Ant Group Wins Championship at Top Computer Vision Conference, Achieving Practical-Level Advancement in AIGC Detection

AIbase基地

Published inAI News · 6 min read · Apr 10, 2026

Recently, in the CVPR 2026 NTIRE Image Detection Challenge, Ant Group won the championship in both the "Robustness Sample Testing in Complex Real-World Scenarios" track and the "Face Enhancement Anomaly Detection" track, providing important support for further enhancing risk identification capabilities in scenarios such as payments, content security review, and financial identity authentication in the AI era.

Currently, the risks of deepfakes and misuse of AIGC are increasing. These fake contents are difficult to distinguish with the naked eye, and existing detection models have seen a drastic drop in accuracy when facing real-world scenarios and rapid iterations of multimodal large models. This CVPR challenge directly addresses this pain point, requiring models to maintain high accuracy and strong robustness under the dual extreme tests of "unknown generation architecture" and "complex degradation interference."

Ant Group originated from payment scenarios and has accumulated security technologies over the past 20 years that represent international leading levels. This advantage is now extended to the field of AI security. Ant Group proposed a detection framework based on the DINOv3 visual foundation model, achieving a capability leap from laboratory to real-world scenarios for AIGC detection.

In the "Robustness Sample Testing in Complex Real-World Scenarios" competition, the Ant AI Security Lab team built a complex training corpus containing millions of high-quality samples, covering open-source datasets such as WildFake, Z-Image, Seedream, Nano-banana-pro, and cutting-edge models. The underlying architecture uses a dual-stream parallel integration structure, like equipping the detection model with two complementary eyes, capturing local details and overall features of images respectively. The team simulated the full-chain degradation effects of images from single noise points to multiple distortions, deeply reproducing image distortion characteristics in real scenarios such as social platform dissemination and secondary rephotography, significantly improving the model's detection capabilities in real scenarios.

Additionally, the team proposed a two-stage detection paradigm called "Locate-Then-Examine," which first locates suspicious areas and then conducts detailed reviews, and built the FakeXplained dataset, which provides local area text explanations. When dealing with suspicious images, this method can not only accurately determine whether they are AI-generated but also locate areas with forgery flaws or violations of physical common sense on the image, while simultaneously generating detailed explanations. This approach breaks through the limitations of traditional "black-box" detection, making model decisions traceable. To facilitate technical professionals to jointly address Deepfake challenges, the team also opened source the most comprehensive AIGC image and video detection resource repository in the field via GitHub.

In the "Face Enhancement Anomaly Detection" competition, the Ant International team won by accurately locating abnormal areas in face images. This technology can precisely identify and locate abnormal areas in face images, mainly applied in scenarios such as financial transaction identity verification and account opening material review, providing important technical guarantees for preventing Deepfake and AIGC attacks. In cross-border payment and financial services, Ant International has deeply applied AIGC identification technology in EKYC, documents, and materials anti-counterfeiting, ensuring the ability to detect various types of generated content.

CVPR is an international conference on computer vision and pattern recognition hosted by IEEE, and together with ICCV and ECCV, it is known as one of the three top-tier conferences in the field of computer vision. This challenge attracted more than 500 teams from around the world.

Anthropic Releases Claude's Security Isolation Architecture: Three Products Demonstrate Multi-Layered Protection Strategies

The Anthropic engineering team shared their experience in developing a secure isolation system for three AI products (claude.ai, Claude Code, and Claude Cowork). The three products target general users, developers, and enterprise users respectively, following the principle of "environment-level isolation first." Among them, claude.ai uses a temporary container solution based on gVisor, generating a temporary container for each user session.

Enter the All-Round Workflow! Meta's AI Agent Goes Global on WhatsApp, Transforming into an AI Assistant for Small and Medium Enterprises

Meta has renamed its customer support AI chatbot to 'Meta Business Agent' and launched it globally on the WhatsApp Business platform, transforming WhatsApp from a small business communication tool into a productivity software with workflow capabilities, accelerating AI integration in the communication ecosystem.....

Google Cloud AI Ecosystem Gains a Major Super Customer! Swedish Unicorn Lovable Signs to Expand Computing Power by 5 Times

Swedish startup Lovable has entered a long-term, deep partnership with Google Cloud, increasing its cloud resources and AI usage by fivefold. As one of Europe's fastest-growing startups, Lovable excels in the fully automated AI coding sector, marking a powerful synergy in global cloud computing and AI ecosystems.....

Google's Ask Gemini Feature Now Expanded to Gmail, Making Email Search Easier!

Google announced that the "Ask Gemini in Drive" feature has been expanded to Gmail, helping users quickly find specific information in a large volume of emails. The feature was launched in March and is now available to eligible Google Workspace, AI Pro, and Ultra users. Users need to select Gmail as the query source on the left, then click the "Ask Gemini" button to use it.

BYD Confirms Self-Developed Humanoid Robot 'Yaoshunyu' Will Be Deployed Internally with 20,000 Units in 2026

BYD executive vice president Li Ke confirmed the company is advancing its self-developed industrial humanoid robot project 'Yao Shun Yu', initiated in 2022 with a core R&D team of over 4,000. The prototype has evolved to the seventh generation, achieving a bipedal walking speed of 1.5 m/s and a rated load of 50 kg, marking BYD's entry into the embodied intelligence second growth curve.....

AI Daily: Kuaishou App Launches AI Shopping Assistant; Kimi Work Opens Internal Testing; WeChat Interconnectivity Partners with Multiple Manufacturers to Launch A2A Assistant

Welcome to the [AI Daily] section! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you gain insight into technical trends and understand innovative AI product applications. Discover new AI products: https://app.aibase.com/zh1. Kuaishou App Launches AI Shopping Assistant, Filling the Shortcoming of Intelligent Shopping Guidance in Shelf E-commerce, Enhancing User Shopping Experience and Decision-Making Efficiency.

GPT-5.5 Takes the Lead in Utilization Efficiency, DeepSeek V4 Pro Wins the Title of Best Cost-Performance! Real-World Cybersecurity Attack and Defense Report on Large Models Released

The reasoning capabilities of large language models in the field of cybersecurity are facing a serious test. Security researcher Kasra Rahjerdi conducted simulated hacker attack tests on mainstream large models by building an APK with core vulnerabilities in book review data, revealing their true level of security reasoning and vulnerability exploitation. The test lasted 2 hours with a single budget of $10, intuitively demonstrating the performance of each model in complex logical challenges.

Major Open Source Release! Native Multimodal LongCat-Next Released, Making Vision and Speech the Mother Tongue of AI

Global AI is undergoing an 'AI-native' technological shift. Addressing the current 'language-centric, externally patched vision or speech' architecture, a team released and open-sourced the native multimodal large model LongCat-Next and a discrete tokenizer, aiming to break modal barriers and enable AI to understand the physical world like processing text, achieved by reconstructing the underlying architecture.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

Ant Group Wins Championship at Top Computer Vision Conference, Achieving Practical-Level Advancement in AIGC Detection

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Anthropic Releases Claude's Security Isolation Architecture: Three Products Demonstrate Multi-Layered Protection Strategies

Enter the All-Round Workflow! Meta's AI Agent Goes Global on WhatsApp, Transforming into an AI Assistant for Small and Medium Enterprises

Google Cloud AI Ecosystem Gains a Major Super Customer! Swedish Unicorn Lovable Signs to Expand Computing Power by 5 Times

Google's Ask Gemini Feature Now Expanded to Gmail, Making Email Search Easier!

GPT 5.5 Dominates AI Vulnerability Challenge, DeepSeek Wins the King of Cost-Effectiveness

BYD Confirms Self-Developed Humanoid Robot 'Yaoshunyu' Will Be Deployed Internally with 20,000 Units in 2026

AI Daily: Kuaishou App Launches AI Shopping Assistant; Kimi Work Opens Internal Testing; WeChat Interconnectivity Partners with Multiple Manufacturers to Launch A2A Assistant

GPT-5.5 Takes the Lead in Utilization Efficiency, DeepSeek V4 Pro Wins the Title of Best Cost-Performance! Real-World Cybersecurity Attack and Defense Report on Large Models Released

Major Open Source Release! Native Multimodal LongCat-Next Released, Making Vision and Speech the Mother Tongue of AI

4 Months to Exhaust the Annual Budget! Uber HR Department Makes Large Layoffs, Official Says: It's Truly Not Related to AI

AI News Recommendations

Anthropic Releases Claude's Security Isolation Architecture: Three Products Demonstrate Multi-Layered Protection Strategies

Enter the All-Round Workflow! Meta's AI Agent Goes Global on WhatsApp, Transforming into an AI Assistant for Small and Medium Enterprises

Google Cloud AI Ecosystem Gains a Major Super Customer! Swedish Unicorn Lovable Signs to Expand Computing Power by 5 Times

Google's Ask Gemini Feature Now Expanded to Gmail, Making Email Search Easier!

GPT 5.5 Dominates AI Vulnerability Challenge, DeepSeek Wins the King of Cost-Effectiveness

BYD Confirms Self-Developed Humanoid Robot 'Yaoshunyu' Will Be Deployed Internally with 20,000 Units in 2026

AI Daily: Kuaishou App Launches AI Shopping Assistant; Kimi Work Opens Internal Testing; WeChat Interconnectivity Partners with Multiple Manufacturers to Launch A2A Assistant

GPT-5.5 Takes the Lead in Utilization Efficiency, DeepSeek V4 Pro Wins the Title of Best Cost-Performance! Real-World Cybersecurity Attack and Defense Report on Large Models Released

Major Open Source Release! Native Multimodal LongCat-Next Released, Making Vision and Speech the Mother Tongue of AI

4 Months to Exhaust the Annual Budget! Uber HR Department Makes Large Layoffs, Official Says: It's Truly Not Related to AI