China Academy of Information and Communications Technology Releases Fangsheng 3.0 Large Model Benchmark Test

AIbase基地

Published inAI News · 4 min read · Oct 9, 2025

Recently, the China Academy of Information and Communications Technology (CAICT) officially launched the "Fangsheng" benchmarking system version 3.0, marking another major advancement in AI evaluation in China. This new version has been comprehensively upgraded based on previous versions, adding tests for model basic attributes and systematically evaluating key underlying features such as model parameter scale and inference efficiency. In addition, the system has proactively laid out future advanced intelligent testing, focusing on ten advanced capabilities such as full-modal understanding, long-term memory, and self-learning, providing more in-depth scenario-based evaluations for key industries such as industrial manufacturing, fundamental science, and finance.

To better implement "Fangsheng" 3.0, CAICT has strengthened the construction of evaluation infrastructure in multiple aspects. First, they plan to expand high-quality test data resources, adding 3 million new data entries to meet the evaluation needs of multi-language, multi-task, and multi-scenario models. Second, CAICT will conduct systematic research and apply advanced testing methods, focusing on solving key technical challenges in large model evaluation, such as high-quality test data synthesis and quality assessment. In addition, CAICT will build a new generation of intelligent evaluation base, adding simulation testing environments for multi-agent interaction and environmental perception, to meet the evaluation needs of agent collaboration and dynamic environment adaptation in complex scenarios.

Starting from 2024, CAICT will conduct a large model benchmarking test every two months. The latest round of testing evaluated 141 large models and 7 agents, covering basic abilities, reasoning abilities, code application capabilities, and multi-modal understanding abilities. The test results showed that OpenAI's GPT-5 continued to lead in comprehensive abilities, while domestic models such as Alibaba's Qwen3-Max-Preview and Moonshot's Kimi K2 performed well. In the evaluation of multi-modal models, image understanding capabilities have made breakthroughs, but there is still room for improvement in complex logical reasoning tasks.

Additionally, the test results on code application capabilities also showed that although they performed well in simple function-level tasks, they still had shortcomings in real project development. This also means that the technological competition between domestic and international players remains intense, and agents still need to improve in multi-modal understanding and complex information processing.

CAICT will continue to strengthen R&D in large model evaluation technology, enhance the credibility and authority of evaluations, and support the cutting-edge innovation in artificial intelligence and the development of new industrialization.

Fangsheng New AI Terms CAICT Benchmark Test System

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Zhipu AI's Hong Kong Stock Closing Price Rises Over 42% and Total Market Value Exceeds 323.2 Billion HKD

Today, Hong Kong stocks experienced the first trading day of the lunar Year of the Horse. While the overall market was weak, the AI large model sector showed vigorous vitality. Zhipu and MINIMAX saw significant stock price increases and became the focus of the market. The leading company in large models, Zhipu, experienced a sharp rise today, with its closing price increase expanding to 42.72%, finally settling at 725 HKD, and its total market value exceeded 323.2 billion HKD.

Feb 20, 2026

Google Releases Gemini 3.1 Pro, with Reasoning Performance More Than Double the Previous Generation

Recently, Google officially launched the new core model Gemini 3.1 Pro, marking a new stage in the development of artificial intelligence technology. Gemini 3.1 Pro is specifically designed to tackle complex problems in scientific, engineering, and research fields, focusing on enhancing core reasoning capabilities, achieving significant improvements in efficiency and accuracy when solving cutting-edge challenges. Official information shows that the model performed excellently in multiple rigorous benchmark tests. For example, in the ARC-AGI-2 test, which evaluates logical pattern processing capabilities, Gemini 3.1 Pro demonstrated outstanding performance in actual measurements.

Feb 20, 2026

AI Pioneer Fei-Fei Li's Startup World Labs Secures $1 Billion in Funding

Recently, World Labs, a startup co-founded by AI industry leader Fei-Fei Li, performed impressively in its latest round of funding, successfully raising $1 billion, which has drawn widespread attention from the industry. The funding round featured a strong lineup, with Autodesk being an important investor, injecting $200 million into World Labs. Additionally, well-known companies and investment institutions such as Andreessen Horowitz, NVIDIA, and AMD have also participated, providing significant momentum for the development of this startup.

Feb 19, 2026

200

Google DeepMind Launches Lyria 3: A New Benchmark for AI Music Generation, Free for Non-Commercial Use

Google DeepMind officially launched its latest AI music generation model, Lyria 3. The model is now available as a beta version (Beta) in the Gemini app and is freely accessible to global users aged 18 and above. The most remarkable feature of Lyria 3 is its full-scenario creation capability. Even users with no musical background can easily generate music through three methods. Users just need to input natural language prompts, such as 'a cheerful reggae song suitable for a beach party' or 'an epic electronic music about space exploration'.

Feb 19, 2026

210

Apple Said to Be Developing Three AI Wearables: Including AI Smart Glasses

Apple is aggressively advancing the development of three AI wearables, including AI smart glasses, AI pin or pendant (AI pin or pendant), and AI AirPods with a camera. These devices are designed to seamlessly connect with iPhones and interact deeply with the more intelligent version of Siri currently under development. It is reported that all three new products will integrate camera functions, allowing the AI to 'understand the wearer's surroundings and provide immediate answers to relevant questions.'

Feb 18, 2026

240

Dou Bao Rises to the Top of Apple App Store Free Chart, Previously Achieved 1.9 Billion Interactions with the Spring Festival Gala

The AI assistant Dou Bao App under ByteDance successfully overtook Alibaba's Qwen and Ant Group's Afu, securing the top position on the rankings. This achievement is closely related to Dou Bao's high exposure and user interaction activities during the CCTV Spring Festival Gala. On February 16, Dou Bao announced a collaboration with the CCTV Spring Festival Gala. According to its disclosure, the number of interactions of Dou Bao AI reached as high as 1.9 billion on the eve of the Spring Festival, demonstrating strong user appeal.

Feb 18, 2026

190

Grok 4.2 Beta Version Released, Musk: New Rapid Learning Ability, Will Be Updated Weekly

Today, Elon Musk, CEO of Tesla, officially announced through the X platform that the candidate release version (public beta) of the large model Grok 4.2 developed by his company xAI has been officially opened for user access. However, users who wish to experience this new version need to specifically select the Grok 4.2 beta version on the platform and complete the activation process; the system will not automatically upgrade existing versions to this beta version.

Feb 18, 2026

230

Moon's Dark Side is About to Complete a New Round of Over $700 Million in Funding

Moon's Dark Side (Kimi) is about to complete the closing of a new round of funding exceeding $700 million. This round features a strong lineup of investors, led by existing shareholders such as Alibaba, Hillhouse Capital, and BioSense, with Tencent also actively participating. Notably, it has only been over a month since Kimi's previous $500 million funding round, showcasing an astonishingly fast fundraising pace. In addition, Kimi has officially begun a new round of funding at an estimated valuation of $10-12 billion.

Feb 18, 2026

230

5 Billion Qwen Helped Me! Alibaba Qwen Spring Festival Event Over 130 Million People Participated in AI Life Services

During Alibaba's Qwen App Spring Festival event, over 130 million users utilized AI assistants for services like ordering milk tea and stocking up on New Year goods, with 'Qwen Help Me' used 5 billion times, integrating AI deeply into holiday consumption. Post-launch, AI-driven movie ticket purchases saw significant growth.....

Feb 17, 2026

190

Yushu Robotics Demonstrates Global First Stunt at the Spring Festival Gala, Vault Height Exceeds 3 Meters

At the 2026 Spring Festival Gala, Yushu Humanoid Robots performed the martial arts routine "Dance BOT" with children, breaking multiple motion limits: vault height exceeds 3 meters, performing continuous one-foot flips, reaching a maximum running speed of 4m/s, and completing high-difficulty actions such as somersaults and wielding sticks and swords, demonstrating excellent stability and flexibility.

Feb 17, 2026

210

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

China Academy of Information and Communications Technology Releases Fangsheng 3.0 Large Model Benchmark Test

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Zhipu AI's Hong Kong Stock Closing Price Rises Over 42% and Total Market Value Exceeds 323.2 Billion HKD

Google Releases Gemini 3.1 Pro, with Reasoning Performance More Than Double the Previous Generation

AI Pioneer Fei-Fei Li's Startup World Labs Secures $1 Billion in Funding

Google DeepMind Launches Lyria 3: A New Benchmark for AI Music Generation, Free for Non-Commercial Use

Apple Said to Be Developing Three AI Wearables: Including AI Smart Glasses

Dou Bao Rises to the Top of Apple App Store Free Chart, Previously Achieved 1.9 Billion Interactions with the Spring Festival Gala

Grok 4.2 Beta Version Released, Musk: New Rapid Learning Ability, Will Be Updated Weekly

Moon's Dark Side is About to Complete a New Round of Over $700 Million in Funding

5 Billion Qwen Helped Me! Alibaba Qwen Spring Festival Event Over 130 Million People Participated in AI Life Services

Yushu Robotics Demonstrates Global First Stunt at the Spring Festival Gala, Vault Height Exceeds 3 Meters

AI News Recommendations

Zhipu AI's Hong Kong Stock Closing Price Rises Over 42% and Total Market Value Exceeds 323.2 Billion HKD

Google Releases Gemini 3.1 Pro, with Reasoning Performance More Than Double the Previous Generation

AI Pioneer Fei-Fei Li's Startup World Labs Secures $1 Billion in Funding

Google DeepMind Launches Lyria 3: A New Benchmark for AI Music Generation, Free for Non-Commercial Use

Apple Said to Be Developing Three AI Wearables: Including AI Smart Glasses

Dou Bao Rises to the Top of Apple App Store Free Chart, Previously Achieved 1.9 Billion Interactions with the Spring Festival Gala

Grok 4.2 Beta Version Released, Musk: New Rapid Learning Ability, Will Be Updated Weekly

Moon's Dark Side is About to Complete a New Round of Over $700 Million in Funding

5 Billion Qwen Helped Me! Alibaba Qwen Spring Festival Event Over 130 Million People Participated in AI Life Services

Yushu Robotics Demonstrates Global First Stunt at the Spring Festival Gala, Vault Height Exceeds 3 Meters

GEO Services