New King of the Domestic AI Battlefield! Wenxin X1.1 Tops the Large Model Evaluation

AIbase基地

Published inAI News · 4 min read · Oct 22, 2025

In the recently released Chinese Precise Instruction Following Evaluation Benchmark (SuperCLUE-CPIF), Baidu's ERNIE X1.1 achieved an outstanding score of 75.51, becoming a top domestic large model and securing the first position in China. This evaluation includes as many as 10 well-known models from both inside and outside China, such as GPT-5(high), DeepSeek-V3.2-Exp-Thinking, Claude-Sonnet-4.5-Reasoning, Gemini-2.5-Pro, etc., focusing on assessing the ability of large language models (LLMs) to execute complex instructions in the Chinese environment.

The SuperCLUE-CPIF evaluation not only focuses on the types of tasks and the number of instructions for the model, but also particularly emphasizes the model's ability to convert natural language instructions into specific outputs that meet requirements. In this evaluation, ERNIE X1.1 performed exceptionally well in real production environments, demonstrating its strong advantages in complex writing tasks and diverse scenarios.

ERNIE X1.1 is a deep thinking model trained based on ERNIE Large Model 4.5. During the upgrade process, it adopted an iterative hybrid reinforcement learning training framework. This means that it not only improves the performance of general tasks and agent tasks, but also continuously enhances overall performance through iterative training with self-distilled data.

In practical applications, ERNIE X1.1 can flexibly use built-in knowledge and online search tools to accurately capture the information users need, while deeply understanding users' creative writing needs, and finally outputting content that is well-structured, logically clear, and elegantly written. For example, when handling customer service on a shared bike platform, ERNIE X1.1 can comprehensively consider the user's emotional state and the type of problem, thereby efficiently solving the problem and demonstrating a complete and proactive service process.

As one of the earliest companies to invest in large model research and development in China, Baidu continues to drive the evolution of the ERNIE large model through its full-stack self-developed system of "chip - framework - model - application." Data shows that ERNIE X1.1 has improved by 34.8% and 12.5% in factual accuracy and instruction following capabilities compared to its predecessor ERNIE X1, and the performance of agents has improved by 9.6%. This achievement undoubtedly sets a new benchmark for the development of domestic large models.

Has the Era of Layers Died? Wu Xinhong from Meitu: Focus on High-Value Vertical Scenarios, Applications Coexist with Large Models

Meitu CEO Wu Xinhong addresses concerns about 'large models overshadowing apps', stating that general large models and vertical applications are complementary, not competitive. He likens large models to 'Swiss Army knives'—versatile but less efficient, while apps are 'specialized tools' that precisely meet core needs. The moat for apps lies in deeply exploring vertical scenarios, fulfilling long-tail demands, and securing user loyalty.....

Overseas Revenue Surpasses Domestic: Kimi K2.5 Drives Moonshot Global Breakthrough

Moon's Dark Side released the K2.5 model, marking a milestone in Kimi's global expansion. Overseas revenue surpassed domestic for the first time, achieving a major breakthrough in international commercialization of domestic large models. Post-update, global paid users surged fourfold within days, leading in popularity on platforms like Openroute.....

Tencent Yuanbao Launches 1 Billion Red Envelope to Intensify the Battle for AI Applications During the Spring Festival!

Tencent's Yuanbao App launched a 1 billion yuan Spring Festival red envelope campaign, topping Apple's free app chart. The red envelope competition highlights not just monetary rivalry but also deep AI application, crucial for capturing user traffic. Analyst Pei Yifan from AVIC Securities notes AI is intensifying industry competition.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

New King of the Domestic AI Battlefield! Wenxin X1.1 Tops the Large Model Evaluation

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Has the Era of Layers Died? Wu Xinhong from Meitu: Focus on High-Value Vertical Scenarios, Applications Coexist with Large Models

Power Efficiency Ratio of Computing Approaches Rubin 5 Times? Startup Positron Launches Asimov Architecture to Reshape AI Inference

Performance Surpasses Flagship! Claude Sonnet5 Leaked, Price Halved to Target OpenAI's Weakness

2025 Global Chinese Large Model Ranking Released: Overseas Giants Take Top Three, Domestic Large Models Surpass in Niche Areas

Global Chinese Language Model Competition! Overseas Strong Contenders Take the Top Three, Domestic Ones Show Promise!

Overseas Revenue Surpasses Domestic: Kimi K2.5 Drives Moonshot Global Breakthrough

Tencent Yuanbao Launches 1 Billion Red Envelope to Intensify the Battle for AI Applications During the Spring Festival!

Amazon Cuts 16,000 Employees: Accelerating Organizational Start-up and the Rise of AI Replacing White-Collar Jobs

Rising Star in the Marketing Track! Titanium Tech's Titanium Model Wins the SuperCLUE Ranking Championship

The Era of Large Models in Input Methods: Sogou Input Method's AI Users Exceed 100 Million, Voice Accuracy Reaches 98%

AI News Recommendations

Has the Era of Layers Died? Wu Xinhong from Meitu: Focus on High-Value Vertical Scenarios, Applications Coexist with Large Models

Power Efficiency Ratio of Computing Approaches Rubin 5 Times? Startup Positron Launches Asimov Architecture to Reshape AI Inference

Performance Surpasses Flagship! Claude Sonnet5 Leaked, Price Halved to Target OpenAI's Weakness

2025 Global Chinese Large Model Ranking Released: Overseas Giants Take Top Three, Domestic Large Models Surpass in Niche Areas

Global Chinese Language Model Competition! Overseas Strong Contenders Take the Top Three, Domestic Ones Show Promise!

Overseas Revenue Surpasses Domestic: Kimi K2.5 Drives Moonshot Global Breakthrough

Tencent Yuanbao Launches 1 Billion Red Envelope to Intensify the Battle for AI Applications During the Spring Festival!

Amazon Cuts 16,000 Employees: Accelerating Organizational Start-up and the Rise of AI Replacing White-Collar Jobs

Rising Star in the Marketing Track! Titanium Tech's Titanium Model Wins the SuperCLUE Ranking Championship

The Era of Large Models in Input Methods: Sogou Input Method's AI Users Exceed 100 Million, Voice Accuracy Reaches 98%

GEO Services