How to Scale Large Models? Yang Zhilin's GTC Debut: Betting on Token Efficiency and Agent Cluster

AIbase基地

Published inAI News · 4 min read · Mar 18, 2026

The ticket to the second half of the large model era is no longer just about stacking computing power, but rather a restructuring of underlying logic.

At the NVIDIA GTC2026 conference held on March 18, Moonshot AI founder Yang Zhilin delivered a highly anticipated public speech. This was his first systematic disclosure of the core technology roadmap behind the Kimi K2.5 model, offering new insights into the evolution of large models in the "post-Scaling" era.

Yang Zhilin pointed out in his speech that to break through the limits of intelligence, it is necessary to completely restructure key technologies such as optimizers, attention mechanisms, and residual connections. He summarized the evolutionary path of Kimi into three key dimensions that work in synergy:

Token Efficiency: Reject resource idleness, pursuing an even more extreme computing efficiency ratio.

Long Context: Continuously deepen Kimi's long-term memory advantage to process massive-scale information.

Agent Cluster: The form of intelligence is evolving from individual combat to dynamic-generated "digital clusters."

In Yang Zhilin's view, the current scaling has evolved into finding scale effects in efficiency, memory, and automated collaboration. If the technical gains from these three dimensions can be multiplied, the model will unleash an intelligence level far beyond its current state.

According to previous release information, the Kimi K2.5 launched in early January this year already demonstrated this "all-around" capability. As the most powerful open-source model from Moonshot AI so far, it adopts a native multimodal architecture, achieving state-of-the-art (SOTA) performance in code and visual understanding, and also supports flexible switching between "thinking" and "non-thinking" modes, accurately adapting to agent task scenarios.

With Moonshot AI's technological secrets being revealed, the focus of competition in the large model arena is shifting from "parameter count" to "intelligence density." When agent clusters become the ultimate form of future intelligence, whether Kimi can achieve a leap forward under Yang Zhilin's "three-dimensional multiplication" logic has become a focal point of industry attention.

LargeModel NVIDIAGTC2026 DarkSideoftheMoon KimiK2.5

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Breakthrough in Edge-side Large Models! Liquid AI Opens Source Hybrid Expert Model LFM2.5

Artificial intelligence startup Liquid AI has released and open-sourced the edge-side large model LFM2.5-8B-A1B, specifically designed for consumer-grade hardware, with optimized tool calling and instruction following capabilities. The model uses a sparse mixture of experts architecture, with a total parameter count of 8.3B, but only activates 1.5B parameters per Token. This reduces computational costs while enhancing reasoning performance, allowing it to run smoothly on mobile phones and laptops.

May 29, 2026

290

Xiaomi announced that MiMo-V2-Pro/Omni will be discontinued in June 2026, and will fully switch to the V2.5 series

Xiaomi announced that the old version of MiMo-V2-Pro/Omni model will be discontinued on June 30, 2026, and upgraded to the MiMo V2.5 series. The MiMo-V2-Pro will be migrated to V2.5-Pro, and MiMo-V2-Omni will be upgraded to the new V2.5 model. The new version has been fully launched, aiming to provide stronger reasoning capabilities and higher cost-effectiveness, and to promote developers to migrate.

May 29, 2026

210

Code Flaws Reduced by 75%! Anthropic Launches Claude Opus 4.8, Speed Increased by 2.5 Times, Outperforming Industry Competitors

Anthropic releases a refined version of its flagship model, Claude Opus 4.8, focusing on agent programming, multi-domain reasoning, and knowledge work. This version surpasses GPT-5.5 in multiple core benchmark tests, significantly reducing the phenomenon of 'AI lying,' with a sharp decrease in programming defects and improved judgment, marking a major breakthrough in AI code reliability.

May 29, 2026

170

Large models can also fit in your pocket! Google will launch the Coral AI development board this summer, unlocking offline real-time voice translation

Tech giants like Google are competing in the new arena of running large models locally. Google announced the Coral AI development board, set for summer 2026, featuring a high-performance NPU chip with up to 1 TOPS of computing power, enabling AI hardware developers to run large models smoothly in offline environments without relying on the cloud or network.....

May 28, 2026

280

SPARK2026 Tencent Games Conference: Over 40 Games Announce Latest Updates, New Progress in Game AI Applications

At the SPARK2026 Tencent Games Conference held online on May 27th, a total of 42 domestic and international games were announced, covering three sections: production, publishing, and investment. 29 developers introduced new content, gameplay, tournaments, and AI technology applications. In addition, during the "One More Thing" segment, Tencent launched 3 products, including 2 game AI technology products, and announced full open-source.

May 27, 2026

430

Xiaopeng Automotive Advances Humanoid Robot Mass Production: Plans to Start Mass Production by the End of 2026 and Bring It to Stores in the Following Year

XPeng Group held a mass production mobilization meeting for humanoid robots, announcing the sprint phase with nearly 1,000 employees. Chairman He Xiaopeng confirmed robot mass production by end-2026, with deployment in XPeng stores by Q1 2027, marking a key step in embodied AI industrialization.....

May 27, 2026

260

AI Daily: Xiaomi MiMo-V2.5 Series API Permanently Reduced in Price; Qwen Upgrades Photo-based Health Inquiry; Doubao Will Disable Question-Answering by Taking Photos

Welcome to the [AI Daily] section! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you understand technical trends and innovative AI product applications. Discover new AI products: https://app.aibase.com/zh1. The Xiaomi MiMo-V2.5 Series API has been permanently reduced in price, with a maximum discount of 99%, and the billing system has also been optimized.

May 27, 2026

230

Are Programmers Cheaper Than AI? U.S. Tech Giants Can't Afford Tokens and Are Starting to Reflect

The AI industry is experiencing polarization: leading companies see soaring valuations and record-high adoption of large language models for programming, yet firms find AI programming costs surpassing human programmers, undermining cost-efficiency expectations and leaving tech companies in an awkward position.....

May 27, 2026

520

Hong Kong Stock Market's AI Large Model Sector Continues to Perform Strongly: MiniMax, Zhipu Show Strong Gains

On May 27, Hong Kong's large model concept sector surged, with MINIMAX-W rising over 8% intraday and Zhipu gaining nearly 5%. The sector has strengthened for consecutive days, boosted by MINIMAX-W's inclusion in the Hang Seng Tech Index on May 22, raising market attention and driving its stock price upward.....

May 27, 2026

260

Investing 60 Billion in AI for Three Years, Xiaomi's Large Model Achieves Double Global First, Accelerating Intelligent Transformation

On May 26, Lei Jun, Chairman of Xiaomi Group, announced R&D spending of 9 billion yuan in Q1 2026, up 33.4% year-on-year, with over 26,000 R&D staff and annual R&D investment expected to exceed 40 billion yuan. Its self-developed large model, Xiaomi MiMo-V2.5-Pro, ranked first globally among open-source models in both comprehensive intelligence and Agent indices on the Artificial Analysis list.....

May 27, 2026

230

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

How to Scale Large Models? Yang Zhilin's GTC Debut: Betting on Token Efficiency and Agent Cluster

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Breakthrough in Edge-side Large Models! Liquid AI Opens Source Hybrid Expert Model LFM2.5

Xiaomi announced that MiMo-V2-Pro/Omni will be discontinued in June 2026, and will fully switch to the V2.5 series

Code Flaws Reduced by 75%! Anthropic Launches Claude Opus 4.8, Speed Increased by 2.5 Times, Outperforming Industry Competitors

Large models can also fit in your pocket! Google will launch the Coral AI development board this summer, unlocking offline real-time voice translation

SPARK2026 Tencent Games Conference: Over 40 Games Announce Latest Updates, New Progress in Game AI Applications

Xiaopeng Automotive Advances Humanoid Robot Mass Production: Plans to Start Mass Production by the End of 2026 and Bring It to Stores in the Following Year

AI Daily: Xiaomi MiMo-V2.5 Series API Permanently Reduced in Price; Qwen Upgrades Photo-based Health Inquiry; Doubao Will Disable Question-Answering by Taking Photos

Are Programmers Cheaper Than AI? U.S. Tech Giants Can't Afford Tokens and Are Starting to Reflect

Hong Kong Stock Market's AI Large Model Sector Continues to Perform Strongly: MiniMax, Zhipu Show Strong Gains

Investing 60 Billion in AI for Three Years, Xiaomi's Large Model Achieves Double Global First, Accelerating Intelligent Transformation

AI News Recommendations

Breakthrough in Edge-side Large Models! Liquid AI Opens Source Hybrid Expert Model LFM2.5

Xiaomi announced that MiMo-V2-Pro/Omni will be discontinued in June 2026, and will fully switch to the V2.5 series

Code Flaws Reduced by 75%! Anthropic Launches Claude Opus 4.8, Speed Increased by 2.5 Times, Outperforming Industry Competitors

Large models can also fit in your pocket! Google will launch the Coral AI development board this summer, unlocking offline real-time voice translation

SPARK2026 Tencent Games Conference: Over 40 Games Announce Latest Updates, New Progress in Game AI Applications

Xiaopeng Automotive Advances Humanoid Robot Mass Production: Plans to Start Mass Production by the End of 2026 and Bring It to Stores in the Following Year

AI Daily: Xiaomi MiMo-V2.5 Series API Permanently Reduced in Price; Qwen Upgrades Photo-based Health Inquiry; Doubao Will Disable Question-Answering by Taking Photos

Are Programmers Cheaper Than AI? U.S. Tech Giants Can't Afford Tokens and Are Starting to Reflect

Hong Kong Stock Market's AI Large Model Sector Continues to Perform Strongly: MiniMax, Zhipu Show Strong Gains

Investing 60 Billion in AI for Three Years, Xiaomi's Large Model Achieves Double Global First, Accelerating Intelligent Transformation