With the rapid development of artificial intelligence technology, intelligent interaction has become the new focus of mobile internet. Recently, the THUNLP Lab at Tsinghua University and Mianbi Intelligence jointly released a revolutionary open-source project - AgentCPM-GUI, which is the world's first open-source GUI (graphical user interface) Agent specifically optimized for Chinese apps. This project not only demonstrates the core strength of domestic AI technology but also provides new possibilities for the intelligent upgrade of the Android ecosystem.

image.png

Technical Breakthrough: The World's First GUI Agent Specialized for Chinese Apps

AgentCPM-GUI is built based on Mianbi Intelligence's MiniCPM-V model with a total parameter count of 8 billion (8B). This model uses phone screen images as input, accurately identifying interface elements and automatically executing user instructions. Compared to traditional general-purpose Agents, AgentCPM-GUI has undergone in-depth optimization for Chinese apps, covering over 30 mainstream Chinese applications including AutoNavi Map, Dianping, Bilibili, and Xiaohongshu, showcasing excellent localization capabilities.

image.png

According to AIbase, this Agent performs exceptionally well in interface element positioning and task execution. For example, in a demonstration scenario, AgentCPM-GUI can quickly open Bilibili and check if a specific UP master has posted a new video, with smooth and precise operations. The realization of this function is due to its deep understanding of the interface logic of Chinese apps and efficient algorithm design.

Efficiency Revolution: Average Action Length Reduced to Only 9.7 Tokens

In terms of end-side inference efficiency, AgentCPM-GUI also shines. Through advanced model compression technology, the average action length has been reduced to 9.7 tokens, significantly reducing computational resource usage. This means that even on ordinary Android devices, AgentCPM-GUI can achieve quick responses and smooth operation, providing users with an interaction experience close to native apps.

AIbase believes that this efficiency improvement not only lowers the hardware threshold for developers and users but also lays the foundation for widespread deployment of AgentCPM-GUI on more consumer electronics devices. Whether it's smartphones, tablets, or other smart terminals, AgentCPM-GUI has the potential to become the core engine of intelligent interaction.

Open Source Empowerment: Promoting Intelligent Upgrades in the Android Ecosystem

As a fully open-source project, the release of AgentCPM-GUI marks Tsinghua University and Mianbi Intelligence's firm commitment to the popularization of AI technology. The development team stated that the code and related documentation of AgentCPM-GUI have been made public, allowing developers to freely access and further develop based on it. This move will greatly reduce the development costs of intelligent interaction for Chinese apps, helping more small and medium-sized enterprises join the construction of intelligent ecosystems.

AIbase noticed that the openness of AgentCPM-GUI has received significant attention from the industry. Industry insiders pointed out that the project not only fills the gap in the field of Chinese GUI Agents but also provides valuable references for the intelligent development of the global Android ecosystem. In the future, with the participation of more developers, AgentCPM-GUI is expected to push the interactive experience of mainstream apps such as AutoNavi Map and Dianping to new heights.

Application Prospects: From Navigation to Social Interaction, Intelligence Everywhere

The emergence of AgentCPM-GUI opens up broad space for the intelligent application of Chinese apps. In navigation scenarios, users can use voice commands to let AgentCPM-GUI automatically operate AutoNavi Map to plan routes; in social scenarios, Agents can quickly browse notes on Xiaohongshu or videos on Bilibili, precisely extracting the information users need; in the life service sector, restaurant recommendations and reservations through Dianping can be achieved with one-click operations via Agents.

AIbase predicts that with the popularity of AgentCPM-GUI, the user experience of Chinese apps will迎来 a qualitative leap. Whether it's improving operational efficiency or optimizing personalized services, this Agent will become the intelligent bridge connecting users and applications.

A Milestone Breakthrough in Domestic AI

As a professional media outlet in the AI field, AIbase believes that the release of AgentCPM-GUI is not only a major breakthrough in research and development by Tsinghua University and Mianbi Intelligence but also an important step for domestic AI to reach the global stage. Its fine optimization for Chinese apps and efficient end-side inference capability showcase the unique advantages of Chinese AI companies in localized scenarios.