Recently, the IPADS Lab team at Shanghai Jiao Tong University launched a new mobile agent toolkit called MobiAgent, breaking down the barriers to personalized intelligent assistant development and claiming that its real-world performance exceeds GPT-5 and other top closed-source models.
The release of MobiAgent gives everyone the opportunity to cultivate their own AI assistant. This toolkit allows users to build mobile agents from scratch, including the entire process from collecting operational data, training models, to deploying the model on a smartphone. The open-source nature of MobiAgent means users can independently acquire data, train models, and implement AI assistants on their personal devices.
To verify the practical capabilities of MobiAgent, the research team conducted tests on 20 popular domestic applications. The results showed that the 7B-scale MobiAgent model not only outperformed several well-known closed-source large models but also ranked at the forefront among open-source GUI agents of similar scale. MobiAgent's unique "Subconscious Memory Accelerator" helps the agent quickly complete repetitive tasks by learning from past operations, with performance improvements of up to 2-3 times.
The core of MobiAgent lies in its efficient data collection and intelligent training process. It uses lightweight tools to record users' phone operations, then leverages a general VLM model to generate high-quality training data. These data are refined and adjusted to ensure that the trained agents have excellent generalization capabilities. The "brain" of MobiAgent is divided into three parts: the "Planner" responsible for task planning, the "Decider" that makes decisions based on the current screen, and the "Executor" that performs specific operations. This architecture makes model training more efficient and significantly improves response speed.
Through the innovative AgentRR acceleration framework, MobiAgent can greatly improve the execution efficiency of repetitive tasks by leveraging past operational experience, achieving a maximum action reuse rate of 60%-85%. This makes the AI assistant faster and more accurate when handling daily tasks.
The launch of MobiAgent not only provides convenience for customizing personal intelligent assistants but also promotes the development of the entire mobile agent ecosystem, marking the arrival of an intelligent era where "it's better to speak than to act."
Paper URL: https://arxiv.org/pdf/2509.00531