Microsoft releases the innovative small-parameter model Mu: Performance comparable to Phi-3.5-mini, empowering Windows agents

AIbase基地

Published inAI News · 7 min read · Jun 24, 2025

At the beginning of today, Microsoft officially released its latest innovative small-parameter model, Mu. This model has only 330 million parameters but can match the performance of Microsoft's previously released Phi-3.5-mini, while being just one-tenth the size of Phi-3.5-mini. More impressively, Mu achieves a response speed of over 100 tokens per second on offline NPU laptops, which is a rare breakthrough in the field of small-parameter models.

A major highlight of the Mu model is its support for setting up agents in Windows. Users simply need to issue natural language instructions, and the agent can be converted into system operations in real time. For example, a sentence like "Make the mouse pointer larger and adjust the screen brightness" can allow the agent to accurately locate and complete the settings with one click, greatly enhancing the usability of the Windows system.

Mu Architecture: Excellent Optimization for Small-Scale Local Deployment

The Mu model draws inspiration from Microsoft's previously released Phi Silica model, and is optimized specifically for small-scale local deployment, especially suitable for Copilot+ PCs equipped with an NPU. Its core architecture is a decoder-only Transformer, and it introduces three key innovations on this basis:

Dual Layer Normalization: By applying LayerNorm operations before and after each sub-layer in the Transformer architecture, it effectively ensures that the distribution of activation values has good statistical properties, significantly enhancing the stability of the training process. It avoids common training instability issues in deep networks, thereby improving training efficiency and reducing resource consumption.
Rotary Position Embedding (RoPE): Compared to traditional absolute position embeddings, RoPE introduces a rotation operation in the complex domain, transforming position encoding into a dynamic and scalable function mapping. This allows the model to directly reflect the relative distance between tokens, solving the performance degradation problem in traditional methods when handling long sequences, and giving the model excellent long-sequence extrapolation capabilities.
Grouped-Query Attention: This optimization addresses the high parameter and memory consumption issues in traditional multi-head attention mechanisms. By sharing keys (Key) and values (Value) among head groups, it significantly reduces the number of attention parameters and memory usage, thus lowering latency and power consumption on NPUs and improving model efficiency. At the same time, by maintaining head diversity, it ensures performance comparable to traditional multi-head attention mechanisms.

In addition, the Mu model also employs advanced training techniques such as warm-up stable decay schedules and the Muon optimizer to further optimize performance. Microsoft trained Mu using an A100 GPU, following the technical approach pioneered in the development of the Phi model. First, it was pre-trained on hundreds of billions of high-quality educational tokens to learn the grammar, semantics, and world knowledge of the language. To further improve accuracy, Mu also performed knowledge distillation from the Phi model, achieving significant parameter efficiency. With only one-tenth the parameters of Phi-3.5-mini, it achieved similar performance.

Empowering Windows Agents: The Perfect Combination of Low Latency and High Accuracy

To enhance the usability of the Windows system, Microsoft has been committed to creating an AI agent capable of understanding natural language and seamlessly modifying system settings. Microsoft plans to integrate the agent driven by the Mu model into the existing search bar to achieve a smooth user experience, which requires ultra-low latency responses for numerous possible settings.

After testing various models, Mu was selected due to its appropriate characteristics. Although the baseline Mu model experiences a 50% drop in accuracy without fine-tuning, Microsoft successfully bridged this gap by expanding the training scale to 3.6 million samples (a 1300-fold increase) and expanding the number of settings processed from about 50 to hundreds. Through technologies such as automated annotation-based synthetic methods, prompt tuning with metadata, diverse phrasing, noise injection, and intelligent sampling, the Mu fine-tuned model for the setting agent successfully met the quality goals. Testing shows that the agent built with the Mu model performs well in understanding and executing operations in Windows settings, with response times kept within 500 milliseconds.

Thunder Launches Download MCP Service, One Sentence Lets AI Automatically Download

Thunder officially launched the download MCP service, claiming that users can let AI automatically complete download tasks by just saying one sentence. The service is compatible with both the PC version of Thunder and NAS Thunder, and all users can currently use it for free. It is reported that Thunder MCP has already been able to access multiple mainstream large models domestically and internationally. Applications such as Nano AI, Koala Space, Cursor, and Cherry Studio are among them. Users just need to clearly express their needs in AI applications that support MCP integration.

Alibaba's FY2025 Revenue Reaches 996.347 Billion Yuan, Marks the Beginning of a New Journey in the AI Era

Alibaba Group officially released its annual report for FY2025, comprehensively showcasing the achievements and development trends across various business areas over the past year. In terms of financial performance, Alibaba Group's revenue in FY2025 reached 996.347 billion yuan, with a 77% year-over-year increase in net profit to 125.976 billion yuan, demonstrating strong profitability.

"AI Daily Report - June 26"; Doubao AI Programming Launches Major Upgrade; Google Opensources AI Agent Gemini CLI

Welcome to the AIbase [AI Daily Report] section! Spend three minutes a day to learn about the latest AI events, helping you understand AI industry trends and innovative AI product applications. For more AI news, visit: https://www.aibase.com/zh1. Doubao AI Programming Launches Major Upgrade! No-code beginners can easily create their own web pages, with real-time editing that is very convenient! Doubao AI Programming has been upgraded to Application Creation 1.0, featuring visual editing, real-time preview, and multi-version management functions, lowering the barrier to web and application development for beginners.

New Oriental Launches Its First Original AI Education Product - New Oriental AI 1-on-1, Revolutionizing the Traditional Learning Model

New Oriental officially launched its first consumer-facing original AI education product - New Oriental AI 1-on-1 today. This is not only a major breakthrough in teaching methods, but also marks a critical step in New Oriental's strategic layout of "Education + AI." The core competitiveness of New Oriental AI 1-on-1 lies in providing learners with a high-frequency interactive 1-on-1 learning experience. The AI teacher can realistically reproduce the learning environment, achieving real interaction and real Q&A. At the same time, the AI teacher is patient, responsible, proficient in teaching, and can provide timely feedback, as well as praise and encourage students.

Vibemotion AI Released! One-Click Generation of Dynamic Videos, Zero-Barrier Creation Triggers a Visual Revolution

Recently, the innovative AI company Vibemotion launched a revolutionary AI dynamic graphics platform, aiming to allow users to easily create high-quality dynamic videos through simple prompts and material input. Currently, the platform is available by invitation only, attracting widespread global attention from content creators. AIbase provides an in-depth analysis of the platform's key features and its potential impact on the creative industry. One-click generation of dynamic videos, lowering the creation threshold to a new low. The core of Vibemotion's AI dynamic graphics platform lies in its extremely simple user experience.

Ant Group Accelerates the Promotion of AI Healthcare and Launches a New Large Model Application "AQ"

Ant Group officially launched the AI health application "AQ" on June 26, offering more than 100 AI functions such as health education, medical consultation, and report interpretation. It connects over 5,000 hospitals, nearly one million doctors, and nearly 200 AI avatars of famous doctors nationwide. The application is now available on major app stores. Han Xinyi, CEO of Ant Group, stated: Ant hopes that through AQ, everyone will have a trusted health assistant, becoming a helpful assistant for universal health and an accessible medical aid. Intelligent questioning and multimodal recognition address medical challenges. Facing nearly 75% of Chinese

AI Hacker Rises to Power! XBOW's Autonomous AI Tool Dominates HackerOne, Revealing Thousands of Vulnerabilities and Intimidating the Cybersecurity Industry

Recently, AI security company XBOW announced that its self-developed AI tool XBOW has outperformed others on the globally renowned vulnerability testing platform HackerOne, ranking first in the United States. This is the first time an AI tool has surpassed human security researchers to top the HackerOne vulnerability disclosure ranking, marking a milestone breakthrough for AI in the field of vulnerability detection. XBOW AI: Pioneer of Fully Automated Penetration Testing XBOW's AI tool is a completely autonomous penetration testing (pentest) system that requires no human intervention.

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Microsoft releases the innovative small-parameter model Mu: Performance comparable to Phi-3.5-mini, empowering Windows agents

AIbase基地

Mu Architecture: Excellent Optimization for Small-Scale Local Deployment

Empowering Windows Agents: The Perfect Combination of Low Latency and High Accuracy

This article is from AIbase Daily

AI News Recommendations