Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Information

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

Submit Your Model

Submit Your Model Info & Services - Precision Marketing & User Targeting

Tools

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

AI Brand Monitoring Tool

Analyze & Track How AI Models Cite Your Brand

GEO Services

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

AI Search Visibility Checker

Detect brand's visibility on AI platforms

Tools

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

AI Deployment Calculator

Enter Your Large Model Computing Requirements for Instant GPU, Memory & Server Configuration Recommendations

AI Tutorial

Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

Xiaomi Open Sources Its First Native End-to-End Speech Large Model Xiaomi-MiMo-Audio

AIbase基地

Published inAI News · 6 min read · Sep 19, 2025

On September 19, Xiaomi announced the open-source release of its first native end-to-end speech large model, Xiaomi-MiMo-Audio, marking a major breakthrough in the field of speech technology. Five years ago, the emergence of GPT-3 ushered in a new era for general artificial intelligence (AGI) in language, but the speech field has long been constrained by reliance on large-scale annotated data, making it difficult to achieve the few-shot generalization capabilities seen in language models. Now, Xiaomi's Xiaomi-MiMo-Audio model, based on an innovative pre-training architecture and hundreds of millions of hours of training data, has achieved few-shot generalization based on In-Context Learning (ICL) for the first time in the speech field, and observed clear "emergence" behavior during pre-training.

The Xiaomi-MiMo-Audio model performs well on multiple standard evaluation benchmarks. Its performance not only surpasses open-source models with the same parameter count, but also exceeds Google's closed-source speech model Gemini-2.5-Flash on the standard test set of the audio understanding benchmark MMAU, and outperforms OpenAI's closed-source speech model GPT-4o-Audio-Preview on the Big Bench Audio S2T task in the audio complex reasoning benchmark. This achievement not only demonstrates Xiaomi's deep strength in the field of speech technology, but also provides a new direction for the development of speech AI.

WeChat screenshot_20250919094548.png

The open-sourced Xiaomi-MiMo-Audio model by Xiaomi features multiple innovations and first-time breakthroughs. First, this model has proven for the first time that extending speech lossless compression pre-training to 100 million hours can "emerge" cross-task generalization, manifested as few-shot learning capability, which is considered the "GPT-3 moment" in the speech field. Second, Xiaomi is the first company to clearly define the objectives and definitions of speech generative pre-training, and has open-sourced a complete speech pre-training solution, including a lossless compression Tokenizer, a new model structure, training methods, and an evaluation system, ushering in the "LLaMA moment" in the speech field. In addition, Xiaomi-MiMo-Audio is the first open-source model to introduce the thinking process simultaneously into both speech understanding and speech generation processes, supporting mixed thinking.

Xiaomi adopted a simple, thorough, and direct open-source style to accelerate the development of the speech research field. The open-source content includes the pre-trained model MiMo-Audio-7B-Base and the instruction-tuned model MiMo-Audio-7B-Instruct, as well as the Tokenizer model, technical report, and evaluation framework. The MiMo-Audio-7B-Instruct model can switch between non-thinking and thinking modes via prompt, starting with high-level reinforcement learning and great potential, serving as a new base model for researching speech RL and Agentic training. The Tokenizer model has 1.2B parameters, uses a Transformer architecture, balances efficiency and performance, was trained from scratch, covers over ten million hours of speech data, and supports both audio reconstruction tasks and audio-to-text tasks. The technical report comprehensively presents the model and training details, while the evaluation framework supports more than 10 evaluation tasks and has been open-sourced on GitHub.

Xiaomi stated that the open-source release of Xiaomi-MiMo-Audio will significantly accelerate the alignment of speech large model research with language large models, providing an important foundation for the development of speech AGI. Xiaomi will continue to open-source, and looks forward to working with every fellow traveler to move towards the "singularity" of speech AI and enter the future era of human-computer interaction.

https://huggingface.co/XiaomiMiMo/MiMo-Audio-7B-Instruct

Xiaomi-MiMo-Audio AI New Words Speech Large Model Brand Product Term

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Tencent's Q3 Report Reveals New AI Ecosystem Opportunities, Significant Growth in Enterprise Services Revenue

Tencent's Q3 2025 revenue reached 192.8B yuan, up 15% YoY. ToB business grew 10% to 58.2B yuan, driven by AI-driven cloud services and WeChat Mini Store tech. Hunyuan AI model leads in rankings, highlighting strategic success.....

Nov 14, 2025

100

TRAE Releases SOLO Stable Version: Real-time Perception + Multi-Agent Initiates a New Professional-level AI Coding Mode

TRAE launches SOLO, a responsive programming assistant for developers, offering real-time control and multi-tasking. Now available internationally with limited-time free access. The July beta included SOLO Builder for multimodal task decomposition and end-to-end app development.....

Nov 14, 2025

110

YC Teen Abandons Selling AI Tools to Agrochemical Giants, Turns to Pesticide Sector and Secures $6 Million in Funding

Bindwell, founded by 18-year-old Tyler Rose and 19-year-old Navvye Anand, raised $6M in seed funding led by General Catalyst and A Capital, with Paul Graham participating. The startup pivoted from selling AI tools to developing proprietary pesticide molecules using AI-driven target design from drug discovery, scanning compound libraries in 6 hours with Foldwell structure prediction.....

Nov 14, 2025

100

Study: 97% of listeners can't distinguish AI music from human-created music

Survey shows global concern over AI music copyright and demand for clear labeling. 97% in blind tests couldn't distinguish AI from human-created music.....

Nov 14, 2025

110

AI Daily: Feifei Li's Marble 3D World Model Public Beta; OpenAI Launches ChatGPT Group Chat Function for the First Time; Baidu Unveils Multimodal AI Assistant, Super Du

Feifei Li's World Labs launches the public beta version of the Marble 3D World Model, supporting multimodal inputs such as text, images, and videos, quickly generating interactive virtual universes, and helping developers explore AI technology applications.

Nov 14, 2025

150

Musk Denies Rumors of xAI's $15 Billion Funding: A False Response Report

Elon Musk denied the rumors that xAI has completed a $15 billion funding round. Previously, CNBC reported that xAI was raising funds to purchase GPU computing power to train the Grok model, with an estimated valuation of $200 billion. In the current AI funding boom, this news has attracted attention, compared to OpenAI's recent $6.6 billion funding and a valuation of $50 billion.

Nov 14, 2025

100

Xiaomi Launches Xiaomi Miloco: Large Model at Your Doorstep, Reconstructing Full-House Smart Interaction

Xiaomi Miloco integrates AI models into smart homes, enabling natural language control for complex commands like 'turn on the lamp and music while reading', enhancing user interaction beyond preset rules.....

Nov 14, 2025

130

100 Million Interactions, 80% Faster: AI Voice Pioneer Vida Secures $4 Million in Series A Funding

Vida, an AI voice automation firm, raised $4M in Series A funding led by Trammell Venture Partners. Its AI voice assistant has handled over 100 million customer interactions, positioning it as a core platform for global enterprise voice agents. Funds will accelerate product innovation and expand industry applications.....

Nov 14, 2025

110

Israeli AI startup Wonderful raises $100 million in Series A funding in just 10 months

Israeli AI startup Wonderful raised $100M in Series A funding led by Index Ventures, following a $34M seed round in July, showcasing rapid growth.....

Nov 14, 2025

110

New Star in the Programming Field! Cursor Raises $2.3 Billion, Valuation Surges to $29.3 Billion

AI coding assistant Cursor raised $2.3B, valuation surged from $9.9B to $29.3B. Led by Accel and Coatue, with Nvidia and Google participating. Funds will support October launch of proprietary model Composer.....

Nov 14, 2025

120

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Xiaomi Open Sources Its First Native End-to-End Speech Large Model Xiaomi-MiMo-Audio

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Tencent's Q3 Report Reveals New AI Ecosystem Opportunities, Significant Growth in Enterprise Services Revenue

TRAE Releases SOLO Stable Version: Real-time Perception + Multi-Agent Initiates a New Professional-level AI Coding Mode

YC Teen Abandons Selling AI Tools to Agrochemical Giants, Turns to Pesticide Sector and Secures $6 Million in Funding

Study: 97% of listeners can't distinguish AI music from human-created music

AI Daily: Feifei Li's Marble 3D World Model Public Beta; OpenAI Launches ChatGPT Group Chat Function for the First Time; Baidu Unveils Multimodal AI Assistant, Super Du

Musk Denies Rumors of xAI's $15 Billion Funding: A False Response Report

Xiaomi Launches Xiaomi Miloco: Large Model at Your Doorstep, Reconstructing Full-House Smart Interaction

100 Million Interactions, 80% Faster: AI Voice Pioneer Vida Secures $4 Million in Series A Funding

Israeli AI startup Wonderful raises $100 million in Series A funding in just 10 months

New Star in the Programming Field! Cursor Raises $2.3 Billion, Valuation Surges to $29.3 Billion

GEO Services