Mianbi Intelligence Launches VoxCPM: A Next-Generation High-Fidelity Speech Generation Model

AIbase基地

Published inAI News · 4 min read · Sep 19, 2025

Under the backdrop of rapid development in speech synthesis technology, Face Intelligent and the Human-Machine Speech Interaction Laboratory at the Shenzhen International Graduate School of Tsinghua University (THUHCSI) recently jointly released a new speech generation model - VoxCPM. This model, with a parameter size of 0.5B, is committed to providing users with a high-quality and natural speech synthesis experience.

The release of VoxCPM marks another milestone in the field of high-fidelity speech generation. The model has achieved industry-leading levels in key indicators such as naturalness, voice similarity, and prosody expression. Through zero-shot voice cloning technology, VoxCPM can generate unique user voices with minimal data, thus achieving personalized speech synthesis. This technological advancement brings more possibilities to the application scenarios of speech generation, especially in personalized voice assistants and character voice acting.

It is reported that VoxCPM is open-sourced on platforms such as GitHub and Hugging Face, and provides an online experience platform for developers, making it easy for users to explore and use its powerful features. The model performed excellently in the authoritative speech synthesis evaluation ranking Seed-TTS-EVAL, achieving extremely low error rates in word error rate and voice similarity, demonstrating its outstanding inference efficiency. On a single NVIDIA RTX4090 graphics card, VoxCPM's real-time factor (RTF) reached approximately 0.17, meeting the needs of high-quality real-time interaction.

VoxCPM not only breaks through in technical performance, but also excels in audio quality and emotional expression. The model can intelligently select appropriate voices, intonations, and prosodies based on the text content, simulating an auditory experience indistinguishable from that of a human. Whether it is weather reports, heroic speeches, or dialect hosts, VoxCPM can accurately reproduce them, providing an immersive auditory experience.

In addition, the technical architecture of VoxCPM is based on the latest diffusion autoregressive speech generation model, integrating hierarchical language modeling and continuous representations of local diffusion generation, significantly enhancing the expressiveness and naturalness of generated speech. The core architecture of this model includes multiple modules that work together, achieving an efficient "semantic-acoustic" generation process.

🔗 Github:

https://github.com/OpenBMB/VoxCPM/

🔗 Hugging Face:

https://huggingface.co/openbmb/VoxCPM-0.5B

🔗 ModelScope:

https://modelscope.cn/models/OpenBMB/VoxCPM-0.5B

🔗 PlayGround Experience:

https://huggingface.co/spaces/OpenBMB/VoxCPM-Demo

🔗 Audio Sample Page Address:

https://openbmb.github.io/VoxCPM-demopage

Ending LLM Coding Hallucinations! Exa Code Launches with Billions of Code Indexes for Zero-Error AI Agents

Exa Labs launches Exa Code, designed to optimize Coding Agents. This tool indexes over 10 billion documents, GitHub repositories, and StackOverflow content, providing accurate code context and significantly improving the accuracy of LLM code generation. It performs exceptionally well in code hallucination assessments, surpassing all existing web search tools. Currently open source and free, it has attracted widespread attention from developers. Its core innovation lies in breaking through traditional search frameworks, focusing on efficient support for code scenarios.

China's Information and Communication Research Institute: The scale of China's artificial intelligence industry will exceed 900 billion yuan in 2024

Data released at the 2025 Artificial Intelligence Industry Conference shows that it is expected that China's AI industry scale will exceed 900 billion yuan in 2024, with a year-on-year growth of 24%. By September 2025, the number of domestic AI companies will exceed 5,300, accounting for 15% of the global market, demonstrating the rapid development of China's artificial intelligence field.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Mianbi Intelligence Launches VoxCPM: A Next-Generation High-Fidelity Speech Generation Model

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Zebra Intelligent Driving Unveils a Groundbreaking Real-World AI Large Model Solution - Leading the Trend of Automotive Intelligence

JD Logistics Launches Super Brain Large Model 2.0 and Yilang Embodied Intelligence Robotic Arm System

Spotify Cracks Down on AI Music Chaos: Launches Industry Standard Certification System, Strictly Prohibits Unauthorized Voice Cloning

Ending LLM Coding Hallucinations! Exa Code Launches with Billions of Code Indexes for Zero-Error AI Agents

OpenAI and SAP Collaborate to Promote the Use of Artificial Intelligence in Germany's Public Sector

China's Information and Communication Research Institute: The scale of China's artificial intelligence industry will exceed 900 billion yuan in 2024

Alibaba Cloud Announces Partnership with NVIDIA to Explore Embodied Intelligence and Physical AI

Jack Ma's CEO Wu Yongming: The Future of AI Is Not Just AGI, the Roadmap to Super Artificial Intelligence Is Revealed

Google Photos AI Editing Revolution Comes: Android Users Can Edit Photos with Just Their Voice, Pixel 10 Exclusive Feature Fully Opened

Tencent Intelligence Body Upgraded! New Development Platform Helps Enterprises Easily Build AI Applications

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Mianbi Intelligence Launches VoxCPM: A Next-Generation High-Fidelity Speech Generation Model

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Zebra Intelligent Driving Unveils a Groundbreaking Real-World AI Large Model Solution - Leading the Trend of Automotive Intelligence

JD Logistics Launches Super Brain Large Model 2.0 and Yilang Embodied Intelligence Robotic Arm System

Spotify Cracks Down on AI Music Chaos: Launches Industry Standard Certification System, Strictly Prohibits Unauthorized Voice Cloning

Ending LLM Coding Hallucinations! Exa Code Launches with Billions of Code Indexes for Zero-Error AI Agents

OpenAI and SAP Collaborate to Promote the Use of Artificial Intelligence in Germany's Public Sector

China's Information and Communication Research Institute: The scale of China's artificial intelligence industry will exceed 900 billion yuan in 2024

Alibaba Cloud Announces Partnership with NVIDIA to Explore Embodied Intelligence and Physical AI

Jack Ma's CEO Wu Yongming: The Future of AI Is Not Just AGI, the Roadmap to Super Artificial Intelligence Is Revealed

Google Photos AI Editing Revolution Comes: Android Users Can Edit Photos with Just Their Voice, Pixel 10 Exclusive Feature Fully Opened

Tencent Intelligence Body Upgraded! New Development Platform Helps Enterprises Easily Build AI Applications

GEO Services