AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
EN

AI News

View More

Microsoft Open-Sources Real-Time Speech Model VibeVoice-Realtime-0.5B, 300ms Real-Time Voice Activation, No Breathing Even for 90-Minute Long Audio

Microsoft open-sources the real-time speech model VibeVoice-Realtime-0.5B, which offers extremely low latency and near-human voice performance. The model takes an average of only 300 milliseconds from text input to voice output, far less than traditional TTS models (1-3 seconds), achieving almost zero latency real-time speech synthesis.

8.4k 36 minutes ago
Microsoft Open-Sources Real-Time Speech Model VibeVoice-Realtime-0.5B, 300ms Real-Time Voice Activation, No Breathing Even for 90-Minute Long Audio

Microsoft Launches VibeVoice-Realtime-0.5B: Achieving Almost Real-Time Natural Speech Generation with Just 0.5B Parameters

Microsoft has released the real-time text-to-speech model VibeVoice-Realtime-0.5B, which can start speaking in about 300 milliseconds with just 0.5B parameters, achieving near real-time smooth speech generation. The model supports real-time transcription and speech generation for both Chinese and English, with slightly lower performance in Chinese but maintaining overall high fluency and fidelity. The natural sound quality has attracted attention.

8.1k 2 hours ago
Microsoft Launches VibeVoice-Realtime-0.5B: Achieving Almost Real-Time Natural Speech Generation with Just 0.5B Parameters
AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2025AIBase
Business CooperationSite Map