Xiaomi Open Sources Major Project! OmniVoice Covers 600+ Languages for Zero-Shot Speech Cloning TTS: WER Only 0.84%, 40 Times Faster, Small Languages Can Also Be Resurrected Easily
Xiaomi Kaldi team open-sources the OmniVoice model, supporting over 600 languages. It achieves SOTA performance in multiple metrics on Chinese and multilingual TTS benchmark tests. The Chinese WER is as low as 0.84%, and the multilingual performance surpasses mainstream commercial models, achieving a new breakthrough in speech synthesis.