Best Zero-shot AI Tools & Models - Premium Zero-shot News

AI News

Xiaomi Open Sources Major Project! OmniVoice Covers 600+ Languages for Zero-Shot Speech Cloning TTS: WER Only 0.84%, 40 Times Faster, Small Languages Can Also Be Resurrected Easily

Xiaomi Kaldi team open-sources the OmniVoice model, supporting over 600 languages. It achieves SOTA performance in multiple metrics on Chinese and multilingual TTS benchmark tests. The Chinese WER is as low as 0.84%, and the multilingual performance surpasses mainstream commercial models, achieving a new breakthrough in speech synthesis.

11.1k 28 minutes ago

Meituan LongCat-AudioDiT Open Source: Pioneering Waveform Latent Space Modeling, Setting New SOTA for Voice Cloning

Meituan's LongCat team open-sourced LongCat-AudioDiT, an end-to-end model that directly models waveform latent space, eliminating mel-spectrogram intermediates to reduce information loss and error accumulation, enhancing zero-shot voice cloning performance.....

13.5k 3 hours ago

Meituan LongCat-AudioDiT Open Source: Pioneering Waveform Latent Space Modeling, Setting New SOTA for Voice Cloning

Starts with a Character! Alibaba Qwen3-TTS Makes Its Debut: 49 Voice Styles + 10 Languages, 9 Dialects, WER Outperforms Mainstream Commercial Models

Alibaba launches Qwen3-TTS, a zero-shot, multi-role, cross-lingual speech synthesis model with lower word error rates than leading commercial engines. It offers 49 voices, supports 10 languages and 9 Chinese dialects, and provides 1 million free characters for developers on Alibaba Cloud.....

22k 1 days ago

Giant Network Launches Three Multi-Modal Models: Eliminating Video Distortion and Enabling Practical Song Usage Through Voice Conversion

Giant Network AI Lab, in collaboration with Tsinghua University and North-western Polytechnical University, has launched three audio-visual multi-modal generation technologies: YingVideo-MV (music-driven video generation), YingMusic-SVC (zero-shot voice conversion), and YingMusic-Singer (voice synthesis). These technologies will be open-sourced, with YingVideo-MV capable of generating videos using only music and a person's image.

13.4k 5 hours ago