Recently, the AI voice field has seen a major innovation - Soul's SoulX-Podcast voice model, which has quickly become a focus in the industry with its revolutionary features. Designed specifically for podcast-style content, this model achieves highly realistic voice generation, supporting long-duration, multi-speaker, and multilingual interactions, marking another milestone in AI's simulation of natural conversations.

image.png

The core highlights of SoulX-Podcast lie in its high fidelity and stability. It can continuously generate over 90 minutes of dialogue content without any stability degradation, ensuring smooth and natural output. This capability is especially suitable for long-form podcasts, interviews, or storytelling scenarios, allowing AI voice to transition from short-time demonstrations to practical applications.

Multi-language and Dialect Support: Bilingual (Chinese-English) + Dialects Seamlessly Integrated

The model excels in language processing, supporting multi-turn dialogue generation in Mandarin, English, and various Chinese dialects. Users can easily switch between Chinese and English or incorporate local dialect elements, creating a more regionally distinctive podcast atmosphere. Furthermore, it features paralanguage control functions, such as precise simulation of emotional expressions like laughter, sighs, and pauses, further enhancing the vitality and immersion of the voice.

Notably, SoulX-Podcast has made innovations in zero-shot cloning and transfer. This function allows the model to directly clone specific voices and tones without additional training, enabling personalized voice customization. This not only lowers the development threshold but also provides content creators with infinite creative space, such as quickly replicating the style of celebrity interviews or simulating the unique tone of a virtual host.

Industry Impact: The AI Podcast Era is Accelerating

This release is undoubtedly set to promote the widespread application of AI voice in media, entertainment, and education. Experts point out that the emergence of SoulX-Podcast will challenge traditional recording studio models, enabling small teams to efficiently produce high-quality podcast content. In the future, as the model continues to evolve, it is expected to expand further into real-time interaction and cross-platform integration.

Project Address: https://github.com/Soul-AILab/SoulX-Podcast