VI-SVC model is just VITS without MAS and DurationPredictor.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Easily train a good VC model with voice data <= 10 mins!
SoftVC VITS Singing Voice Conversion
SOTA Open Source TTS
Amphion (/?m?fa??n/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 11 programming languages
Core Engine of Singing Voice Conversion & Singing Voice Clone
A simple, high-quality voice conversion tool focused on ease of use and performance.
移动版二次元 AI 老婆聊天器
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!