Best XTTS AI Tools & Models - Premium XTTS News

Models

qwen3-tts-flash

Alibaba

Input tokens/M

Output tokens/M

Context Length

qwen3-tts-flash-realtime

Alibaba

Input tokens/M

Output tokens/M

Context Length

qwen-tts-realtime

Alibaba

$2.4

Input tokens/M

$12

Output tokens/M

Context Length

qwen-tts

Alibaba

$1.6

Input tokens/M

$10

Output tokens/M

Context Length

Xtts Gguf

GenMedLabs

XTTS v2 GGUF is a memory-efficient text-to-speech system optimized for mobile devices. It uses a C++ inference engine to achieve ultra-low memory usage and fast loading.

Audio Processing Gguf

GgufMultiple Languages

GenMedLabs

246

Tunisian_TTS

amenIKh

XTTS V2 text-to-speech model fine-tuned on a custom Tunisian dataset

Audio ProcessingArabic

amenIKh

EGTTS V0.1

OmarSamir

A text-to-speech (TTS) model specifically designed for Egyptian Arabic, developed based on the XTTS v2 architecture

Audio ProcessingArabic

OmarSamir

101

XTTS V2 Urdu FT

suhaibrashid17

A TTS model supporting Urdu text-to-speech and voice cloning

Audio Processing

suhaibrashid17

XTTS V2

shadialhakimi

ⓍTTS-v2 is an advanced voice generation model that supports 17 languages. It can clone voices and achieve cross-lingual voice synthesis with just a 6-second audio clip.

Audio Processing

shadialhakimi

XTTS V2_C3PO

Xerror

A multilingual speech synthesis model fine-tuned with voice samples of C-3PO from 'Star Wars', supporting 17 languages while retaining the character's iconic voice and sarcastic style

Audio Processing

Xerror

XTTS Hindi Finetuned

Abhinay45

This is a fine-tuned version of the XTTS v2 model developed by Coqui-AI, specifically optimized for Hindi speech datasets, supporting voice cloning and multilingual speech generation.

Audio Processing

Abhinay45

XTTS V2 Argentinian Spanish

UNRN

ⓍTTS is a voice generation model that can clone a voice with just 6 seconds of audio and apply it to different languages, supporting Argentinian-accented Spanish.

Audio ProcessingSpanish

UNRN

XTTSv2 Hi_ft

AOLCDROM

A Hindi text-to-speech model fine-tuned from a forked version of Coqui TTS, supporting Hindi and English speech synthesis with Hindi accent

Audio Processing

TransformersMultiple Languages

AOLCDROM

XTTS V2 Argentinian Spanish

marianbasti

ⓍTTS is a speech generation model that can clone voices with just 6 seconds of audio and apply them to different languages. No need for hours of extensive training data.

Audio ProcessingSpanish

marianbasti