Kyutai Labs abre el código de Kyutai TTS: tecnología de síntesis de voz en tiempo real con bajo latencia

AIbase基地

Published inAI News · 4 min read · Jul 4, 2025

El 3 de julio, el instituto francés de investigación en inteligencia artificial Kyutai Labs anunció el lanzamiento de su última tecnología de síntesis de voz a partir de texto (TTS), llamada Kyutai TTS, ofreciendo una solución eficiente y en tiempo real para la generación de voz a desarrolladores e entusiastas de la IA. Kyutai TTS destaca por su bajo latencia y sonido de alta fidelidad, admitiendo la transmisión de texto en flujo, lo que permite comenzar a generar audio sin necesidad de tener todo el texto completo, especialmente adecuado para escenarios de interacción en tiempo real.

Kyutai TTS muestra un rendimiento excepcional. Con una sola tarjeta gráfica NVIDIA L40S, este modelo puede procesar 32 solicitudes simultáneamente con una latencia de solo 350 milisegundos. Además, el sistema no solo genera audio de alta calidad, sino que también puede emitir marcas de tiempo precisas para cada palabra, facilitando la generación de subtítulos en tiempo real o aplicaciones interactivas, como la función de manejo de interrupciones en la plataforma Unmute.

En cuanto al soporte de idiomas y la evaluación de calidad, Kyutai TTS actualmente admite inglés y francés, con tasas de error de palabras (WER) de 2,82 y 3,29 respectivamente, demostrando una alta precisión. La similitud del hablante alcanza el 77,1% (inglés) y el 78,7% (francés), asegurando que la voz sea natural y cercana a las muestras originales. El modelo también puede manejar artículos largos, superando la restricción tradicional de 30 segundos de los sistemas TTS, lo que lo hace ideal para la generación de contenido largo como noticias o libros.

Kyutai TTS utiliza una arquitectura de modelado de flujo de retardo (DSM), combinada con un servidor en Rust para realizar un procesamiento por lotes eficiente. Ya está disponible en GitHub y Hugging Face con código abierto y pesos del modelo, ayudando a desarrolladores de todo el mundo a impulsar la innovación en la tecnología de voz.

Nuevas palabras en IA KyutaiTTS síntesis de voz tecnología de código abierto

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Deepfake Technology Reappears in Chaos OpenAI's New Tool Sora Draws Attention

OpenAI's newly launched Sora video generation platform is based on the Sora 2 model, capable of creating highly realistic videos, including fake images of celebrities such as Martin Luther King Jr., and using copyright characters like SpongeBob, generating shocking or harmful content. Users have some awareness of the generated videos.

Oct 28, 2025

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

Ant Group open-sources the BaiLing Large Model Ring-flash-linear-2.0-128K, specifically targeting long-text programming. It employs a hybrid linear attention mechanism with a sparse MoE architecture, achieving performance comparable to a 40B dense model by activating only 6.1B parameters. It achieves optimal results in code generation and intelligent agent applications, efficiently addressing the challenges of long context processing.

Oct 28, 2025

AI Daily: Hailuo 2.3 Released; Dou Bao's AI Programming Gets a Major Upgrade; Musk Launches AI Encyclopedia Grokipedia

[AI Daily] Focusing on AI Trends, Hailuo 2.3 Released: Achieves cinematic-level video generation, making breakthroughs in actions, expressions, and physical interactions, marking the entry of AI video into the professional film industry. A dual-mode strategy adapts to various scenarios and supports free trial.

Oct 28, 2025

Ant AI AQ Ranks 7th in China's AI Application List, Leading Industry Growth Rate, Exceeding Wen Xiaoyan and Other General AI Products

According to QuestMobile data, in the Top 10 list of China's AI-native applications for the third quarter of 2025, Ant Group's AI health application AQ performed outstandingly, rising to the 7th position, becoming the only health-related application on the list. Its user base has already exceeded general AI products like Tongyi and Wen Xiaoyan. Within just over three months of its release, it achieved rapid growth, with a compound growth rate of 83.4% in the third quarter.

Oct 28, 2025

100

Zero-Code Programming in Seconds! Douyin's Duanbao AI Coding Makes a Historic Upgrade: PPT-Style Drag-and-Drop + Multi-Agent Automatic Collaboration - Product Managers Can Also Become Full-Stack Developers!

ByteDance's Duanbao AI coding achieves a paradigm shift, upgrading from code completion to full automatic product delivery. Through a PPT-style visual interface and multi-agent collaboration, beginners can generate an online H5, data dashboard, or activity page with just one sentence or a sketch in 8 minutes, achieving the breakthrough of 'saying a sentence and going live'.

Oct 28, 2025

140

Small Model Training Efficiency Surges 100 Times! Thinking Machine Introduces Online Policy Distillation, OpenAI's Former CTO Likes It Personally

Thinking Machine's online policy distillation boosts small model training efficiency by 50-100x on specific tasks. Combining RL and supervised learning, it overcomes traditional AI training limitations, creating an 'AI coach' that has drawn industry-wide attention.....

Oct 28, 2025

Qualcomm Launches New Generation AI Chip, Challenging NVIDIA's Stock Surge by 20%

Qualcomm launches AI200 and AI250 chips to challenge Nvidia, boosting its stock by over 20%. AI200, designed for AI inference, supports 768GB memory to reduce costs and enhance performance for large language and multimodal models.....

Oct 28, 2025

100

U.S. Department of Energy and AMD Reach a Billion-Dollar Partnership to Build Supercomputers and AI Projects

U.S. DOE partners with AMD on a $1B deal to build two supercomputers for nuclear energy, cancer research, and national security, enhancing data processing capabilities.....

Oct 28, 2025

110

DeepSeek Model Wins in Hong Kong and U.S. Stock Trading Competition with an Annualized Return of 10.61%, Far Exceeding GPT and Nasdaq Benchmark

China's DeepSeek model achieved 10.61% annual return in HKU-led AI trading experiment, outperforming GPT models and Nasdaq 100, demonstrating AI's potential in autonomous stock trading.....

Oct 28, 2025

120

MiniMax Launches M2 Inference Large Model: 230 Billion Parameters, 100 Tokens/s, Specifically Designed for Smart Agents

MiniMax launches M2, a new open-source reasoning model for intelligent agents. With 230B total parameters and 10B activated per inference, it achieves 100 tokens/sec for real-time interaction.....

Oct 28, 2025

110

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Kyutai Labs abre el código de Kyutai TTS: tecnología de síntesis de voz en tiempo real con bajo latencia

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Deepfake Technology Reappears in Chaos OpenAI's New Tool Sora Draws Attention

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

AI Daily: Hailuo 2.3 Released; Dou Bao's AI Programming Gets a Major Upgrade; Musk Launches AI Encyclopedia Grokipedia

Ant AI AQ Ranks 7th in China's AI Application List, Leading Industry Growth Rate, Exceeding Wen Xiaoyan and Other General AI Products

Zero-Code Programming in Seconds! Douyin's Duanbao AI Coding Makes a Historic Upgrade: PPT-Style Drag-and-Drop + Multi-Agent Automatic Collaboration - Product Managers Can Also Become Full-Stack Developers!

Small Model Training Efficiency Surges 100 Times! Thinking Machine Introduces Online Policy Distillation, OpenAI's Former CTO Likes It Personally

Qualcomm Launches New Generation AI Chip, Challenging NVIDIA's Stock Surge by 20%

U.S. Department of Energy and AMD Reach a Billion-Dollar Partnership to Build Supercomputers and AI Projects

DeepSeek Model Wins in Hong Kong and U.S. Stock Trading Competition with an Annualized Return of 10.61%, Far Exceeding GPT and Nasdaq Benchmark

MiniMax Launches M2 Inference Large Model: 230 Billion Parameters, 100 Tokens/s, Specifically Designed for Smart Agents

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Kyutai Labs abre el código de Kyutai TTS: tecnología de síntesis de voz en tiempo real con bajo latencia

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Deepfake Technology Reappears in Chaos OpenAI's New Tool Sora Draws Attention

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

AI Daily: Hailuo 2.3 Released; Dou Bao's AI Programming Gets a Major Upgrade; Musk Launches AI Encyclopedia Grokipedia

Ant AI AQ Ranks 7th in China's AI Application List, Leading Industry Growth Rate, Exceeding Wen Xiaoyan and Other General AI Products

Zero-Code Programming in Seconds! Douyin's Duanbao AI Coding Makes a Historic Upgrade: PPT-Style Drag-and-Drop + Multi-Agent Automatic Collaboration - Product Managers Can Also Become Full-Stack Developers!

Small Model Training Efficiency Surges 100 Times! Thinking Machine Introduces Online Policy Distillation, OpenAI's Former CTO Likes It Personally

Qualcomm Launches New Generation AI Chip, Challenging NVIDIA's Stock Surge by 20%

U.S. Department of Energy and AMD Reach a Billion-Dollar Partnership to Build Supercomputers and AI Projects

DeepSeek Model Wins in Hong Kong and U.S. Stock Trading Competition with an Annualized Return of 10.61%, Far Exceeding GPT and Nasdaq Benchmark

MiniMax Launches M2 Inference Large Model: 230 Billion Parameters, 100 Tokens/s, Specifically Designed for Smart Agents

GEO Services