On August 15, 2025, Kuaishou Group announced the official launch of Mureka V7.5 model, bringing new breakthroughs to the field of AI music creation. This news marked the successful conclusion of Kuaishou SkyWork AI Technology Launch Week. During the five-day launch period, Kuaishou launched a new model every day, covering cutting-edge technologies in multi-modal AI core scenarios, including SkyReels-A3, Matrix-Game2.0, Matrix-3D, Skywork UniPic2.0, and Skywork Deep Research Agent.

The release of Mureka V7.5 was the highlight of this technology launch week, demonstrating outstanding capabilities in Chinese song creation. The model has achieved significant improvements in tone, performance techniques, articulation, and emotional expression. With a deep understanding of Chinese musical styles and elements, Mureka V7.5 can accurately convey the artistic essence and emotional color of Chinese music. It covers various styles, including traditional folk songs, operas, classic Chinese pop hits, and contemporary folk music, showcasing the diversity and cultural characteristics of Chinese music.

To further enhance the authenticity and emotional depth of vocal performances, Mureka V7.5 optimized Automatic Speech Recognition (ASR) technology. This technology delves into the micro-level aspects of singing, accurately identifying lyrics and analyzing breath usage, emotional fluctuations, and vocal details in real singing. By intelligently dividing phrases and determining natural breathing and pause positions, Mureka V7.5 significantly improves the clarity and structural realism of generated vocal segments. These detailed captures, when fed back to the generation model, greatly enhance the naturalness, breathiness, and authenticity of emotional expression in vocals, effectively reducing the mechanical feel and making AI-performed songs more fluid and closer to human singing, especially excelling in handling the unique rhythms and breath requirements of Chinese songs.

WeChat Screenshot_20250815094600.png

At the same time, Kuaishou's speech team also launched MoE-TTS — the first role description-based Text-to-Speech (TTS) framework based on Mixture of Experts (MoE). As a research work aimed at open description scenarios, MoE-TTS allows users to precisely control voice characteristics and styles through natural language descriptions. Even with only open-source data, this technology can match or even surpass closed-source commercial products in terms of character alignment. The release of MoE-TTS is expected to solve long-standing challenges in descriptive TTS, such as when generating speech for metaphors, analogies, and other complex figures of speech, the output often deviates from user expectations. This framework combines the text capabilities of pre-trained large language models (LLMs) with voice expert modules, ensuring independent optimization and non-interference between modalities, achieving "zero knowledge loss" generalization capabilities. On dual test sets covering both in-domain and out-of-domain descriptions, MoE-TTS performed well in terms of style expressiveness and overall alignment, demonstrating its advantages in complex description matching.

The release of MoE-TTS not only provides the academic community with a reproducible open description TTS solution but also demonstrates the great potential of the technical path combining modality decoupling and knowledge freezing transfer in speech synthesis. This breakthrough is expected to drive the industry from "closed-label control" to a new paradigm of "natural language free control," accelerating the experience upgrade of digital humans, virtual assistants, and immersive content creation. Currently, MoE-TTS is still under iteration, and future plans include integrating it into the Mureka-Speech platform as a base model for character dubbing, providing global developers and creators with an open, efficient, and customizable descriptive TTS capability.

Through the release of Mureka V7.5 and MoE-TTS, Kuaishou Group has demonstrated its strong strength and innovation capabilities in the fields of AI music creation and speech synthesis. The launch of these technologies not only brings new possibilities to music creation and speech synthesis but also provides new ideas and directions for research and development in related fields. Global users can experience the new V7.5 model by visiting www.mureka.ai and explore the infinite possibilities of music creation.