2024-09-19 09:56:02.AIbase.11.8k
Tencent Launches EzAudio AI: Transforming Text into Realistic Audio in Seconds
Recently, Johns Hopkins University and Tencent AI Lab jointly launched a new text-to-audio generation model named EzAudio. This technology promises unprecedented efficiency and high-quality text-to-speech conversion, marking a significant leap in artificial intelligence and audio technology. EzAudio operates by utilizing the latent space of audio waveforms instead of traditional spectrograms, an innovation that allows it to work at high temporal resolution without the need for an additional neural vocoder. The architecture of EzAudio is referred to as EzAu.