Bidirectional Audio-Visual Separation: Tongyi Lab Releases PrismAudio to Let AI Understand Videos and Revoice Them
Tongyi Lab of Alibaba has launched the PrismAudio framework, which solves the issue of audio-video desynchronization in AI video generation. The technology introduces a 'chain-of-thought' mechanism, analyzing video content first and then generating matching sound effects to enhance immersion. The research has been accepted by ICLR 2026.