Microsoft Launches MAI-Transcribe-1, the World's Most Accurate Speech-to-Text Model
Microsoft has launched a new speech-to-text model called MAI-Transcribe-1, which achieved an average word error rate of only 3.9% across 25 languages, becoming the most accurate transcription model in the world. The model performed excellently in the FLEURS benchmark test, especially showing outstanding results in 11 core languages such as English. This is the third product in Microsoft's MAI series, following the release of a speech synthesis and image generation model.