Recently, Alibaba's TONGYI Lab officially launched its latest end-to-end speech recognition large model - FunAudio-ASR. The biggest highlight of this model is its innovative "Context module," which significantly improves the accuracy of speech recognition in high-noise environments, reducing the hallucination rate from 78.5% to 10.7%, a reduction of nearly 70%. This technological breakthrough has set a new benchmark for the speech recognition industry, especially suitable for noisy environments such as meetings and public places.

The FunAudio-ASR model used tens of millions of hours of audio data during training, and integrated the semantic understanding capabilities of large language models, making its performance in complex conditions such as far-field, noisy, and multi-speaker scenarios exceed many mainstream speech recognition systems such as Seed-ASR and KimiAudio-8B. Through this technology, users can enjoy clearer and more accurate speech recognition results when using speech recognition.

image.png

In addition to the full version, Alibaba also released a lightweight version called FunAudio-ASR-nano. This version maintains high recognition accuracy while reducing inference costs, making it suitable for deployment environments with high resource requirements. Whether it's a large enterprise or a small team, they can find a solution that suits their needs.

image.png

Currently, FunAudio-ASR has been practically applied in DingTalk's "AI Note-taking" feature, video conferences, and the DingTalk A1 hardware. In addition, its API has been officially launched on Alibaba Cloud's BaiLian platform, making it convenient for developers to integrate and use. For enterprise users, this means they can leverage this advanced technology to improve meeting efficiency and enhance communication effectiveness.

FunAudio-ASR not only brings new breakthroughs to speech recognition technology but also provides strong support for user applications, promoting the further popularization and application of AI technology.

Official introduction: https://mp.weixin.qq.com/s/7l5EPTU7cpz7GSN4RP91rg