In the field of artificial intelligence speech, balancing generality and accuracy has always been a challenge for the industry. On April 20, the Tongyi Lab of Alibaba officially announced the release of the speech recognition large model Fun-ASR1.5. This model, with its unified large model architecture, has achieved a breakthrough in multilingual, regional dialects, and complex contexts.
According to the information, the "hearing" performance of Fun-ASR1.5 is comprehensive. It not only covers 30 major languages worldwide, but also deeply adapts to the seven major Chinese dialects and more than 20 local accents. More importantly, the model also shows outstanding performance in the traditional culture domain. Even when facing ancient poetry recitations with fluctuating tones and unique sentence structures, it can achieve high-precision real-time transcription.
Currently, Fun-ASR1.5 has been officially launched on the Alibaba Cloud BaiLian platform. The Tongyi Lab of Alibaba stated that the model will provide efficient speech technology support to customers in multiple industries such as education, media, finance, technology, and culture through API services, helping various industries upgrade to intelligent office work and content production.




