Cohere, which has been active in the enterprise AI market, officially launched an open-source speech recognition model called
This model has 2 billion parameters and is designed for edge devices, aiming to break through the latency bottleneck caused by the large size of previous speech models. By open-sourcing it under the Apache 2.0 license, Cohere tries to follow Meta's path, leveraging the power of the developer community to quickly improve the ecosystem and ultimately achieve commercialization feedback.
The Performance Monster on the Edge: Supports 14 Languages and Exceeds Mainstream Competitors
Thanks to its reduced number of parameters, it can be directly deployed on terminal devices such as smartphones, PCs, or industrial gateways without frequently calling cloud computing power. This not only greatly reduces data transmission latency but also provides a more secure solution for industries with high sensitivity to privacy, such as banking, sales, and healthcare.
Strategic Expansion from Text to Speech: Rebuilding the Foundation of Intelligent Agent Interaction
Although Cohere has long focused on the text generation field, this cross-domain move in speech recognition is seen as a key step in building a comprehensive AI intelligent agent (Agent). The company announced that
Analysts point out that as voice interaction similar to Siri becomes the starting point of the AI trend, voice capabilities have become an essential "ears" for intelligent agents to perceive the world. Cohere is competing head-on with IBM, Alibaba, and Zoom, which launched AI Companion 3.0, by adopting this "small but powerful" open-source strategy in the edge computing and real-time speech translation market.

