Microsoft's AI division released its first self-developed AI models on Thursday: MAI-Voice-1AI and MAI-1-preview. This move marks an important step for Microsoft in the field of self-developed AI models, which could reduce its reliance on external models and lay the foundation for future Copilot products.
MAI-Voice-1: A New Breakthrough in Fast Voice Generation
MAI-Voice-1 is a model focused on voice generation, with its main highlight being efficiency and low cost. Microsoft stated that this model can generate one minute of audio in less than a second using just a single GPU. Currently, the model has been applied to some of Microsoft's existing features, such as Copilot Daily, an AI host that explains news headlines in a podcast style.
Users can now experience MAI-Voice-1 on Copilot Labs, by inputting text and adjusting the voice and style to feel its powerful voice generation capabilities.
MAI-1-preview: The Future Form of Copilot
The simultaneously released MAI-1-preview model is said to have been trained on approximately 15,000 Nvidia H100 GPUs. This model aims to provide users with an AI experience that can follow instructions and give useful responses to daily queries.
Mustafa Suleiman, head of Microsoft's AI division, stated that Microsoft's internal AI models will focus on consumer use cases, leveraging their advantages in advertising and consumer data to create a true AI companion for consumers. In the future, MAI-1-preview will be partially used in Copilot AI assistants to supplement or replace the current reliance on OpenAI's large language models. Additionally, this model has already been publicly tested on the AI benchmark platform LMArena.
Microsoft's AI division stated in a blog that their ambition does not stop here, and they will coordinate a series of specialized models serving different user intentions and use cases in the future to unlock greater value.