Stability AI has recently launched its latest audio generation model, Stable Audio 2.5, aimed at providing more efficient solutions for professional sound effect creation. The model is designed to help creative teams quickly generate high-quality, customizable audio works, meeting the growing demand for audio content in the market.
The biggest highlight of Stable Audio 2.5 is its more complex generation capabilities, allowing it to create multi-part musical works, including introductions, developments, and endings. Stability AI states that the new model can respond more accurately to emotional prompts, such as "inspiring," and can understand specific music style prompts, such as "rich synth sounds." Users can generate a three-minute music track in just a few seconds, and on Nvidia H100 GPU, the processing time is even less than two seconds.
This new model's speed is due to its later training method - Adversarial Relativistic-Contrastive (ARC), a technology developed by the company's research team. In May this year, Stability AI also launched a compact version suitable for smartphones, also using the ARC method. The Stable Audio Open Small model can generate up to 11 seconds of stereo audio on mobile devices, taking only seven seconds.
In terms of features, the main update of Stable Audio 2.5 is the audio inpainting function. Users can upload their own audio files, select a starting point, and let the AI generate the subsequent content to complete or expand the existing recording. In addition, users can also generate music through text prompts. It should be noted that the uploaded files must be copyright-free, and Stability AI ensures copyright compliance through an advanced identification system. Like earlier versions, Stable Audio 2.5 was also trained on an authorized dataset, considered to be commercially safe.
Stability AI hopes this technology will be applied in areas such as advertising, retail, and brand sound effects. It has partnered with Amp, an audio brand agency under WPP, to provide consistent audio recognition services for large clients. The audio team of Stability AI can also adjust the model according to the company's sound library to create unique audio identities. Stable Audio 2.5 will be available to WPP's global customers through the WPP Open platform.
Since the launch of Stable Audio 2 in April 2024, Stability AI has started to expand its partner network in the audio field, striving to enhance its financial strength. In March of this year, the WPP group made an undisclosed investment in Stability AI, while Meta is also accelerating its audio research.
Key Points:
🎵 The new model Stable Audio 2.5 supports the generation of complex musical works and can quickly generate tracks up to three minutes long.
🖌️ Introduces the audio inpainting function, allowing users to upload audio files and let AI complete or expand recordings.
🤝 Stability AI collaborates with large clients like WPP, committed to providing consistent brand audio recognition services.