Stability AI and Arm Launch Mobile-Level Audio Generation AI: Create 11-Second Stereo in 7 Seconds

AIbase基地

Published inAI News · 3 min read · May 19, 2025

33

Stability AI and Arm have jointly released a compact text-to-audio model named "Stable Audio Open Small." This model can generate high-quality stereo audio clips up to 11 seconds long in approximately 7 seconds, optimized to run on mobile devices like smartphones.

This breakthrough is based on the "Adversarial Relativistic-Contrastive" (ARC) technology developed by researchers at UC Berkeley. On high-end hardware such as Nvidia H100 GPUs, the model performs even more impressively, generating 44 kHz stereo audio in just 75 milliseconds, achieving nearly real-time audio synthesis capabilities.

AI Music Artificial Intelligence (3)

In comparison to the original Stable Audio Open released last year with 1.1 billion parameters, this streamlined version uses only 341 million parameters, significantly reducing computational resource requirements to enable smooth operation on consumer-grade hardware. This marks the first major achievement since Stability AI and Arm announced their collaboration in March of this year.

To achieve smartphone-level performance, the development team thoroughly revamped the model architecture, restructuring it into three core components: an autoencoder for compressing audio data, an embedding module for interpreting text prompts, and a diffusion model for generating the final audio output.

According to Stability AI, the model excels particularly in generating sound effects and field recordings but still has limitations in music generation, especially when handling vocals. At present, it primarily supports English prompt inputs.

The model was trained using approximately 472,000 audio clips from the Freesound database that comply with CC0, CC-BY, or CC-Sampling+ licensing terms. The development team conducted a series of automated checks to screen the training data, aiming to avoid potential copyright issues.

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Stability AI and Arm Launch Mobile-Level Audio Generation AI: Create 11-Second Stereo in 7 Seconds

AIbase基地

This article is from AIbase Daily