Tencent ARC Open-Sources AudioStory: Generating Long Audio with Large Language Models

AIbase基地

Published inAI News · 4 min read · Sep 1, 2025

Recently, the Tencent ARC team released a model called AudioStory, aimed at generating long narrative audio using large language models (LLMs). The model addresses the advantages of existing text-to-audio generation technology in handling short audio while tackling the challenges of time coherence and compositional reasoning in long narrative audio.

The core of AudioStory lies in its unified understanding and generation framework. The model can handle various tasks such as video dubbing, audio continuation, and long narrative audio synthesis. By combining large language models with audio generation systems, AudioStory can generate structured and temporally coherent audio narratives. The model has strong instruction-following reasoning generation capabilities, capable of breaking down complex narrative queries into subtasks arranged in chronological order, while maintaining the continuity of scene transitions and the consistency of emotional tone.

AudioStory has two notable features: first, a decoupled bridging mechanism that effectively divides the collaboration between large language models and audio generators into two specialized parts; second, an end-to-end training approach that unifies instruction understanding and audio generation, enhancing the synergy between components.

In addition, the research team has established a benchmark dataset called AudioStory-10K, covering diverse fields such as animated soundscapes and natural sound narratives. Through extensive experiments, AudioStory outperforms previous text-to-audio generation models in both single-audio generation and narrative audio generation, demonstrating excellent instruction-following capabilities and audio quality.

Currently, the team has released the inference code for the model and showcased a series of demonstration videos, including a dubbing example for the classic animation "Tom and Jerry," as well as application cases of generating long audio based on text, demonstrating the model's wide applicability and powerful functionality.

Project: https://github.com/TencentARC/AudioStory

Key Points:
🎧 **AudioStory is a long-form narrative audio generation model developed by Tencent ARC, combining large language models and audio generation technology.**
📊 **The model has strong instruction-following capabilities, capable of generating coherent audio narratives and improving user experience.**
🛠️ **The team has released inference code and demonstrated multiple application cases, showcasing its advantages in video dubbing and long audio generation.**

AudioStory Tencent ARC Team Large Language Model Long Narrative Audio

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Tencent's Game Revenue Grew by 22% Year-on-Year in 2025, with Longevity, Global Expansion, and AI Driving Business Growth

Tencent's 2025 financial report shows steady growth in the gaming business, with total revenue reaching 241.6 billion yuan, up 22% year-on-year. Revenue from the domestic market was 164.2 billion yuan, an increase of 18%, while the international market exceeded 10 billion US dollars for the first time, growing by 33%. In the fourth quarter, revenue from the domestic market reached 38.2 billion yuan, up 15%, mainly driven by games such as "Delta Force" and "Valorant".

Mar 18, 2026

Tencent's 2025 Financial Report: AI Enhances Core Business Resilience, B2B Revenue Reaches a New High of 229.4 Billion Yuan

Tencent's Q4 2025 revenue grew 13% YoY to 194.37B yuan, with full-year revenue at 751.77B yuan. ToB business (FinTech & Enterprise Services) hit a record 229.43B yuan annually, with enterprise services up 22% in Q4. Tencent Cloud achieved full-year profitability with accelerated growth, entering sustainable development. AI strategy accelerated with heavy investment in foundational models.....

Mar 18, 2026

160

AI Daily: MiniMax Launches M2.7 Model; Tencent QClaw Integrates WeChat Mini Program; OpenAI Releases Strongest Small Model GPT-5.4 mini

Welcome to the [AI Daily] column! This is your guide to exploring the world of artificial intelligence every day. Each day, we present you with the latest content in the AI field, focusing on developers to help you understand technical trends and innovative AI product applications. Click to learn more about new AI products: https://app.aibase.com/zh1. "Can models also have dolls?" 8. ByteDance Launches ByteClaw Tool and "Security Guidelines" to Strengthen Control over Internal Network Access for Large Models. ByteDance Launches ByteClaw Tool and "Security Guidelines"

Mar 18, 2026

160

Is Baidu's Search About to Change Dramatically? A Model Expert Rotates Positions, MEG Integration of Search and Recommendation Accelerates

Baidu's mobile ecosystem is accelerating its transformation with 'modelization', as key personnel changes indicate deep integration between large models and search recommendation services.....

Mar 18, 2026

110

Unsloth Studio Launches: The First Local Visual Large Model Fine-tuning Platform, Reducing VRAM Usage by 70%

Unsloth AI releases an open-source no-code visual tool called Unsloth Studio, aiming to simplify the fine-tuning process of large language models and lower the development threshold. The tool achieves double the training speed and saves 70% of VRAM usage through a customized backpropagation kernel, without requiring complex environment configuration or high hardware costs.

Mar 18, 2026

170

You Can Raise Lobsters in WeChat Now! Tencent QClaw Integrates with Mini Programs: File Sharing and Multimodal Interaction Fully Enabled

Tencent Cloud AI Intelligent Agent QClaw is upgraded, integrating with the WeChat ecosystem. The access point has been upgraded from a customer service account to a mini program, making interaction more convenient. Users can receive files from the computer end within the mini program and have multimodal potential.

Mar 18, 2026

140

How to Scale Large Models? Yang Zhilin's GTC Debut: Betting on Token Efficiency and Agent Cluster

Yang Zhilin unveils Kimi K2.5's roadmap at NVIDIA GTC 2026, highlighting the shift to a 'post-Scaling era' where optimizing core technologies like optimizers and attention mechanisms is key to advancing AI, beyond mere compute scaling.....

Mar 18, 2026

110

Can Models Also 'Clone' Themselves? MiniMax Releases M2.7: The First Domestic Large Model to Deeply Participate in Self-Iteration

MiniMax released the M2.7 model, achieving a breakthrough in 'AI self-evolution'. This model has the ability to autonomously construct complex intelligent agent testing frameworks, marking a new phase where large models transition from relying on human input to self-iteration.

Mar 18, 2026

490

ByteDance Launches ByteClaw Tool and 'Security Guidelines' to Strengthen Internal Network Access Control for Large Models

ByteDance released the 'OpenClaw Security Standards and Usage Guide' and the compliant tool ByteClaw, aiming to address security governance issues of large models within enterprise internal networks. ByteClaw, based on Volcano Engine's ArkClaw Enterprise Edition, offers unified identity authentication, access control, and permission management to ensure employees securely access internal resources and tackle access control challenges in large model applications.

Mar 18, 2026

100

New Breakthrough in the Field of Chemical AI! Tsinghua AIR Collaborates with Shuimu Molecules to Open Source the General-Purpose Large Model BioMedGPT-Mol

The Tsinghua Institute for Intelligent Technology Research collaborates with Shuimu Molecules to open source the general-purpose chemical molecule model BioMedGPT-Mol, marking a significant step in the deepening of domestic large models from general dialogue to specialized fields such as biomedicine. This model is specifically designed for chemical molecules, aiming to promote the intelligence and standardization of drug development.

Mar 18, 2026

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Tencent ARC Open-Sources AudioStory: Generating Long Audio with Large Language Models

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Tencent's Game Revenue Grew by 22% Year-on-Year in 2025, with Longevity, Global Expansion, and AI Driving Business Growth

Tencent's 2025 Financial Report: AI Enhances Core Business Resilience, B2B Revenue Reaches a New High of 229.4 Billion Yuan

AI Daily: MiniMax Launches M2.7 Model; Tencent QClaw Integrates WeChat Mini Program; OpenAI Releases Strongest Small Model GPT-5.4 mini

Is Baidu's Search About to Change Dramatically? A Model Expert Rotates Positions, MEG Integration of Search and Recommendation Accelerates

Unsloth Studio Launches: The First Local Visual Large Model Fine-tuning Platform, Reducing VRAM Usage by 70%

You Can Raise Lobsters in WeChat Now! Tencent QClaw Integrates with Mini Programs: File Sharing and Multimodal Interaction Fully Enabled

How to Scale Large Models? Yang Zhilin's GTC Debut: Betting on Token Efficiency and Agent Cluster

Can Models Also 'Clone' Themselves? MiniMax Releases M2.7: The First Domestic Large Model to Deeply Participate in Self-Iteration

ByteDance Launches ByteClaw Tool and 'Security Guidelines' to Strengthen Internal Network Access Control for Large Models

New Breakthrough in the Field of Chemical AI! Tsinghua AIR Collaborates with Shuimu Molecules to Open Source the General-Purpose Large Model BioMedGPT-Mol

AI News Recommendations

Tencent's Game Revenue Grew by 22% Year-on-Year in 2025, with Longevity, Global Expansion, and AI Driving Business Growth

Tencent's 2025 Financial Report: AI Enhances Core Business Resilience, B2B Revenue Reaches a New High of 229.4 Billion Yuan

AI Daily: MiniMax Launches M2.7 Model; Tencent QClaw Integrates WeChat Mini Program; OpenAI Releases Strongest Small Model GPT-5.4 mini

Is Baidu's Search About to Change Dramatically? A Model Expert Rotates Positions, MEG Integration of Search and Recommendation Accelerates

Unsloth Studio Launches: The First Local Visual Large Model Fine-tuning Platform, Reducing VRAM Usage by 70%

You Can Raise Lobsters in WeChat Now! Tencent QClaw Integrates with Mini Programs: File Sharing and Multimodal Interaction Fully Enabled

How to Scale Large Models? Yang Zhilin's GTC Debut: Betting on Token Efficiency and Agent Cluster

Can Models Also 'Clone' Themselves? MiniMax Releases M2.7: The First Domestic Large Model to Deeply Participate in Self-Iteration

ByteDance Launches ByteClaw Tool and 'Security Guidelines' to Strengthen Internal Network Access Control for Large Models

New Breakthrough in the Field of Chemical AI! Tsinghua AIR Collaborates with Shuimu Molecules to Open Source the General-Purpose Large Model BioMedGPT-Mol

GEO Services