Alibaba Tongyi Launches Qwen3-ASR-Toolkit for a New Breakthrough in Audio and Video Transcription

AIbase基地

Published inAI News · 3 min read · Sep 24, 2025

Recently, the Tongyi Qwen team from Alibaba released an open-source Python command-line tool called Qwen3-ASR-Toolkit. This tool is designed to provide users with a more convenient audio and video transcription service, especially breaking the three-minute limit of the Qwen3-ASR-Flash API in terms of audio duration, enabling fast transcription for hours. The release of this new tool undoubtedly provides strong support for users who need large-scale audio transcription.

Qwen3-ASR-Flash is the latest speech recognition model in the Tongyi Qianwen series, trained on massive multimodal data and ASR data with a scale of tens of millions of hours. Its powerful performance provides users with high-accuracy speech recognition capabilities, allowing long audio and video content to be effectively transcribed into text, greatly improving work efficiency.

The Qwen3-ASR-Toolkit uses intelligent voice activity detection (VAD) technology to ensure the integrity of sentences during transcription. At the same time, the tool can automatically resample any sampling rate audio file to 16kHz mono to improve processing results. In addition, it supports multi-threaded parallel upload of segments, a feature that significantly reduces total processing time, making the user experience smoother during use.

In terms of supported media formats, Qwen3-ASR-Toolkit is based on FFmpeg, covering almost all mainstream audio and video formats, including mp4, mov, mkv, mp3, wav, m4a, etc., which allows users to flexibly choose file types when performing audio and video transcription without worrying about format compatibility issues.

github:https://github.com/QwenLM/Qwen3-ASR-Toolkit

Key Points:
📌 Alibaba's Tongyi launches Qwen3-ASR-Toolkit, breaking the time limit for audio transcription, supporting hour-level transcription.
🎤 This tool is based on the latest Qwen3-ASR-Flash model, ensuring high accuracy in speech recognition.
💻 Supports multiple audio and video formats, allowing users to choose flexibly and improve the efficiency of audio and video transcription.

Qwen3-ASR-Toolkit AlibabaTongyi Qwen3-ASR-Flash SpeechRecognition

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Sold 2 Billion in 3 Years! JD.com Teams Up with Haier at AWE to Create a Stir: AI Kitchens Are About to Take Over Your Stomach

Haier Small Appliances and JD.com's Kitchen Small Appliances signed a strategic partnership, planning to achieve a sales volume of 2 billion yuan across JD.com's full channels in the next three years. Both sides will integrate platforms, brands, and AI technology to jointly promote the deep evolution of smart kitchens.

Mar 13, 2026

110

After Reading 5 Million News Articles, Gemini Learned to Predict the Future? Google Reveals: Large Models Predict Flash Floods More Accurately Than Satellites

Google used the Gemini model to analyze 5 million news reports globally, attempting to predict flash floods. As the top killer among meteorological disasters, flash floods cause massive casualties each year due to their sudden nature and small scale, making traditional monitoring methods often unable to issue accurate warnings. This innovative approach aims to enhance disaster warning capabilities by mining relevant information in news data.

Mar 13, 2026

100

Microsoft Plans to Train 3 Million Africans in AI Tools to Drive Digital Transformation

Microsoft is accelerating its expansion in the African AI market, planning to train 3 million people within the year to use AI tools, aiming to capture the demographic dividend of young people and compete with platforms like DeepSeek from China. The company is taking a multi-pronged approach through education, collaboration, and investment in computing power to secure leadership in the future market.

Mar 12, 2026

160

NVIDIA Releases Open-Source Large Model Nemotron 3 Super: Performance Approaching GPT-5.4

NVIDIA releases its new open-source large model, Nemotron 3 Super, designed specifically for AI agents. The model uses an innovative Mamba-MoE hybrid architecture, with a total of 120 billion parameters and only 12 billion activated parameters, significantly improving inference efficiency, increasing speed by 300%, while maintaining excellent task success rates, becoming a focus in the open-source community.

Mar 12, 2026

480

Core Members of Qwen Move to ByteDance: The Competition for Large Model Talent Heats Up Again

Alibaba's Tongyi Lab has recently undergone restructuring, with the original Qwen team being split, leading to talent turnover. After Lin Jinyang left, Yu Bowen, the former head of the large model pre-training team of Qwen, also joined ByteDance, taking on the role of head of the pre-training team for the Seed team's visual model and multimodal interaction team. This reflects the intensifying competition for talent in the field of large models in China, and the industry structure is undergoing a new round of reshaping.

Mar 12, 2026

350

In-House Computing Power Advances Again: Meta Releases New AI Chip, Performance Directly Challenges NVIDIA H100

Meta unveils its latest in-house AI chip, MTIA3, designed to reduce reliance on external computing power. Tailored for recommendation systems and inference tasks, it outperforms NVIDIA's H100 in internal tests.....

Mar 12, 2026

160

It is reported that Yu Bowen, a core member of the original Qwen team, has joined ByteDance's Seed team

Ali Tongyi Lab has recently undergone an organizational restructuring, splitting the Qwen team into multiple parallel lines such as pre-training and post-training. Subsequently, Yu Bowen, the former head of the post-training team of Qwen, was reported to have joined ByteDance, taking on the role of post-training lead for visual models and multimodal interaction within the Seed team. ByteDance has not officially responded yet.

Mar 12, 2026

160

From Chip Giant to Full-Stack Player: NVIDIA Plans to Invest $26 Billion in Open-Weight Models

NVIDIA plans to invest $26 billion over five years to develop open-weight AI models, transitioning from a hardware giant to deep involvement in AI core model development, and promoting the growth of the open-source ecosystem.

Mar 12, 2026

250

World Models Enter the Era of Fine-Tuning: Tencent Opensources the Reinforcement Learning Post-Training Framework WorldCompass

Tencent's Hunyuan 3D team open-sourced WorldCompass, a reinforcement learning post-training framework designed to enhance world models' accuracy and user experience in interactions by addressing biases in handling complex instructions.....

Mar 11, 2026

120

Gemini Breaks into the Pentagon: Google AI Agent Covers 3 Million Employees of the U.S. Department of Defense

Google has deployed Gemini AI to over 3 million personnel in the U.S. Department of Defense, marking the official entry of mainstream AI technology into the U.S. defense system. Currently, this tool is only available for use on non-classified networks, and it remains to be seen whether it will be expanded to classified systems in the future.

Mar 11, 2026

180

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Alibaba Tongyi Launches Qwen3-ASR-Toolkit for a New Breakthrough in Audio and Video Transcription

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Sold 2 Billion in 3 Years! JD.com Teams Up with Haier at AWE to Create a Stir: AI Kitchens Are About to Take Over Your Stomach

After Reading 5 Million News Articles, Gemini Learned to Predict the Future? Google Reveals: Large Models Predict Flash Floods More Accurately Than Satellites

Microsoft Plans to Train 3 Million Africans in AI Tools to Drive Digital Transformation

NVIDIA Releases Open-Source Large Model Nemotron 3 Super: Performance Approaching GPT-5.4

Core Members of Qwen Move to ByteDance: The Competition for Large Model Talent Heats Up Again

In-House Computing Power Advances Again: Meta Releases New AI Chip, Performance Directly Challenges NVIDIA H100

It is reported that Yu Bowen, a core member of the original Qwen team, has joined ByteDance's Seed team

From Chip Giant to Full-Stack Player: NVIDIA Plans to Invest $26 Billion in Open-Weight Models

World Models Enter the Era of Fine-Tuning: Tencent Opensources the Reinforcement Learning Post-Training Framework WorldCompass

Gemini Breaks into the Pentagon: Google AI Agent Covers 3 Million Employees of the U.S. Department of Defense

AI News Recommendations

Sold 2 Billion in 3 Years! JD.com Teams Up with Haier at AWE to Create a Stir: AI Kitchens Are About to Take Over Your Stomach

After Reading 5 Million News Articles, Gemini Learned to Predict the Future? Google Reveals: Large Models Predict Flash Floods More Accurately Than Satellites

Microsoft Plans to Train 3 Million Africans in AI Tools to Drive Digital Transformation

NVIDIA Releases Open-Source Large Model Nemotron 3 Super: Performance Approaching GPT-5.4

Core Members of Qwen Move to ByteDance: The Competition for Large Model Talent Heats Up Again

In-House Computing Power Advances Again: Meta Releases New AI Chip, Performance Directly Challenges NVIDIA H100

It is reported that Yu Bowen, a core member of the original Qwen team, has joined ByteDance's Seed team

From Chip Giant to Full-Stack Player: NVIDIA Plans to Invest $26 Billion in Open-Weight Models

World Models Enter the Era of Fine-Tuning: Tencent Opensources the Reinforcement Learning Post-Training Framework WorldCompass

Gemini Breaks into the Pentagon: Google AI Agent Covers 3 Million Employees of the U.S. Department of Defense

GEO Services