Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Information

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

Submit Your Model

Submit Your Model Info & Services - Precision Marketing & User Targeting

Tools

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

GEO Services

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

AI Search Visibility Checker

Detect brand's visibility on AI platforms

Tools

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

AI Tutorial

Tencent Hunyuan Open-Sources End-to-End Video Sound Effect Generation Model HunyuanVideo-Foley

AIbase基地

Published inAI News · 4 min read · Aug 28, 2025

On August 28, Tencent Hunyuan announced the open source of the end-to-end video sound effect generation model HunyuanVideo-Foley. This model can match movie-level sound effects to videos by inputting videos and text, bringing a new breakthrough in video creation. Users just need to input corresponding text descriptions, and HunyuanVideo-Foley can generate audio that accurately matches the visuals, breaking the limitations of AI-generated videos being only "seen" but not "heard," making silent AI videos a thing of the past.

The emergence of HunyuanVideo-Foley solves three major pain points faced by existing audio generation technologies. First, it enhances the model's generalization ability by building a large-scale high-quality TV2A (text-video-audio) dataset, enabling it to adapt to various types of videos such as people, animals, natural landscapes, and cartoons, and generate audio that accurately matches the visuals. Second, the model adopts an innovative dual-stream multimodal diffusion transformer (MMDiT) architecture, which balances text and video semantics to generate richly layered composite sound effects, avoiding the problem of audio being out of sync with the scene due to over-reliance on text semantics. Finally, HunyuanVideo-Foley improves the quality and stability of audio generation by introducing a representation alignment (REPA) loss function, ensuring professional-level audio fidelity.

WeChat screenshot_20250828134745.png

In multiple authoritative evaluation benchmarks, HunyuanVideo-Foley's performance is comprehensively leading, with the audio quality metric PQ increasing from 6.17 to 6.59, the visual semantic alignment metric IB from 0.27 to 0.35, and the temporal alignment metric DeSync from 0.80 to 0.74, all reaching new SOTA levels. In subjective evaluations, the model achieved average opinion scores exceeding 4.1 points (out of 5) in three dimensions: audio quality, semantic alignment, and temporal alignment, demonstrating audio generation effects close to professional standards.

The open sourcing of HunyuanVideo-Foley provides the industry with a reusable technical paradigm, accelerating the application of multimodal AI in the content creation field. Short video creators can instantly generate contextual sound effects, film teams can quickly complete ambient sound design, and game developers can efficiently build immersive auditory experiences.

Currently, users can download the model on Github and HuggingFace, or directly experience it on the Hunyuan official website.

Experience entry: https://hunyuan.tencent.com/video/zh?tabIndex=0
Project website: https://szczesnys.github.io/hunyuanvideo-foley/
Code: https://github.com/Tencent-Hunyuan/HunyuanVideo-Foley
Technical report: https://arxiv.org/abs/2508.16930
Hugging Face: https://huggingface.co/tencent/HunyuanVideo-Foley

Tencent Hunyuan HunyuanVideo-Foley AI Sound Effect Generation Dual-Stream Multimodal Diffusion Transformer

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Baidu Search Launches AI Short Drama Platform, Begins Public Testing to Assist Creators

Baidu AI short drama platform launched, offering funds and traffic support to enhance creators' content production and IP monetization. It automates 80%+ tasks, requiring only minor tweaks for high-quality output.....

Oct 16, 2025

110

Qwen Launches Qwen Chat Memory Feature

Tongyi Qianwen, a subsidiary of Alibaba, has launched the Qwen Chat Memory feature, which users can experience at chat.qwen.ai. This feature enables the intelligent assistant to have long-term memory capabilities, saving user preferences, habits, and historical conversation content. It maintains context consistency in multi-turn conversations, achieving more intelligent and personalized interactions.

Oct 16, 2025

Wondershare Launches Video Tutorial Co-creation Incentive Program to Accelerate the Development of an Inclusive AI Video Creation Ecosystem

Wondershare promotes 'creative equality' via its AIGC video platform, launching a tutorial co-creation incentive to foster an inclusive creator ecosystem for AI video production.....

Oct 16, 2025

New Strategy for Oracle's AI Supercluster Construction

At Oracle AI World 2025, OCI's Vincent highlighted 'bringing cloud to clients' as core strategy, shared cloud/AI infrastructure plans, noted India's AI potential, and reflected on 20+ years of industry experience.....

Oct 16, 2025

AI Daily: Google Releases Veo 3.1; Tongyi Qianwen Introduces Qwen Chat Memory Feature; Sora2 Free Users Can Generate 15-Second Videos

Google released Veo 3.1, a video generation model with audio features and refined editing, enhancing realism and control, plus improved image-to-video quality.....

Oct 16, 2025

New Breakthrough in AI Assistant! Qwen Chat Memory Launches Officially, It Can Remember Every Conversation You Have!

Alibaba Cloud launched Qwen Chat Memory on Oct 16, enhancing AI assistants with long-term memory to understand context, retain key info, and recall past conversations for better interaction.....

Oct 16, 2025

Tesla Optimus Robot Edition Children's Clothing Launched, Symbolizing AI Entering Daily Life

Tesla launched children's clothing inspired by the humanoid robot Optimus. The accompanying text says, 'Your child can now dress like me,' sparking discussions. This move is seen as a fun marketing strategy for Tesla to promote the robot image. CEO Musk previously stated that the company is accelerating the mass production of the robot.

Oct 16, 2025

Hong Kong Monetary Authority Announces AI Sandbox List, Ant Digital Becomes Core Technology Partner

HKMA and Cyberport launched Phase 2 of their generative AI sandbox, with 27 use cases from 20 banks and 14 tech partners including Ant Digital and BOCHK. Ant Digital will provide AI agent services and security products to enhance banking efficiency, user experience, and risk control.....

Oct 16, 2025

120

Apple Launches M5 Chip MacBook Pro: First AI-Optimized Mac Processor, Battery Life Up to 24 Hours

Apple unveils 14-inch MacBook Pro with M5 chip, optimized for AI tasks. Features 10-core CPU/GPU with neural engines, 3rd-gen ray tracing, and 24-hour battery.....

Oct 16, 2025

120

Qoder CLI Launches! Alibaba's Fastest AI Programming Assistant with a Response Time of Just 200 Milliseconds

Alibaba's Qoder CLI, an AI-powered command-line tool, integrates proprietary models to streamline coding, debugging, and deployment, enhancing developer productivity by blurring IDE boundaries.....

Oct 16, 2025

110

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Tencent Hunyuan Open-Sources End-to-End Video Sound Effect Generation Model HunyuanVideo-Foley

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Baidu Search Launches AI Short Drama Platform, Begins Public Testing to Assist Creators

Qwen Launches Qwen Chat Memory Feature

Wondershare Launches Video Tutorial Co-creation Incentive Program to Accelerate the Development of an Inclusive AI Video Creation Ecosystem

New Strategy for Oracle's AI Supercluster Construction

AI Daily: Google Releases Veo 3.1; Tongyi Qianwen Introduces Qwen Chat Memory Feature; Sora2 Free Users Can Generate 15-Second Videos

New Breakthrough in AI Assistant! Qwen Chat Memory Launches Officially, It Can Remember Every Conversation You Have!

Tesla Optimus Robot Edition Children's Clothing Launched, Symbolizing AI Entering Daily Life

Hong Kong Monetary Authority Announces AI Sandbox List, Ant Digital Becomes Core Technology Partner

Apple Launches M5 Chip MacBook Pro: First AI-Optimized Mac Processor, Battery Life Up to 24 Hours

Qoder CLI Launches! Alibaba's Fastest AI Programming Assistant with a Response Time of Just 200 Milliseconds

GEO Services