Tencent Releases New Video Generation Model HunyuanVideo1.5 to Lower the Barriers to Video Creation

AIbase基地

Published inAI News · 4 min read · Nov 21, 2025

Tencent Hunyuan team has officially released its latest video generation model, HunyuanVideo1.5, marking another important breakthrough in video generation technology. This lightweight model, based on the Diffusion Transformer (DiT) architecture, has 8.3B parameters and can generate high-definition videos of 5 to 10 seconds. It is now available on Tencent's "Yuanbao" platform for user experience.

HunyuanVideo1.5 supports multiple generation methods. Users can generate videos from text descriptions (Prompt), or combine uploaded images with text to easily convert static images into dynamic videos. This innovative technology not only meets the needs of both Chinese and English input, but also demonstrates consistency between images and videos, ensuring that the generated videos match the original image in tone, lighting, scene, subject, and details.

In practical applications, users can generate complex scenes based on prompts. For example, a prompt describes how a miniature English garden grows inside a suitcase, and the model can accurately present this process, demonstrating a high level of instruction understanding and compliance. In addition, HunyuanVideo1.5 supports realistic and animated styles, and can generate Chinese and English text in videos, greatly enriching the possibilities of content creation.

Technically, HunyuanVideo1.5 adopts an innovative SSTA sparse attention mechanism, significantly improving inference efficiency, and combines a multi-stage progressive training strategy, achieving commercial-level performance in key dimensions such as motion coherence and semantic compliance. The deployment threshold of this model has been significantly reduced, requiring only a consumer-grade graphics card with 14G VRAM to run smoothly, allowing every developer and creator to participate in the innovation of video generation.

It is reported that previous open-source SOTA flagship models in the field of video generation usually required over 20B parameters and 50GB of GPU memory. The release of HunyuanVideo1.5 not only achieved a qualitative leap in generation effects, but also found a balance between performance and size. The model has now been uploaded to Hugging Face and GitHub, and developers are welcome to download and experience it.

With the release of HunyuanVideo1.5, Tencent further strengthens its leading position in artificial intelligence and video generation, providing content creators with more powerful tools and infinite creative possibilities. In the future, as technology continues to develop, the application scenarios of video generation will be more widespread. We look forward to HunyuanVideo1.5 bringing new changes to the industry.

Tencent Yuanbao Launches New Feature: Generate a Video with Just One Sentence or One Image!

Tencent Yuanbao launches a new feature, allowing users to generate high-definition videos with just one sentence or one image. Based on the open-source model HunyuanVideo1.5, it uses a DiT architecture, with 830 million parameters, and supports video generation of 5-10 seconds, simplifying the content creation process.

MOSS-Speech Open Source: China's First Speech-to-Speech Large Model, Bypassing Text Intermediate

The MOSS team from Fudan University released MOSS-Speech, which realizes end-to-end speech dialogue for the first time. The model is now available and open-sourced on Hugging Face. It adopts a 'layer splitting' architecture, freezing the original text model and adding new layers for speech understanding, semantic alignment, and vocoder. It can complete speech Q&A, emotional imitation, and laughter generation in one step, without the traditional three-step process. Evaluation results show that the word error rate has been reduced to 4.1% in the ZeroSpeech2025 task, and the emotion recognition accuracy reached 91.2%.

Lambda Secures $1.5 Billion in Funding to Support AI Data Center Development

AI data center provider Lambda has completed a $1.5 billion funding round, led by the new investment firm TWG Global. TWG Global was founded by individuals from the entertainment and finance industries and manages $40 billion in assets. It focuses on AI-related investments and has partners including xAI and Palantir, aiming to provide AI agent services to enterprises.

Weibo Open Sources Vibe Thinker: 1.5 Billion Parameters Outperform DeepSeek R1 with a Training Cost of Only $7,800

Weibo launches the open-source large model Vibe Thinker, which has only 1.5 billion parameters but outperforms the 671 billion parameter DeepSeek R1 in mathematical competition benchmarks, with higher accuracy and a training cost of only $7,800. It adopts a lightweight MoE architecture and knowledge distillation technology, requiring only 5GB of mathematical corpus for fine-tuning. It supports downloading from Hugging Face and commercial use. The model performs outstandingly in international math competitions such as AIME.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Tencent Releases New Video Generation Model HunyuanVideo1.5 to Lower the Barriers to Video Creation

AIbase基地

This article is from AIbase Daily

AI News Recommendations

AI Daily: Tencent Yuanfang Launches Video Model HunyuanVideo 1.5; Google Nano Banana Pro Released; Quark AI Glasses Partner with AutoNavi to Strengthen Collaboration

Yuanbao Launches a One-Sentence Video Creation Feature, Making Video Creation Simple and Fun

Tencent Yuanbao Launches New Feature: Generate a Video with Just One Sentence or One Image!

MOSS-Speech Open Source: China's First Speech-to-Speech Large Model, Bypassing Text Intermediate

Lambda Successfully Raises $1.5 Billion to Support Large-Scale AI Infrastructure Development

Prime Video Launches AI Video Highlights: Automatically Generates Theatrical-Quality Season Highlights, First Coverage Includes 'Fallout' and 'Jack Ryan'

NVIDIA and Microsoft Jointly Bet on AI: $1.5 Billion Investment in Anthropic

Google Video Editing Platform Vids Opens New Features to Everyone, Including AI Voice Dubbing, Removing Redundant Spoken Words, AI Image Editing, etc.

Lambda Secures $1.5 Billion in Funding to Support AI Data Center Development

Weibo Open Sources Vibe Thinker: 1.5 Billion Parameters Outperform DeepSeek R1 with a Training Cost of Only $7,800

GEO Services