Alibaba Cloud announced that its highly anticipated video generation AI model Wan2.2 is set to be officially released soon. As an upgraded version of Wan2.1, Wan2.2 is expected to achieve significant breakthroughs in performance, efficiency, and functionality. It continues to uphold Alibaba's open-source AI strategy, further solidifying its leading position in the global AI video generation field. Following the successful open-sourcing of Wan2.1 in February 2025, the release of Wan2.2 has sparked热烈 discussions within the developer community and industry.

Wan2.2: Technical Upgrades, Performance Breakthroughs

Wan2.1, with its spatiotemporal variational autoencoder (VAE) and diffusion transformer (DiT) architecture, has already surpassed OpenAI's Sora (84.28%) on the VBench benchmark test with a score of 84.7%. According to social media discussions, Wan2.2 is expected to further optimize these technologies, significantly improving video generation speed and quality, especially in high-resolution (such as 1080p) and long video generation. Expected new features include:

  • Text-to-Video (T2V): Supports higher resolutions (such as 1080p and 4K) and longer video generation, with reduced generation time.
  • Image-to-Video (I2V): Improves the smoothness and realism of dynamic scenes, supporting more complex actions and scene transitions.
  • Video-to-Audio (V2A): Enhances the ability to generate matching audio from video content, improving the multimodal creation experience.
  • Multi-language and Style Expansion: Supports text effect generation in more languages and adds diverse artistic style templates, such as cyberpunk and realistic animation.
  • Hardware Optimization: Further reduces hardware requirements, with the T2V-1.3B model expected to run on devices with lower memory (such as 6GB), expanding the user base.

The training data for Wan2.2 is expected to be further expanded based on Wan2.1 (1.5 billion videos, 10 billion images), with optimized data selection to enhance the diversity and authenticity of generated content.

Wan2.2 will continue to use the Apache 2.0 license, providing code and model weights free of charge through Alibaba Cloud ModelScope and Hugging Face, supporting both academic research and commercial applications. Wan2.1 has launched four variants: T2V-1.3B, T2V-14B, I2V-14B-720P, and I2V-14B-480P. Wan2.2 is expected to add more model variants, further optimizing for different hardware and scenarios.

image.png

Developers are full of anticipation for the open-source prospects of Wan2.2, believing it will further challenge the market dominance of closed models like OpenAI Sora and promote the democratization of AI video generation technology. Alibaba's move not only lowers the technical barrier but also provides more innovation space for developers around the world.