Leading figure in China's AI video generation field, Vidu, has recently announced a major upgrade to its Q1 model, introducing a new "Reference-to-Video" feature that allows users to upload up to seven reference images and generate 1080p videos with extremely high visual consistency. This feature breaks through the bottlenecks of traditional AI video generation in multi-scene and multi-subject consistency, providing creators with unprecedented flexibility and creative freedom.
Reference-to-Video: Unlock Complex Narratives with Seven Images
The "Reference-to-Video" feature of Vidu Q1 is the core highlight of this update. Users can upload up to seven reference images, including elements such as people, scenes, and props, and combine them with text prompts to generate high-quality videos. Vidu Q1 ensures high consistency of elements from multiple images in the video using advanced semantic fusion technology, avoiding common issues like scene breaks or character distortion in traditional AI video generation.
For example, users can upload a photo of a person, a forest background, and an animal image, and input the prompt: "A woman playing the guitar in a forest, an owl perched on a branch." Vidu Q1 can intelligently generate a video containing the guitar playing action, the forest environment, and the owl, with highly realistic details such as clothing texture, background lighting, and animal movement. This feature provides powerful tools for animation, short video, and advertising creators, significantly lowering the threshold for creating complex scenes.
Multiple-Entity Consistency: Create a Cohesive Visual Experience
Multiple-Entity Consistency technology in Vidu Q1 is one of its core competitive advantages. Users can upload different types of reference images (such as characters, objects, and environments) to generate videos that include interactions between multiple entities, with each entity's characteristics remaining stable throughout the video. For example, uploading a photo of a character, a patterned outfit, and a bicycle image, Vidu Q1 can generate a smooth video where the character wears the specified outfit and rides the bicycle, with details such as the pattern and the bicycle's design closely matching the reference images.