Alibaba Cloud continues to invest in the AIGC open-source ecosystem. Today, the Tongyi Lab officially released its latest image editing model - Qwen-Image-Edit-2511, which focuses on solving the "slight drift" issue (i.e., the position of people or objects in the edited area shifts) present in the previous version (2509). Through multiple technical optimizations, it significantly improves the consistency and visual stability before and after editing, providing developers with a more reliable and precise controllable generation tool.
Addressing Pain Points: Say Goodbye to "Getting Worse with Each Edit"
In the early version Qwen-Image-Edit- 2509, users reported that when performing local modifications (such as changing clothes, adjusting hairstyles, or replacing backgrounds), the target object often experienced subtle but noticeable displacement or deformation, disrupting the overall harmony of the image. Qwen-Image-Edit- 2511 enhances spatial alignment mechanisms and structural preservation capabilities, ensuring that edits only affect the specified areas while the rest remains completely unchanged, achieving precise control where what you imagine is exactly what you get.
Technical Upgrades: Consistency at the Core, Balancing Generation Quality
The new version has achieved key improvements in the following areas:
- Structural Consistency Optimization: Introduces an improved reference attention mechanism to strengthen the geometric structure constraints of the original image;
- Detail Fidelity Enhancement: Retains texture, lighting, and edge sharpness during pixel-level repair;
- Instruction-Image Alignment Enhancement: Better understands complex editing instructions (e.g., "put a red beret on the lady, with the hat in a natural position").
Open Source Empowers, Promoting the Maturity of the AIGC Toolchain
Qwen-Image-Edit- 2511 has released model weights and inference code, supporting editing through text instructions or mask images, and can be widely applied in scenarios such as e-commerce outfit changes, film post-production, design prototype iteration, and social media photo editing. Developers can quickly build high-precision image editing applications based on this model without training from scratch.






