ByteDance Releases Image Editing Model SeedEdit 3.0 with Further Improvement in Detail Preservation

AIbase基地

Published inAI News · 6 min read · Jun 6, 2025

139

On June 6th, the Seed team of ByteDance officially released the image editing model SeedEdit3.0. This new version of the image editing model has made significant progress in aspects such as maintaining the subject of the image, handling background details, and following instructions, greatly improving the usability and efficiency of image editing.

SeedEdit3.0 is developed based on the text-to-image model Seedream3.0. By introducing diverse data fusion methods and specific reward models, it addresses the shortcomings of previous image editing models in terms of maintaining the subject and background, and following instructions. The model can process and generate images with a resolution of 4K, performing exceptionally well in fine processing of the edited areas and high-fidelity retention of non-edited areas. Particularly in complex scenarios such as portrait editing, background changes, and perspective and light transformations, SeedEdit3.0 demonstrates excellent capabilities.

For example, in the task of removing unwanted pedestrians from an image, SeedEdit3.0 not only accurately identifies and removes irrelevant people but also removes their shadows, showcasing its powerful ability to handle details. In converting 2D illustrations into real-life models, the model effectively retains details like clothing, accessories, and handbags, generating images with a street-style fashion sense. Additionally, SeedEdit3.0 can handle complex lighting transformations, properly preserving details from nearby houses to distant water ripples, and adjusting pixel-level rendering according to changes in light.

To achieve these capabilities, the Seed team proposed an efficient data fusion strategy during development and constructed various specialized reward models. By jointly training these reward models with diffusion models, the team specifically improved the editing quality of key tasks such as face alignment and text rendering. At the same time, SeedEdit3.0 has been optimized for inference acceleration, enabling it to achieve rapid inference within 10 seconds.

In evaluating the performance of SeedEdit3.0, the team collected hundreds of real and synthetic test images and built 23 categories of editing operation subtasks, covering common operations such as stylization, addition, replacement, deletion, as well as command-driven actions like camera movement, object displacement, and scene transitions. Machine evaluation results show that SeedEdit3.0 leads previous versions and other similar models in terms of editing retention effects and instruction response capabilities. Real-person evaluations also indicate that SeedEdit3.0's image retention capability stands out most prominently, with an availability rate reaching 56.1%, a significant improvement over previous versions.

The release of SeedEdit3.0 marks another important advancement in image editing technology in the AI field. The model not only achieves multiple technical innovations but also demonstrates high practicality and efficiency in real-world applications. Currently, the technical report of SeedEdit3.0 has been made public, and the model has begun testing on the Imdream web platform. The DouBao app will also be launched soon. Users can experience this powerful image editing tool by uploading reference images and inputting modification prompts.

Project homepage:

https://seed.bytedance.com/seededit

Technical report:

https://arxiv.org/pdf/2506.05083

Experience entry:

Imdream web platform - Image generation - Upload reference image - Select 3.0 model - Input modification prompt (gray-scale testing);

DouBao app - AI image generation - Add reference image - Input modification prompt (coming soon).

AliTongyi Opensources Audio Generation Model ThinkSound Supporting Chain-of-Thought Reasoning

Recently, the Ali Speech AI team announced the open source of ThinkSound, the world's first audio generation model supporting chain-of-thought reasoning. By introducing the chain-of-thought technology, this model breaks through the limitations of traditional video-to-audio technology in capturing dynamic visuals, achieving high-fidelity and strong synchronized spatial audio generation. This breakthrough marks a leap forward in AI audio technology, moving from 'image配音' to structured understanding of visual content.

AI Daily: Tencent Huyaun Launches 3D Generation Large Model Hunyuan3D-PolyGen; DingTalk AI Spreadsheet Makes a Big Entry; Alibaba Launches Multimodal Large Language Model HumanOmniV2

1.Tencent's Hunyuan3D-PolyGen boosts 3D modeling efficiency by 70% with BPT tech. 2.Alibaba's HumanOmniV2 achieves 69.33% accuracy in multilingual input. 3.DingTalk AI processes 1k tasks/hour with 'spreadsheet-as-document'. 4.Baidu PaddleOCR3.1 improves 37-language recognition by 30%. 5.Microsoft Deep Research opens API. 6.HKPolyU & OPPO's DLoRAL speeds video enhancement 10x. 7.Google opens MCP Toolbox for SQL. 8.Microsoft Win11 to add AI dynamic....

Tencent Hunyuan Launches the Industry's First Art-Level 3D Generation Large Model Hunyuan3D-PolyGen

On July 7, the Tencent Hunyuan 3D team announced the launch of the industry's first art-level 3D generation large model, Hunyuan3D-PolyGen. By employing self-developed high-compression representation BPT technology and a autoregressive mesh generation framework, it enables accurate generation of complex geometric models with up to ten thousand faces. The model has breakthrough solutions for core pain points in 3D asset generation, such as poor topology quality, excessive face count, and difficulty in post-editing. It has improved the modeling efficiency of artists by over 70%. The relevant capabilities have been launched on the Tencent Hunyuan 3D AI creation engine and integrated into multiple game pipelines. Traditional

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

ByteDance Releases Image Editing Model SeedEdit 3.0 with Further Improvement in Detail Preservation

AIbase基地

This article is from AIbase Daily

AI News Recommendations

AI Daily: Alibaba Tongyi Opens Source Audio Generation Model ThinkSound; Google Veo3 Generates Images into Videos; Feishu Announces Several New AI Products

Kunlun Wildfire Launches Skywork-R1V 3.0: Cross-modal Reasoning Capabilities Approaching Those of Human Experts!

Hugging Face Launches SmolLM3: A 3B-Parameter Small Model Competes with 4B Giants, 128K Context Leads a New Trend in Efficient AI!

Zhiyuan Robot Announces Patent Related to Robot Motion Control Model

Moonvalley Releases Marey Realism v1.5: Native 1080P AI Video Model, Zero Copyright Risk Leading the Industry Trend!

AliTongyi Opensources Audio Generation Model ThinkSound Supporting Chain-of-Thought Reasoning

Hugging Face releases the next generation of small parameter model SmolLM3: 128K context, dual-mode reasoning

AI Daily: Tencent Huyaun Launches 3D Generation Large Model Hunyuan3D-PolyGen; DingTalk AI Spreadsheet Makes a Big Entry; Alibaba Launches Multimodal Large Language Model HumanOmniV2

Baidu's Stock Rises, Intelligent Cloud Wins Double Champion in Large Model Market in the First Half of the Year

Tencent Hunyuan Launches the Industry's First Art-Level 3D Generation Large Model Hunyuan3D-PolyGen