Data to be translated: The teams of Shuicheng Yan and Mingming Cheng have released MDTv2, which has enhanced the training speed of the DiT core component in Sora and set a new record on the ImageNet benchmark. By introducing the Masked Diffusion Transformer, they have successfully addressed the challenges faced by diffusion models in learning semantic relationships. MDTv2 has made significant progress in both training speed and generation quality, demonstrating a strong performance advantage.