Challenging Conventional Wisdom! Ant Group and Renmin University to Launch the First Native MoE Diffusion Language Model in the Industry at the 2025 Bund Conference
Ant Group and Renmin University developed LLaDA-MoE, a native MoE-based diffusion language model trained on 20T data, demonstrating scalability and stability. It outperforms LLaDA1.0/1.5 and Dream-7B, rivals autoregressive models with faster inference. The model will be open-sourced soon.....