Pioneering Diffusion Thought Chain: Making AI More Creative and Flexible

AIbase基地

Published inAI News · 4 min read · May 27, 2025

In recent years of artificial intelligence research, the concept of "chains of thought" has gained increasing attention, especially in the training and reasoning of large language models. Recently, Professor Guojun Qi's team from the MAPLE Lab at Westlake University proposed a novel "diffusive divergent chain of thought," a completely new reasoning method specifically designed for diffusion language models.

Traditional large language models usually adopt a linear chain of thought, generating answers step by step through reasoning. However, human thinking processes are often more complex, filled with non-linearity and jumping characteristics. Professor Qi's team believes that mimicking this divergent thinking will help enhance the model's creativity and problem-solving abilities.

The core of the diffusive divergent chain of thought lies in allowing the model to generate intermediate results in any order during the reasoning process, without following traditional syntactic structures or readability requirements. Through this approach, the model can explore more diverse thinking paths and produce more creative and flexible answers. This method has been successfully applied in various diffusion language models, particularly excelling in mathematical reasoning and code generation tasks compared to existing models.

In its implementation, the team optimized the entire generation process using reinforcement learning. The model starts from an uninformative mask sequence, progressively generates key information, and derives the final answer during the denoising process. Unlike traditional chains of thought, the diffusive chain of thought can utilize intermediate generated content to enhance the accuracy of the final answer.

The research team's findings indicate that the diffusive divergent chain of thought not only enhances the model's reasoning ability but also provides important insights for future model training. This innovative approach, particularly in Google's recently released Gemini Diffusion model, suggests broader application potential. In the future, the diffusive chain of thought is expected to become a standard training procedure for diffusion language models.

arXiv link: https://arxiv.org/abs/2505.10446

GitHub link: https://github.com/maple-research-lab/LLaDOU

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Moonshot AI officially released its latest creation - the Kimi K2 model, and simultaneously announced its open source. This foundation model based on the MoE architecture has gained widespread attention in the AI field since its release, thanks to its strong coding capabilities and excellent general Agent task processing abilities. The Kimi K2 model has a total of 1T parameters, with 32B activated parameters. It has achieved top performance among open-source models in a series of benchmark performance tests such as SWE Bench Verified, Tau2, and AceBench.

Tencent Hunyuan-A13B Model API Launches

Recently, Tencent Cloud officially launched the API service for the Tencent Hunyuan A13B model on its official website. The input price is set at 0.5 yuan per million Tokens, and the output price is 2 yuan per million Tokens, which has quickly sparked enthusiastic discussions in the developer community. As the first 13B-level MoE (Mixture of Experts) open-source hybrid inference model in the industry, Hunyuan-A13B features a total of 80B parameters and only 13B activated parameters, achieving performance comparable to leading open-source models of the same architecture, while also demonstrating efficient reasoning capabilities.

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

1. Zhipu launches free AI Slides for PPT generation. 2. Keling AI introduces KeTu 2.1 with 180 styles. 3. NVIDIA's DiffusionRenderer enables 3D scene editing. 4. Modao AI offers 30-second prototype generation. 5. Higgsfield creates avatars from 10 photos. 6. Google open-sources GenAI Processors. 7. Google Veo3 adds image-to-video. 8. Mistral AI releases Devstral2507 for code generation.....

Mistral AI Releases Devstral2507: Designed for Code-Centric Language Modeling

Mistral AI launched the Devstral2507 series with two AI models: the open-source Devstral Small1.1 (24 billion parameters, SWE-Bench score of 53.6%) and the enterprise version Devstral Medium2507 (score of 61.6%). Small1.1 supports a 128k context window and local deployment, while Medium2507 outperforms some commercial models. Both are optimized for code reasoning and program synthesis, and support integration with agent frameworks.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Pioneering Diffusion Thought Chain: Making AI More Creative and Flexible

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Tencent Hunyuan-A13B Model API Launches

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

Mistral AI Releases Devstral2507: Designed for Code-Centric Language Modeling

Microsoft BioEmu Model Dramatically Shortens Protein Simulation Time

City Commercial Banks Are Launching a Trend of Large Model Bidding, with Million-Level Investments Becoming a New Industry Opportunity!

Personification of Large AI Models: Grok 4 and Empathy with Musk?

Kling AI Releases KTu 2.1 Model: Significant Improvement in Image Generation Capabilities, Supports 180 Styles

Keling AI Launches Keltu 2.1 Model, Will Be Free for All Members for 7 Days

vivo New Multimodal Model Launches! AI's Ability to Understand GUI Interfaces is Upgraded Again!