d1
Improving the reasoning capabilities of diffusion large language models using reinforcement learning.
CommonProductProductivityReasoningReinforcement Learning
This model improves the reasoning capabilities of diffusion large language models through reinforcement learning and masked self-supervised fine-tuning with high-quality reasoning trajectories. The importance of this technology lies in its ability to optimize the model's reasoning process, reduce computational costs, while ensuring the stability of learning dynamics. Suitable for users who want to improve efficiency in writing and reasoning tasks.
d1 Visit Over Time
Monthly Visits
No Data
Bounce Rate
No Data
Page per Visit
No Data
Visit Duration
No Data
d1 Visit Trend
No Visits Data
d1 Visit Geography
No Geography Data
d1 Traffic Sources
No Traffic Sources Data