Ant Group Launches dInfer: Significantly Speeding Up the Inference of Diffusion Language Models by 10 Times!

AIbase基地

Published inAI News · 4 min read · Oct 13, 2025

Recently, Ant Group officially open-sourced the industry's first high-performance diffusion language model inference framework - dInfer. The release of this framework not only marks a significant breakthrough in the inference speed of diffusion language models, but also signifies an important step for this emerging technology towards practical applications.

In the latest benchmark tests, dInfer's inference speed is 10.7 times faster than NVIDIA's Fast-dLLM framework. In the code generation task HumanEval, dInfer achieved a speed of 1011 Tokens per second in a single inference, which is the first time in the open-source community that the inference speed of a diffusion language model has significantly surpassed traditional autoregressive models. Such progress has generated great expectations for the future of diffusion language models, and it is believed that they will become an important technological path towards Artificial General Intelligence (AGI).

The unique aspect of diffusion language models lies in their view of text generation as a "denoising process" to gradually recover a complete sequence from random noise, with characteristics of high parallelism, global perspective, and flexible structure. Although theoretically having great potential, dLLM has been limited in practical inference by high computational costs, KV cache failure, and parallel decoding, among other challenges. These issues have prevented the full potential of diffusion language models from being realized, and breakthroughs are urgently needed.

To address these challenges, dInfer is specifically designed for diffusion language models and includes four core modules: model access, KV cache manager, diffusion iteration manager, and decoding strategy. This modular design allows developers to flexibly combine and optimize each module, while conducting standardized evaluations on a unified platform, much like building with LEGO toys.

On a node equipped with eight NVIDIA H800 GPUs, dInfer performs exceptionally well. Compared to Fast-dLLM, dInfer achieves an average inference speed of 681 Tokens per second under comparable performance, while Fast-dLLM's speed is only 63.6 Tokens per second. Furthermore, compared to the autoregressive model Qwen2.5-3B running on the industry-leading inference service framework vLLM, dInfer's speed is 2.5 times faster.

Ant Group stated that the release of dInfer is an important step in connecting cutting-edge research with industrial applications. They look forward to working with developers and researchers around the world to explore the huge potential of diffusion language models and build a more efficient and open AI ecosystem.

dInfer Diffusion Language Models New AI Terms Ant Group

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Microsoft Plans to Train 3 Million Africans in AI Tools to Drive Digital Transformation

Microsoft is accelerating its expansion in the African AI market, planning to train 3 million people within the year to use AI tools, aiming to capture the demographic dividend of young people and compete with platforms like DeepSeek from China. The company is taking a multi-pronged approach through education, collaboration, and investment in computing power to secure leadership in the future market.

Mar 12, 2026

AI Daily: Tencent WorkBuddy Supports WeChat One-Click Connection; Xiaohongshu Launches Shrimp Ban; ComfyUI Launches App Mode

Welcome to the [AI Daily] column! Here is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you understand technological trends and innovative AI product applications. 8. NVIDIA releases open-source large model Nemotron3Super: Performance Close to GPT-5.4. NVIDIA has released the open-source large model Nemotron3Super designed for AI agents, quickly becoming a focal point in the open-source community due to its excellent reasoning efficiency and task success rate.

Mar 12, 2026

NVIDIA Releases Open-Source Large Model Nemotron 3 Super: Performance Approaching GPT-5.4

NVIDIA releases its new open-source large model, Nemotron 3 Super, designed specifically for AI agents. The model uses an innovative Mamba-MoE hybrid architecture, with a total of 120 billion parameters and only 12 billion activated parameters, significantly improving inference efficiency, increasing speed by 300%, while maintaining excellent task success rates, becoming a focus in the open-source community.

Mar 12, 2026

Tencent WorkBuddy Launches Major Update: Direct Connection with WeChat, Making Xiaolongxia Work Around the Clock

Tencent's self-developed AI work platform WorkBuddy has been fully upgraded, achieving deep integration with WeChat and full-scale release. Users can now send instructions via their mobile phones to remotely control office computers, breaking through PC terminal restrictions and entering an era of round-the-clock mobile AI assistant.

Mar 12, 2026

A $6 Billion Bet! Netflix Acquires Ben Affleck's AI Production Company, Will Hollywood Change?

On March 12, Netflix announced the acquisition of InterPositive, an AI film production company founded by Ben Affleck, with a deal valued up to $6 billion. This move marks the shift of AI technology from a behind-the-scenes tool to a core element of content creation. The involvement of an Academy Award winner is pushing AI filmmaking into the 'mainstream era,' highlighting Netflix's strategic positioning in the field of AI content.

Mar 12, 2026

In-House Computing Power Advances Again: Meta Releases New AI Chip, Performance Directly Challenges NVIDIA H100

Meta unveils its latest in-house AI chip, MTIA3, designed to reduce reliance on external computing power. Tailored for recommendation systems and inference tasks, it outperforms NVIDIA's H100 in internal tests.....

Mar 12, 2026

Compute Architecture Slimming Down: Ayar Labs Joins with Weiyin to Redesign AI Rack with Optical Interconnect

Ayar Labs and Weiyin have developed a rack-level optical interconnect system, replacing traditional copper wires with optical interconnects, to address bandwidth and power consumption bottlenecks in AI data centers, promoting the technological innovation of 'optics advancing, copper retreating'.

Mar 12, 2026

Qingdao Takes Action! The First Individual + AI Science and Technology Innovation Fund is Launched, Investing as Low as 50,000 Yuan to Drive the Era of Super Individuals

Qingdao has established the country's first "Individual + AI" venture capital fund, supporting individuals to start AI ventures with as little as 50,000 yuan, exploring the future model of "one-person companies," demonstrating forward-looking financial innovation and business model strategies.

Mar 12, 2026

100

Hume AI Open Sources TADA: 5x Speed, Zero Hallucination TTS That Can Run 700-Second Audio on Mobile Devices

Hume AI opensources the TADA speech generation model, which uses a text-acoustic dual alignment architecture, significantly improving the efficiency and reliability of TTS systems. By achieving 1:1 strict synchronization between text tokens and acoustic representations, it effectively solves the content hallucination problem in traditional LLM-based TTS. The model has been validated through more than a thousand samples and shows excellent performance.

Mar 12, 2026

Say Goodbye to Complex Node Graphs! ComfyUI Launches App Mode to Turn AI Workflows into Standalone Applications

ComfyUI introduces three major features: App Mode, App Builder, and ComfyHub, aiming to transform complex AI node graphs into simple and user-friendly applications, lowering the entry barrier and promoting the transition of tools from professional productivity to consumer-level platforms.

Mar 12, 2026

120

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Ant Group Launches dInfer: Significantly Speeding Up the Inference of Diffusion Language Models by 10 Times!

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Microsoft Plans to Train 3 Million Africans in AI Tools to Drive Digital Transformation

AI Daily: Tencent WorkBuddy Supports WeChat One-Click Connection; Xiaohongshu Launches Shrimp Ban; ComfyUI Launches App Mode

NVIDIA Releases Open-Source Large Model Nemotron 3 Super: Performance Approaching GPT-5.4

Tencent WorkBuddy Launches Major Update: Direct Connection with WeChat, Making Xiaolongxia Work Around the Clock

A $6 Billion Bet! Netflix Acquires Ben Affleck's AI Production Company, Will Hollywood Change?

In-House Computing Power Advances Again: Meta Releases New AI Chip, Performance Directly Challenges NVIDIA H100

Compute Architecture Slimming Down: Ayar Labs Joins with Weiyin to Redesign AI Rack with Optical Interconnect

Qingdao Takes Action! The First Individual + AI Science and Technology Innovation Fund is Launched, Investing as Low as 50,000 Yuan to Drive the Era of Super Individuals

Hume AI Open Sources TADA: 5x Speed, Zero Hallucination TTS That Can Run 700-Second Audio on Mobile Devices

Say Goodbye to Complex Node Graphs! ComfyUI Launches App Mode to Turn AI Workflows into Standalone Applications

AI News Recommendations

Microsoft Plans to Train 3 Million Africans in AI Tools to Drive Digital Transformation

AI Daily: Tencent WorkBuddy Supports WeChat One-Click Connection; Xiaohongshu Launches Shrimp Ban; ComfyUI Launches App Mode

NVIDIA Releases Open-Source Large Model Nemotron 3 Super: Performance Approaching GPT-5.4

Tencent WorkBuddy Launches Major Update: Direct Connection with WeChat, Making Xiaolongxia Work Around the Clock

A $6 Billion Bet! Netflix Acquires Ben Affleck's AI Production Company, Will Hollywood Change?

In-House Computing Power Advances Again: Meta Releases New AI Chip, Performance Directly Challenges NVIDIA H100

Compute Architecture Slimming Down: Ayar Labs Joins with Weiyin to Redesign AI Rack with Optical Interconnect

Qingdao Takes Action! The First Individual + AI Science and Technology Innovation Fund is Launched, Investing as Low as 50,000 Yuan to Drive the Era of Super Individuals

Hume AI Open Sources TADA: 5x Speed, Zero Hallucination TTS That Can Run 700-Second Audio on Mobile Devices

Say Goodbye to Complex Node Graphs! ComfyUI Launches App Mode to Turn AI Workflows into Standalone Applications

GEO Services