DeepSeek's Next-Generation Technology Wins the ACL 2025 Best Paper Award, Enhancing Long-Text Processing Efficiency by 11 Times!

AIbase基地

Published inAI News · 4 min read · Jul 31, 2025

At the recent ACL2025 awards ceremony, a research paper co-authored by Dr. Wenfeng Liang from DeepSeek, in collaboration with Peking University and other institutions, won the Best Paper Award. This conference was unprecedented in scale, with nearly double the number of submissions, reaching 8,360 papers, highlighting the intense competition.

The paper introduced a new mechanism called Native Sparse Attention (NSA), which can increase the processing speed of long texts by an astonishing 11 times through collaborative optimization of algorithms and hardware. More encouragingly, the performance of this technology not only improved but also surpassed traditional full-attention models. With this technology, the research team successfully extended the context length to an impressive 1 million tokens, laying the foundation for future advanced models.

The core of the NSA mechanism lies in its dynamic hierarchical sparsity strategy, combined with three parallel attention branches, effectively capturing important information in the text. First is "compression attention," responsible for summarizing global information; second is "selective attention," focusing on important word blocks; and finally, "sliding attention," ensuring the integrity of local context. This design makes the model more flexible and has been deeply optimized on modern GPU hardware, achieving a native trainable mode.

In testing, NSA increased decoding speed by 11.6 times when processing texts of 64k length, and forward and backward propagation speeds increased by 9 times and 6 times respectively. More importantly, NSA performed exceptionally well in various benchmark tests. A 27B parameter model outperformed the full-attention baseline in 7 out of 9 evaluation metrics, showing obvious advantages in complex tasks such as multi-hop question answering and code understanding.

This research opens up new possibilities for long-text processing, truly achieving a win-win situation in speed and accuracy, proving the wide application prospects of the NSA mechanism in the AI field.

Paper URL: https://arxiv.org/pdf/2502.11089

DeepSeek Gains Global Attention: How a Quantitative Trading Fund's Tech Star Is Challenging U.S. AI Supremacy

The chatbot application developed by Chinese AI lab DeepSeek has topped the download charts on Apple and Google app stores this week, sparking questions from Wall Street and the tech industry about the sustainability of the U.S. AI leadership and chip demand. The company is supported by quantitative fund Huanfang Quantitative and uses efficient computing technology to train AI models. Its founder, Liang Wenfeng, established the fund in 2015 and used AI to assist in trading decisions.

DeepSeek releases V3.2-exp model, pioneering sparse attention mechanism significantly reduces AI inference costs

DeepSeek releases the experimental model V3.2-exp, which adopts an innovative 'sparse attention' mechanism to significantly reduce the cost of long context inference. The model is now available on Hugging Face and GitHub. The core is the 'lightning indexer' and optimized attention mechanisms to improve processing efficiency. This breakthrough technology is expected to promote the development of AI in the field of long text processing.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

DeepSeek's Next-Generation Technology Wins the ACL 2025 Best Paper Award, Enhancing Long-Text Processing Efficiency by 11 Times!

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Google Gemini 3.0 Pro Beta Leaks: Major Advancements in Programming Capabilities, Launching Next Week

OpenAI Launches ChatGPT Parental Controls Feature, Sparking Intense Debate on Youth Protection and User Freedom

AI Daily: DeepSeek Releases V3.2-exp Model; Claude Sonnet 4.5 Released; ChatGPT Launches Instant Checkout Feature

OpenAI's First-Half Financial Report Revealed: Sales Exceed $4.3 Billion

DeepSeek Gains Global Attention: How a Quantitative Trading Fund's Tech Star Is Challenging U.S. AI Supremacy

Cambrian announces full compatibility of the DeepSeek-V3.2-Exp model, the inference engine is open-sourced!

DeepSeek releases V3.2-exp model, pioneering sparse attention mechanism significantly reduces AI inference costs

DeepSeek Suddenly Released V3.2 and Then Temporarily Removed It

Germany Heidelberg Launches AI City Scanner, Marking a New Era for Parking Enforcement

The Alarm Bell of the Era of Large Models: Richard Sutton Calls for a Rekindling of Scientific Exploration into Intelligence Understanding

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

DeepSeek's Next-Generation Technology Wins the ACL 2025 Best Paper Award, Enhancing Long-Text Processing Efficiency by 11 Times!

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Google Gemini 3.0 Pro Beta Leaks: Major Advancements in Programming Capabilities, Launching Next Week

OpenAI Launches ChatGPT Parental Controls Feature, Sparking Intense Debate on Youth Protection and User Freedom

AI Daily: DeepSeek Releases V3.2-exp Model; Claude Sonnet 4.5 Released; ChatGPT Launches Instant Checkout Feature

OpenAI's First-Half Financial Report Revealed: Sales Exceed $4.3 Billion

DeepSeek Gains Global Attention: How a Quantitative Trading Fund's Tech Star Is Challenging U.S. AI Supremacy

Cambrian announces full compatibility of the DeepSeek-V3.2-Exp model, the inference engine is open-sourced!

DeepSeek releases V3.2-exp model, pioneering sparse attention mechanism significantly reduces AI inference costs

DeepSeek Suddenly Released V3.2 and Then Temporarily Removed It

Germany Heidelberg Launches AI City Scanner, Marking a New Era for Parking Enforcement

The Alarm Bell of the Era of Large Models: Richard Sutton Calls for a Rekindling of Scientific Exploration into Intelligence Understanding

GEO Services