DeepSeek releases V3.2-exp model, pioneering sparse attention mechanism significantly reduces AI inference costs

AIbase基地

Published inAI News · 4 min read · Sep 30, 2025

Artificial intelligence company DeepSeek's research team announced on Monday that they have released a new experimental model called V3.2-exp, designed to significantly reduce the cost of long-context operations through an innovative "sparse attention" mechanism. This milestone progress was simultaneously released on Hugging Face and GitHub, along with a detailed academic paper.

DeepSeek

The core of this model lies in its unique DeepSeek sparse attention mechanism. This complex system consists of two parts: first, a module called "Lightning Indexer" prioritizes specific excerpts within the context window; second, an independent "fine-grained token selection system" selects key tokens from these prioritized excerpts and loads them into a limited attention window. The combination of these mechanisms allows the sparse attention model to efficiently process long context segments with lower server load.

In preliminary tests, the new model has shown significant advantages. DeepSeek reported that the cost of simple API calls in long-context operations can be reduced by up to half. Although more third-party testing is needed to verify these conclusions, due to the model being open-weighted and freely available on Hugging Face, its actual performance will soon be validated by the industry.

DeepSeek's breakthrough is one of a series of recent innovations aimed at addressing AI inference costs. Inference cost refers to the server costs of running a trained AI model, not the training costs. Unlike the R1 model, which focuses on reducing training costs, this new model emphasizes improving the operational efficiency of the basic Transformer architecture, providing a more economical solution for the popularization of AI applications.

DeepSeek has been the focus of attention in this year's AI boom. Its earlier released R1 model gained attention for its low-cost reinforcement learning training method. While the sparse attention method may not cause as much excitement as the R1 model, it provides valuable experience for global AI suppliers, helping to jointly reduce the operating costs of AI services.

Brave Search Upgrade: AI Q&A Features Provide a More Detailed Search Experience

Brave browser introduces a new AI search feature, offering more detailed answers in addition to the original concise responses, complementing the AI Answers function. This service does not require switching modes and automatically optimizes the search experience for users. It currently has over 15 million daily active users.

GLM 4.6 by Zhipu is Here: Domestic Chips Join Forces to Drive AI Advancement

Zhipu launches the GLM-4.6 model, using Cambrian Neural's domestic chips, achieving for the first time FP8+Int4 hybrid quantization deployment. This technological breakthrough significantly reduces inference costs while maintaining model accuracy, opening a new path for domestic chips supporting large model local execution.

Accenture Lays Off Over 11,000 Employees, Fully Shifts to Artificial Intelligence

Accenture recently laid off over 11,000 employees, reducing the total number of employees from 791,000 to 779,000. The company warned that further layoffs may occur in the future if employees cannot adapt to AI demands. This round of layoffs is part of an $865 million restructuring plan, which is expected to continue until November. CEO Julie Sweet emphasized the necessity of the transformation.

Volc Engine Launches Doubao Large Model 1.6-Vision, Achieving Major Breakthroughs in Visual Understanding

Volc Engine launches Doubao Large Model 1.6-Vision, achieving breakthroughs in the field of visual understanding. The core highlight of this model is its ability to call tools, significantly improving the accuracy and processing speed of image recognition and object detection through optimized algorithms and enhanced learning, promoting the development of AI technology applications.

Doubao Large Model 1.6-vision is officially released, with a comprehensive cost reduction of about 50% compared to the previous generation

Volcano Engine released Doubao Large Model 1.6-vision, the first visual deep thinking model in the family with tool invocation capabilities. It enhances multimodal understanding and reasoning capabilities, supports Responses API, and its core advantages include precise visual understanding through tool invocation, the ability to integrate images into the thinking chain, and support for image operations such as positioning, cropping, and selection.

Stanford Top Scientist Xu Zuhong Joins Alibaba Tongyi

Global AI expert Xu Zuhong joins the Alibaba Tongyi team, responsible for the development of multimodal interaction models, drawing attention from the technology community. As an IEEE Fellow, he has over 20 years of AI experience and previously served as a tenured professor at the Singapore Management University and an associate professor at Nanyang Technological University. This move is seen as an important strategic step for Alibaba in the field of AI.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

DeepSeek releases V3.2-exp model, pioneering sparse attention mechanism significantly reduces AI inference costs

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Brave Search Upgrade: AI Q&A Features Provide a More Detailed Search Experience

AI Interviewer Startup Alex Secures $17 Million in Series A Funding: Completes Thousands of Interviews Daily, Investors Favor the Trend of Recruitment Automation

GLM 4.6 by Zhipu is Here: Domestic Chips Join Forces to Drive AI Advancement

Accenture Lays Off Over 11,000 Employees, Fully Shifts to Artificial Intelligence

Opera Launches AI-Powered Neon Browser for a New Intelligent Browsing Experience

Volc Engine Launches Doubao Large Model 1.6-Vision, Achieving Major Breakthroughs in Visual Understanding

Doubao Large Model 1.6-vision is officially released, with a comprehensive cost reduction of about 50% compared to the previous generation

Stanford Top Scientist Xu Zuhong Joins Alibaba Tongyi

AI Daily: DeepSeek Releases V3.2-exp Model; Claude Sonnet 4.5 Released; ChatGPT Launches Instant Checkout Feature

Seko User Count Exceeds 100,000: AI Creation Enters a New One-Stop Stage

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

DeepSeek releases V3.2-exp model, pioneering sparse attention mechanism significantly reduces AI inference costs

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Brave Search Upgrade: AI Q&A Features Provide a More Detailed Search Experience

AI Interviewer Startup Alex Secures $17 Million in Series A Funding: Completes Thousands of Interviews Daily, Investors Favor the Trend of Recruitment Automation

GLM 4.6 by Zhipu is Here: Domestic Chips Join Forces to Drive AI Advancement

Accenture Lays Off Over 11,000 Employees, Fully Shifts to Artificial Intelligence

Opera Launches AI-Powered Neon Browser for a New Intelligent Browsing Experience

Volc Engine Launches Doubao Large Model 1.6-Vision, Achieving Major Breakthroughs in Visual Understanding

Doubao Large Model 1.6-vision is officially released, with a comprehensive cost reduction of about 50% compared to the previous generation

Stanford Top Scientist Xu Zuhong Joins Alibaba Tongyi

AI Daily: DeepSeek Releases V3.2-exp Model; Claude Sonnet 4.5 Released; ChatGPT Launches Instant Checkout Feature

Seko User Count Exceeds 100,000: AI Creation Enters a New One-Stop Stage

GEO Services