Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Information

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

Tools

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

Tools

AI Brand Monitoring Tool

Analyze & Track How AI Models Cite Your Brand

AI Search Visibility Checker

Detect brand's visibility on AI platforms

Service

GEO Services

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

Tools

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

AI Deployment Calculator

Enter Your Large Model Computing Requirements for Instant GPU, Memory & Server Configuration Recommendations

AI Tutorial

Meta Launches DeepConf Technology, Smartly Balancing the Inference Cost and Accuracy of Large Language Models

AIbase基地

Published inAI News · 5 min read · Sep 4, 2025

Recently, Meta AI, in collaboration with the University of California, San Diego (UCSD), has introduced a new technology called Deep Think with Confidence (DeepConf), aimed at helping enterprises effectively reduce computing costs while maintaining high accuracy in complex reasoning tasks with large language models (LLMs).

Currently, improving LLM reasoning capabilities often relies on the "consistency through multiple samplings and majority voting" strategy, but this approach can lead to rapid growth in computational resources, taking time and effort. Low-quality reasoning paths may even result in incorrect answers being selected. The innovation of DeepConf lies in its ability to no longer treat all reasoning paths equally, but instead to filter and adjust weights based on confidence signals within the model.

DeepConf introduces various refined confidence metrics, such as:

Group Confidence: calculates the average confidence of a segment of tokens during the reasoning process;
Tail Confidence: focuses on the confidence level at the end of the reasoning;
Lowest Group Confidence: identifies the most "fragile" part of the reasoning path;
Bottom-10% Confidence: focuses on the least confident parts of the reasoning.

DeepConf supports two execution modes:

Offline Thinking: first generate multiple complete reasoning paths, then select the best ones based on confidence for voting or weighted voting;
Online Thinking: evaluate in real-time during the reasoning process. If the confidence of the current path falls below a threshold, it is immediately terminated to save resources.

In multiple open-source models (such as DeepSeek-8B, Qwen3-32B, GPT-OSS-120B) and complex mathematical and STEM reasoning tasks (AIME, HMMT, BRUMO25, GPQA-Diamond), DeepConf has shown impressive results:

In offline mode, using GPT-OSS-120B, the accuracy on AIME2025 reached 99.9%, while the number of generated tokens was reduced by 84.7% compared to traditional methods;
In online mode, DeepSeek-8B improved the accuracy by 5.8 percentage points on AIME24, while using 77.9% fewer tokens.

Enterprises can choose different settings based on their risk preferences:

DeepConf-high (Conservative Mode): typically reduces generation costs by about 50%, with almost no impact on accuracy, suitable for high-risk scenarios like finance and law;
DeepConf-low (Aggressive Mode): saves 70%–85% of tokens, suitable for scenarios that require speed but have more flexible error tolerance, such as draft answers and knowledge retrieval.

Using DeepConf does not require retraining the model; it only requires adding a small amount of logic processing during inference. In addition, it has good compatibility and can be seamlessly integrated with existing inference frameworks (such as vLLM, SGLang, TensorRT-LLM). As the researchers stated, this provides a "plug-and-play" efficient solution for real-world enterprise deployment of LLM inference tasks.

Paper: https://arxiv.org/abs/2508.15260

Key Points:

🧠 Confidence-Guided Selection: DeepConf filters or ranks reasoning paths based on local confidence (group, tail, lowest point, etc.), rather than using a one-size-fits-all majority voting approach.
⏱ Significantly Improved Efficiency: Achieves up to 99.9% reasoning accuracy, while reducing the number of generated tokens by as much as 84.7%.
️🎛 Adjustable Strategy Modes: Enterprises can choose between "high security" or "high efficiency" modes based on their risk preferences, achieving optimal results with minimal resources.

DeepConf MetaAI Large Language Models LLM

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Tencent's Overseas Large Model Brand Rebranding: Yuan (Hunyuan) Simplified to Tencent HY

Tencent rebrands its overseas AI model 'Hunyuan' as 'Tencent HY' to enhance global identity, coinciding with the launch of Tencent HY 2.0 on December 5, now integrated into apps like Yuanbao and im.....

Dec 10, 2025

50 Million Yuan Robot + Large Model Order Finalized! UBTech Walker S2 to Be Delivered Within the Year

UBTECH signs a contract exceeding 50 million yuan with a leading AI company to deliver the world's first self-charging industrial humanoid robot, Walker S2, within the year. The company also opens robot data interfaces for deep integration of AI models, creating a 'embodied intelligence + vertical model + data' loop. Current production capacity exceeds 300 units/month, with phased deliveries planned.....

Dec 10, 2025

Meta's 'Llama' Coming to an End? New Large Model Code-named Avocado Announced for Q1 2026 or Switching to Closed Source to Directly Compete with OpenAI

Meta's next flagship model, codenamed Avocado, is set to launch in Q1 2026 as the successor to the Llama series. It will be a closed-source commercial model, competing directly with GPT-5 and Gemini in the closed-model ecosystem.....

Dec 10, 2025

120

NEO, the World's First Native Multimodal Architecture, Makes Its Debut with Perfect Integration of Vision and Language

AI expert Ilya Sutskever states that the era of simply scaling models is over, with future breakthroughs relying on smarter architecture design. This shift marks a move from a 'scale-only' approach to new pathways. In this context, a Chinese team has introduced the open-source native multimodal architecture NEO, offering a new direction for innovation in the field.....

Dec 9, 2025

140

Aging US Power Grid Threatens OpenAI's and Microsoft's Expansion Plans

U.S. grid strained by tech giants' data center investments, with AI demand projected at 44GW by 2028 but only 25GW available, leaving a 19GW shortfall. Amazon and Google face power supply challenges.....

Dec 9, 2025

150

MIT Introduces New Method to Significantly Improve the Computational Efficiency of Large Language Models

MIT researchers developed an instance-adaptive scaling technique that dynamically adjusts computational resources for large language models based on problem complexity, enhancing efficiency and reducing energy consumption. Supported by multiple institutions, the related paper was released in early November.....

Dec 9, 2025

170

Report: Huawei 2012 Lab Establishes a Fundamental Large Model Department to Accelerate AI Infrastructure Technology Deployment

Huawei establishes a 'Fundamental Model Department' to enhance AI infrastructure, focusing on general AI and foundational algorithms for future tech competition. It also recruits global AI talent, especially young researchers with strong academic backgrounds and innovative capabilities.....

Dec 9, 2025

110

Design → Tape-out in 1 Month! Shanghai Jiao Tong University Releases the World's First Photonic Chip Vertical Large Model LightSeek, R&D Efficiency Improved by 7 Times

Shanghai Jiao Tong University's Wuxi Photonic Chip Institute launches LightSeek, the world's first full-link professional large model for photonic chips. Based on a 100-billion-parameter multimodal architecture and real process data from a 110nm pilot line, it cuts chip R&D cycles from 6-8 months to 1 month, boosting efficiency sevenfold and marking the era of AI vertical models in photonic chips.....

Dec 9, 2025

170

Father of Xiaoice, Li Di, Starts a New Venture: Makes a Surprise Appearance After 8 Months of Seclusion, Invests in 'Cognitive Large Model' with Original Team of 30 People

Li Di debuts as Nextie founder, launching a project on 'collective intelligence and cognitive large models' with MiraclePlus investment, aiming for $10M in next funding. Core team includes former Xiaoice members, focusing on cognitive AI development.....

Dec 9, 2025

140

Large Model API (LLM API): From Individual Developers to Enterprise AI Applications - n1n.ai Accompanies Your AI Product Full Lifecycle

Enterprises face challenges in cost-effectively integrating large models into products, with API choices impacting development speed, profitability, and market competitiveness.....

Dec 8, 2025

250

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Meta Launches DeepConf Technology, Smartly Balancing the Inference Cost and Accuracy of Large Language Models

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Tencent's Overseas Large Model Brand Rebranding: Yuan (Hunyuan) Simplified to Tencent HY

50 Million Yuan Robot + Large Model Order Finalized! UBTech Walker S2 to Be Delivered Within the Year

Meta's 'Llama' Coming to an End? New Large Model Code-named Avocado Announced for Q1 2026 or Switching to Closed Source to Directly Compete with OpenAI

NEO, the World's First Native Multimodal Architecture, Makes Its Debut with Perfect Integration of Vision and Language

Aging US Power Grid Threatens OpenAI's and Microsoft's Expansion Plans

MIT Introduces New Method to Significantly Improve the Computational Efficiency of Large Language Models

Report: Huawei 2012 Lab Establishes a Fundamental Large Model Department to Accelerate AI Infrastructure Technology Deployment

Design → Tape-out in 1 Month! Shanghai Jiao Tong University Releases the World's First Photonic Chip Vertical Large Model LightSeek, R&D Efficiency Improved by 7 Times

Father of Xiaoice, Li Di, Starts a New Venture: Makes a Surprise Appearance After 8 Months of Seclusion, Invests in 'Cognitive Large Model' with Original Team of 30 People

Large Model API (LLM API): From Individual Developers to Enterprise AI Applications - n1n.ai Accompanies Your AI Product Full Lifecycle

GEO Services