Major Breakthrough in Mathematical AI Reasoning! We-Math 2.0 Builds a Full-Chain Knowledge System, Achieving a Significant Leap in Multimodal Learning Capabilities

AIbase基地

Published inAI News · 7 min read · Aug 29, 2025

Although multimodal large models have made significant progress in areas such as image question answering and visual understanding, they still have clear shortcomings in mathematical reasoning, a core challenge. A joint research team from Beijing University of Posts and Telecommunications, Tencent WeChat, and Tsinghua University has officially released We-Math2.0—a groundbreaking multimodal mathematical reasoning dataset and knowledge system—to address this pain point.

The core highlight of this new system is the construction of an unprecedented systematic mathematical knowledge framework. This framework covers a complete knowledge spectrum from basic primary school mathematics to advanced university mathematics, including 491 sub-knowledge points and 1819 core knowledge principles. This comprehensive knowledge system provides AI models with a solid theoretical foundation in mathematics.

Innovative Knowledge Architecture: Definition-Theorem-Application Integration

We-Math2.0 adopts a logical structure of definition-theorem-application, ensuring that mathematical concepts form a clear association network. This design not only aligns with the cognitive rules of human mathematical learning but also provides structured reasoning paths for AI models. Through this approach, models can better understand the internal connections between mathematical concepts, rather than just simple pattern matching.

To address the issue of uneven quality in existing open-source datasets, the research team manually designed questions and drawings to carefully construct the MathBook-Standard dataset. This dataset innovatively adopts a strategy of one question with multiple diagrams and one diagram with multiple questions, providing multi-angle coverage for each knowledge principle, significantly enhancing the diversity and practicality of the data.

Three-Dimensional Difficulty Modeling: Let AI Learn Gradually

Another important innovation of We-Math2.0 is the MathBook-Pro module, which conducts fine-grained three-dimensional difficulty modeling for multimodal math problems. By systematically increasing the difficulty in three dimensions—reasoning step complexity, visual complexity, and contextual complexity—the research team successfully expanded each basic question into eight samples at different difficulty levels.

This progressive difficulty design enables AI models to gradually improve their problem-solving abilities, starting from simple problems, just like human students, ultimately tackling complex multimodal mathematical challenges. This methodology holds significant importance for improving the generalization capability of models.

Hybrid Training Strategy: Dual-Drive of Supervised Learning and Reinforcement Learning

In terms of training methods, We-Math2.0 adopts an innovative hybrid training strategy. The system first performs supervised fine-tuning using 1000 high-quality data points to establish a fundamental mathematical reasoning ability, followed by the introduction of reinforcement learning algorithms for deep optimization.

Notably, the system also implements a dynamic scheduling learning mechanism, allowing the model to intelligently adjust the weights and distribution of training data based on different types of errors. This adaptive learning approach significantly improves training efficiency and effectiveness.

Experimental Validation: Significant Improvement in Multiple Metrics

Preliminary experimental results show that models optimized by We-Math2.0 have achieved significant improvements on multiple mainstream mathematical reasoning test sets. This result not only verifies the effectiveness of the new system but also provides important technical support for the development of multimodal mathematical AI.

AIbase Analysis: The release of We-Math2.0 has important academic and practical value. From an academic perspective, the system provides a standardized dataset and evaluation framework for research on multimodal mathematical reasoning; from an application perspective, this breakthrough is expected to promote the deep application of AI in fields such as mathematical education, scientific computing, and engineering applications.

By establishing a systematic knowledge framework, innovative difficulty modeling methods, and a hybrid training strategy, We-Math2.0 not only addresses the core challenges faced by current multimodal mathematical AI but also lays a solid foundation for the intelligence of mathematical education and the automation of scientific research. The successful implementation of this project marks an important step forward for AI in complex reasoning tasks.

With the open-source release of We-Math2.0, it is expected that more research teams will conduct related studies based on this platform, further driving the rapid development of multimodal mathematical AI technology.

Paper Address: https://arxiv.org/pdf/2508.10433

NetEase Youdao Document Translation Function Now Free for All Users, Enhanced by the Ziyue Education Large Model to Improve Multilingual Communication Efficiency

On August 28, 2025, NetEase Youdao announced that its powerful document translation function is now officially available free of charge to all users. This move aims to provide users with a more efficient and accurate multilingual translation experience, especially in professional fields such as finance and economics, computer science, and medicine. Key Highlights: Enhanced by the Education Large Model, Translation Quality Has Significantly Improved. The document translation function now being made free includes the self-developed "Ziyue" education large model by NetEase Youdao. This model supports mutual translation among eight languages and claims to achieve world-leading results through optimized algorithms.

OpenAI Unveils Major Update! GPT-Realtime Speech Model Launches, Supports Image Input - AI Interaction Is About to Go Rogue!

OpenAI officially launched its latest speech model, GPT-Realtime. This multimodal speech agent model has sparked industry discussion with its powerful reasoning capabilities, support for image input, and optimized command following functionality. According to the latest information from AIbase, GPT-Realtime not only achieves breakthroughs in speech interaction, but also provides developers with a smarter and more flexible speech agent solution by integrating features such as image input, remote MCP, and SIP phone calls. GPT-Real

Microsoft Copilot is Now Available on Samsung TVs, Free to Help You Find Movies and Suggest

Microsoft's AI assistant Copilot is now available on select Samsung 2025 TVs and monitors, including Neo QLED, OLED, The Frame, and M7/M8/M9 displays. Integrated into Tizen OS, users can activate Copilot via the remote's mic button for voice and visual responses, enhancing home entertainment with interactive features.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

Building and Deploying AI

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

Major Breakthrough in Mathematical AI Reasoning! We-Math 2.0 Builds a Full-Chain Knowledge System, Achieving a Significant Leap in Multimodal Learning Capabilities

AIbase基地

Innovative Knowledge Architecture: Definition-Theorem-Application Integration

Three-Dimensional Difficulty Modeling: Let AI Learn Gradually

Hybrid Training Strategy: Dual-Drive of Supervised Learning and Reinforcement Learning

Experimental Validation: Significant Improvement in Multiple Metrics

This article is from AIbase Daily

AI News Recommendations

Baidu Search AI Assistant Fully Launches with the Super Fast Model, Search Result Generation Speed Significantly Improved

AI Daily: Hailuo AI's First and Last Frame Feature Launches; Yuan Shi Technology Releases Wenti Bai 5; OpenAI Releases New Speech Model GPT-Realtime

NetEase Youdao Document Translation Function Now Free for All Users, Enhanced by the Ziyue Education Large Model to Improve Multilingual Communication Efficiency

OpenAI Unveils Major Update! GPT-Realtime Speech Model Launches, Supports Image Input - AI Interaction Is About to Go Rogue!

SuperCLUE Multimodal Vision August Evaluation Ranking: Gemini-2.5-Pro Ranks First

Nous Research releases Hermes 4 AI model, surpassing ChatGPT with no content restrictions

xAI Launches Grok Code Fast 1: A Fast and Cost-Effective Efficient Proxy Coding Model

Singularity Mind Secures Millions in Funding to Target Children's AI Education! Developed by Tsinghua Team, English AI Companion Robot Challenges Traditional Learning Models

Microsoft Launches Its First Self-Developed AI Model MAI-Voice-1 and MAI-1-preview to Compete with OpenAI

Microsoft Copilot is Now Available on Samsung TVs, Free to Help You Find Movies and Suggest