Although multimodal large models have made significant progress in areas such as image question answering and visual understanding, they still have clear shortcomings in mathematical reasoning, a core challenge. A joint research team from Beijing University of Posts and Telecommunications, Tencent WeChat, and Tsinghua University has officially released We-Math2.0—a groundbreaking multimodal mathematical reasoning dataset and knowledge system—to address this pain point.

The core highlight of this new system is the construction of an unprecedented systematic mathematical knowledge framework. This framework covers a complete knowledge spectrum from basic primary school mathematics to advanced university mathematics, including 491 sub-knowledge points and 1819 core knowledge principles. This comprehensive knowledge system provides AI models with a solid theoretical foundation in mathematics.

image.png

Innovative Knowledge Architecture: Definition-Theorem-Application Integration

We-Math2.0 adopts a logical structure of definition-theorem-application, ensuring that mathematical concepts form a clear association network. This design not only aligns with the cognitive rules of human mathematical learning but also provides structured reasoning paths for AI models. Through this approach, models can better understand the internal connections between mathematical concepts, rather than just simple pattern matching.

To address the issue of uneven quality in existing open-source datasets, the research team manually designed questions and drawings to carefully construct the MathBook-Standard dataset. This dataset innovatively adopts a strategy of one question with multiple diagrams and one diagram with multiple questions, providing multi-angle coverage for each knowledge principle, significantly enhancing the diversity and practicality of the data.

Three-Dimensional Difficulty Modeling: Let AI Learn Gradually

Another important innovation of We-Math2.0 is the MathBook-Pro module, which conducts fine-grained three-dimensional difficulty modeling for multimodal math problems. By systematically increasing the difficulty in three dimensions—reasoning step complexity, visual complexity, and contextual complexity—the research team successfully expanded each basic question into eight samples at different difficulty levels.

This progressive difficulty design enables AI models to gradually improve their problem-solving abilities, starting from simple problems, just like human students, ultimately tackling complex multimodal mathematical challenges. This methodology holds significant importance for improving the generalization capability of models.

Hybrid Training Strategy: Dual-Drive of Supervised Learning and Reinforcement Learning

In terms of training methods, We-Math2.0 adopts an innovative hybrid training strategy. The system first performs supervised fine-tuning using 1000 high-quality data points to establish a fundamental mathematical reasoning ability, followed by the introduction of reinforcement learning algorithms for deep optimization.

Notably, the system also implements a dynamic scheduling learning mechanism, allowing the model to intelligently adjust the weights and distribution of training data based on different types of errors. This adaptive learning approach significantly improves training efficiency and effectiveness.

Experimental Validation: Significant Improvement in Multiple Metrics

Preliminary experimental results show that models optimized by We-Math2.0 have achieved significant improvements on multiple mainstream mathematical reasoning test sets. This result not only verifies the effectiveness of the new system but also provides important technical support for the development of multimodal mathematical AI.

AIbase Analysis: The release of We-Math2.0 has important academic and practical value. From an academic perspective, the system provides a standardized dataset and evaluation framework for research on multimodal mathematical reasoning; from an application perspective, this breakthrough is expected to promote the deep application of AI in fields such as mathematical education, scientific computing, and engineering applications.

By establishing a systematic knowledge framework, innovative difficulty modeling methods, and a hybrid training strategy, We-Math2.0 not only addresses the core challenges faced by current multimodal mathematical AI but also lays a solid foundation for the intelligence of mathematical education and the automation of scientific research. The successful implementation of this project marks an important step forward for AI in complex reasoning tasks.

With the open-source release of We-Math2.0, it is expected that more research teams will conduct related studies based on this platform, further driving the rapid development of multimodal mathematical AI technology.

Paper Address: https://arxiv.org/pdf/2508.10433