On March 24, 2026, the Meituan LongCat (LongCat) team officially open-sourced a deep learning model specifically designed for mathematical formalization and theorem proving - LongCat-Flash-Prover. This model addresses the shortcomings of large language models in rigorous logical reasoning by decomposing formal reasoning into three atomic capabilities: auto-formalization, sketching, and proving, achieving a paradigm shift from "probabilistic prediction of answers" to "rigorous logical proofs."

QQ20260324-102744.jpg

Under the combined tool-integrated reasoning (TIR) strategy, the model achieved a 97.1% pass rate on the MiniF2F-Test benchmark with only 72 reasoning budgets, setting a new SOTA record for open-source Prover models. Additionally, its performance significantly surpassed existing open-source models on high-difficulty competition-level tasks such as MathOlympiad-Bench and PutnamBench.

QQ20260324-102750.jpg

Technically, LongCat-Flash-Prover adopts a "hybrid expert iteration" framework based on TIR. By integrating Lean4Server verification, semantic and theorem consistency checks, and legality verification against nine types of cheating behaviors, the model effectively addresses logic loopholes and code deception issues. During the training phase, the team introduced a hierarchical masking strategy and token-level staleness control, significantly improving the stability of reinforcement learning under the MoE architecture.

As AI reasoning capabilities shift from handling natural language ambiguity to verifiable formal languages, such Prover models are gradually surpassing the algorithmic benchmarking paradigm, transforming into the "foundation infrastructure" for fundamental scientific research. This breakthrough indicates that the era of AI deeply participating in frontier mathematical exploration and automated document verification is accelerating.

  • GitHub:

    https://github.com/meituan-longcat/LongCat-Flash-Prover

  • Hugging Face: https://huggingface.co/meituan-longcat/LongCat-Flash-Prover

  • Report:

    https://github.com/meituan-longcat/LongCat-Flash-Prover/blob/main/LongCat_Flash_Prover_Technical_Report.pdf