MiniMax Xiyu Technology officially launched its next-generation cutting-edge large model, MiniMax M3, on June 1, 2026. This is the first open-source large model in China that integrates top-tier programming, 1M ultra-long context, and native multi-modal capabilities, aiming to comprehensively match overseas closed-source flagship models.

QQ20260601-092610.jpg

Regarding the context expansion bottleneck in complex intelligent agent tasks, M3 has developed a sparse attention architecture (MSA) at the underlying level. Compared with traditional solutions, it achieves more accurate KV partitioning and operator-level optimization, with a computing speed four times higher than similar open-source solutions. At a 1M context length, the computational cost per token is only one-tenth of the previous generation model. It achieves over 9 times acceleration in the prefill stage and 15 times acceleration in the decoding stage.

QQ20260601-092812.jpg

Under the mixed training of native trillion-scale interleaved data, M3's semantic space is highly integrated. It surpasses GPT-5.5 and Gemini 3.1 Pro in authoritative software engineering and multi-modal evaluations such as SWE-Bench Pro. In extreme task tests, M3 demonstrates strong long-thread autonomous planning capabilities. It not only autonomously reproduced the experiments of an ICLR top paper in 12 hours but also ran continuously for 24 hours without reference code, calling tools nearly 2,000 times. It improved the FP8 matrix multiplication hardware utilization on the Hopper architecture from 7.6% to 71.3%, and autonomously scheduled the model to complete the full "data-training-iteration" process on the open PostTrainBench.