On February 24th, the 360 ZhiNao team and Peking University jointly released Tiny-R1-32B-Preview, a medium-sized inference model. This model, with only 5% of the parameters, successfully approached the full performance of Deepseek-R1-671B, demonstrating the huge potential of small models in efficient inference.

The model's performance is particularly outstanding in several key areas. In mathematics, Tiny-R1-32B-Preview achieved a score of 78.1 in the AIME2024 evaluation, very close to the original R1 model's 79.8, significantly surpassing Deepseek-R1-Distill-Llama-70B's 70.0. In programming and science, the model achieved scores of 61.6 and 65.0 in LiveCodeBench and GPQA-Diamond tests respectively, comprehensively outperforming the current best open-source 70B model, Deepseek-R1-Distill-Llama-70B. This achievement not only proves the excellent performance of Tiny-R1-32B-Preview but also significantly reduces inference costs with only 5% of the parameters, achieving a leap in efficiency.

微信截图_20250226080042.png

The core technology behind this breakthrough is the "divide-and-conquer-merge" strategy. The research team used DeepSeek-R1 to generate massive domain-specific data and trained models for three vertical fields: mathematics, programming, and science. Subsequently, using the Mergekit tool from the Arcee team, they intelligently merged these models, breaking through the performance limitations of a single model and achieving balanced optimization across multiple tasks. This innovative technical approach not only improves model performance but also provides new ideas for the future development of inference models.

The joint research team of 360 ZhiNao and Peking University stated that the success of Tiny-R1-32B-Preview is inseparable from the support of the open-source community. The model benefits from DeepSeek-R1 distillation, DeepSeek-R1-Distill-32B incremental training, and model fusion techniques.

To promote technological accessibility, the research team has committed to publicly releasing the complete model repository, including technical reports, training code, and part of the dataset. The model repository has been uploaded to the Hugging Face platform, located at https://huggingface.co/qihoo360/TinyR1-32B-Preview.