Recently, Kuaizhi Wanyi officially released its brand-new open-source model Skywork-R1V3.0, claiming to have reached an unprecedented level in multimodal reasoning, even comparable to the level of human junior experts. During the training process, the model adopted a reinforcement learning strategy, achieving significant progress in complex logical modeling and cross-disciplinary knowledge generalization.

Skywork-R1V3.0 was "bootstrapped" based on the previous generation Skywork-R1V2.0, using high-quality distilled data and rejection sampling techniques to successfully build a powerful multimodal reasoning training set. The design of this model is not limited to text, but also includes image processing, significantly improving its ability to reason between images and text.

image.png

According to the introduction, the training of Skywork-R1V3.0 relies on only about 12,000 supervised fine-tuning samples and 13,000 reinforcement learning samples, demonstrating the unique advantage of "small data triggering great capability." In the authoritative comprehensive multimodal evaluation MMMU, Skywork-R1V3.0 scored 76.0, leading over closed-source models such as Claude-3.7-Sonnet (75.0) and GPT-4.5 (74.4), proving its outstanding cross-modal understanding ability.

In specific application scenarios, Skywork-R1V3.0 has shown excellent performance in multiple fields such as physics, logic, and mathematical reasoning. For example, in the physics reasoning evaluation, the model achieved the best open-source scores of 52.8 and 31.5, showing its ability to understand complex physics problems. Additionally, in the logic reasoning test, Skywork-R1V3.0 also achieved an excellent score of 59.7.

image.png

The model is also formidable in mathematical reasoning, achieving excellent scores of 77.1, 59.6, and 52.6 in evaluations such as MathVista, MathVerse, and MathVision, significantly outperforming other open-source models. These outstanding performances make Skywork-R1V3.0 a strong competitor in the current open-source multimodal reasoning field.

image.png

The release of Skywork-R1V3.0 marks a new peak in multimodal reasoning technology. Its powerful performance and open-source nature will greatly promote the further development of AI technology.