Redefining Tradition! Mini-o3 Open-Source Model Achieves Ultra-Long Visual Reasoning, Deep Thinking Is No Longer a Challenge
Recently, ByteDance and the University of Hong Kong jointly launched a new open-source visual reasoning model - Mini-o3, marking another major breakthrough in multi-turn visual reasoning technology. Unlike previous visual language models (VLMs) that could only conduct 1-2 rounds of dialogue, Mini-o3 limited the number of dialogue rounds to 6 during training, but during testing it can extend the reasoning rounds to dozens, greatly enhancing the ability to handle visual questions. The strength of Mini-o3 lies in its deep reasoning in high-difficulty visual search tasks, reaching