Baidu releases ERNIE-4.5-VL-28B-A3B-Thinking: Accurately locates image details to solve complex problems
Baidu launches the multimodal AI model ERNIE-4.5-VL-28B-A3B-Thinking, which can deeply integrate images for reasoning. The model performs excellently in multiple benchmark tests, sometimes surpassing top commercial models such as Google Gemini 2.5 Pro and OpenAI GPT-5 High. Although it has a total of 28 billion parameters, it uses a routing architecture, activating only 3 billion parameters, achieving lightweight and efficient inference.