Moondream3.0 Released, Exceeding Top Models Like GPT-5 in Multiple Benchmark Tests

AIbase基地

Published inAI News · 5 min read · Sep 28, 2025

In the latest released Moondream3.0 preview version, this model, based on an efficient Mixture of Experts (MoE) architecture, demonstrates impressive visual reasoning capabilities. Moondream3.0 has a total of 9 billion parameters but uses a lightweight design that activates only 2 billion parameters, making its performance particularly outstanding in complex scenarios. Compared to the previous Moondream2 version, 3.0 surpasses industry-leading models such as GPT-5, Gemini, and Claude4 in multiple benchmark tests, truly achieving a technological leap.

Moondream3.0's design supports a context length of 32K, making it ideal for real-time interaction and agent workflows. The model features an innovative SigLIP visual encoder, capable of high-resolution image processing and supporting multi-crop channel stitching. By using a custom efficient SuperBPE tokenizer combined with a multi-head attention mechanism, the model significantly improves its ability in long context modeling. Although the training data volume is approximately 4.5 billion tokens, far less than the trillions of tokens of other leading models, Moondream3.0 still achieves excellent performance.

A major highlight of this model is its "versatile" visual skills, including open-vocabulary object detection, point selection, counting, caption generation, and optical character recognition (OCR). It supports structured output, directly generating JSON arrays, such as extracting information like dog ID, coat color, and belt color. Additionally, Moondream3.0 performs impressively in user interface understanding, document transcription, and object localization.

Early benchmark test results show that Moondream3.0 achieved a score of 51.2 in COCO object detection, an increase of 20.7 compared to the previous version; OCRBench scores rose from 58.3 to 61.2, while ScreenSpot UI F1@0.5 reached 60.3. In practical applications, the model can easily identify complex scenes, such as identifying people wearing purple socks, selecting quantity input fields on shopping web pages, marking bottles, and recommending utensils suitable for pasta. Its application range is not limited to security monitoring and drone inspections but extends to medical imaging and enterprise-level document processing.

Moondream3.0 is an open-source model, emphasizing the concept of "no training, no ground truth data, no heavy infrastructure." Developers can unlock its powerful visual understanding capabilities with simple prompts. According to community feedback, the model has been successfully deployed on robot semantic behavior, mobile devices, and Raspberry Pi, making it suitable for edge computing scenarios.

Key Points:
🌟 Moondream3.0 has 9 billion parameters, activating only 2 billion, demonstrating efficient visual reasoning capabilities.
🔍 Supports open-vocabulary object detection and structured output, suitable for various scenarios.
💻 Open-source design, easy for developers to use, suitable for edge computing applications.

UK Government Proposes AI Plan to Save £4.5 Billion, but Experts Question Its Feasibility

The UK government plans to use AI to save £4.5 billion in the public sector, but parliamentary experts have questioned the validity of this figure, which is based on rough assumptions. Since government funding is mainly used for wages and infrastructure, how to achieve such significant savings has become a focal point.

AI Daily: Alibaba Launches Compact Qwen3-VL Model; iFlytek AI Translation Earbuds Launch Globally; Gemini Code Appears in Veo3.1

Alibaba launches the compact Qwen3-VL series of visual language models, including 400 million and 800 million parameter versions, aiming to promote the application of multimodal AI technology on edge devices. The model helps enhance AI processing capabilities on devices and promotes the popularization of the technology.

Airtel and IBM Partner to Advance Innovation in Cloud Computing and AI Technologies

Airtel has formed a strategic partnership with IBM to enhance Airtel Cloud services. Combining Airtel's high reliability and data residency advantages in the telecommunications sector with IBM's specialized expertise in cloud infrastructure and AI inference software, the two companies will help regulatory industry enterprises efficiently scale AI workloads, ensuring interoperability across on-premises, cloud, and edge infrastructures.

Alibaba Tongyi Qianwen Launches Qwen3-VL Lightweight Model: 4B and 8B Parameter Versions Performance Approaches Previous 72B Flagship

The Alibaba Tongyi Qianwen team has launched two lightweight models in the Qwen3-VL series, with parameter scales of 4B and 8B. This series is the strongest family of vision-language models to date, adding small-parameter versions to lower deployment barriers while maintaining strong performance. Each scale offers two versions: instruction following and chain-of-thought reasoning, providing developers with more flexible options.

Google Meet Launches AI Makeup Feature to Help Users Feel More Confident Before Meetings

Google Meet has introduced an AI makeup filter feature, offering 12 virtual makeup options to help users enhance their appearance during video conferences. This move aims to compete with similar features already launched by competitors such as Microsoft Teams and Zoom, enhancing market competitiveness. Users can find this feature under the "Appearance" settings in "Portrait Editing." The feature has been available since 2023 and continues to optimize virtual beauty effects.

Turing Award Winner Hinton: AI May Already Have Subjective Experiences, but Human Understanding of Consciousness Is Limited

AI pioneer Hinton presents a controversial view in an interview: current AI systems may already possess some form of subjective experience, but have not yet developed self-awareness. He emphasizes that the key lies in human misunderstanding of the nature of consciousness, rather than whether AI has awareness. At the same time, he reviews the development of AI from simple keyword matching to the present.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Moondream3.0 Released, Exceeding Top Models Like GPT-5 in Multiple Benchmark Tests

AIbase基地

This article is from AIbase Daily

AI News Recommendations

UK Government Proposes AI Plan to Save £4.5 Billion, but Experts Question Its Feasibility

AI Daily: Alibaba Launches Compact Qwen3-VL Model; iFlytek AI Translation Earbuds Launch Globally; Gemini Code Appears in Veo3.1

Airtel and IBM Partner to Advance Innovation in Cloud Computing and AI Technologies

Alibaba Tongyi Qianwen Launches Qwen3-VL Lightweight Model: 4B and 8B Parameter Versions Performance Approaches Previous 72B Flagship

Alibaba Launches Compact Qwen3-VL Model to Enhance Multimodal AI Efficiency and Accelerate Edge Device Deployment

Coco Robotics Teams Up with UCLA Professor to Establish a New Physical Artificial Intelligence Laboratory

Google Meet Launches AI Makeup Feature to Help Users Feel More Confident Before Meetings

Turing Award Winner Hinton: AI May Already Have Subjective Experiences, but Human Understanding of Consciousness Is Limited

Apple Takes New Action! Will Produce AI Home Devices and Mobile Desktop Robots in Vietnam

University of Pennsylvania Study Finds: The Ruder the AI is, the Higher the Accuracy Rate

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Moondream3.0 Released, Exceeding Top Models Like GPT-5 in Multiple Benchmark Tests

AIbase基地

This article is from AIbase Daily

AI News Recommendations

UK Government Proposes AI Plan to Save £4.5 Billion, but Experts Question Its Feasibility

AI Daily: Alibaba Launches Compact Qwen3-VL Model; iFlytek AI Translation Earbuds Launch Globally; Gemini Code Appears in Veo3.1

Airtel and IBM Partner to Advance Innovation in Cloud Computing and AI Technologies

Alibaba Tongyi Qianwen Launches Qwen3-VL Lightweight Model: 4B and 8B Parameter Versions Performance Approaches Previous 72B Flagship

Alibaba Launches Compact Qwen3-VL Model to Enhance Multimodal AI Efficiency and Accelerate Edge Device Deployment

Coco Robotics Teams Up with UCLA Professor to Establish a New Physical Artificial Intelligence Laboratory

Google Meet Launches AI Makeup Feature to Help Users Feel More Confident Before Meetings

Turing Award Winner Hinton: AI May Already Have Subjective Experiences, but Human Understanding of Consciousness Is Limited

Apple Takes New Action! Will Produce AI Home Devices and Mobile Desktop Robots in Vietnam

University of Pennsylvania Study Finds: The Ruder the AI is, the Higher the Accuracy Rate

GEO Services