Ant Group open-sources Ming-lite-omni: The first open-source multimodal model comparable to GPT-4o

AIbase基地

Published inAI News · 4 min read · May 29, 2025

76

The multi-modal large model Ming-lite-omni from the Bai Ling team of Ant Group recently announced a significant decision at the recent Ant Technology Day: to fully open-source the model. This move not only marks another major openness initiative by Ant Group in the AI field but is also considered by the industry as the first open-source model that can rival GPT-4o in terms of modality support.

22 billion parameters technical breakthrough

Ming-lite-omni is based on Ling-lite and adopts an advanced MoE (Expert Mix) architecture, with a total of 22 billion parameters and 3 billion active parameters. This parameter scale has reached new heights among open-source multi-modal models, showcasing Ant Group's deep accumulation in large model technology.

Currently, the model weights and inference code for Ming-lite-omni have been fully opened to the public, and the training code and training data will be released in subsequent stages, providing comprehensive technical support for global developers.

Ongoing open-source strategy shows results

This year, the Bai Ling large model team has continuously open-sourced several important model products, including Ling-lite, Ling-plus, and other large language models, the multi-modal large model Ming-lite-uni, as well as the preview version Ming-lite-omni.

In mid-May, the Ling-lite-1.5 version open-sourced had capabilities close to the same level SOTA, with performance between 4B and 8B of Qwen, successfully verifying the feasibility of training a 300B size SOTA MoE large language model on non-high-end computing platforms.

Performance comparable to international top-tier models

In multiple understanding and generation capability evaluations, Ming-lite-omni's performance is equivalent to or better than leading multi-modal large models of the 10B scale. Ant Group stated that this is known as the first open-source model that can rival GPT-4o in terms of modality support, providing important technical options and reference standards for global developers.

Bai Ling large model leader Xiting introduced the team's technical route: "We firmly use the MoE architecture in both language large models and multi-modal large models and extensively utilize non-high-end computing platforms, successfully demonstrating the ability of domestic GPUs to train models comparable to GPT-4o."

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials