Liquid AI Launches LFM2-VL: A Low-Latency Ultra-Efficient Vision-Language Model

AIbase基地

Published inAI News · 5 min read · Aug 21, 2025

Liquid AI officially launches LFM2-VL, a new series of visual language foundation models optimized for low latency and device-adaptive deployment. The released LFM2-VL models include two efficient variants: LFM2-VL-450M and LFM2-VL-1.6B, marking significant progress in the application of multimodal AI on smartphones, laptops, wearables, and embedded systems without sacrificing speed and accuracy.

LFM2-VL models are carefully designed, achieving twice the GPU inference speed compared to existing visual language models while maintaining competitive benchmark performance on tasks such as image description, visual question answering, and multimodal reasoning. The 450M parameter version is designed for resource-constrained environments, while the 1.6B parameter version provides stronger capabilities while remaining lightweight, suitable for single GPU or high-end mobile devices.

The technological innovation of LFM2-VL adopts a modular architecture, combining a language model backbone (LFM2-1.2B or LFM2-350M), SigLIP2NaFlex visual encoder (400M or 86M parameters), and a multimodal projector. It uses "pixel unmixing" technology to dynamically reduce the number of image tokens, achieving faster processing speed. In addition, the model can process images at their original resolution, up to 512×512 pixels, avoiding distortion caused by scaling. Larger images will be split into non-overlapping 512×512 patches, ensuring the preservation of details and aspect ratio. The 1.6B version also encodes a scaled-down thumbnail of the entire image to provide global context understanding.

The flexible inference capability of LFM2-VL models allows users to adjust the balance between speed and quality during inference, adapting to device capabilities and application needs. The model is pre-trained, jointly trained to integrate visual and language capabilities, and finally fine-tuned on approximately 100 billion multimodal tokens, ensuring excellent performance in image understanding.

In public benchmark tests, LFM2-VL performs comparably to large models like InternVL3 and SmolVLM2, but with smaller memory usage and faster processing speed, making it ideal for edge and mobile applications. Both models are open-weight and can be downloaded on Hugging Face for research and commercial use. For large enterprises, a commercial license must be obtained by contacting Liquid AI. These models are seamlessly integrated with Hugging Face Transformers and support quantization to further enhance efficiency on edge hardware.

LFM2-VL aims to help developers and enterprises quickly, accurately, and efficiently deploy multimodal AI on devices, reducing reliance on the cloud and promoting new applications such as robotics, the Internet of Things, smart cameras, and mobile assistants.

huggingface:https://huggingface.co/collections/LiquidAI/lfm2-vl-68963bbc84a610f7638d5ffa

Key Points:
🌟 LFM2-VL models offer ultra-efficient GPU inference speed, twice as fast as existing models, suitable for various devices.
🖼️ Supports processing images at original resolution and handling large images, ensuring no loss of detail.
🚀 Both models are open-weight and can be downloaded on Hugging Face, suitable for research and commercial applications.

Tongyi APP Launches Knowledge Base Feature to Help Users Efficiently Manage Personal and Official Information

The Tongyi APP has officially launched the knowledge base feature, aimed at helping users better manage and query various types of information. This feature not only integrates official knowledge bases from multiple fields such as education, law, finance, but also allows users to create and maintain their own personal knowledge bases. Through this dual model, Tongyi APP provides users with more comprehensive and professional knowledge support. In terms of official knowledge bases, Tongyi APP has integrated hundreds of thousands of materials certified by authoritative institutions, covering five high-knowledge-density fields such as education, law, medicine, finance, and information technology.

ByteDance Makes a Big Move: Seed-OSS-36B Open Source Model Emerges Suddenly, Revolutionizing the AI Community with 512K Ultra-Long Context

The arms race in AI large models has escalated again, and this time it's ByteDance that has dropped a shocker. This tech giant, known for TikTok and Jinri Toutiao, has officially announced the open-sourcing of its latest masterpiece, the Seed-OSS-36B large language model. With an impressive configuration of 36 billion parameters and native 512K ultra-long context window, it has instantly become the focus of the open-source AI community, drawing widespread attention from the industry. In contrast to the common 128K context limits seen in mainstream open-source models, the 512K ultra-long context capability of Seed-OSS is truly astonishing.

DeepSeek-V3.1 Launch: Higher Thinking Efficiency and Stronger Agent Capabilities

On August 21, DeepSeek Technology Co., Ltd. officially released its latest AI model - DeepSeek-V3.1. This upgrade marks a solid first step for the company in entering the Agent era, providing users with a more powerful, efficient, and versatile artificial intelligence solution. The release of DeepSeek-V3.1 brings multiple major improvements. First, this version introduces a hybrid reasoning architecture, allowing a single model to support both thinking mode and non-thinking mode, offering users a more flexible and intelligent experience.

Companies Have Invested Heavily in Generative AI, but 95% of the Returns Are Zero

Although companies have invested between $30 billion to $40 billion in generative artificial intelligence (AI), the latest MIT report shows that 95% of organizations have not received any return from it. This data has sparked deep reflection on the effectiveness of AI investments. Image source note: The image is generated by AI, and the image licensing service provider is Midjourney. The report indicates that only 5% of integrated AI pilot projects have created value in the millions of dollars, while most companies use these technologies.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

Building and Deploying AI

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

Liquid AI Launches LFM2-VL: A Low-Latency Ultra-Efficient Vision-Language Model

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Tongyi APP Launches Knowledge Base Feature to Help Users Efficiently Manage Personal and Official Information

Baidu Steam Engine 2.0 Video Generation Large Model Major Upgrade, Industry's First Multi-Person Voice Video Generation

ChatExcel Secures 10 Million Yuan in Angel Funding to Promote the Development of an AI Data Intelligence Platform

AI Daily: Zhipu AI releases AutoGLM 2.0; Tencent Yuanbao integrates with Tencent Video; ByteDance launches open-source large language model Seed-OSS

The Birth of an AI Math Genius: GPT-5 Pro Independently Proves a New Theorem, Shocking the Academic World. OpenAI President Exclaims This is a Sign of Life

ByteDance Makes a Big Move: Seed-OSS-36B Open Source Model Emerges Suddenly, Revolutionizing the AI Community with 512K Ultra-Long Context

DeepSeek-V3.1 Launch: Higher Thinking Efficiency and Stronger Agent Capabilities

Companies Have Invested Heavily in Generative AI, but 95% of the Returns Are Zero

Microsoft Tests New Features of Windows 11 Copilot: AI Smart Search for Files and Images

Intel Launches New Rack-Level AI Chip Jaguar Shores with HBM4 Memory