Zhipu Open-Source GLM-4.6V Series: 106B Native Support for Function Call Lightweight Version 9B Free for Commercial Use

AIbase基地

Published inAI News · 7 min read · Dec 9, 2025

Zhipu officially launched the open-source GLM-4.6V multimodal large model series, including the base version GLM-4.6V (total parameters 106B, activated 12B) and the lightweight version GLM-4.6V-Flash (9B). The new model increases the context window to 128k tokens, achieving SOTA visual understanding accuracy for the same parameter scale. For the first time, the Function Call capability is natively integrated into the visual model, enabling a complete "visual perception → executable action" workflow. API pricing has been reduced by 50% compared to GLM-4.5V, with input at 1 yuan per million tokens and output at 3 yuan per million tokens; GLM-4.6V-Flash is completely free and has been integrated with GLM Coding Plan and dedicated MCP tools, allowing developers to commercialize at zero cost.

Technical Highlights: 128k Multimodal Context + Native Visual Function Call

128k Multimodal Context: A single round can input 30 high-resolution images and 80,000 words of text, achieving SOTA results on long video understanding benchmarks such as Video-MME and MMBench-Video.

Native Function Call: Visual signals are directly mapped to executable APIs without an additional projector, reducing latency by 37% and increasing success rate by 18%.

Unified Encoding: Images, videos, and text share a single Transformer, with dynamic routing during inference, reducing GPU memory usage by 30%.

Pricing and Licensing: Lightweight Version Free, Base Version Halved

GLM-4.6V-Flash (9B): 0 yuan for calls, open weights and commercial license, suitable for edge devices and SaaS integration.

GLM-4.6V (106B-A12B): Input at 1 yuan per million tokens and output at 3 yuan per million tokens, approximately 1/4 of GPT-4V's price.

50% Price Reduction: Overall reduction of 50% compared to GLM-4.5V, with a free 1 million token trial quota included.

Developer Tools: MCP + Coding Plan, One-Click Access

Dedicated MCP (Model-Context-Protocol) tool: Integrate GLM-4.6V into VS Code, Cursor with just 10 lines of code, enabling "UI selection → automatic front-end code generation."

GLM Coding Plan: Offers 50+ scenario templates (websites, mini-programs, scripts), transforming visual requirements into executable code and auto-deployment.

Online Playground: Supports drag-and-drop image uploads, real-time debugging of Function Calls, and one-click export of Python/Node.js call snippets.

Benchmark Results: SOTA with the Same Parameters, Leading in Long Video Understanding

| --------------------- | -------- | ------ | -------------- |

| Video-MME | 74.8 | 69.1 | 72.9 |

| MMBench-Video | 82.1 | 78.4 | 80.6 |

| LongVideoBench (128k) | 65.3 | 58.2 | 62.1 |

Commercial Scenarios and Cases

Movie Preview: Directors upload character images and storyboards, automatically generating a 30-second preview video with more than 96% consistency of main subjects.

Industrial Inspection: Capture equipment panel images → automatically identify abnormal areas → call maintenance API to create work orders.

Educational Materials: Teachers select textbook illustrations → generate 3D animations and voice explanations, and export to PPT with one click.

Open Roadmap

Starting today: Weights, inference code, and MCP tools are open-sourced on GitHub and Hugging Face (search for GLM-4.6V).

Q1 2025: Launch a 1M context version and a terminal-side INT4 quantized model, which can run on laptop CPUs.

Q2 2025: Launch the "Visual Agent Store," allowing developers to list custom Function Calls and earn revenue through calls.

Industry Insights

While multimodal models are still at the stage of "understanding visuals," Zhipu has embedded "understanding visuals + taking actions" into one model: native integration of Function Call allows images to trigger APIs directly, skipping the redundant chain from vision to text to prompt. The free 9B version lowers the entry barrier, while the 106B base version halving the price aims to quickly capture the visual agent ecosystem. As 128k long video understanding becomes practical, vertical scenarios like film, industry, and education are expected to see early large-scale adoption. AIbase will continue to track its progress in terminal-side quantization and Agent Store development.

AI Daily: WeChat Secretly Developing AI Agent; Fish Audio Launches S2; Honor Magic V6 Begins Internal Testing of On-Device AI Intelligence

Welcome to the [AI Daily] column! Here is your guide to exploring the world of artificial intelligence every day. Every day, we bring you the latest content in the AI field, focusing on developers, helping you understand technological trends and innovative AI product applications. Click to learn more about new AI products: https://app.aibase.com/zh1. WeChat secretly developing AI Agent: Intended to connect millions of mini programs, will start testing in mid-2026 WeChat is secretly developing a high-priority AI Agent product, aimed at fully integrating into WeChat

From Mobile Phones to the Lobster Universe: Honor Magic V6's Pre-Release Testing of On-Device AI Intelligence

Honor launches the foldable flagship Magic V6 and the AI ecosystem 'Honor Lobster Universe', deeply integrating on-device AI into multi-device collaboration. Its open-source framework OpenClaw can provide decision suggestions and directly control PCs, tablets, and other terminals, realizing automated task processing.

The Turing Award Winner Was Shocked! Claude 1 Hour Solved a Mathematical Puzzle That Had Baffled Donald Knuth for 30 Years

Donald Knuth was amazed by AI solving a mathematical problem he had been working on for weeks in just one hour. The computer science giant revealed in a short essay that Claude Opus 4.6 solved a mathematical problem he could trace back to 30 years ago in just one hour, demonstrating the amazing potential of artificial intelligence in the field of logical reasoning.

AI Agents Spark a Trend in Raising Shrimp, the Ministry of Industry and Information Technology Issues an Urgent Reminder, and Chip Giant Rockchip Can't Stay Calm

The AI community is embracing 'lobster farming,' referring to deploying the open-source AI agent OpenClaw. With its red lobster icon, users humorously call model training 'feeding the lobster.' It supports local operation, features long-term memory and autonomous task handling, is user-friendly, and free, driving its rapid popularity.....

Xiaohongshu's New AI Editing Model FireRed-Image-Edit v1.1 Open Source: Overcoming ID Consistency and Complex Fusion Challenges

The Xiaohongshu Super Intelligence team released the image editing model FireRed-Image-Edit v1.1 on March 9, 2026, less than a month after the release of version 1.0, indicating accelerated iteration. The new version deepens optimization for complex scenarios such as ID consistency editing, multi-element fusion, portrait makeup, and font style reference, while maintaining previous advantages, enhancing semantic understanding and visual generation capabilities.

AI Music Musk Moment: Suno v5 and Lyria3 Team Up to Make a Big Impact! Full-Chain Intelligent Production Begins, the Creative Soul of Workers is Secure?

In February 2026, AI music leaders Suno, Udio, and Google launched major updates, elevating AI music from a 'toy' to an industrial-grade production tool. Suno v5 notably achieved high-precision simulation of human singing emotion, eliminating electronic coldness and supporting multi-layered vocals, marking a breakthrough for professional music production.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Zhipu Open-Source GLM-4.6V Series: 106B Native Support for Function Call Lightweight Version 9B Free for Commercial Use

AIbase基地

This article is from AIbase Daily

AI News Recommendations

AI Daily: WeChat Secretly Developing AI Agent; Fish Audio Launches S2; Honor Magic V6 Begins Internal Testing of On-Device AI Intelligence

From Mobile Phones to the Lobster Universe: Honor Magic V6's Pre-Release Testing of On-Device AI Intelligence

NVIDIA NemoClaw Released! Hardware Decoupling + Open Source, Enterprise-Level AI Agents Enter the General Era

The Turing Award Winner Was Shocked! Claude 1 Hour Solved a Mathematical Puzzle That Had Baffled Donald Knuth for 30 Years

AI Agents Spark a Trend in Raising Shrimp, the Ministry of Industry and Information Technology Issues an Urgent Reminder, and Chip Giant Rockchip Can't Stay Calm

Xiaohongshu's New AI Editing Model FireRed-Image-Edit v1.1 Open Source: Overcoming ID Consistency and Complex Fusion Challenges

Performance Evaluation in the Security Field: Anthropic Claude Discovers 22 Vulnerabilities in Firefox Within Two Weeks

AI Music Musk Moment: Suno v5 and Lyria3 Team Up to Make a Big Impact! Full-Chain Intelligent Production Begins, the Creative Soul of Workers is Secure?

OpenClaw Can Train While Running: AReaL v1.0 Stable Version of the Intelligence Agent Reinforcement Learning Training Framework Released

Not Just a Cost-Effective King! MiniMax 2.5 Ranks Top Globally in Usage: Monthly Revenue Exceeds $150 Million, M3 Flagship Version to Be Released in First Half of the Year

AI News Recommendations

AI Daily: WeChat Secretly Developing AI Agent; Fish Audio Launches S2; Honor Magic V6 Begins Internal Testing of On-Device AI Intelligence

From Mobile Phones to the Lobster Universe: Honor Magic V6's Pre-Release Testing of On-Device AI Intelligence

NVIDIA NemoClaw Released! Hardware Decoupling + Open Source, Enterprise-Level AI Agents Enter the General Era

The Turing Award Winner Was Shocked! Claude 1 Hour Solved a Mathematical Puzzle That Had Baffled Donald Knuth for 30 Years

AI Agents Spark a Trend in Raising Shrimp, the Ministry of Industry and Information Technology Issues an Urgent Reminder, and Chip Giant Rockchip Can't Stay Calm

Xiaohongshu's New AI Editing Model FireRed-Image-Edit v1.1 Open Source: Overcoming ID Consistency and Complex Fusion Challenges

Performance Evaluation in the Security Field: Anthropic Claude Discovers 22 Vulnerabilities in Firefox Within Two Weeks

AI Music Musk Moment: Suno v5 and Lyria3 Team Up to Make a Big Impact! Full-Chain Intelligent Production Begins, the Creative Soul of Workers is Secure?

OpenClaw Can Train While Running: AReaL v1.0 Stable Version of the Intelligence Agent Reinforcement Learning Training Framework Released

Not Just a Cost-Effective King! MiniMax 2.5 Ranks Top Globally in Usage: Monthly Revenue Exceeds $150 Million, M3 Flagship Version to Be Released in First Half of the Year

GEO Services