Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Information

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

Submit Your Model

Submit Your Model Info & Services - Precision Marketing & User Targeting

Tools

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

AI Brand Monitoring Tool

Analyze & Track How AI Models Cite Your Brand

GEO Services

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

AI Search Visibility Checker

Detect brand's visibility on AI platforms

Tools

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

AI Deployment Calculator

Enter Your Large Model Computing Requirements for Instant GPU, Memory & Server Configuration Recommendations

AI Tutorial

Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

Baidu Qianfan-VL Open-Source Release Kunlun Chip Enables New Breakthroughs in Multimodal AI

AIbase基地

Published inAI News · 5 min read · Sep 25, 2025

Baidu has reached a new milestone in the AIGC field, officially open-sourcing its latest visual understanding model - Qianfan-VL. This series of models comes in three versions: 3B, 8B, and 70B, with parameters increasing from small to large, suitable for different application scenarios. Notably, the training of the Qianfan-VL series fully relies on Baidu's self-developed Kunlun X P800 chip, demonstrating the strong capabilities of domestic chips in the field of artificial intelligence.

Qianfan-VL is called a multimodal large model, capable of understanding both images and text. For example, it can analyze data and trends in complex charts. In terms of core capabilities, Qianfan-VL performs particularly well in OCR (Optical Character Recognition) and education scenario optimization. Users just need to take a photo of an ID card, and the model can automatically recognize the name and ID number, achieving full-scenario text recognition. Whether it's printed text, handwriting, or complex mathematical formulas, it can easily recognize and extract information, converting it into structured data.

In the education field, Qianfan-VL is positioned as a "super student," helping students take photos to solve problems, perform geometric reasoning, and function analysis. According to test results, the 70B version of Qianfan-VL achieved a high score of 98.76 in the ScienceQA scientific question-answering test, far exceeding competitors. At the same time, in the Chinese multimodal benchmark test CCBench, this version also stood out with a high score of 80.98, demonstrating its strong comprehension ability in the Chinese context.

The Kunlun X P800 chip that supports the training of Qianfan-VL has excellent power consumption control, with a power consumption of 150W to 160W, giving it clear advantages in energy consumption and heat dissipation in large-scale clusters. The unique architecture design of the P800 separates the computing unit and communication unit, optimizing the chip's utilization efficiency. Through "communication-computation integration" technology, data transmission and computing processes can be seamlessly connected, significantly improving model training performance.

The underlying architecture of Qianfan-VL integrates multiple industry-leading achievements and adopts an innovative "four-stage training pipeline" method, ensuring that the model has a solid general knowledge foundation and professional knowledge during the training process. Currently, the entire series of Qianfan-VL models have been open-sourced on platforms such as GitHub and Hugging Face, available for free use by enterprises and developers. Meanwhile, Baidu Intelligent Cloud's Qianfan platform also provides online experience and deployment services.

GitHub:

https://github.com/baidubce/Qianfan-VL

Hugging Face:

https://huggingface.co/baidu/Qianfan-VL-70B

Key points:
🌟 Baidu's Qianfan-VL series model is officially open-sourced, including three versions: 3B, 8B, and 70B, suitable for different scenarios.
🧠 The model has powerful multimodal capabilities, capable of recognizing text and images simultaneously, especially excelling in OCR and education fields.
💡 The Kunlun X P800 chip supports the model training, with low power consumption and high utilization efficiency, optimizing large-scale computing performance.

AIGC Qianfan-VL Kunlun Chip P800 OCR

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

01.AI Collaborates with Jiyi AI: Announce Joint Research on Game Industry Large Model, Launch SOON Engine This Year

01.AI and KaiYing Network's Hangzhou Jiyi AI collaborate to develop a game industry model, integrating Yi series models into SOON AIGC engine for one-click game generation.....

Nov 26, 2025

ByteDance PICO Strategic Upgrade: Launch Self-Developed Chip and New VR Headset in 2026

ByteDance accelerates the self-development and high-end positioning of VR hardware. The PICO brand under ByteDance plans to launch a new generation of headsets in 2026, equipped with a fully self-developed dedicated chip. This chip was initiated in 2022, completed the first chip return in 2024, and entered mass production. It meets the performance targets, with its core advantage being low latency performance.

Nov 26, 2025

Kunlun Yuan AI Launches New All-Modal Fusion Model BaiZe-Omni-14b-a2b, Driving New Advances in AI Technology

Kunlun Yuan AI launched BaiZe-Omni-14b-a2b at the 2025 World Computing Conference. Based on Ascend platform, it handles text, audio, image, and video with innovative multi-modal architecture, advancing AI applications.....

Nov 25, 2025

150

Tencent Releases HunyuanOCR Open-Source Model, Achieving Multiple SOTA Performances with Only 1B Parameters

Tencent releases HunyuanOCR, a 1B-parameter open-source model based on the Hunyuan multimodal architecture, achieving SOTA in OCR. It features end-to-end design with core components: native resolution video encoder, adaptive vision adapter, and lightweight language model.....

Nov 25, 2025

240

Tencent Huyaun Opensources HunyuanOCR Model: 1B Parameters Achieve SOTA in Multiple Scenarios, Empowering OCR Applications

Tencent Huyaun opensources the 1 billion parameter OCR model HunyuanOCR, which adopts an end-to-end design, integrating a video encoder, visual adapter, and lightweight language model. It achieves SOTA results in multiple benchmarks, with small size and easy deployment as core advantages, providing an efficient OCR solution.

Nov 25, 2025

160

Tesla AI Chip Acceleration: AI5 to Begin Mass Production, AI6 Design Begins; Samsung and TSMC Advance in Parallel

Tesla partners with Samsung for a $16.4B AI chip deal to produce next-gen AI6 chips, with Musk confirming AI5 chips are near tape-out, highlighting Tesla's AI hardware strategy.....

Nov 24, 2025

180

Emirati and Saudi Companies Successfully Obtain U.S. AI Chip Sales Authorization

US approves sale of 70,000 advanced AI chips to UAE and Saudi Arabia, supplying 35,000 Nvidia GB300 servers each to G42 and Humain to boost Middle East AI development.....

Nov 20, 2025

160

Weibo Open Sources Vibe Thinker: 1.5 Billion Parameters Outperform DeepSeek R1 with a Training Cost of Only $7,800

Weibo launches the open-source large model Vibe Thinker, which has only 1.5 billion parameters but outperforms the 671 billion parameter DeepSeek R1 in mathematical competition benchmarks, with higher accuracy and a training cost of only $7,800. It adopts a lightweight MoE architecture and knowledge distillation technology, requiring only 5GB of mathematical corpus for fine-tuning. It supports downloading from Hugging Face and commercial use. The model performs outstandingly in international math competitions such as AIME.

Nov 18, 2025

190

Kunlun Wanwei Launches Lightweight Multimodal Agent Skywork R1V4-Lite, Opening a New Era of Intelligent Interaction

Skywork R1V4-Lite, a lightweight multimodal agent by Kunlun Wanwei, integrates vision, reasoning, and planning. It supports image operations, tool use, and complex scene tasks like spatial positioning and text enhancement via photos.....

Nov 18, 2025

260

NotebookLM Upgraded to Support Image Import, Whiteboard Notes Become Searchable Knowledge Base

Google introduces image recognition features for NotebookLM, allowing users to upload whiteboard notes, textbooks, or table images, and automatically recognize text and perform semantic analysis. Users can directly search the content of images using natural language. This feature is free across all platforms and will soon add local processing options to protect privacy. The system uses multimodal technology to distinguish between handwritten and printed text, analyze table structures, and intelligently link with existing notes.

Nov 17, 2025

240

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Baidu Qianfan-VL Open-Source Release Kunlun Chip Enables New Breakthroughs in Multimodal AI

AIbase基地

This article is from AIbase Daily

AI News Recommendations

01.AI Collaborates with Jiyi AI: Announce Joint Research on Game Industry Large Model, Launch SOON Engine This Year

ByteDance PICO Strategic Upgrade: Launch Self-Developed Chip and New VR Headset in 2026

Kunlun Yuan AI Launches New All-Modal Fusion Model BaiZe-Omni-14b-a2b, Driving New Advances in AI Technology

Tencent Releases HunyuanOCR Open-Source Model, Achieving Multiple SOTA Performances with Only 1B Parameters

Tencent Huyaun Opensources HunyuanOCR Model: 1B Parameters Achieve SOTA in Multiple Scenarios, Empowering OCR Applications

Tesla AI Chip Acceleration: AI5 to Begin Mass Production, AI6 Design Begins; Samsung and TSMC Advance in Parallel

Emirati and Saudi Companies Successfully Obtain U.S. AI Chip Sales Authorization

Weibo Open Sources Vibe Thinker: 1.5 Billion Parameters Outperform DeepSeek R1 with a Training Cost of Only $7,800

Kunlun Wanwei Launches Lightweight Multimodal Agent Skywork R1V4-Lite, Opening a New Era of Intelligent Interaction

NotebookLM Upgraded to Support Image Import, Whiteboard Notes Become Searchable Knowledge Base

GEO Services