King of Cost-Performance: Microsoft Open Sources Phi-4-reasoning-vision-15B Focused on Lightweight Multimodal Reasoning

AIbase基地

Published inAI News · 3 min read · Apr 13, 2026

Microsoft has officially open-sourced its latest multi-modal reasoning model, Phi-4-reasoning-vision-15B. With a parameter scale of 15B, the model achieves an ideal balance between high performance and low cost while maintaining a lightweight design, offering a new option for complex visual tasks in resource-constrained environments.

A "small cannon" driven by refined data

Different from industry models that typically consume trillions of tokens, Phi-4-reasoning-vision was trained using only 200B multi-modal tokens. The development team prioritized data quality, through deep cleaning of open-source data, generating targeted synthetic data, and precise domain data ratio (such as increasing math data to simultaneously improve computer operation capabilities), making it perform excellently in scientific reasoning and screen positioning tasks.

Innovative hybrid reasoning strategy

A major highlight of this model is the "hybrid reasoning path" design:

Perception tasks: When handling simple tasks such as image description and OCR, the model defaults to a direct answer mode, effectively reducing latency.
Reasoning tasks: When facing complex logic such as mathematical formulas and scientific charts, the model automatically calls a structured chain-of-thought (CoT) path to ensure accuracy of answers.
Users can also manually switch between these two modes using specific guiding words to adapt to different scenarios.

Thanks to the integration of the SigLIP-2 dynamic resolution encoder, the model has strong perception ability for small elements in high-resolution screenshots. This makes it an ideal choice for developing computer operation assistants (CUA), capable of accurately identifying and operating buttons and input fields on web or mobile interfaces.

Currently, Phi-4-reasoning-vision-15B has been released on multiple open-source platforms. Microsoft hopes that this compact model will prove that in the multi-modal field, "smaller and faster" can coexist with "stronger," further promoting the popularity of spatial intelligence and real-time interaction technologies.

Phi-4-reasoning-vision-15B Multimodal Reasoning Model Microsoft New AI Terms

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Nubia Leads in Completing the Registration of the AI Agent Large Model, Global First AI Agent Smartphone to Be Unveiled

Nubia completes AI agent large model filing, will launch the world's first AI agent phone, announced by CEO Ni Fei. The phone aims to let users confidently delegate tasks, greatly improving daily efficiency.....

Jul 15, 2026

120

China's First AI Personification Interactive Service Regulation Takes Effect Today, Six Red Lines Clearly Defined, Prohibiting the Provision of Virtual Partners to Minors

Five departments jointly enacted the Interim Measures for AI Anthropomorphism Interaction Services, effective immediately. It mandates safety responsibilities for providers and classifies emotion-companion AI under tiered oversight, signaling standardized regulation. Six activities are expressly banned.....

Jul 15, 2026

140

WAIC2026 Conference Goes Full Steam! MINIMAX and Zhipu Stock Prices Surge Over 8%

Ahead of the 2026 World AI Conference, Hong Kong-listed AI large model stocks rallied. On July 15, MINIMAX-W soared 12.52%, Zhipu gained 8.81%, and Qunhe Technology rose 6.11%, fueled by conference optimism and target price upgrades from major banks. WAIC 2026 runs July 17-20 in Shanghai, highlighting three core AI themes.....

Jul 15, 2026

230

AI Daily: DouBao and QianWen Discontinue Smart Agent Features on the Same Day; GPT-5.6Sol Exposed to Automatically Delete User Databases; JD AI Agent and Tencent Yuanbao Integrate Mini-Program Ecosystem

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we bring you the latest in the AI field, focusing on developers, helping you understand technology trends and innovative AI product applications. Discover new AI products: https://app.aibase.com/zh1, OpenAI releases GPT-5.6Sol, triggering security warnings. The new flagship model is exposed to automatically delete user databases. OpenAI releases its next-generation flagship model, GPT-5.6Sol

Jul 15, 2026

210

OpenAI's First Smart Speaker Exposed, Will Use GPT-Live Voice Model

OpenAI is developing its first hardware: a mobile, screenless smart speaker with GPT-Live voice model. It offers natural interaction, acts as a portable smart home hub, and signals OpenAI's entry into hardware, enabling a new on-device AI paradigm. It has a rechargeable battery for portability.....

Jul 15, 2026

250

JD.com AI Agent and Tencent Yuanbao Officially Connected the Mini Program Ecosystem, Users Can Directly Place Orders to Shop in Conversations

JD.com and Tencent announced integration of JD’s AI agent with Tencent Yuanbao mini-programs. The agent provides full-category product info and end-to-end shopping within Yuanbao chats—from search to purchase, no redirect. JD self-developed the agent, Yuanbao’s first e-commerce vertical partner, already connected to multiple phone makers’ AI agents.....

Jul 15, 2026

170

France Stands Out in the AI Competition with Low-Priced Electricity

At the G7 summit, Mistral AI CEO Arthur Mensch highlighted France's advantage in AI due to abundant low-cost nuclear power, noting France exported 92 TWh of electricity last year. He stressed electricity is key to AI development.....

Jul 15, 2026

190

Google Chrome Android Version Redesigns Bottom Toolbar: Adds Dedicated Gemini Button and Supports Multi-Tab AI Analysis

Google is testing a new bottom navigation bar in Chrome 150 for Android, featuring a dedicated Gemini AI button for the first time. The new AI breaks through the single-page summary limitations, allowing cross-comparison and summarization of content across multiple open tabs. It fully transfers the deep search experience of the desktop sidebar to mobile devices, marking a structural upgrade in AI interaction for mobile browsers.

Jul 15, 2026

220

Fatal 0day Vulnerability Found in Cursor IDE! Simply Open a Malicious Repository to Execute Code Easily!

A security company, Mindgard, discovered a critical 0day vulnerability in Cursor IDE: on Windows systems, opening a repository containing a malicious git.exe file will automatically execute the file without any user interaction. This security flaw has raised widespread concerns about the security practices of Cursor, which has over 7 million active users.

Jul 15, 2026

250

JD.com AI Agent and Tencent Yuanbao Integrate with WeChat Mini Program Ecosystem to Enable One-Click Ordering within AI Chat

JD's AI agent and Tencent Yuanbao have linked mini-programs, making JD the first e-commerce partner on Yuanbao. The partnership covers digital, home appliances, beauty, food, etc. Users can complete search, recommendation, order, fulfillment, and after-sales on Yuanbao, closing the loop from AI chat to e-commerce.....

Jul 15, 2026

180

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

Website AI Friendliness Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

LLM API Proxy Checker

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

King of Cost-Performance: Microsoft Open Sources Phi-4-reasoning-vision-15B Focused on Lightweight Multimodal Reasoning

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Nubia Leads in Completing the Registration of the AI Agent Large Model, Global First AI Agent Smartphone to Be Unveiled

China's First AI Personification Interactive Service Regulation Takes Effect Today, Six Red Lines Clearly Defined, Prohibiting the Provision of Virtual Partners to Minors

WAIC2026 Conference Goes Full Steam! MINIMAX and Zhipu Stock Prices Surge Over 8%

AI Daily: DouBao and QianWen Discontinue Smart Agent Features on the Same Day; GPT-5.6Sol Exposed to Automatically Delete User Databases; JD AI Agent and Tencent Yuanbao Integrate Mini-Program Ecosystem

OpenAI's First Smart Speaker Exposed, Will Use GPT-Live Voice Model

JD.com AI Agent and Tencent Yuanbao Officially Connected the Mini Program Ecosystem, Users Can Directly Place Orders to Shop in Conversations

France Stands Out in the AI Competition with Low-Priced Electricity

Google Chrome Android Version Redesigns Bottom Toolbar: Adds Dedicated Gemini Button and Supports Multi-Tab AI Analysis

Fatal 0day Vulnerability Found in Cursor IDE! Simply Open a Malicious Repository to Execute Code Easily!

JD.com AI Agent and Tencent Yuanbao Integrate with WeChat Mini Program Ecosystem to Enable One-Click Ordering within AI Chat

AI News Recommendations

Nubia Leads in Completing the Registration of the AI Agent Large Model, Global First AI Agent Smartphone to Be Unveiled

China's First AI Personification Interactive Service Regulation Takes Effect Today, Six Red Lines Clearly Defined, Prohibiting the Provision of Virtual Partners to Minors

WAIC2026 Conference Goes Full Steam! MINIMAX and Zhipu Stock Prices Surge Over 8%

AI Daily: DouBao and QianWen Discontinue Smart Agent Features on the Same Day; GPT-5.6Sol Exposed to Automatically Delete User Databases; JD AI Agent and Tencent Yuanbao Integrate Mini-Program Ecosystem

OpenAI's First Smart Speaker Exposed, Will Use GPT-Live Voice Model

JD.com AI Agent and Tencent Yuanbao Officially Connected the Mini Program Ecosystem, Users Can Directly Place Orders to Shop in Conversations

France Stands Out in the AI Competition with Low-Priced Electricity

Google Chrome Android Version Redesigns Bottom Toolbar: Adds Dedicated Gemini Button and Supports Multi-Tab AI Analysis

Fatal 0day Vulnerability Found in Cursor IDE! Simply Open a Malicious Repository to Execute Code Easily!

JD.com AI Agent and Tencent Yuanbao Integrate with WeChat Mini Program Ecosystem to Enable One-Click Ordering within AI Chat