Xbai o4, the Fourth Generation Open-Source Large Model Released by Question Bai

AIbase基地

Published inAI News · 3 min read · Aug 4, 2025

Open source large model field has made a new breakthrough. "Wen Xiao Bai" officially released its fourth-generation open-source model XBai o4, which demonstrates excellent complex reasoning capabilities. Its Medium mode has fully surpassed OpenAI o3-mini and outperformed Anthropic Claude Opus in some benchmark tests.

XBai o4 introduces an innovative "reflective generative paradigm," combining Long-CoT reinforcement learning and process reward learning to achieve deep reasoning and efficient reasoning chain screening, while significantly reducing reasoning costs.

Technical Breakthrough: Unique "Reflective Generative Paradigm"

The core innovation of XBag o4 is its unique "reflective generative form". This paradigm combines Long-CoT reinforcement learning with process reward learning (Process Reward Learning), allowing a single model to simultaneously complete two key tasks:

Deep reasoning: Think through multiple steps like humans.
High-quality reasoning chain selection: Evaluate and select the optimal reasoning path.

More notably, XBag o4 reduces the reasoning time of process rewards by 99% by sharing the backbone network of the process reward model (PRMs) and the policy model. This optimization significantly improves the model's operational efficiency, providing a solid foundation for practical applications.

Excellent Performance: Leading in Multiple Benchmark Tests

The XBag o4 model offers three modes (low, medium, high) to adapt to different task complexities. Its strong performance has been fully validated in various key benchmark tests:

In the Medium mode, XBag o4 fully surpasses the OpenAI o3-mini model.
In some benchmark tests, its performance even exceeds that of Anthropic's Claude Opus.
The model has demonstrated outstanding reasoning ability in multiple tests such as AIME24, AIME25, LiveCodeBench v5, and C-EVAL.

"Wen Xiao Bai" has open-sourced the relevant training and evaluation code on GitHub, which not only provides valuable resources for the AI research community but also indicates that open-source large models are rapidly enhancing their competitiveness in the field of complex reasoning.

Address: https://github.com/MetaStone-AI/XBai-o4

AI Daily: Tencent's Xiaolongxia WorkBuddy Launched; AI Impact on Red Fruit Real Production, Base May Be Cancelled; ZhiPu Released One-Click Deployment Agent Tool AutoClaw

Welcome to the 【AI Daily】 column! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you understand technological trends and learn about innovative AI product applications. Click to learn more about new AI products: https://app.aibase.com/zh1, 1 minute connection to Qiyue! 8, Tencent SkillHub Community officially launched: optimized for Chinese users and includes over 13,000 AI skills

Affected by Tesla's AI6 Chip Production Plan Changes, South Korean AI Rising Star DX-M2 Mass Production Delayed to Third Quarter of 2026

Tesla's production plan changes led to Samsung adjusting its 2nm production line schedule, forcing Korean AI chip firm DeepX to delay mass production of its next-gen NPU chip DX-M2 by six months, with testing expected only after Q3 2026. This highlights how large clients in the semiconductor foundry industry prioritize scheduling, impacting smaller enterprises.....

The Short Drama Industry Faces a Major Shake-up: Elimination of Minimum Production Costs for Live-Action Content and a Thousand-Fold Surge in AI Live-Action Drama Production Capacity

In early 2026, the short drama industry shifted as platforms like Hongguo altered their 'guarantee + revenue share' model, halting many live-action projects and quieting production bases. The sector is now 'trimming fat,' moving from human actors to AI production, with computing power replacing labor as the new trend.....

Study Finds: AI Acts as a Cooling Agent for Negative Reviews, Effectively Reducing Emotional Corporate Public Relations

AI effectively handles malicious social media reviews, preventing emotional employee responses that could cause PR crises. Studies show that businesses using Automated Review Monitoring Systems (ARMS) in the food industry significantly improve ratings, enhancing performance and public image. AI serves as a first line of defense, optimizing feedback mechanisms for positive change.....

Lei Jun's Proposal at the Two Sessions: L3/L4-Level Autonomous Driving to Accelerate, Embodied Intelligence Large Models Mark the Beginning of a New Era

Standing at the crossroads of technology in 2026, Lei Jun predicts that L3/L4 autonomous driving will experience a surge, while embodied intelligence large models will also truly take off. He submitted five proposals at this year's Two Sessions, focusing on humanoid robots, intelligent driving, and tech philanthropy, believing that China's tech industry is in a critical period of development.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Xbai o4, the Fourth Generation Open-Source Large Model Released by Question Bai

AIbase基地

Technical Breakthrough: Unique "Reflective Generative Paradigm"

Excellent Performance: Leading in Multiple Benchmark Tests

This article is from AIbase Daily

AI News Recommendations

AI Daily: Tencent's Xiaolongxia WorkBuddy Launched; AI Impact on Red Fruit Real Production, Base May Be Cancelled; ZhiPu Released One-Click Deployment Agent Tool AutoClaw

Affected by Tesla's AI6 Chip Production Plan Changes, South Korean AI Rising Star DX-M2 Mass Production Delayed to Third Quarter of 2026

Thousands of Authors Jointly Publish a Blank Book: Literary Giants Like Kazuo Ishiguro Protest AI Infringement

Valuation of 14.6 Billion USD: AI Compute Newcomer Nscale Completes 2 Billion USD Series C Funding

The Short Drama Industry Faces a Major Shake-up: Elimination of Minimum Production Costs for Live-Action Content and a Thousand-Fold Surge in AI Live-Action Drama Production Capacity

Study Finds: AI Acts as a Cooling Agent for Negative Reviews, Effectively Reducing Emotional Corporate Public Relations

OpenAI Officially Acquires Promptfoo, Aiming to Fill the Last Gap

M4 Chip Secrets Cracked! Claude Makes a Big Contribution, Is Your Mac Mini a Hidden Training Monster?

Lei Jun's Proposal at the Two Sessions: L3/L4-Level Autonomous Driving to Accelerate, Embodied Intelligence Large Models Mark the Beginning of a New Era

OpenClaw 2026.3.7 Version Update: Supports GPT-5.4, Completely Solving the Smart Agent Fragmentation Problem

AI News Recommendations

AI Daily: Tencent's Xiaolongxia WorkBuddy Launched; AI Impact on Red Fruit Real Production, Base May Be Cancelled; ZhiPu Released One-Click Deployment Agent Tool AutoClaw

Affected by Tesla's AI6 Chip Production Plan Changes, South Korean AI Rising Star DX-M2 Mass Production Delayed to Third Quarter of 2026

Thousands of Authors Jointly Publish a Blank Book: Literary Giants Like Kazuo Ishiguro Protest AI Infringement

Valuation of 14.6 Billion USD: AI Compute Newcomer Nscale Completes 2 Billion USD Series C Funding

The Short Drama Industry Faces a Major Shake-up: Elimination of Minimum Production Costs for Live-Action Content and a Thousand-Fold Surge in AI Live-Action Drama Production Capacity

Study Finds: AI Acts as a Cooling Agent for Negative Reviews, Effectively Reducing Emotional Corporate Public Relations

OpenAI Officially Acquires Promptfoo, Aiming to Fill the Last Gap

M4 Chip Secrets Cracked! Claude Makes a Big Contribution, Is Your Mac Mini a Hidden Training Monster?

Lei Jun's Proposal at the Two Sessions: L3/L4-Level Autonomous Driving to Accelerate, Embodied Intelligence Large Models Mark the Beginning of a New Era

OpenClaw 2026.3.7 Version Update: Supports GPT-5.4, Completely Solving the Smart Agent Fragmentation Problem

GEO Services