Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Tools

GEO Brand Visibility

All-in-One GEO Brand Insights Platform

AI Visibility Audit

Quickly check how your brand is perceived and presented in AI-powered search results.

AI Search Visibility Checker

Detect brand's visibility on AI platforms

GEO Ranking Monitor

Batch queries & scheduled GEO ranking tracking

AI Conversation Insight

Discover trending questions users ask AI to guide content strategy

GEO Promotion Link Detection

Quickly evaluate the citation of promotion articles on AI platforms

Service

GEO Ranking Optimization System

Own your own GEO system and become a professional GEO optimization service provider.

GEO Ranking Optimization

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

Information

LLM API Hub

One-stop integration for all major LLM APIs.

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Tools

LLM API Proxy Checker

Choose reliable LLM API proxies with our 5-dimension test

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

AI Deployment Calculator

Enter Your Large Model Computing Requirements for Instant GPU, Memory & Server Configuration Recommendations

First AI College Entrance Exam Volunteer Assessment Released: Qwen Outperforms Human Counselors in Multiple Aspects

AIbase基地

Published inAI News · 5 min read · Jun 23, 2026

On June 23, the first AI capability assessment report for the college entrance examination (Gaokao) application scenario, "Gaokao AI Assessment Benchmark," was released. The report was independently completed by Yousong Lab, with the Qwen Gaokao application agent as the evaluation subject. The results show that Qwen's performance has reached the level of human application consultants, and it has advantages in stability, accuracy, structured expression, and efficiency.

Yousong Lab is an independent research team focused on artificial intelligence and educational decision-making research. It has long been concerned with large model capability assessment, AI applications in educational scenarios, and issues related to information, cognition, and decision-making in students' academic advancement choices. Its research findings have been adopted by many universities and research institutions. The released assessment benchmark aims to establish a public, reproducible, and scalable evaluation framework for the rapidly emerging Gaokao application AI products, clarifying the task boundaries AI can undertake at this stage.

Considering that the Qwen Gaokao Agent is built on Alibaba's 8 years of Gaokao service data and experience, it has industry representativeness in product form, data accumulation, and user coverage, so the report selected it as the first evaluation subject. The human control group consisted of 53 application consultants, with an average of 4.6 years of working experience.

The assessment covered four stages: basic facts and rules of Gaokao applications, simulated application filling, open consultation, and application recommendation reports, corresponding to the main process for students and parents when filling out applications, from researching information and understanding rules to planning schemes and making decisions.

The results showed that Qwen answered all 44 objective questions correctly, achieving a 100% accuracy rate, while the average correct rate of human consultants was 89.3%. In simulated application filling, Qwen's plan included 6 acceptable applications, without any explicit preference violations, and it hit the optimal result of post-evaluation. On average, human consultants provided 5.3 acceptable applications. In open consultation, in 100 anonymous comparisons, experts preferred the Qwen version 58 times, with a "directly displayable" rate of 56.0%, higher than the 33.0% of human consultants. Experts considered Qwen more stable in professional path breakdown, risk warnings, and clarity of expression.

The report concluded that within the scope of the assessment tasks, Qwen's performance has reached the level of senior human consultants, especially showing advantages in stability, accuracy, structured expression, and response efficiency.

However, the report also pointed out that the value of human consultants remains irreplaceable. Especially in topics such as income expectations and employment judgment, which require careful calibration based on individual circumstances, consultants can provide more practical advice. In scenarios such as parent-child communication and value trade-offs, AI solutions with complete structures cannot replace human communication and judgment.

The report suggests that AI is more suitable for efficiently completing information verification, data organization, and initial screening of plans, while consultants can focus more on family communication, value trade-offs, and personalized judgments. Only through complementarity can the application process improve accuracy and better meet the actual needs of students and their families.

AIBenchmarkforGaokaoPreferenceAssessment QianwenGaokaoPreferenceFillingAgent YousongLab AICapabilityAssessmentReport

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team