Best 基准饱和 AI Tools & Models - Premium 基准饱和 News

AI News

AI 学霸遭遇重创！GPT-4o 专家考试仅得 2.7 分

《自然》杂志报道，GPT-4o 在“人类终极考试”中仅得 2.7 分（满分 100），表现最佳 AI 模型也仅 8 分。这一结果引发对 AI 真实能力的质疑。传统测试难以反映真实水平，主要因“基准饱和”问题。

13.6k 12 hours ago

Models

Claude 3 Opus

Anthropic

$105

Input tokens/M

$525

Output tokens/M

200

Context Length

qwen-image-edit

Alibaba

Input tokens/M

Output tokens/M

Context Length

Qwen3-235B-A22B-Instruct-2507

Alibaba

Input tokens/M

Output tokens/M

Context Length

GLM-4.5-X

Chatglm

Input tokens/M

$16

Output tokens/M

128

Context Length

o3

Openai

$14

Input tokens/M

$56

Output tokens/M

200

Context Length

ERNIE-4.5-VL-424B-A47B-Paddle

Baidu

Input tokens/M

Output tokens/M

Context Length

Yi-9B-200K

01-ai

Input tokens/M

Output tokens/M

200

Context Length

GLM-4

Chatglm

$100

Input tokens/M

$100

Output tokens/M

128

Context Length

Gemini 1.0 Pro

Google

$3.5

Input tokens/M

$10.5

Output tokens/M

Context Length

Yi-6B-Chat

01-ai

Input tokens/M

Output tokens/M

Context Length

Baichuan-7B

Baichuan

Input tokens/M

Output tokens/M

Context Length

ERNIE-3.0

Baidu

Input tokens/M

Output tokens/M

Context Length

ERNIE-1.0

Baidu

Input tokens/M

Output tokens/M

Context Length

Empowering the future, your artificial intelligence solution think tank

English 简体中文繁體中文にほんご

FirendLinks:

AI Newsletters AI Tools MCP Servers AI News AIBase LLM Leaderboard AI Ranking

Business Cooperation Site Map