AIbase
Product LibraryTool NavigationMCP

Search AI Products and News

  • AI News
  • AI Tools
2025-06-17 11:12:21.AIbase

In-Depth Review of Body Type Calculator: Is the AI Helper for Scientifically Shaping the Perfect Physique Actually Reliable?

2025-06-04 10:13:15.AIbase

Stanford's Latest Evaluation: DeepSeek R1 Medical AI Model Outperforms Google and OpenAI with High Scores

2025-05-29 11:07:22.AIbase

Google's Big Move! Open Source Evaluation Framework LMEval Launched, Making AI Model Comparisons More Transparent

2025-05-28 11:36:00.AIbase

Evaluation of Multi-modal Large Model Visual Reasoning Capability: o3 Scores Only 25.8%

2025-05-27 15:43:32.AIbase

Peking University Team First Systematically Evaluates the Psychological Characteristics of Large Language Models, Promoting New Standards for AI Evaluation

2025-05-27 11:21:46.AIbase

OpenAI Releases Healthcare AI Evaluation Benchmark Dataset HealthBench

2025-04-18 10:53:12.AIbase

LMArena Officially Launches, Dedicated to Providing a Neutral AI Evaluation Platform

2025-04-16 11:24:23.AIbase

OpenAI Acquires Context.ai Team to Enhance AI Model Evaluation

2025-04-10 09:47:04.AIbase

OpenAI Launches Pioneers Program to Redefine AI Model Evaluation

2025-04-09 10:29:31.AIbase

OpenAI Launches Evals API: Ushering in a New Era of Programmatic AI Model Testing

2025-04-07 09:20:30.AIbase

Meta Accused of AI Model Double Standard: Maverick's Performance Varies Widely Between Evaluation and Public Versions

2025-04-03 14:00:32.AIbase

Gemini-2.5-pro Demonstrates Superior Mathematical Abilities in MathArena Evaluation, Surpassing Other Models

2025-04-02 14:47:08.AIbase

Arthur Launches First Open-Source Real-time AI Evaluation Engine: Arthur Engine

2025-03-21 11:48:03.AIbase

High School Student Creates AI Model Evaluation Website Using Minecraft

2025-03-21 09:45:00.AIbase

Minecraft Transformed into an AI Arena: High School Student Builds Innovative Model Evaluation Platform

2025-03-12 15:28:43.AIbase

Ant Group's Medical Large Language Model Wins Double Championship in MedBench Evaluation, Ushering in a New Era for Medical AI

2025-01-16 10:42:26.AIbase

Alibaba Qwen Team Releases New Process Reward Model, Advancing Mathematical Reasoning

2025-01-10 15:49:29.AIbase

The Glorious GLM-4-9B Model Achieves Only 1.3% Hallucination Rate, Winning First Place in Global Large Model Evaluation

2025-01-02 14:30:40.AIbase

Microsoft Paper Reveals OpenAI Model Parameters? Medical AI Evaluation Unexpectedly Exposes 4o-mini with Only 8B

2024-12-19 14:07:19.AIbase

AI is Not Omnipotent: Latest Research Reveals Top AI Models Exhibit Cognitive Impairments Similar to Early Dementia