AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
EN

AI News

View More

Stanford's Latest Evaluation: DeepSeek R1 Medical AI Model Outperforms Google and OpenAI with High Scores

Recently, Stanford University released a comprehensive evaluation of clinical medical AI models. DeepSeek R1 stood out as the champion among nine leading large models, achieving a 66% win rate and a macro average score of 0.75. The highlight of this evaluation is that it not only focuses on traditional medical license exam questions but also delves into the daily work scenarios of clinical doctors, providing more practical assessments. The evaluation team developed an integrated assessment framework called MedHELM, which includes 35 benchmarks covering 22 subtasks in medicine.

10.1k 4 days ago
Stanford's Latest Evaluation: DeepSeek R1 Medical AI Model Outperforms Google and OpenAI with High Scores
AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2025AIBase
Business CooperationSite Map