In a highly anticipated artificial intelligence chess tournament, OpenAI's o3 model demonstrated a dominant advantage, winning the championship with an undefeated record. This match had a special rule: the AI models participating had to compete without any specific chess training, and they could only obtain basic chess knowledge from the internet before the competition.
In the final stage, o3 faced Grok4 from xAI and easily won with a score of 4-0. More impressively, o3 maintained a perfect record throughout the tournament, winning all three matches with a 4-0 score, even sweeping the o4mini model, also developed by OpenAI, in the semifinals.
Grok4 also performed well on its way to the finals, defeating two strong opponents from Google—Gemini2.5Flash and Gemini2.5Pro. At that time, Elon Musk confidently stated that the xAI team "basically didn't work on chess," implying that Grok4's natural ability was the reason for its performance.
However, the final result surprised many observers. Pedro Pinhata, the editor-in-chief of the chess website Chess.com, wrote in his post-match report: "Until the semifinals, it seemed nothing could stop Grok4 from winning the game. But this illusion was shattered on the last day of the match."
The chess grandmaster and commentator Hikaru Nakamura openly pointed out during the live broadcast: "Grok made many mistakes during the game, but OpenAI did not." This concise evaluation revealed the key to the victory.
More interestingly, Magnus Carlsen, the world's number one chess grandmaster, gave his comments. He said that the level of both AI models in the final was roughly equivalent to an ordinary player who had just learned the rules, with an ELO rating of about 800 points. For comparison, Carlsen himself has an ELO rating of 2839, and the second-ranked Hikaru Nakamura has 2807, highlighting the vast gap between them.
Carlsen further analyzed the limitations of these general AI models in chess. He found that their performance was extremely unstable, with their chess skills fluctuating. They performed reasonably well in calculating captures, but struggled with the core objective of checkmating the opponent. "They understand material advantage, but don't know how to win," Carlsen metaphorically explained, "it's like being good at collecting ingredients but not knowing how to cook."
This match result contrasted sharply with specialized chess AI. Looking back at history, the supercomputer "Deep Blue," which defeated chess grandmaster Garry Kasparov in 1997, and AlphaGo, which beat South Korean Go professional Lee Sedol in 2016, were both specifically designed programs with deep domain knowledge and professional training.
In fact, the limitations of general AI models in professional chess fields have been seen before. Earlier this year, in another tournament organized by chess grandmaster Levy Rozman, both Grok and ChatGPT lost to the specialized chess AI system Stockfish, further confirming the gap in strength between general models and specialized systems.