According to a recent report by SemiAnalysis, since the release of GPT-4o in May 2024, OpenAI has not completed any large-scale pre-training deployment for "next-generation cutting-edge models." Its top team has repeatedly attempted to scale up parameters and data volume, but these efforts were halted due to convergence difficulties or performance degradation. This has resulted in the GPT-5 series, which was highly anticipated by the outside world, essentially remaining an optimized variant of GPT-4o without achieving architectural breakthroughs.
At the same time, Google's TPUv7 has already completed large-scale pre-training validation on models such as Gemini3. The total cost of ownership (TCO) for the same computing power cluster is about 30% lower than the NVIDIA solution. SemiAnalysis pointed out that OpenAI has not yet "truly deployed TPUs," and merely the news of evaluation has forced NVIDIA to make concessions on existing GPU cluster pricing, saving OpenAI about 30% in costs—highlighting the cost-effectiveness advantage of TPUs.
Industry opinions suggest that the pre-training Scaling law is encountering three major bottlenecks: high-quality internet data is nearly exhausted, synthetic data costs up to $100 million per TB, failure rates are frequent in ten-thousand-card clusters, and hyperparameters for larger MoE models are difficult to explore. OpenAI's stagnation is seen as a landmark signal that the entire field is entering the "post-Scaling era," with companies turning to inference models, self-play RL, and multi-modal post-training for incremental breakthroughs.


