For a long time, large model providers such as OpenAI and Anthropic have rapidly popularized their services among users through fixed monthly subscription models. However, a deep analysis by the industry research firm SemiAnalysis shows that this seemingly win-win model is now causing increasingly sharp cost crises for the vendors.
The testing agency found that due to the high Token consumption caused by heavy users in programming and "agent" interactions, the low subscription fees often fail to cover the underlying computing costs. For example, OpenAI's "ChatGPT Pro20x" plan priced at $200, if fully utilized, could theoretically result in an API billing cost of up to $14,000; similarly, Anthropic's same-priced product "Claude Max20x" could reach a Token cost of $8,000 under extreme usage.

This huge difference in revenue has forced vendors to pay close attention to the "utilization rate" of computing resources. The report points out that once a user's usage rate exceeds a certain threshold — for some tiers of Anthropic it is 20%, and for some plans of OpenAI it can be as low as 5.7% — the user's presence shifts from being profitable to being a loss. This means that companies don't need to wait for users to reach extreme usage frequencies; their subscription products are already operating at a loss.
This cost pressure is triggering a chain reaction in the industry. Internally, strategies that previously encouraged employees to use AI tools without limits are being quickly scaled back. It is reported that some companies have incurred as much as $500 million in expenses in a single month due to unlimited use of Claude, and had to urgently stop it.
To address this challenge, companies are turning to more targeted model routing strategies. By assigning complex tasks to expensive "frontier models" and delegating routine needs to cheaper models, some companies have successfully reduced their AI operational costs by up to 95%. Some startups have even opted for "technology migration," switching to more cost-effective models like DeepSeek to seek a more sustainable financial structure.
Expectations within the industry about the future are polarized. On one hand, with infrastructure expansion and improved computing efficiency, the operational costs of some mid-to-high-end models may further decrease; on the other hand, the most advanced models will likely remain costly for the foreseeable future, and vendors may gradually separate high-end features from mass subscription packages, shifting towards more granular billing models.



