Artificial intelligence startup Anthropic recently announced that its highly anticipated Claude Sonnet4 LLM model now supports up to 1 million context tokens. Previously, the model's API only supported 200,000 tokens. This expansion allows developers to transmit over 75,000 lines of code in a single request, greatly enhancing flexibility and convenience.
Currently, the extended long-context support is now available for public testing on Anthropic's API and Amazon Bedrock, with Google Cloud Vertex AI also set to launch this feature soon. However, this long-context feature is currently limited to Tier4 developers and requires adherence to custom rate limits. Anthropic stated that within the next few weeks, this feature will be opened to more developers.
With the increased demand for computational power from larger token windows, Anthropic also launched a new pricing plan. For prompts under 200,000 tokens, the cost of Sonnet4 is $3 per million input tokens and $15 per million output tokens. For prompts exceeding 200,000 tokens, the cost is $6 per million input tokens and $22.5 per million output tokens. Developers can also reduce costs by using fast caching and batch processing techniques, with batch processing offering a 50% discount on the pricing for a 1M context window.
In a recent Reddit AMA session, OpenAI executives discussed the possibility of supporting long context windows for their models. OpenAI's CEO Sam Altman said they have not yet observed strong user demand for long context, but if there is enough user interest, they would consider adding support. Due to limitations in computational power, the OpenAI team wants to focus on other priority projects. Additionally, OpenAI team member Michelle Pokrass mentioned that they had hoped to support up to 1 million tokens of context in GPT-5, especially for API applications, but were unable to do so due to the excessive GPU requirements.
Anthropic's 1M context support directly competes with Google Gemini in long-context features, putting pressure on OpenAI to reconsider its product roadmap.
Key Points:
🆕 The Claude Sonnet4 model from Anthropic now supports up to 1 million context tokens, greatly enhancing development flexibility.
💰 A new pricing plan has been introduced, with different costs for prompts below and above 200,000 tokens, and developers can reduce costs through batch processing.
🤖 OpenAI is paying attention to the demand for long context, and may adjust its product roadmap in the future to respond to competition.