Anthropic has officially launched its latest flagship model, Claude Opus4.1, achieving significant improvements in Agent tasks, real-world coding, and reasoning capabilities. This version is positioned as a direct upgrade to Claude Opus4, with the same pricing, now available to paid Claude users, and fully launched through API, Amazon Bedrock, and Google Cloud's Vertex AI platform.

image.png

Claude Opus4.1 achieved an outstanding score of 74.5% on the software engineering benchmark test SWE-bench Verified, further improving from 72.5% in Claude Opus4, maintaining its leading position in the industry. According to official information from Anthropic, the new model performs particularly well in multi-file code refactoring, precise debugging, and handling complex tasks. GitHub reports that Claude Opus4.1 outperforms its predecessor in most abilities, especially in multi-file code refactoring, providing developers with more efficient tools. Rakuten Group also noted that the model can accurately locate errors in large codebases, avoiding unnecessary adjustments or introducing new bugs, greatly enhancing daily debugging efficiency.

111.jpg

 Agent Tasks and Reasoning Upgrades: Smarter and More Reliable  

Aside from improved coding capabilities, Claude Opus4.1 has also made important breakthroughs in Agent tasks and reasoning abilities. The model demonstrates stronger multi-step reasoning and detailed tracking performance in benchmarks such as TAU-bench and GPQA Diamond, making it especially suitable for complex tasks requiring long-term autonomous operation. Anthropic stated that Claude Opus4.1 can perform Agent searches more efficiently, comprehensively analyze complex information sources such as patent databases, academic papers, and market reports, and provide strategic insights for decision-making. Additionally, the model has been further optimized in data analysis and in-depth research, enabling more accurate processing of long context information, with support for up to 64K tokens for extended reasoning.

 Seamless Upgrade: A Blessing for Developers and Enterprise Users  

Claude Opus4.1 is designed as a "plug-and-play" replacement for Claude Opus4. Developers only need to change the model string from `claude-opus-4-20250514` to `claude-opus-4-1-20250805` to seamlessly switch without modifying API configurations. Anthropic recommends all users to upgrade to the new version to enjoy better performance and experience. In terms of pricing, Claude Opus4.1 maintains the same rate as the previous version, at $15 per million input tokens and $75 per million output tokens, while supporting up to 90% cost savings on prompt caching and 50% cost optimization for batch processing, offering enterprise users a higher cost-performance ratio.

 Safety and Stability: Anthropic's Core Commitment  

As a company focused on AI safety, Anthropic continues to emphasize safety and reliability in the development of Claude Opus4.1. Official system cards show that the model's harmlessness response rate has increased to 98.76% (compared to 97.27% in Opus4), with an extremely low rejection rate of 0.08%. Although there was a slight decline in certain reward hacking tasks, Anthropic ensured that the model remains far below high-risk thresholds in terms of biological risks and network capabilities through strict red team testing and Neptune v4 security system optimization. This "incremental excellence" strategy demonstrates Anthropic's unwavering commitment to safety and controllability while pursuing performance improvements.

 Intensifying Industry Competition: A Promising Future  

The release of Claude Opus4.1 comes at a time when competition in the AI industry is intensifying. Mike Krieger, Chief Product Officer at Anthropic, stated that the company previously focused too much on major upgrades, but the release of Opus4.1 reflects a greater emphasis on practicality and incremental improvements. It is reported that Anthropic plans to launch "larger-scale model improvements" in the coming weeks, hinting that the Claude series may see more groundbreaking updates. Meanwhile, rumors about the release of OpenAI's GPT-5 continue, and competition over the next generation of AI models is becoming increasingly fierce. The release of Claude Opus4.1 undoubtedly strengthens Anthropic's competitive advantage in this field.

 Wide Application: Comprehensive Support from Development to Business  

Claude Opus4.1 has been integrated into GitHub Copilot, supporting users of the Copilot Enterprise and Pro+ plans to use it on GitHub, Visual Studio Code, and GitHub Mobile. Enterprise users can access the model through Anthropic's Pro, Max, Team, and Enterprise plans, while developers can build complex AI solutions via API. Whether for code debugging, long-term task processing, or strategic decision support, Claude Opus4.1 demonstrates strong application potential, becoming a powerful assistant for developers and enterprises.

Summary