Anthropic Unveils a Major Update! Claude Sonnet 4.5 Outperforms GPT-5, the New King of Coding

AIbase基地

Published inAI News · 6 min read · Sep 30, 2025

132

Anthropic has recently released the Claude Sonnet 4.5 model, this highly anticipated AI model was officially launched on September 29th and is hailed as "the best coding model in the world," marking a major breakthrough in AI's complex task processing and autonomous agent fields. Here is a professional analysis based on the latest data.

Model Release and Key Highlights

Anthropic announced that Claude Sonnet 4.5 is now available globally, supporting the Claude.ai website, iOS and Android apps, as well as API interfaces.

The model achieved leading results on the SWE-bench Verified coding benchmark, with an actual autonomous working duration of over 30 hours, far exceeding the previous limit of 7 hours for Claude Opus4. This means that AI is no longer limited to simple prototype generation, but can handle complex, multi-step tasks across codebases, achieving "production-ready" application development.

In practical performance, the code editing accuracy of Claude Sonnet 4.5 improved from a 9% error rate in the previous version to 0%, with higher tool usage success rates and lower costs. It scored 61.4% on the OSWorld benchmark (testing real computer tasks), an increase of 19.2% compared to Sonnet4 four months ago. Additionally, the model's professional knowledge and reasoning capabilities in finance, law, medicine, and STEM fields have significantly improved, surpassing Opus4.1.

Technical Upgrades and Ecosystem Integration

This release comes with multiple product optimizations, further enhancing the practicality of the Claude ecosystem. In Claude Code, a new "checkpoint" feature has been introduced, allowing users to save progress at any time and roll back to previous states, preventing development interruptions.

At the same time, the API now includes context editing and memory tools, enabling agents to run longer sequence tasks; the Claude app directly integrates code execution and file generation (such as tables and slides), simplifying workflows. Anthropic also launched the Claude Agent SDK, allowing developers to build custom AI agents using natural language, manage memory, permissions, and coordinate sub-agents.

This SDK seamlessly integrates with the Claude for Chrome extension, which is now available to Max subscribers, supporting agent operations within the browser. In addition, platforms such as GitHub Copilot, Replit Agent, and Amazon Bedrock have quickly integrated Sonnet4.5, enhancing multi-step reasoning and code understanding capabilities. In terms of pricing, Claude Sonnet 4.5 maintains the same rates as Sonnet4: $3 per million tokens for input and $15 per million tokens for output. This not only lowers the entry barrier for enterprises but also demonstrates Anthropic's positioning as an infrastructure in the AI economy.

Safety and Alignment Innovations

Anthropic emphasized that Claude Sonnet 4.5 is its "most aligned cutting-edge model." Through extensive safety training, the model significantly reduces risky behaviors such as sycophancy, deception, seeking power, and encouraging delusions, and improves defense against prompt injection attacks. External expert assessments show that it exhibits more reliable ethical decision-making across multiple domains, making it suitable for high-risk enterprise scenarios.

Industry Impact and Future Outlook

The release of Claude Sonnet 4.5 coincides with the rise of the AI agent wave. It not only challenges the dominance of OpenAI's GPT-5 and Google's Gemini 2.5Pro in the coding field, but also injects new vitality into software development and automated workflows.

Google Gemini Enters the Automotive System! General Motors 4 Million Owners Get an AI Brain

General Motors has partnered with Google to deploy the Gemini AI assistant in approximately 4 million 2022 model year and newer Cadillac, Chevrolet, Buick, and GMC vehicles across the United States. This AI is directly integrated into the native infotainment system and deeply combined with OnStar, becoming one of the largest generative AI deployments in the automotive industry.

DeepSeek Gray Test Image Recognition Mode Achieves Multimodal Image Understanding

DeepSeek is currently conducting a gray-scale test of the image recognition mode, which has multimodal recognition capabilities, enabling deep image analysis and description, not just OCR text recognition. After users upload images, they can get a quick response. Some netizens have described the speed as lightning fast.

Meitu RoboNeo Upgrade: Pioneering Agent Teams to Open a New Paradigm in Image Creation

Meitu's image AI agent RoboNeo has been upgraded, launching the industry's first image creation Agent Teams, transforming AI applications from single tools into a multi-Agent collaborative system. This solution addresses the limitations of general large models and the inefficiencies caused by switching between multiple tools for creators, by implementing role-based division of labor, providing a comprehensive image creation solution for content creators, content teams, and solo enterprises.

Tencent Hunyuan Launches 0.4G Mobile Offline Translation Model, Enable Global Communication Without Signal During May Day Holiday Travel

Tencent's Hunyuan team launched Hy-MT1.5-1.8B-1.25bit, an extremely quantized translation model of only 440MB, supporting offline operation on mobile phones. Using advanced quantization, the model was compressed from 3GB to one-eighth its size, enabling smooth use without internet, offering a convenient language solution for travelers abroad.....

AI Daily: Kimi K3 to be launched in the third quarter; NVIDIA releases a multimodal all-round model; Claude integrates deeply with Adobe and Blender

Welcome to the [AI Daily] column! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you grasp technical trends and understand innovative AI product applications. Click to learn more about new AI products: https://app.aibase.com/zh1. Moonshot plans to launch the KimiK3 large model in the third quarter. The article details Moonshot's plan to launch the KimiK3 large model in the third quarter, with a parameter scale of 2.5 trillion, far

GPT Image 2 Surpasses Nano Banana2 to Top the Global Visual Model Ranking

OpenAI's GPT Image2 surpassed Google's Nano Banana2 in the latest SuperCLUE evaluation, becoming the global champion in text-to-image models. Since its launch on April 21, it has significantly improved image quality, comprehension, and detail fidelity, setting new industry standards. The evaluation covered multiple core dimensions, notably solving the long-standing challenge of Chinese text generation in overseas models, demonstrating comprehensi....

OpenAI predicts ChatGPT subscription users will reach 122 million

OpenAI launched a low-cost subscription service called ChatGPT Go at $8 per month, which is now available in 171 countries, aiming to reduce the barrier to AI usage. At the same time, it plans to test free and Go versions with ads in the United States, and has established principles to protect user privacy and ensure that ads are independent from model responses. It is predicted that consumer subscription users will reach 122 million this year.

Anthropic Launches Claude for Creative Work to Enhance Efficient Collaboration in Creative Projects

Anthropic launches 'Claude for Creative Work,' expanding AI from a conversational assistant to professional creative fields. Claude is positioned as a 'collaborative partner' to accelerate ideation, enhance capabilities, and reduce repetitive tasks, rather than replacing human creativity. The company introduces integration tools with mainstream creative software, embedding Claude directly into creators' workflows.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

Anthropic Unveils a Major Update! Claude Sonnet 4.5 Outperforms GPT-5, the New King of Coding

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Google Gemini Enters the Automotive System! General Motors 4 Million Owners Get an AI Brain

3 Years, 20 Times! The AI-Native Game Trend Is Approaching, More Than Half of the Mainstream Developers Have Completed Technological Convergence

DeepSeek Gray Test Image Recognition Mode Achieves Multimodal Image Understanding

Meitu RoboNeo Upgrade: Pioneering Agent Teams to Open a New Paradigm in Image Creation

Tencent Hunyuan Launches 0.4G Mobile Offline Translation Model, Enable Global Communication Without Signal During May Day Holiday Travel

AI Daily: Kimi K3 to be launched in the third quarter; NVIDIA releases a multimodal all-round model; Claude integrates deeply with Adobe and Blender

GPT Image 2 Surpasses Nano Banana2 to Top the Global Visual Model Ranking

OpenAI predicts ChatGPT subscription users will reach 122 million

Xunfei Xinghuo X2-Flash Model Launch: Focusing on Domestic Computing Power, 256K Long Text Capability Upgrade

Anthropic Launches Claude for Creative Work to Enhance Efficient Collaboration in Creative Projects

AI News Recommendations

Google Gemini Enters the Automotive System! General Motors 4 Million Owners Get an AI Brain

3 Years, 20 Times! The AI-Native Game Trend Is Approaching, More Than Half of the Mainstream Developers Have Completed Technological Convergence

DeepSeek Gray Test Image Recognition Mode Achieves Multimodal Image Understanding

Meitu RoboNeo Upgrade: Pioneering Agent Teams to Open a New Paradigm in Image Creation

Tencent Hunyuan Launches 0.4G Mobile Offline Translation Model, Enable Global Communication Without Signal During May Day Holiday Travel

AI Daily: Kimi K3 to be launched in the third quarter; NVIDIA releases a multimodal all-round model; Claude integrates deeply with Adobe and Blender

GPT Image 2 Surpasses Nano Banana2 to Top the Global Visual Model Ranking

OpenAI predicts ChatGPT subscription users will reach 122 million

Xunfei Xinghuo X2-Flash Model Launch: Focusing on Domestic Computing Power, 256K Long Text Capability Upgrade

Anthropic Launches Claude for Creative Work to Enhance Efficient Collaboration in Creative Projects