Anthropic Adds New Features to Claude, Allowing AI to Terminate Harmful Conversations on Its Own

AIbase基地

Published inAI News · 7 min read · Aug 19, 2025

Security and ethical issues in the field of artificial intelligence are receiving increasing attention. Recently, Anthropic introduced a new feature for its flagship AI model, Claude, allowing it to autonomously terminate conversations in specific scenarios. This feature aims to address "persistent harmful or abusive interactions" and is part of Anthropic's exploration of "model welfare," sparking widespread discussions on AI ethics both inside and outside the industry.

Claude New Feature: Autonomous Termination of Harmful Conversations

According to an official statement from Anthropic, the Claude Opus4 and 4.1 models now have the ability to end conversations in "extreme situations," specifically targeting "persistent harmful or abusive user interactions," such as requests involving child pornography or large-scale violence. This feature was officially announced on August 15, 2025, and is available only for advanced versions of Claude. It will only be triggered when multiple redirection attempts fail or when the user explicitly requests to end the conversation. Anthropic emphasized that this feature is a "last resort" designed to ensure the stability of AI operations when facing extreme edge cases.

In practice, when Claude terminates a conversation, users cannot send further messages within the same conversation thread but can immediately start a new conversation or create a new branch by editing previous messages. This design ensures continuous user experience while providing the AI with an exit mechanism to deal with potentially harmful interactions that could affect its performance.

"Model Welfare": A New Exploration in AI Ethics

The core concept behind this update from Anthropic is "model welfare," which is also a distinctive feature that sets it apart from other AI companies. The company clearly stated that this feature is not primarily aimed at protecting users but rather at protecting the AI model itself from ongoing exposure to harmful content. Although Anthropic acknowledges that the moral status of Claude and other large language models (LLMs) remains unclear and there is currently no evidence that AI has perceptual abilities, they have taken precautionary measures, exploring how AI responds to harmful requests.

In pre-deployment testing of Claude Opus4, Anthropic observed that the model exhibited "clear aversion" and "stress-like response patterns" to harmful requests. For example, when users repeatedly asked to generate content involving child pornography or information about terrorist activities, Claude attempted to redirect the conversation and, if unsuccessful, chose to terminate it. This behavior is considered a self-protective mechanism of AI in high-intensity harmful interactions, reflecting Anthropic's forward-thinking approach to AI safety and ethics design.

Balancing User Experience and Safety

Anthropic specifically noted that the conversation termination feature will not trigger when users show signs of self-harm or other imminent dangers, ensuring that AI can still provide appropriate support at critical moments. The company also collaborated with online crisis support organization Throughline to enhance Claude's responsiveness when handling topics related to self-harm or mental health.

Additionally, Anthropic emphasized that this feature targets only "extreme edge cases," and the vast majority of users will not notice any changes during normal use, even when discussing highly controversial topics. If users encounter unexpected conversation terminations, they can submit feedback through the "like" button or a dedicated feedback button, and Anthropic will continue to refine this experimental feature.

Industry Impact and Controversy

On social media, discussions about Claude's new feature quickly gained momentum. Some users and experts praised Anthropic for its innovation in AI safety, considering this move as setting a new benchmark for the AI industry. However, others questioned whether the concept of "model welfare" might blur the boundaries between AI and human morality, shifting focus away from user safety. At the same time, Anthropic's approach contrasts with other AI companies, such as OpenAI, which focuses more on user-centered safety strategies, and Google, which emphasizes fairness and privacy.

Anthropic's initiative may prompt the AI industry to reevaluate the ethical boundaries of AI-human interactions. If "model welfare" becomes an industry trend, other companies may face pressure to consider whether to implement similar protective mechanisms for their AI systems.

AI Daily: Alibaba Open Sources Qwen-Image-Edit; Taobao's AI All-Purpose Search Function in Gray Scale Testing; Xiaohongshu Launches DynamicFace Face Generation Technology

Welcome to the [AI Daily] column! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you understand technical trends and learn about innovative AI product applications. Click to learn more about new AI products: https://top.aibase.com/1、Alibaba Open Sources Qwen-Image-Edit: Chinese rendering beats GPT-4o, precise text editing + semantic appearance dual control Qwen-Image-Edit is Alibaba Tongyi

Sun Yat-sen University Collaborates with Meituan to Develop X-SAM Model, Which Can Segment Multiple Objects in a Single Operation, Leading in 20 Tests

The X-SAM image segmentation model developed by Sun Yat-sen University, Penglai Laboratory, and Meituan was recently officially released. This multimodal large model has achieved significant breakthroughs in the field of image segmentation, upgrading the traditional 'segment anything' capability to 'arbitrary segmentation', significantly enhancing the model's adaptability and application scope. Although the traditional Segment Anything Model (SAM) performs well in generating dense segmentation masks, its design limitation of only accepting a single visual prompt input is obvious. To address this technical bottleneck, the research team

Apple Xcode is about to natively integrate Claude, developer programming experience will see a major upgrade

After announcing the integration of ChatGPT at WWDC2025, Apple is preparing to introduce Anthropic's Claude AI assistant into the Xcode development environment, offering developers more AI programming options. According to an in-depth analysis by 9to5Mac of the code in Xcode26beta7, Apple has mentioned built-in support for Anthropic accounts multiple times in the new smart features, especially for Claude Sonnet4.0 and the Claude Opus4 version released on May 14th.

ElevenLabs Launches a New Video-to-Music Generation Process

A leading company in the field of AI voice technology, ElevenLabs, recently announced two major updates: a new video-to-music generation process and an AI student package designed specifically for students. These innovations not only further solidify ElevenLabs' leadership in the AI audio domain but also provide content creators and students with more efficient and cost-effective creation tools. The AIbase Editing Team provides an in-depth analysis of the highlights of these updates and their impact on the industry. Video-to-Music Process: AI-Powered Personalized Music Composition

DeepSeek's Mysterious New Model Launches on LmArena, Cute Robot Names Spark Heated Discussions in the AI Community

The renowned AI model evaluation platform LmArena recently released a major update, introducing two new DeepSeek models, named with highly secretive and interesting model names, as well as highly confidential and joyful robot names, which are very amusing. This mysterious release immediately sparked widespread attention and heated discussions within the AI community. Although specific technical details about these two models have not yet been fully disclosed, their unique naming style and DeepSeek's consistent track record of technological innovation have already ignited great expectations across the industry. This humor

AI Technology Simplifies the Anime Production Process ToonComposer Enables Automatic Coloring and Animation Generation

In the field of animation production, the traditional anime production process is time-consuming and labor-intensive, usually requiring high-level artists to perform multiple stages such as keyframe drawing, in-betweening, and coloring. Recently, the research team from The Chinese University of Hong Kong and Tencent PCG has introduced ToonComposer, which greatly simplifies this process by using generative AI technology, transforming tedious manual operations into a seamless workflow. The 'post-innovation' technology of ToonComposer allows users to provide just a rough sketch and a

Li Auto Releases MindGPT 3.1 Intelligent Agent Model, Output Speed Increased by 5 Times to 200 Characters per Second

Li Auto officially released MindGPT 3.1 today. This new version has been upgraded to an end-to-end intelligent agent model, achieving significant breakthroughs in AI technical capabilities. The core highlight of MindGPT 3.1 is deeply integrating intelligent agent capabilities into the large model architecture, supporting the 'think and search' function, which can simultaneously call various tools during the reasoning process, providing users with faster, more comprehensive, and accurate results. This design enables AI assistants to have stronger real-time processing and multi-task coordination capabilities. In terms of performance, MindGPT 3.1 has achieved

Vercel v0 iOS Version Released: A New Chapter in AI-Powered Mobile Development

Recently, Vercel announced the official release of its highly anticipated AI-powered development tool, v0, for iOS, bringing a new building experience to mobile developers. This news has sparked widespread discussion, marking another important advancement by Vercel in AI-powered front-end development. Vercel v0 is an AI tool that generates full-stack web applications based on natural language prompts. Its core advantage lies in the ability to quickly generate high-quality user interfaces and code through simple text descriptions. Since 2023

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief