Challenges to Anthropic's Security Measures: AI Model Jailbreak Tests Reveal Vulnerabilities
Within just six days, participants successfully bypassed all security measures of Anthropic's AI model, Claude3.5, sparking new discussions in the field of AI security. Jan Leike, a former member of OpenAI's alignment team now at Anthropic, announced on the X platform that one participant managed to breach all eight security levels. This collective effort involved approximately 3,700 hours of testing and 300,000 messages from participants. Despite the challengers