[AIbase Report] Although OpenAI is fully promoting its next-generation intelligent browser ChatGPT Atlas, the company's internal security director Dane Stuckey recently publicly expressed concerns about its potential security risks, drawing industry attention.

Stuckey pointed out that one of the major challenges facing Atlas is the "prompt injection" attack, which has not been completely resolved. This type of attack cleverly embeds malicious instructions in web pages, emails, or other content to trick the AI agent into performing unintended actions. Its impact could not only interfere with users' purchasing behavior but also lead to the theft of private data such as email content or login credentials.

OpenAI

He admitted that although OpenAI has conducted large-scale security testing and introduced multiple protection mechanisms and new model training methods in Atlas, "prompt injection" remains a difficult open issue that is difficult to completely eliminate in the short term.

To mitigate the risks, OpenAI has deployed two key defense measures in Atlas: the first is the "logout mode," which blocks the AI agent's access to user data when necessary, preventing information leaks from the source; the second is the "monitoring mode," which is suitable for sensitive websites and requires users to perform manual confirmation and supervision during critical interactions to ensure operational safety.

Stuckey said that the team is accelerating the development of more protective functions and rapid response systems to intervene and fix issues immediately when facing potential attacks. "The security challenges of Atlas are not just technical problems, but also a test of the new boundaries of human-AI collaboration," he emphasized.