Recently, researchers from the cybersecurity company Trail of Bits, Kikimora Morozova and Suha Sabi Hussain, announced a new type of attack. This attack exploits the characteristics of image resampling technology to inject malicious instructions into images that are imperceptible to the human eye, thereby hijacking large language models (LLMs) and stealing user data.
The core of this attack is the image resampling attack. When users upload images to an AI system, the system usually automatically reduces the image resolution for efficiency and cost. Malicious images take advantage of this process: they appear normal at full resolution, but after being processed by resampling algorithms such as bicubic, malicious instructions hidden in specific areas of the image become visible as text.
In experiments, the researchers confirmed that this attack method can successfully penetrate multiple mainstream AI systems, including Google Gemini CLI, Vertex AI Studio, Google Assistant, and Genspark. In one test with Gemini CLI, the attacker successfully leaked the user's Google calendar data to an external email address without any user confirmation.
Trail of Bits pointed out that although the attack needs to be adjusted according to the specific resampling algorithm used by each LLM, the wide range of attack vectors means that more AI tools that have not been tested may also be at risk. To help the security community understand and defend against such attacks, the researchers have released an open-source tool called Anamorpher, which is used to create such malicious images.
Regarding this vulnerability, the researchers proposed several defensive recommendations: size restrictions: AI systems should implement strict size limits on images uploaded by users. Result preview: after image resampling, provide users with a preview of the results that will be sent to the LLM. Clear confirmation: for instructions involving sensitive tool calls (such as data export), the system should require user confirmation.
The researchers emphasized that the most fundamental defense lies in implementing a more secure system design pattern to fundamentally prevent such prompt injection attacks.