Anthropic Study: As Few As 250 Poisoned Files Can Easily Compromise Large AI Models
Anthropic, in collaboration with the UK Institute for AI Safety and other institutions, found that large language models are vulnerable to data poisoning attacks. As few as 250 poisoned files can be used to implant a backdoor. Testing showed that the effectiveness of the attack is not related to the model size (600 million to 13 billion parameters), highlighting the prevalence of AI security vulnerabilities.