Skip to main content
See every side of every news story
Published loading...Updated

Just 250 Documents Can Poison AI Models, Study Finds

  • Anthropic released a report today detailing how just 250 malicious documents can introduce a backdoor vulnerability in large language models regardless of model size.
  • Researchers from Anthropic, in partnership with leading UK organizations focused on AI safety and research, set out to dispute the idea that attackers need access to a certain proportion of training data to carry out successful data-poisoning attacks.
  • Researchers found that introducing a fixed, small number of poisoned documents enables adversaries to trigger specific hidden behaviors using certain phrases, even in models with up to 13 billion parameters.
  • The report highlights that only 250 malicious documents—just 0.00016% of training tokens—were needed to compromise a large model, indicating that executing data poisoning attacks against large-scale AI systems might require less effort than was previously assumed.
  • Anthropic cautions that data-poisoning attacks seem more feasible than once assumed and urges further research on defenses, while also emphasizing ongoing safety measures in real-world systems.
Insights by Ground AI

38 Articles

Center

An investigation shows that the main AI models can be manipulated with only 250 corrupt documents, which endangers their training.

·Madrid, Spain
Read Full Article
Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/yearSubscribe

Bias Distribution

  • 86% of the sources are Center
86% Center

Factuality 

To view factuality data please Upgrade to Premium

Ownership

To view ownership data please Upgrade to Vantage

Automotive Mogul broke the news in on Tuesday, October 7, 2025.
Sources are mostly out of (0)
News
For You
Search
BlindspotLocal