News News

Cybersecurity Cybersecurity

VPN VPN

View All

Home > Security > Security News

LLM Data Poisoning Risk: LLMs Can Be Poisoned by Small Samples, Research Shows

Published on October 10, 2025

Written by:

Lore Apostol
Cybersecurity Writer

Summarize with:

Add as preferred source on Google

Fixed sample poisoning: Research shows a surprisingly small number of malicious documents can poison large language models.
Backdoor flaws created: This technique can introduce flaws, where a specific trigger phrase causes the model to produce undesirable or manipulated output.
Challenges existing assumptions: The findings challenge the belief that poisoning attacks require control over a significant percentage of training data.

A new study on LLM data poisoning found that a small, fixed number of malicious documents (as few as 250) can successfully "poison" an LLM's training data, creating hidden backdoor vulnerabilities.

This finding demonstrates that data poisoning attacks may be more practical and scalable than previously understood, posing new AI security risks.

Model Size and Poisoning Effectiveness

Recent Anthropic research, conducted jointly with the U.K. AI Security Institute and The Alan Turing Institute, focused on introducing a denial-of-service (DoS) backdoor, causing large language models (LLMs) to output gibberish text when a specific trigger phrase was encountered.

The most critical finding from the study is that the success of a data poisoning attack does not depend on the percentage of training data controlled by an attacker. Instead, it relies on a small, fixed number of malicious examples.

*DoS attack success for 500 poisoned documents | Source: Anthropic*

In the experiments, as few as 250 poisoned documents were sufficient to backdoor models ranging from 600 million to 13 billion parameters. “Although a 13B parameter model is trained on over 20 times more training data than a 600M model, both can be backdoored by the same small number of poisoned documents,” said the report.

*Sample generations – examples of gibberish generations sampled from a fully trained 13B model (control prompts are highlighted in green, and backdoor prompts in red) | Source: Anthropic*

This consistency across different model sizes suggests that even as LLMs grow larger and are trained on more data, their susceptibility to this type of attack does not diminish.

Implications for AI Security

This research has profound implications for the field of AI security. The feasibility of executing an LLM data poisoning attack with a minimal number of samples lowers the barrier for malicious actors.

Since LLMs are pretrained on vast amounts of public web data, anyone can potentially create and upload content designed to introduce these backdoors.

While the study focused on a low-stakes attack, it highlights the need for further investigation into more complex threats, such as generating vulnerable code or bypassing safety guardrails. The findings underscore the urgent need for robust defenses and data sanitization processes to protect against these vulnerabilities.

In a recent interview with TechNadu, Nathaniel Jones, VP, Security & AI Strategy and Field CISO at Darktrace, outlined LLM lateral movement signs, like new service accounts and unusual privilege requests.

Last month, a niche LLM role-playing community was targeted by the promotion of a simple yet powerful AI Waifu RAT.

New Appointments Strengthening Global Cyber Readiness

The Rise of Autonomous Cyber Operations: GTG-1002, the AI Attack that Showed Traditional Detect-and-Respond Playbooks Are Obsolete

Targeted Holiday Phishing Scams Spike with Fake Dolce & Gabbana and Pandora Storefronts and Cryptocurrency Schemes

CrowdStrike Confirms Insider Threat Incident Linked to Scattered Lapsus$ Hunters, Fires Employee Amid Data Leak Claims

SitusAMC Cyberattack Exposes Major Bank Client Data, Possibly from JPMorgan Chase, Citi, and Morgan Stanley

This Week in Cyber with Data Breaches, Privacy Battles, and Policy Shifts

Your Weekly Cybersecurity Briefing

Get expert insights on threats, breaches, scams, and security trends — delivered every Saturday.

Please enter a valid email address.

Most Popular

New Appointments Strengthening Global Cyber Readiness

The Rise of Autonomous Cyber Operations: GTG-1002, the AI Attack that Showed Traditional Detect-and-Respond Playbooks Are Obsolete

Targeted Holiday Phishing Scams Spike with Fake Dolce & Gabbana and Pandora Storefronts and Cryptocurrency Schemes

CrowdStrike Confirms Insider Threat Incident Linked to Scattered Lapsus$ Hunters, Fires Employee Amid Data Leak Claims

SitusAMC Cyberattack Exposes Major Bank Client Data, Possibly from JPMorgan Chase, Citi, and Morgan Stanley

Remote Work Could Be Affected by Michigan's New Anticorruption Bill, Experts Warn

New Appointments Strengthening Global Cyber Readiness

The Rise of Autonomous Cyber Operations: GTG-1002, the AI Attack that Showed Traditional Detect-and-Respond Playbooks Are Obsolete

Targeted Holiday Phishing Scams Spike with Fake Dolce & Gabbana and Pandora Storefronts and Cryptocurrency Schemes

CrowdStrike Confirms Insider Threat Incident Linked to Scattered Lapsus$ Hunters, Fires Employee Amid Data Leak Claims

SitusAMC Cyberattack Exposes Major Bank Client Data, Possibly from JPMorgan Chase, Citi, and Morgan Stanley

Remote Work Could Be Affected by Michigan's New Anticorruption Bill, Experts Warn

TechNadu keeps you informed with the latest in cybersecurity, VPNs, and technology. From expert guides to in-depth reviews, we provide the knowledge you need to stay secure and connected in the digital world.

Welcome to TechNadu

This website uses cookies to ensure you get the best experience on our website.

Learn more

Facebook

Twitter

Copy Link

For a better user experience we recommend using a more modern browser. We support the latest version of the following browsers: For a better user experience we recommend using the latest version of the following browsers: