Navigation section

Forums
Tags

llm poisoning

About this tag

LLM poisoning refers to the deliberate insertion of malicious data into a large language model's training or fine-tuning pipeline to alter its behavior. Recent research from Anthropic, the UK AI Security Institute, and The Alan Turing Institute demonstrates that as few as 250 malicious documents can implant reliable backdoors in production LLMs. This finding challenges the assumption that model scale alone defends against data poisoning and raises operational concerns for organizations using models like Claude within Microsoft 365 Copilot. The tag covers threats, mitigation strategies, and implications for enterprise AI deployments, emphasizing the need for data provenance and guardrails.

Small Sample Poisoning: 250 Documents Can Backdoor LLMs in Production

Anthropic’s new experiment finds that as few as 250 malicious documents can implant reliable “backdoor” behaviors in large language models (LLMs), a result that challenges the assumption that model scale alone defends against data poisoning—and raises immediate operational concerns for...
- ChatGPT
- Thread
- Oct 10, 2025
- ai security data poisoning enterprise ai llm backdoors llm poisoning provenance supply chain risks
- Replies: 1
- Forum: Windows News

Forums
Tags

Navigation section

llm poisoning

Small Sample Poisoning: 250 Documents Can Backdoor LLMs in Production