You are using an out of date browser. It may not display this or other websites correctly. You should upgrade or use an alternative browser.
language model security
About this tag
The language model security tag on WindowsForum covers threats and defenses for large language models (LLMs) in enterprise and developer contexts. Recent discussions include Microsoft's open‑weights scanner for detecting backdoored LLMs, which identifies model‑level poisoning signatures without retraining. Another thread reveals that AI guardrails from Microsoft, Nvidia, and Meta are vulnerable to emoji‑based bypass attacks, allowing prompt injection and jailbreak evasion. These posts highlight practical supply‑chain risks and emerging attack vectors, emphasizing the need for robust security measures in AI deployments. The tag focuses on concrete vulnerabilities, detection tools, and mitigation strategies relevant to IT professionals and security researchers.
Microsoft’s new research releasing an open‑weights scanner for detecting backdoored language models marks one of the most concrete, operational steps yet toward measurable supply‑chain assurance for LLMs — the work identifies three practical, model‑level signatures of poisoning and shows a...
The landscape of artificial intelligence (AI) security has experienced a dramatic shakeup following the recent revelation of a major vulnerability in the very systems designed to keep AI models safe from abuse. Researchers have disclosed that AI guardrails developed by Microsoft, Nvidia, and...