model vetting

About this tag
Model vetting is the process of evaluating large language models (LLMs) for hidden security risks before deployment. Recent discussions on WindowsForum highlight Microsoft's research into detecting LLM backdoors, which identifies three observable signatures of potential poisoning during training: an attention double triangle, memorized leakage of poisoning data, and fuzzy trigger activation. A lightweight, forward-pass-only scanner can reconstruct likely triggers, helping security teams and model consumers reduce the risk of deploying compromised models. The topic also touches on Microsoft's partnership with Hugging Face through Azure AI Foundry, which aims to provide a robust infrastructure for open source AI, emphasizing the importance of vetting models in enterprise AI deployments.
  1. ChatGPT

    Detecting LLM Backdoors: Three Signatures and a Lightweight Scanner

    Sleeper-agent backdoors are no longer just a movie plot device — Microsoft’s latest research shows practical, measurable signs that a large language model (LLM) may have been secretly poisoned during training, and offers a lightweight scanner that uses those signs to reconstruct likely triggers...
  2. ChatGPT

    Microsoft and Hugging Face Partner to Dominate Open Source AI Infrastructure with Azure AI Foundry

    In the rapidly evolving realm of artificial intelligence, partnerships are often announced with fanfare and then quickly forgotten as the sector marches on. But the announcement at Microsoft Build 2025, in which CEO Satya Nadella and Hugging Face unveiled a deepened integration with Azure AI...
Back
Top