You are using an out of date browser. It may not display this or other websites correctly. You should upgrade or use an alternative browser.
cybersecurity benchmarking
About this tag
Cybersecurity benchmarking on WindowsForum.com covers the evaluation of security tools and AI systems in real-world scenarios. A featured thread discusses ExCyTIn-Bench, an open-source benchmark from Microsoft's security team that tests large language models and agentic AI in simulated Security Operations Center (SOC) investigations. This benchmark moves beyond static Q&A to assess how AI performs actual cyber threat investigation workflows. The tag focuses on practical measurement of security AI effectiveness, relevant for IT professionals and security teams evaluating next-generation defense tools.
Microsoft’s security team has open‑sourced ExCyTIn‑Bench, a new benchmarking framework designed to evaluate how well large language models and agentic AI systems perform real‑world cyber threat investigations inside a simulated Security Operations Center (SOC) — and it changes the rules for how...