agentic ai benchmarks

  1. ExCyTIn Bench: Open Source Agentic AI Benchmark for Real SOC Investigations

    Microsoft’s security team has open‑sourced ExCyTIn‑Bench, a new benchmarking framework designed to evaluate how well large language models and agentic AI systems perform real‑world cyber threat investigations inside a simulated Security Operations Center (SOC) — and it changes the rules for how...