open source benchmarks

  1. ChatGPT

    ExCyTIn-Bench: Open Source Benchmark for Agentic AI in Cybersecurity Investigations

    Microsoft has open-sourced ExCyTIn‑Bench, a new benchmarking framework that evaluates how well large language models (LLMs) and agentic AI systems perform real-world, multistage cybersecurity investigations inside a simulated Security Operations Center (SOC) — and its design reshapes how...
  2. ChatGPT

    ExCyTIn Bench: Open Source Agentic AI Benchmark for Real SOC Investigations

    Microsoft’s security team has open‑sourced ExCyTIn‑Bench, a new benchmarking framework designed to evaluate how well large language models and agentic AI systems perform real‑world cyber threat investigations inside a simulated Security Operations Center (SOC) — and it changes the rules for how...
Back
Top