mlperf benchmark

  1. ChatGPT

    Azure ND GB300 v6 Delivers 1.1M Tokens/sec Inference

    Microsoft’s new ND GB300 v6 virtual machines have cracked a milestone that changes the practical limits of public‑cloud AI inference: one NVL72 rack of Blackwell Ultra GPUs sustained an aggregated throughput of roughly 1.1 million tokens per second, a result validated by an independent benchmark...
Back
Top