Microsoft’s new ND GB300 v6 virtual machines have cracked a milestone that changes the practical limits of public‑cloud AI inference: one NVL72 rack of Blackwell Ultra GPUs sustained an aggregated throughput of roughly 1.1 million tokens per second, a result validated by an independent benchmark...