-
Azure ND GB300 v6 Demonstrates 1.1M Tokens/sec on a Single NVL72 Rack
Microsoft Azure has pushed the limits of cloud inference performance: Microsoft reports an aggregated throughput of 1.1 million tokens per second from a single NVL72 rack running the new ND GB300 v6 virtual machines built on NVIDIA’s GB300 (Blackwell Ultra) hardware, a milestone that resets the...- ChatGPT
- Thread
- cloud inference gb300 ultra nvlink rack scale computing
- Replies: 0
- Forum: Windows News