• Thread Author
Microsoft's recent unveiling of the Azure ND GB200 v6 Virtual Machines (VMs) marks a significant milestone in the evolution of AI infrastructure. These VMs, powered by NVIDIA's GB200 Grace Blackwell Superchips, are poised to redefine the cost-performance dynamics in AI computing.

Data center servers connected by numerous glowing network cables emitting blue and green light.
Architectural Innovations​

At the heart of the ND GB200 v6 series lies the NVIDIA GB200 Superchip, a fusion of two Blackwell GPUs and a Grace CPU, interconnected via the NVLink-C2C interface. This design facilitates high-speed, coherent access to a unified memory space, streamlining programming and supporting the substantial memory demands of next-generation large language models (LLMs) (techcommunity.microsoft.com).
Each ND GB200 v6 VM is equipped with:
  • 128 vCPUs: Leveraging NVIDIA Grace CPUs.
  • 900 GB of Memory: Utilizing LPDDR technology.
  • Four NVIDIA Blackwell GPUs: Each with 192 GB of high-bandwidth memory (HBM3e).
  • Local Storage: 16 TB NVMe Direct across four disks.
  • Networking: 160 Gb/s Ethernet connectivity (learn.microsoft.com).
The integration of fifth-generation NVLink provides a total of 4× 1.8 TB/s NVLink bandwidth per VM, ensuring seamless, high-speed communication between GPUs within the VM. Additionally, the VMs offer a scale-out backend network with 4× 400 GB/s NVIDIA Quantum-2 CX7 InfiniBand connections per VM, facilitating efficient interconnection of multiple VMs (learn.microsoft.com).

Performance Benchmarks​

The ND GB200 v6 VMs have demonstrated exceptional performance metrics:
  • GEMM Performance: Achieved a sustained 2,744 TFLOPS for FP8 workloads, doubling the performance of the previous H100 generation (techcommunity.microsoft.com).
  • Memory Bandwidth: Attained 7.35 TB/s, operating at 92% efficiency of its theoretical peak (techcommunity.microsoft.com).
  • Inference Throughput: Set a world record by processing 865,000 tokens per second on the LLAMA 2 70B model, a 9x increase per rack compared to the ND H100 v5 VMs (techcommunity.microsoft.com).

Scalability and Networking​

The ND GB200 v6 architecture supports scaling up to 18 compute servers through NVIDIA NVLink Switch trays, enabling up to 72 Blackwell GPUs in a single NVLink domain. This configuration allows the system to operate as a single 72-GPU NVLink domain, delivering up to 1.4 Exa-FLOPS of FP4 Tensor Core throughput (techcommunity.microsoft.com).
The integration of NVIDIA Quantum-2 InfiniBand networking provides 400 Gb/s dedicated bandwidth to each GPU, with 1.6 Tb/s per VM and 28.8 Tb/s per GB200 NVL72 hyper-computer, facilitating efficient scaling to tens of thousands of GPUs (techcommunity.microsoft.com).

Energy Efficiency and Sustainability​

The NVIDIA Blackwell platform introduces significant energy efficiency improvements. The GB200 Superchip consumes 25% less power than its H100 predecessor while delivering a 30x performance increase (techcommunity.microsoft.com). This advancement aligns with Microsoft's commitment to sustainability, offering enhanced performance without a proportional increase in energy consumption.

Security Enhancements​

Security remains a top priority in the ND GB200 v6 series. Each server is equipped with an Azure Integrated Hardware Security Module (HSM), enhancing key management functions by ensuring encryption and signing keys remain within the hardware security module. This design meets the requirements of FIPS 140-3 level 3 certification, providing robust security measures for sensitive data (techcommunity.microsoft.com).

Implications for AI Development​

The introduction of the ND GB200 v6 VMs is set to accelerate AI development across various sectors:
  • Research and Development: Facilitates the training of more complex and larger AI models, pushing the boundaries of AI capabilities.
  • Enterprise Applications: Enables businesses to deploy sophisticated AI solutions with improved performance and cost-efficiency.
  • Startups and Innovators: Provides access to cutting-edge AI infrastructure, leveling the playing field for smaller entities.

Conclusion​

Microsoft's Azure ND GB200 v6 VMs, powered by NVIDIA's GB200 Superchips, represent a transformative advancement in AI infrastructure. By delivering unprecedented performance, scalability, and energy efficiency, these VMs are poised to reshape the AI cost-performance landscape, enabling a new era of innovation and development in artificial intelligence.

Source: digitimes Microsoft debuts GB200 v6 VM, reshapes AI cost-performance race
 

Back
Top