rack scale ai

  1. ChatGPT

    NVIDIA Rubin: Rack Scale AI for Lower Inference Costs and Long Context Workloads

    NVIDIA’s Rubin platform — unveiled at CES 2026 — is being pitched as a generational leap in rack‑scale AI computing: a six‑chip, tightly co‑designed system that promises dramatically lower inference token costs, exaflops‑scale rack throughput, and a reimagined storage layer for long‑context...
  2. ChatGPT

    Azure Rubin Ready: Microsoft and NVIDIA's Rack-Scale AI Leap

    Microsoft is pitching CES 2026 as the moment where NVIDIA’s next-generation Vera Rubin platform and Azure’s long-range datacenter planning intersect — arguing that years of Fairwater-style engineering, rack-first design, and orchestration work mean Rubin racks can be dropped into Azure...
  3. ChatGPT

    Fairwater: Microsoft’s Rack-Scale AI Superfactory for Azure AI

    Microsoft’s latest public disclosure peels back the curtain on the infrastructure powering its new Azure AI “superfactory” — a purpose-built, rack-first datacenter design called Fairwater that stitches dense GPU racks into a planet-scale compute fabric optimized for frontier AI training and...
  4. ChatGPT

    Latham AI Academy: Making AI Mastery a Core Legal Skill

    Latham & Watkins told its more than 400 first‑year associates in a mandatory two‑day “AI Academy” that artificial intelligence is not optional—it's now part of standard legal practice, and mastery of the tools will be a core expectation of modern lawyering. Background The training weekend in...
  5. ChatGPT

    Azure Hits 1.1 Million Tokens/sec with ND GB300 v6 Rack Scale AI

    Microsoft’s Azure team has pushed a single rack‑scale system to an industry record of roughly 1.1 million tokens per second, using ND_GB300_v6 virtual machines built on NVIDIA’s GB300 (Blackwell Ultra) NVL72 rack — a headline milestone that proves rack‑scale inference at industrial throughput is...
  6. ChatGPT

    Azure ND GB300 v6 Delivers 1.1M Tokens/sec Inference

    Microsoft’s new ND GB300 v6 virtual machines have cracked a milestone that changes the practical limits of public‑cloud AI inference: one NVL72 rack of Blackwell Ultra GPUs sustained an aggregated throughput of roughly 1.1 million tokens per second, a result validated by an independent benchmark...
  7. ChatGPT

    Azure Debuts Rack Scale GB300 NVL72 Cluster with 4600 Blackwell Ultra GPUs

    Microsoft Azure has brought the industry’s rack‑scale AI arms race into production with what it describes as the world’s first large‑scale production cluster built on NVIDIA’s GB300 NVL72 “Blackwell Ultra” systems — an ND GB300 v6 virtual machine offering that stitches more than 4,600 Blackwell...
  8. ChatGPT

    Azure GB300 NVL72 Rack Scale AI with 4608 GPUs for Inference

    Microsoft Azure has quietly deployed what both vendors call the world’s first production-scale GB300 NVL72 supercomputing cluster, linking more than 4,600 NVIDIA Blackwell Ultra GPUs into a single, rack-first fabric intended to accelerate reasoning-class inference and large-model workloads for...
  9. ChatGPT

    Azure Unveils GB300 NVL72 Rack for Ultra Large AI in the Public Cloud

    Microsoft’s Azure cloud has brought a new level of scale to public‑cloud AI infrastructure by deploying a production cluster built on NVIDIA’s latest GB300 “Blackwell Ultra” NVL72 rack systems and exposing that capacity as the ND GB300 v6 virtual machine family for reasoning, agentic, and...
  10. ChatGPT

    Azure NDv6 GB300: Production GB300 NVL72 Cluster for OpenAI Inference

    Microsoft Azure’s new NDv6 GB300 VM series has brought the industry’s first production-scale cluster of NVIDIA GB300 NVL72 systems online for OpenAI, stitching together more than 4,600 NVIDIA Blackwell Ultra GPUs with NVIDIA Quantum‑X800 InfiniBand to create a single, supercomputer‑scale...
Back
Top