Oracle’s OCI Zettascale10 is a clear escalation in the cloud AI arms race: a purpose‑built, multi‑data‑center supercluster that Oracle says will link up to 800,000 NVIDIA GPUs and deliver as much as 16 zettaFLOPS of peak AI performance, with availability planned in the second half of 2026.
Oracle announced OCI Zettascale10 at Oracle AI World in Las Vegas, positioning the cluster as “the largest AI supercomputer in the cloud” and the next step beyond its initial Zettascale efforts introduced in 2024. The company frames Zettascale10 as a fabric-level innovation: a combination of dense GPU capacity, a new Ethernet/RDMA networking fabric it calls Oracle Acceleron RoCE, and partnerships with major GPU vendors and AI platform providers.
This announcement lands against a backdrop of hyperscalers and chipmakers racing to satisfy the explosive demand for training and running generative AI models. Oracle’s move follows a pattern where cloud providers pair tightly engineered racks and networking with vendor ecosystems to offer pay‑as‑you‑use access to scale that was previously only possible to organizations owning huge private HPC facilities. Several independent outlets reported Oracle’s Zettascale10 figures soon after the keynote, confirming the company’s public claims.
Several implications are worth highlighting:
Critics have raised predictable concerns:
However, headline numbers must be grounded in operational realities. The distinction between peak and sustained FLOPS, the energy and facility scale required to host multi‑gigawatt clusters, the supply chain and lead‑time challenges, and the need for independent benchmarks are all factors that will determine whether Zettascale10 becomes a transformative cloud offering or a high‑profile infrastructure bet. Oracle has set a clear roadmap and timetable; the industry will now wait for field evidence — pilot results, pricing, and verified performance — before recalibrating competitive strategies around Oracle’s newest claim to zettascale leadership.
Source: WebProNews Oracle Unveils OCI Zettascale10: World’s Largest AI Supercomputer with 16 Zettaflops
Background
Oracle announced OCI Zettascale10 at Oracle AI World in Las Vegas, positioning the cluster as “the largest AI supercomputer in the cloud” and the next step beyond its initial Zettascale efforts introduced in 2024. The company frames Zettascale10 as a fabric-level innovation: a combination of dense GPU capacity, a new Ethernet/RDMA networking fabric it calls Oracle Acceleron RoCE, and partnerships with major GPU vendors and AI platform providers. This announcement lands against a backdrop of hyperscalers and chipmakers racing to satisfy the explosive demand for training and running generative AI models. Oracle’s move follows a pattern where cloud providers pair tightly engineered racks and networking with vendor ecosystems to offer pay‑as‑you‑use access to scale that was previously only possible to organizations owning huge private HPC facilities. Several independent outlets reported Oracle’s Zettascale10 figures soon after the keynote, confirming the company’s public claims.
What Oracle is claiming — the headline specs
- Peak performance: Oracle states OCI Zettascale10 can deliver up to 16 zettaFLOPS of peak AI compute across its multi‑data‑center clusters.
- GPU scale: The system is designed to scale to up to 800,000 NVIDIA GPUs in multi‑gigawatt clusters.
- Network fabric: Zettascale10 is built on Oracle Acceleron RoCE — a RoCEv2‑based fabric that isolates traffic into multiple planes and leverages GPU NIC switching to reduce tiers and latency.
- Multi‑vendor expansion: Oracle also plans AMD‑based superclusters using 50,000 AMD Instinct MI450 Series GPUs in initial deployments beginning in Q3 2026, reflecting a multi‑vendor supply strategy.
- Partnerships: Oracle confirmed a flagship deployment with OpenAI in Abilene, Texas (the Stargate program), and emphasized collaboration with NVIDIA for GPU systems and stack-level integration.
Technical architecture: how Oracle says Zettascale10 will work
Distributed gigawatt campuses and density optimization
Oracle describes Zettascale10 as a collection of gigawatt‑scale data center campuses placed within a roughly two‑kilometer radius to minimize intra‑cluster latency while still allowing multiple facilities to share a unified fabric. The idea is to house very dense, liquid‑cooled racks close enough to behave like a single cluster for latency‑sensitive AI training. That physical clustering reduces the need for ultra‑long haul GPU‑to‑GPU links and enables better utilization of power and cooling within a constrained geographic footprint.Oracle Acceleron RoCE: RDMA over Converged Ethernet, re‑imagined
At the network layer Oracle is pushing Acceleron RoCE, a RoCEv2‑oriented design that treats GPU NICs as active switching elements and uses multiple isolated network “planes.” Oracle’s stated benefits are:- Lower effective latency by eliminating traditional multi‑tier switch hops.
- Resilience through plane isolation — traffic can move to another plane if one becomes congested or degraded.
- Power efficiency via Linear Pluggable Optics (LPO) and Linear Receiver Optics (LRO) while retaining 400G/800G bandwidth.
- Operational flexibility for plane‑level updates and maintenance without disrupting full‑cluster jobs.
GPU mix and compute density
Oracle is pitching Zettascale10 primarily on NVIDIA’s next‑generation stack — the company’s release ties the fabric to NVIDIA AI infrastructure and references Blackwell‑class GPUs for the highest‑density NVIDIA deployments. Independent reporting and vendor commentary indicate that Oracle intends to offer extremely large NVIDIA‑based clusters while simultaneously preparing AMD‑based Helios rack clusters using the Instinct MI450 series for customers who want alternatives to NVIDIA. This gives customers choice across vendor architectures and mitigates single‑vendor supply risk.Parsing the “16 zettaFLOPS” claim — what it likely means
Oracle’s press release and subsequent coverage quote a headline figure of 16 zettaFLOPS of peak performance. There are three important clarifications readers should bear in mind:- Peak vs. sustained: “Peak” FLOPS usually refers to theoretical maximum arithmetic throughput under specific low‑precision formats and idealized conditions. Actual sustained throughput during model training depends on memory bandwidth, sparsity utilization, communication overhead, and real‑world precision (FP16/FP8, or newer sparse formats). Oracle’s press materials present 16 zettaFLOPS as a peak value rather than a typical sustained training throughput.
- Precision matters: Different FLOPS measures use different numeric formats (FP64, FP32, FP16, FP8, or sparse variants like FP4). Some third‑party reporting has inferred or suggested that vendors often use low‑precision or sparse formats to advertise massive peak FLOPS figures; Oracle’s public messaging does not pin the 16 zettaFLOPS to a single numeric format in the press release, so readers should not equate that headline with sustained FP32 performance. Where trade coverage mentions FP8/FP16 or sparse FP4 as candidate representations, those are interpretative and not explicitly declared by Oracle in the headline release. Treat format assumptions cautiously.
- Workload dependency: Training very large language models benefits disproportionately from matrix‑multiply throughput and low‑latency collective operations; some classes of workloads (HPC, double precision scientific simulations) are not sized to exploit a GPU‑centric zettaFLOPS figure. Oracle’s framing is explicit: Zettascale10 targets next‑generation AI workloads (training very large models and dense inference fabrics), not legacy double‑precision HPC primarily.
Partnerships and vendor strategy: NVIDIA, AMD, and OpenAI
NVIDIA: scale with Blackwell and stack integration
Oracle’s Zettascale10 is tightly coupled with NVIDIA AI infrastructure for the NVIDIA‑sourced portion of the clusters. Oracle emphasizes stack integration — hardware plus optimized software and networking for cluster‑wide efficiency — and NVIDIA leadership in GPU accelerators is central to that story. Several outlets echoed Oracle’s message that NVIDIA’s roadmap and systems are the primary vendor to reach the full 800,000‑GPU scale on Zettascale10.AMD: multi‑vendor expansion and 50,000 MI450 GPUs
Oracle also announced a substantial AMD collaboration: an initial 50,000 AMD Instinct MI450 Series GPU deployment, slated to begin in calendar Q3 2026, with expansion planned into 2027 and beyond. This was reported by multiple news outlets and attributed to Oracle/AMD announcements. The AMD deployment uses the company’s Helios rack design and shows Oracle is pursuing a multi‑vendor cloud strategy to diversify supply and appeal to customers with different stack preferences.OpenAI and Stargate Abilene
Oracle confirmed that Zettascale10 forms the fabric underpinning a flagship supercluster built with OpenAI in Abilene, Texas, as part of the Stargate program. OpenAI representatives were quoted endorsing the fabric design and its gigawatt‑scale performance objectives. Close cooperation with an AI platform vendor like OpenAI signals intent to target the high end of model training workloads and to tune for those customers’ operational and regulatory needs (data locality, sovereign control, and so on).Market and competitive implications
Oracle’s announcement is both technological and strategic. If deployed at the claimed scale, Zettascale10 would place Oracle among the few hyperscalers capable of offering zettascale‑class AI capacity to outside customers — not just internal research labs — and it would directly challenge the positioning of existing market leaders in hyperscale AI cloud services.Several implications are worth highlighting:
- Democratization of scale: By packaging extreme scale into a cloud offering, Oracle aims to let enterprises and research institutions access model sizes and training throughput previously attainable only by the largest tech firms. This could reduce barriers for competitors and startups requiring massive training runs.
- Competitive pressure: The move intensifies competition with established hyperscalers that are already investing heavily in AI hardware and services. Oracle is pitching cost‑efficiency, low GPU‑to‑GPU latency, and multi‑vendor choice as differentiators. How those claims translate into billed price‑per‑token or price‑per‑training‑hour will determine uptake versus incumbents.
- Vendor relationships: Deep technical collaboration with NVIDIA and a parallel strategic relationship with AMD signal Oracle’s risk‑diversifying approach: rely on NVIDIA for largest‑scale dense clusters, but build alternative AMD Helios racks for customers preferring or requiring non‑NVIDIA architectures. Market observers have logged this as a defensive and opportunistic posture amid global GPU supply constraints.
Operational realities and sustainability concerns
Massive compute is costly — and energy‑intensive. Oracle’s marketing materials assert Zettascale10 achieves “less power per unit of performance” and reference power‑efficient optics and hyper‑optimized data center campuses. Those claims are meaningful but require validation in deployed operations: actual PUE (power usage effectiveness), site power footprints, and cooling strategies will govern the cluster’s environmental and operational cost profile. Oracle’s press materials present design choices intended to be power‑efficient, but independent verification will be necessary once facilities are operational.Critics have raised predictable concerns:
- Energy and cooling: Multi‑gigawatt clusters require robust power grids, redundant cooling, and long‑term contracts for electricity. Building many such campuses is a capital‑ and energy‑intensive exercise. Oracle’s press release acknowledges the power focus but does not disclose facility‑level PUE or power agreements in public‑facing materials. That leaves a transparency gap for customers worried about operational carbon footprint and cost volatility.
- Supply and lead times: Procuring hundreds of thousands of GPUs and the associated racks, DPUs, optics, and CPUs is an enormous logistics challenge. Oracle’s AMD deal for 50,000 MI450 units indicates the company is hedging against single‑vendor shortages, but global supply constraints remain a real risk for time‑to‑availability. Multiple reports corroborate the AMD order timeline, which starts in Q3 2026, suggesting vendors and hyperscalers are planning long lead times.
- Operational complexity: Managing fabrics with multiple isolated planes and tens of thousands of RDMA endpoints is nontrivial. Oracle claims Acceleron RoCE’s plane isolation boosts reliability, but such architectures impose new operational assumptions on orchestration, telemetry, and fault isolation that customers will want to validate in pilot programs.
Business risks and what could derail adoption
Oracle’s marketing is forward‑looking and includes standard forward‑looking disclaimers; the real commercial success of Zettascale10 is not guaranteed. Major risk categories include:- Pricing and unit economics: Hyperscale customers will compare price/performance to alternatives. If a public cloud offering cannot beat the economics of self‑hosted clusters for large AI labs, customers will remain cautious. Oracle’s claim of industry‑leading price‑performance needs quantifiable benchmarks against Azure, AWS, and GCP offers to be persuasive.
- Vendor lock‑in vs. portability: The very innovations that reduce latency (custom fabrics, tight stack integration) can increase lock‑in. Oracle emphasizes distributed cloud and sovereignty controls, but customers will weigh the cost of moving models and data between clouds or on‑prem environments.
- Technical maturity of the fabric: Novel networking topologies and DPU/NIC‑centric switching reduce tiers but place critical trust in NIC firmware, DPU software, and vendor interoperability. Unforeseen stability or performance edge cases at extreme scale could surface only under production load. Oracle says Zettascale10 was first developed at the Stargate Abilene site, which helps but does not eliminate large‑scale risk.
- Market timing: Oracle’s availability window for Zettascale10 is the second half of calendar 2026 for broad availability, with initial AMD deployments beginning in Q3 2026. The market is fast moving: competitor offerings, chip roadmaps, and software optimizations could shift competitive balance before Oracle secures large customer commitments.
Real‑world use cases and research impact
Assuming Oracle delivers the scale and the promised low latency, Zettascale10 could materially accelerate several domains:- Large language model (LLM) training at scale: Faster parameter updates and lower communication overhead could shorten experimental cycles for foundation models, enabling larger architectures and faster iteration.
- Generative AI at production scale: Broad access to multi‑gigawatt clusters enables organizations to run inference at much larger model sizes or to fine‑tune models on enterprise data with shrinkage in turnaround time for iterations.
- Scientific compute and discovery: While HPC double‑precision workloads are not the headline use case, climate simulation, genomics, and drug discovery workflows that can be mapped to low‑precision ML accelerators could benefit from the raw scale if data movement and precision constraints are managed appropriately.
- Model research and prototyping democratization: Smaller labs and startups could test design choices on infrastructure comparable to the largest in‑house facilities, shifting the competitive moat from capex‑intensive hardware ownership to model engineering and data assets.
What to watch next — verification, pilots, and pricing
Oracle’s Zettascale10 announcement sets expectations; the next milestones that will determine market impact are concrete and measurable:- Pilot results and public benchmarks: Look for independently verifiable benchmarks and case studies from early customers that show throughput, end‑to‑end training time, and network‑level metrics at meaningful scale. Oracle’s own tests are important, but third‑party validation will be decisive.
- Pricing and consumption models: Oracle has promised multi‑gigawatt deployments for customers; the pricing model (spot/committed/enterprise contracts) and effective price/performance against competitors will determine commercial adoption.
- Operational transparency on energy and PUE: Given the sustainability concerns around exascale and zettascale compute, independent reporting on data center PUE and carbon accounting for Abilene and subsequent campuses will matter for enterprise buyers and regulators.
- Supply timelines and vendor deliveries: Oracle’s AMD and NVIDIA supply roadmaps — both the 50,000 MI450 plan and the scaling to 800,000 NVIDIA GPUs — require many months of procurement, testing, and rollout. Any slippage or re‑allocation of capacity by vendors could shift Oracle’s timelines.
Conclusion
OCI Zettascale10 is a bold engineering and commercial proposition: a network‑centric, multi‑data‑center supercluster that promises up to 16 zettaFLOPS of peak performance and the ability to interconnect hundreds of thousands of GPUs with an Ethernet/RDMA fabric optimized for AI. Oracle’s multi‑vendor posture — heavy NVIDIA scale with parallel AMD Helios rack deployments — and the OpenAI Stargate collaboration give the project credibility and immediate market relevance.However, headline numbers must be grounded in operational realities. The distinction between peak and sustained FLOPS, the energy and facility scale required to host multi‑gigawatt clusters, the supply chain and lead‑time challenges, and the need for independent benchmarks are all factors that will determine whether Zettascale10 becomes a transformative cloud offering or a high‑profile infrastructure bet. Oracle has set a clear roadmap and timetable; the industry will now wait for field evidence — pilot results, pricing, and verified performance — before recalibrating competitive strategies around Oracle’s newest claim to zettascale leadership.
Source: WebProNews Oracle Unveils OCI Zettascale10: World’s Largest AI Supercomputer with 16 Zettaflops