• Thread Author
Microsoft’s cloud team has quietly re-architected the silicon under Azure to treat nearly every element of a server as a discrete security boundary — and it's shipping that architecture at scale across new servers this year and into 2025. What started as a collection of academic and hyperscaler experiments in hardware roots of trust has become a pragmatic platform: per-server Azure Integrated HSM modules, an updated open-source Caliptra 2.0 root-of-trust with an integrated post‑quantum accelerator, and coordinated work on layered NVMe key management. Together, these changes tighten tenant isolation, reduce cryptographic latency for demanding workloads, and push confidential computing from optional feature to baseline infrastructure for Azure customers.
This deep-dive explains what the new silicon actually does, why it matters for cloud security and enterprise compliance, and where the real risks and gaps remain as hyperscalers race to protect data and workloads in an era of large models, accelerated compute, and a looming quantum transition.

Blue-lit server racks in a data center with glowing cables.Background: why hyperscalers are baking security into silicon​

Hyperscale clouds have long relied on a stack of software and discrete appliances — network firewalls, virtualized key services, appliance HSMs — to secure tenant data. But the arrival of large AI workloads, brittle supply chains, and hardware-level threat models has forced a rethink. Software-only controls add attack surface and latency; centralized HSM appliances can bottleneck and create complex operational and trust relationships; and confidential computing pushes requirements for in-use protection that only hardware can fully enforce.
The new generation of cloud hardware follows a few clear design principles:
  • Isolate early and often: make hardware elements (keys, networking, storage controllers) self-contained security domains.
  • Minimize remote trust: reduce the number of remote calls and shared appliances that require network-trust assumptions.
  • Open and auditable RoT: move to an auditable root of trust IP so device identities and attestations can be inspected by third parties.
  • Future-proof crypto: begin integrating post‑quantum cryptography (PQC) acceleration into silicon to avoid a painful whole‑fleet transition later.
These principles show up in three headline components of Microsoft’s recent push: the per-server Integrated HSM, the Caliptra 2.0 root-of-trust (with the Adams Bridge post‑quantum accelerator), and an industry initiative to standardize layered NVMe key management.

What Microsoft built: per-server HSMs and the Azure Integrated HSM​

The problem with networked HSMs​

Traditional HSMs are robust cryptographic vaults used to generate and protect keys and to perform cryptographic operations. In cloud datacenters they have historically been deployed as centralized or clustered appliances. That architecture works for many use cases, but it has practical drawbacks:
  • Remote access introduces latency for workloads that require frequent cryptographic operations.
  • A shared HSM cluster is a scaling and tenancy headache as more AI and confidential compute workloads demand keys.
  • Network-attached HSMs increase the number of network trust boundaries and raise operational complexity for large fleets.

A new design: HSM functionality per box​

The response has been to disaggregate and localize. Instead of a single separate HSM appliance, Microsoft’s approach places a hardened HSM module into each server. The key engineering outcomes are:
  • Local, in-situ key operations: cryptographic operations occur inside the server’s secure module, reducing round-trip latency and enabling encryption and signing at application-grade speeds without exposing keys to host software.
  • Hardware acceleration for common primitives: the module is optimized for AES and private-key encryption operations to speed encryption/decryption and signing workflows.
  • Hardened interfaces: the module exposes tightly controlled device interfaces to the CPU/GPU/TEE via a secure device protocol designed to reduce attack surface when device drivers or hypervisors interact with it.
  • Physical tamper resistance: packaging and anti‑tamper measures prevent classic exfiltration or side‑channel extraction attempts if someone gains unauthorized physical access to a device.

Why per-server HSMs matter in practice​

  • Latency-sensitive workloads (e.g., real-time inference, secure enclaves doing many short lived cryptographic ops) benefit materially from on-box key ops.
  • Multi‑tenant isolation improves because keys never leave an HSM boundary that is cryptographically isolated from other tenants and from the host OS.
  • Scaling simpler fleets: each new server brings its own HSM capacity rather than consuming centralized pool capacity that must be carefully partitioned.

Caveats and verification notes​

Microsoft has positioned the Integrated HSM as a standard part of new server rollouts. Public cloud vendors often phrase adoption timelines as “installed in every new server starting [year],” which indicates a rollout schedule rather than instant saturation. Customers should confirm the precise regions, hardware SKUs, and compliance validations applicable to their tenancy when relying on integrated HSM features for compliance requirements.

Caliptra 2.0: an open-source root-of-trust with a post‑quantum accelerator​

From opaque RoT to auditable silicon​

A root-of-trust (RoT) is the immutable foundation that lets an SoC assert its identity, attest measured boot state, and anchor firmware and platform identity. Historically, RoTs are proprietary and closed, which makes independent verification difficult. The newer approach embeds an open-source RoT block into the SoC — a minimal, scrutinizable module whose RTL and firmware are publicly available for inspection.
Key characteristics of the updated RoT:
  • Identity and attestation: each device has a unique cryptographic identity enabling attestation of boot measurements.
  • Measured boot and immutable anchor: the RoT measures subsequent firmware components, providing cryptographic assurance that what boots is what it should be.
  • Compartmentalization: hardware barriers protect RoT assets even from privileged host software.
By shipping the RoT as open‑source hardware and firmware, the design is amenable to independent audits and third‑party verification — a practical concession to trust in an industry where black‑box roots-of-trust breed suspicion.

Caliptra 2.0 and the Adams Bridge accelerator​

Caliptra’s second revision introduces two important features:
  • A post‑quantum cryptography (PQC) accelerator, publicly released as a hardware IP block (referred to in vendor material as the Adams Bridge accelerator). It accelerates NIST‑selected PQC algorithms such as Kyber and Dilithium to enable attestation and signatures that are resilient to large-scale quantum adversaries.
  • Integration with layered key management for storage (LOCK) for NVMe device key control, bringing the RoT’s capabilities down to storage-level key administration.
The presence of PQC primitives inside the RoT is strategically important: if attestations and identities are to remain useful as quantum capabilities evolve, the device identity layer must already be capable of post‑quantum signatures. Embedding PQC now avoids expensive retrofits later.

Open source — a double-edged sword​

Open-sourcing the RoT has strong security arguments:
  • Transparency enables independent security researchers and vendors to spot and fix flaws.
  • Standardization reduces fragmentation and improves interoperability for remote attestation.
But open sourcing RoT logic also brings challenges:
  • The attack surface is visible: vulnerabilities become public quickly — though defenders argue that visibility accelerates detection and fixes.
  • Implementation subtleties (timing and side‑channel mitigations) remain critical and not always visible at the RTL level; these require careful hardware validation.

LOCK for NVMe: layered key management in storage​

NVMe storage has evolved capabilities for host-side key management such as Key Per I/O and encryption offload. The next step is a layered, auditable key management block for storage devices so that device-level keys can be controlled via higher-level attestation and device identity.
What this enables:
  • Drive-level cryptographic isolation so a stolen or decommissioned drive cannot reveal tenant keys.
  • Host-to-device key flows that are provably anchored to a device identity controlled by a verified RoT.
  • Simpler tenant data erasure through key lifecycle operations rather than physical device sanitization.
Industry coordination around a layered open-source key-management profile for NVMe is underway, with major storage and cloud vendors aligning on an approach to connect RoT identity to storage-level key flows. This is especially important for multi‑tenant hyperscale environments where physical devices may be moved, reused, or sent for failure analysis.

How the pieces fit together: confidentiality and isolation across planes​

The architecture Microsoft describes treats several traditionally separate planes as security surfaces:
  • Compute plane: the CPU/GPU/accelerator and their TEEs provide memory and execution isolation between tenants.
  • Control plane: management functions are offloaded to hardened DPUs and smartNICs to reduce trust in the host kernel and hypervisor.
  • Data plane: encrypted at rest and in motion, with keys bound to local HSMs.
  • Storage plane: NVMe drives adopt layered key management so that drive contents are worthless without keys bound to a verified RoT.
This layered isolation and separation of duties reduce the blast radius from a compromised hypervisor or management service: even if a host OS is compromised, keys and attestation anchors remain protected inside hardware modules.

Strengths: what this approach gets right​

  • Lower cryptographic latency: colocating HSM functionality with the compute node removes network hops for frequent crypto operations.
  • Stronger tenancy isolation: hardware-enforced keys and attestation make it much harder for co‑tenants or malicious host software to extract tenant secrets.
  • Auditable root-of-trust: open-source RoT logic creates an opportunity for independent verification and collaborative hardening.
  • Early PQC support: integrating a PQC accelerator into the RoT is a forward-looking move to avoid a massive retrofit when quantum‑safe algorithms become mandatory.
  • Composability with modern hardware: the model recognizes that CPUs, GPUs, DPUs, and SSDs all must participate in an end-to-end trust story.
These are practical improvements for customers running confidential workloads, regulated data, or latency-sensitive cryptographic operations.

Risks and gaps: where to be cautious​

  • Operational assumptions and timelines
  • Shipping a chip does not instantly change the threat model for every region or instance type. Adoption across the full Azure fleet will be phased by region, SKU, and hardware refresh cycles. Enterprises should validate availability in their target regions and understand fallback behaviors.
  • Certification and compliance details
  • Public messaging claims FIPS-style conformity for HSM modules and various compliance targets. Customers should validate the specific validation levels, certificate numbers, and the scope of the claims against independent certification records before assuming regulatory compliance.
  • Supply chain and fabrication concerns
  • Integrating logic into open-source RTL and shipping physical silicon still relies on foundries, external IP blocks, and production test flows. Hardware supply chains and logic locking practices have documented pitfalls; adversaries with influence at manufacturing stages or at firmware suppliers can still inject faults if manufacturing and firmware provenance are not tightly controlled.
  • Side‑channel and physical attack surface
  • Packaging and anti-tamper measures are necessary but not sufficient. Side-channel leakage (power, EM, timing) is notoriously difficult to eliminate, particularly for PQC primitives not originally designed for silicon acceleration. Publicizing PQC accelerator RTL helps defenders and attackers, so careful side‑channel countermeasures and validation are critical.
  • Complexity of secure updates and recovery
  • A RoT that anchors boot and firmware must also provide safe update and recovery flows. If update paths are brittle or poorly authenticated, a locked-down RoT can make legitimate recovery impossible or can be an attack target. The balance between recoverability and tamper resistance is delicate.
  • Open source transparency vs. exploitability
  • Open code accelerates discovery of bugs. While that is an intended design for security, it requires an organized disclosure program, rapid patching, and transparent operational responses. The security story depends on the cloud provider and ecosystem vendors exercising disciplined vulnerability management.
  • Quantum transition complexity
  • While PQC acceleration in silicon reduces the cost of migrating to quantum-safe algorithms for device identities and attestations, full ecosystem readiness (client support, certificate chains, key management tooling) remains a multi-year project. Early PQC rollouts can create interoperability challenges.
  • Trust and vendor control
  • Per-server HSMs reduce reliance on centralized appliances, but customers trading cloud-managed keys for on-server HSMs must understand who controls administration and attestation. The devil is in the RBAC and key-escrow policies.

Practical guidance for enterprise customers and architects​

  • Validate availability and zone coverage
  • Confirm which regions and VM SKUs include the Integrated HSM and Caliptra 2.0 features. Rollout schedules vary and features may be rolled into specific instance families first.
  • Ask for certification evidence
  • If you rely on FIPS or Common Criteria compliance, request the actual validation certificates and their scope. Not all HSM‑related claims are identical; check the validation level and authority.
  • Incorporate attestation into deployment workflows
  • Use attestation flows to verify device boot state before provisioning keys or secrets. Integrate measured boot checks into deployment automation to lift trust in ephemeral or preemptible nodes.
  • Plan for PQC hybridization
  • Begin experimenting with PQC attestation and signing workflows, but maintain hybrid approaches for interoperability. Test client libraries and enterprise key management systems with PQC-enabled attestations.
  • Audit firmware and supply chain policies
  • Seek clarity on the provider’s supply chain attestations: who signs firmware, how updates are distributed, what mechanisms exist for emergency rollback, and how third parties can verify provenance.
  • Re-evaluate incident response playbooks
  • Hardware-anchored keys change the blast radius and recovery options. Update IR playbooks to account for hardware revocation, device-level key rotation, and coordinated vendor disclosures.
  • Benchmark workloads for cryptographic latency
  • Measure the real-world impact of local HSM acceleration for your workloads; in many cases you’ll see substantial latency reductions for crypto-heavy workloads such as TLS offload, model signing, and confidential computing primitives.

The competitive landscape and industry implications​

Hyperscalers have converged on a basic idea: hardware roots of trust and per‑server security primitives are necessary to reassure customers running high‑value workloads. What distinguishes different vendors is openness, integration, and the scope of what that silicon protects.
  • One major advantage of the open‑source RoT model is the potential for ecosystem interoperability. If multiple CPU, GPU, and storage vendors adopt a shared RoT design, customers can more easily verify device identity and build cross-cloud attestations.
  • Conversely, proprietary RoT implementations — while possibly more performant in the short term — risk vendor lock-in and opaque security claims that are harder to audit.
For chip designers and server OEMs, the move signals a demand for integrated cryptographic accelerators and secure interface protocols across the board. For enterprises, the arrival of per-server HSMs and PQC-capable RoTs makes confidential computing a practical option rather than a niche product.

Final assessment: meaningful progress, but not a magic bullet​

Microsoft’s per-server HSM and the move to an open, auditable RoT with integrated PQC acceleration represent a substantive advance in cloud hardware security. The architecture reduces key exposure, lowers crypto latency for demanding workloads, and provides a more verifiable path for attestation and device identity. The industry effort to standardize layered NVMe key management extends that protection into storage, completing more of the confidentiality stack.
However, these gains are not panaceas. They shift the attack surface: hardware fabrication, firmware provenance, side‑channel resilience, and update mechanisms become higher‑value targets. Open-source RoTs invite more scrutiny but also accelerate vulnerability discovery. Enterprises must therefore treat the new primitives as powerful tools that must be integrated with rigorous operational controls: certificate validation, firmware monitoring, well‑drilled recovery processes, and supply-chain verification.
In short, the cloud’s security story is becoming more hardware-centric — and that is a necessary evolution. Organizations that understand the benefits and the residual risks, and that plan migration and verification carefully, will get tangible security and performance improvements. Those that rely on vendor messaging alone without independent validation will discover that hardware promises require an equally disciplined set of operational controls to deliver on the security they promise.

Source: theregister.com Microsoft shows off custom silicon keeping Azure on lockdown
 

Back
Top