Zero Trust for Virtualization: PAWs, VM Encryption, and Immutable Backups

  • Thread Author
Google Cloud’s 2026 hardening update is a wake-up call: threat actors increasingly target the virtualization layer to perform reconnaissance, steal Active Directory material offline, or permanently destroy availability by corrupting virtual disks and backups. The technical countermeasure set it recommends—enclave the management plane, strictly isolate traffic types, enforce PAW-originated administration, apply host-level firewalls, and make Tier‑0 assets and backups cryptographically unreadable if stolen—is sound and urgent, but operationalizing it at scale requires careful planning, key‑management discipline, and testing before enforcement.

Background / Overview​

Virtualization platforms (VMware vSphere and Microsoft Hyper‑V) are no longer just compute substrates; they are high-value targets because a single hypervisor compromise can give an attacker the ability to manipulate or copy virtual disks, power off or snapshot fleets, and operate beneath guest-level detection. Recent casework and threat research — including incident writeups that describe attackers detaching domain controller disks, copying NTDS.dit, and returning disks with no in‑guest forensic trace — demonstrate the realism of this threat model. Google Cloud’s 2026 guidance codifies a layered, zero‑trust approach specifically targeted at preventing such destructive or credential‑theft attacks.
This feature distills that guidance, verifies the technical feasibility of the recommendations, cross‑references vendor documentation and independent best practices, and provides an actionable, prioritized implementation and test plan for WindowsForum readers who manage enterprise virtualization estates.

Threat model: what adversaries are trying to accomplish​

  • Gain administrative control of the virtualization management plane (vCenter, ESXi hosts, Hyper‑V management).
  • Use that control to perform offline credential theft (“disk swap”): power down a Domain Controller VM, detach its virtual disk, mount it on an attacker-controlled VM, extract NTDS.dit and SYSTEM hives, and then return the disk. This bypasses any EDR or logging inside the guest OS.
  • Corrupt or encrypt virtual disks and backups at scale (ransomware or wiper), often by running code directly on hypervisors or by modifying datastore files.
  • Sabotage recovery by manipulating or deleting backups and backup metadata, or by changing backup job configurations. Immutable and encrypted backup targets are the primary defense.
Understanding these objectives clarifies why defenses must protect three layers simultaneously: identity, the network/logical architecture, and cryptographic protections for stored artifacts (VM disks and backups).

A zero‑trust, defense‑in‑depth architecture for virtualization​

The core principle​

Enforce the rule that privileged credentials alone must never be sufficient to reach the management plane or to access raw virtual disks. To achieve this, combine: (1) logical enclave networks for management traffic, (2) PAW‑only administrative paths and strict L3/4 controls, (3) host‑level firewalling and lockdown features, and (4) cryptographic protections (VM encryption, Shielded/Guarded fabrics) for Tier‑0 VMs and backups. Google Cloud’s guidance frames this as preserving the network as the final protective boundary even when identity is compromised.

VMware vSphere: practical zero‑trust blueprint​

Immutable VLAN segmentation and VRF isolation​

  • Create dedicated VLANs (and preferably a dedicated VRF) for:
  • Host management (ESXi management)
  • vCenter/VCSA and infrastructure appliances
  • vMotion (non‑routable)
  • Storage (iSCSI/NFS/datastore management — non‑routable)
  • Production Guest VMs
  • Move infrastructure VLANs into a separate VRF so there is no L3 route from user/guest zones into the management zone. This prevents a routed pathway even if credentials leak. VMware and networking best practices both recommend isolating vMotion and storage traffic from any routable networks.

PAW‑Exclusive Access and L3/L4 policies​

  • Deconstruct any direct corporate LAN routes to management subnets. Require that all administrative access originate only from a Privileged Access Workstation (PAW) subnet. Microsoft and CISA guidance for PAWs recommend this model for Tier‑0 operations.
  • Network ACL examples (apply at gateway and enforcement boundary):
  • ALLOW: TCP/443 (vCenter UI/API) and TCP/902 (MKS/console) from PAW subnet only.
  • DENY: TCP/22 (SSH), TCP/5480 (VAMI) from all sources except explicit PAW hosts (SSH should be disabled by default).
  • Egress controls: block all outbound internet access from management appliances except to vendor update endpoints and authorized identity providers. vCenter’s GUI cannot fully enforce egress controls, so hardware gateway filtering is necessary.

Host‑based firewall and lockdown​

  • VCSA (Photon OS): switch to a default deny posture via the appliance management interface (VAMI) and use iptables/nftables when granular rules are required.
  • ESXi: deselect “Allow connections from any IP address” and scope management services to explicit management IPs; enable Lockdown Mode so host console access is only available from vCenter and designated break‑glass accounts. VMware documents and CIS benchmarks recommend lockdown for production hosts.

VM encryption and key management (definitive mitigation for disk swap)​

  • Encrypt all Tier‑0 VMs (Domain Controllers, PKI servers, backup servers) using vSphere VM Encryption and integrate with a KMIP‑compliant KMS. VM encryption makes raw VMDK files unreadable to an attacker who detaches or copies them. Vendor docs show VM Encryption protects virtual disks and selected VM files when correctly configured.
  • Key management cautions:
  • Use an external, hardened KMS (BYOK recommended) with strict access controls and separate admin roles.
  • Test key‑recovery and DR paths; losing keys can make backups irrecoverable. VMware and third‑party technical guidance both emphasize testing KMS recovery and understanding licensing/feature dependencies.

Hyper‑V: zero‑trust equivalents and Shielded VM protections​

Traffic segmentation and non‑routable streams​

  • Enforce VLAN separation for Host Management, Live Migration, CSV/Cluster Heartbeat, and Production VMs. Place Live Migration and CSV traffic on non‑routable VLANs. Microsoft guidance and published best practices highlight the necessity of isolation for high‑bandwidth cluster traffic and to avoid interception from other network segments.

PAW and Layer‑3/4 policies​

  • Require that all management sessions (WinRM, RDP, WMI/RPC) originate from the PAW subnet. Example ingress filtering:
  • ALLOW: WinRM/PowerShell Remoting (TCP/5985, TCP/5986), RDP (TCP/3389), WMI/RPC (TCP/135 + dynamic RPC ports) only from PAWs.
  • DENY: SMB (TCP/445), unmanaged RPC/WMI, and other management ports from untrusted networks.
  • Use Windows Admin Center gateway (HTTPS/TCP443) only when its gateway is itself hosted in an isolated management zone and subject to PAW‑origin restrictions.

Host‑based firewall and Shielded VMs​

  • Apply Windows Firewall with Advanced Security (WFAS) rules scoped to PAW and management subnets. Enable logging of dropped packets to feed SIEM rules that surface internal reconnaissance. Microsoft documentation for PAWs and Hyper‑V hardening recommends strict scope restriction and logging.
  • Use Hyper‑V Shielded VMs or Guarded Fabric for Tier‑0 workloads where operationally feasible. Shielded VMs combine Secure Boot, vTPM, and BitLocker to make offline access to VHD/VHDX unreadable on another host that is not authorized. This directly mitigates disk‑swap extraction scenarios for Hyper‑V.

Backups: encryption, immutability, and recovery hygiene​

Default‑encrypt Tier‑0 backups​

  • Ensure backup applications encrypt backups at rest using keys managed separately from the virtualization platform and stored in a hardened KMS. This prevents an attacker who obtains backup files from reading or manipulating payloads. Cloud and on‑prem vendors recommend encrypting backup repositories and separating KMS control from production administrators.

Immutable backup repositories and air‑gapping​

  • Implement immutable (WORM) backup targets for one or more retention tiers. Vendors such as Veeam and enterprise backup suites support hardened Linux repositories, S3 Object Lock, or other immutability constructs; industry guidance shows immutable backups materially reduce ransomware effectiveness. However, immutability is not a silver bullet—processes and job configurations must be protect do not poison backups before immutability applies.
  • Air‑gapped and offline copies (tape or physically isolated media) remain important for the highest assurance cases. Regularly test restores from immutable and offline copies.

Protect backup management planes​

  • Require PAW and phishing‑resistant MFA for backup console access; isolate backup management into a dedicated, non‑routable management VLAN and enforce the same network ingress/egress controls as virtualization management. The attacker’s usual playbook includes simultaneous targeting of backups to prevent recovery.

Detecting the hypervisor heist: practical telemetry​

  • Forward hypervisor logs (vpxd.log, hostd.log, audit_records) to a centralized SIEM and create detections for:
  • SSH start events on ESXi hosts, host firewall changes, and unexpected API calls from unknown IPs.
  • Bulk VM power‑off or snapshot deletion patterns (e.g., >5 VMs off in 10 minutes).
  • VM disk detach/attach events where the target attachment is to a VM not in inventory or to a different owner account.
  • In Windows environments, monitor for DSRM password changes, unusual NTDS exports, and suspicious use of administrative accounts combined with datastore API activity. Google Cloud’s guidance lists named detections and high‑value event IDs that defenders should integrate into use cases and hunting rules.

Implementation roadmap — prioritized and testable​

  • Short term (0–30 days)
  • Inventory virtualization management endpoints and map current network flows and ACLs.
  • Create a PAW bastion/subnet and immediately require that at least one group of administrators use PAWs for Tier‑0 tasks; enforce conditional access on PAW devices.
  • Harden backup credentials and enable encryption at rest on backup repositories; enable remote logging from hypervisors to SIEM.
  • Medium term (1–3 months)
  • Build isolated management VLANs and VRF instances; apply ingress filters that allow only PAW subnets to reach management APIs and consoles.
  • Enable ESXi lockdown mode and remove direct root SSH access; scope vCenter and host firewall rules to explicit management IPs.
  • Implement immutability for at least one backup tier and validate restore procedures from immutable repositories.
  • Long term (3–12 months)
  • Encrypt Tier‑0 VMs and integrate with an external KMS; test DR/restore with encrypted VMs and ensure replication compatibility.
  • Review and reduce service principals and automation permissions that can create or modify images/backups or attach disks; enforce JIT/JEA for automation.
  • Conduct red team exercises that simulate a disk‑swap attack and validate detection, containment, and recovery playbooks.

Strengths of the guidance — what works well​

  • The emphasis on a network final barrier acknowledges the realistic case that identity can be compromised and places a practical, enforceable control between attackers and hypervisor artifacts. This is a pragmatic application of Zero Trust to infrastructure.
  • VM encryption and Shielded VM features directly address the offline theft vector: a stolen VMDK or VHDX becomes useless without keys and host‑authorization. Vendor docs consistently corroborate that properly deployed VM encryption protects virtual disk contents.
  • Immutable backups and strict backup key separation materially increase the chance of full operational recovery following destructive events. Vendor implementations (hardened repositories, object lock) are mature and widely deployed.

Practical risks, operational trade‑offs, and failure modes​

  • Key Management is a single point of catastrophic failure if mishandled. Encrypting production VMs without tested KMS recovery/DR plans risks irreversible data loss. Any KMS migration, replication, or DR architecture must be exercised under realistic failure scenarios. Document every KMS recovery step and test it annually.
  • Performance and feature impact: VM encryption and Shielded VM features have known compatibility and feature limitations (replication/replication products, third‑party tooling, and older vSphere editions may behave differently). Verify product versions and support matrices before blanket deployment.
  • Operational complexity: VRF and strict L3/L4 isolation increase troubleshooting complexity and risk of accidental service disruption. Implement change control, automated testing, and runbooks before restricting production access paths.
  • False confidence in immutability: immutability protects stored backups but does not prevent an attacker from modifying backup job definitions, schedules, or excluding critical systems before the immutability window is applied. Protect backup management with PAW origin and phishing‑resistant MFA and alert on job‑definition changes.

Detection and incident response playbook highlights​

  • Assume breach: An effective playbook assumes an attacker can reach vCenter. Prioritize containment steps such as severing management network connectivity, isolating ESXi hosts via physical or virtual network switches, and placing hosts into maintenance/network isolation modes.
  • For suspected disk‑swap activity:
  • Immediately snapshot and preserve logs from vCenter, ESXi hostd/vpxd, and syslog collectors.
  • Preserve the datastore in a read‑only fashion where possible.
  • Use offline forensic tools to validate the integrity of DC disks; do not power on restored VMs until validated in an isolated environment.
  • If VM encryption is in place, coordinate with KMS custodians to prevent key loss and to assist in forensic validation.

Closing recommendations — prioritized checklist (for the next 90 days)​

  • Create and enforce a PAW program for every administrator with Tier‑0 privileges; block non‑PAW origin for management access in conditional access policies.
  • Immediately disable SSH on vCenter/hosts unless needed for emergency break‑glass; restrict any required SSH by ACL to explicit PAW jump hosts.
  • Forward hypervisor logs to SIEM and stand up detections for disk‑detach/attach, bulk VM power‑off, and unexpected datastore file modification.
  • Implement a “Default Encrypted” policy for Tier‑0 VMs and require KMS redundancy and documented recovery tests before rollout.
  • Protect backups with encryption, immutability, separate KMS control, and PAW‑only management access; test full restore from immutable media.

Final assessment​

Google Cloud’s 2026 edition on preparation and hardening provides a prescriptive, defensible blueprint that aligns with both vendor capabilities and real‑world adversary behavior: isolate the management plane, restrict administrative routes to PAWs, harden hosts, encrypt Tier‑0 assets, and make backups immutable and independently keyed. These controls collectively raise the cost and complexity of an attack from feasible to impractical for most adversaries. However, organizations must not treat the blueprint as a checklist to be applied without planning: the technical barriers (KMS design, replication impacts, operational troubleshooting, and recovery testing) are real and can themselves become sources of outage or data loss if implemented without rigorous testing and role separation.
Hardening the virtualization plane is essential, measurable work — plan a phased rollout, test every recovery path, and instrument your environment so that when adversaries probe the hypervisor, your SIEM surfaces that activity early. The combination of PAWs, network enclaves, host lockdowns, VM encryption, immutable backups, and a practiced incident response playbook is the most pragmatic path to survive an adversary that targets your most critical infrastructure.

Source: Google Cloud Proactive Preparation and Hardening Against Destructive Attacks: 2026 Edition | Google Cloud Blog