CVE-2026-31591: Linux KVM AMD SEV-SNP vCPU Locking Race Can Crash Hosts

  • Thread Author

Dark console panel with four buttons, the rightmost lit red.CVE-2026-31591: Linux KVM SEV-SNP vCPU Locking Flaw Can Corrupt Guest State or Crash the Host​

CVE-2026-31591 is a Linux kernel vulnerability in KVM’s AMD SEV-SNP launch path. The issue affects the way KVM synchronizes Virtual Machine Save Areas, or VMSAs, when finalizing the launch of an SEV-SNP protected guest. In plain terms, the kernel was updating highly sensitive per-vCPU state during the final stage of confidential VM launch without first ensuring that all vCPUs were locked against concurrent manipulation. That opened a race window in which userspace could modify or run a vCPU while KVM was synchronizing and encrypting its VMSA.
The bug is specific, but the consequences matter: the vulnerable path could corrupt vCPU state, and in the worst case it could crash the host kernel. For virtualization operators, cloud platforms, confidential-computing environments, and anyone running AMD SEV-SNP guests on Linux KVM, this should be treated as an availability and isolation-hardening issue rather than a routine guest-only bug. The public NVD record describes the vulnerability as resolved in the Linux kernel and explains that userspace interaction during VMSA synchronization could corrupt vCPU state or crash the host.

What the Vulnerability Is​

CVE-2026-31591 sits in the AMD Secure Encrypted Virtualization code inside KVM, specifically the SEV-SNP launch finish flow. SEV-SNP, short for Secure Encrypted Virtualization with Secure Nested Paging, is AMD’s more advanced confidential-computing technology. It is designed to protect guest memory and strengthen the boundary between the VM and the hypervisor. In KVM, support for this feature involves carefully staged launch commands, firmware interactions, page-state transitions, measurement, encryption, and protected vCPU state handling.
The vulnerable operation occurs when KVM synchronizes VMSAs for SNP guests. A VMSA contains the saved architectural state associated with a virtual CPU. In an SEV-SNP guest, that state is security-sensitive because it is part of what allows the guest to run with encrypted and protected execution context. During launch finish, KVM needs to synchronize vCPU state into the VMSA and then transition that VMSA into the protected firmware-managed state.
The problem was that the SNP path did not lock all vCPUs while doing this synchronization and encryption work. If userspace, such as QEMU or another VMM process interacting through KVM ioctls, was allowed to manipulate or run a vCPU during this window, KVM could end up synchronizing inconsistent state. This is the classic shape of a race condition: one path assumes state is stable while another actor can still modify it.
The fix is direct and revealing. The kernel patch adds locking around all vCPUs before updating SNP VMSAs, changes error exits to go through a common unlock path, and adds a lockdep assertion to verify that the per-vCPU mutex is held when synchronizing the VMSA. The upstream stable patch description states that all vCPUs must be locked when synchronizing and encrypting VMSAs for SNP guests, because otherwise userspace could manipulate or run a vCPU while its state is being synchronized.

Why VMSA Synchronization Matters​

To understand the impact, it helps to understand the role of the VMSA. A vCPU is the virtual processor presented to a guest VM. Its state includes registers, control state, and other CPU execution context that must be saved, restored, and validated as the guest runs. A VMSA is the data structure used in AMD’s SEV-ES and SEV-SNP architecture to hold that saved state.
In ordinary virtualization, a race in vCPU state management can break a guest. In confidential virtualization, the stakes are higher because the state becomes part of a protected execution model. SEV-SNP is designed to ensure that guest memory and selected execution state cannot be casually inspected or tampered with by the host. That means KVM must be extremely precise about when guest state is mutable, when it is measured, when it is encrypted, and when it becomes protected.
The Linux kernel documentation describes SEV as an AMD-V extension that lets virtual machines run under a hypervisor while transparently encrypting VM memory with a key unique to that VM. SEV-SNP builds on that model with stronger integrity protections and launch-time guarantees. The launch process is not just a boot sequence; it is part of the trust establishment between the guest owner, the platform, firmware, and the hypervisor.
During SNP launch finish, the VMSA is effectively being finalized into the secure state expected by the SNP guest. If another thread can change vCPU state at the same time, the kernel may encrypt or transition a VMSA that does not reflect a coherent vCPU snapshot. At best, the guest gets bad state and fails. At worst, the host kernel follows inconsistent assumptions into a crash.
That is why the patch does not merely add a defensive check. It changes the locking behavior so that KVM freezes the relevant vCPU state before touching and encrypting the VMSAs.

Impact and Severity​

The most likely impact of CVE-2026-31591 is denial of service. A local userspace VMM with access to the KVM VM file descriptor could potentially trigger the race during SEV-SNP guest launch, leading to corrupted guest state or a host kernel crash. Public vulnerability tracking has described the issue as affecting Linux kernel KVM and leading to denial of service, with fixes available in newer stable kernels.
This is not best understood as a remote unauthenticated internet vulnerability. Exploitation requires proximity to a virtualization host’s KVM interface. In most deployments, that means the attacker would need control over, or the ability to influence, a VM-management process such as QEMU, a cloud orchestration component, or another privileged local userspace component capable of issuing the relevant KVM SEV-SNP launch operations.
The practical severity depends heavily on deployment model:
  • On a personal workstation that does not run SEV-SNP guests, the issue may be irrelevant.
  • On a Linux KVM host using AMD EPYC hardware with SEV-SNP enabled, the issue is important.
  • On a cloud or hosting platform exposing confidential VM creation workflows to tenants, the issue deserves faster attention.
  • On a managed platform where only trusted infrastructure code can launch SNP guests, risk is reduced but not eliminated.
  • On systems where untrusted users can create or control confidential VMs, the risk is higher because the bug affects host availability.
A reasonable estimated CVSS v3.1 score would fall in the medium range, around 5.5 to 6.5, depending on how one models privileges and attack complexity. The attack is local from the host’s perspective, requires privileges sufficient to interact with KVM and SEV-SNP launch ioctls, does not require user interaction, and primarily affects availability. If a scoring model assumes low privileges and a reliable host crash, a vector similar to CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:L/A:H would land around moderate severity. If integrity impact is limited to guest state rather than host integrity, the availability component remains the dominant factor.
Operationally, administrators should treat this as a host-stability vulnerability in a sensitive virtualization subsystem. It may not be a broad emergency for every Linux machine, but it is significant for confidential-computing hosts.

Who Is Affected​

The affected population is narrower than “all Linux users.” CVE-2026-31591 is relevant to systems that meet several conditions:
  • The system runs a Linux kernel with the vulnerable KVM AMD SEV-SNP code.
  • The system uses KVM virtualization.
  • The host runs on AMD hardware supporting SEV-SNP.
  • SEV-SNP guests are launched or can be launched.
  • Userspace has a path to manipulate or run vCPUs during the vulnerable launch operation.
Public tracking indicates affected upstream kernel versions included Linux 6.18 through 6.18.23, 6.19 through 6.19.13, and 7.0.0, with fixes in 6.18.24, 6.19.14, and 7.0.1. Distribution kernels may differ because vendors often backport fixes without changing the major kernel version string.
That final point is critical. A server running a distribution kernel labeled 6.12, 6.8, 5.15, or another enterprise version is not automatically safe or vulnerable based only on the version number. Enterprise Linux vendors routinely backport KVM, SEV, and security fixes. Administrators should check the vendor advisory, package changelog, and kernel build metadata rather than relying solely on upstream version comparisons.
The issue is most relevant to:
  • Cloud providers offering AMD SEV-SNP confidential VMs.
  • Private clouds running OpenStack, KVM, QEMU, and libvirt on AMD EPYC platforms.
  • Hosting providers experimenting with confidential computing.
  • Security research labs testing SEV-SNP guests.
  • Organizations using SNP-backed confidential workloads for regulated data.
  • CI and kernel-testing infrastructure that exercises KVM SEV-SNP launch flows.
It is less relevant to:
  • Intel-only KVM hosts.
  • AMD KVM hosts that do not support or enable SEV-SNP.
  • Systems that use virtualization but not KVM.
  • Systems that run ordinary, non-confidential VMs only.
  • Desktop users with no SEV-SNP guest launch workflow.

Technical Root Cause​

The root cause is missing synchronization around a critical state transition. KVM was walking vCPUs and updating VMSA state during the SNP launch finish path, but it did not first acquire the locks needed to prevent concurrent vCPU activity. This created a mismatch between what the code needed and what the locking actually guaranteed.
The fix adds kvm_lock_all_vcpus(kvm) before the VMSA update loop and ensures that all exits pass through kvm_unlock_all_vcpus(kvm). It also adds lockdep_assert_held(&vcpu->mutex) inside the VMSA synchronization helper. The assertion is important because it documents and enforces the locking expectation: synchronizing VMSA state must happen only while the vCPU mutex is held.
The public patch shows the key logic clearly. Before the fix, errors inside the loop could return directly. After the fix, those exits go to a shared cleanup label so locks are released correctly. This is a common kernel hardening pattern: acquire the lock once, perform all sensitive work under the lock, and use a single cleanup path to avoid leaking locks or returning while protected state remains inconsistent.
The Linux KVM locking documentation underscores why this matters. KVM has a defined lock hierarchy, including the relationship between kvm->lock and vcpu->mutex, and complex KVM paths must respect that ordering to avoid races and deadlocks. The CVE-2026-31591 fix follows that philosophy by making vCPU locking explicit around the SEV-SNP state transition.

Why This Is a Host Issue, Not Just a Guest Issue​

Many VM bugs only affect the guest. A bad virtual device state may crash the VM, a guest kernel may panic, or a launch may fail. CVE-2026-31591 is more serious because the vulnerable code runs in the host kernel’s KVM subsystem. If KVM mishandles protected vCPU state during SNP launch, the failure can propagate into the host kernel.
The CVE description explicitly warns that the outcome could be host kernel crash. That makes availability the central security property at risk. In a single-tenant environment, a crash is disruptive. In a multi-tenant environment, a crash can affect other guests on the same physical host. In a cloud environment, host crashes can trigger workload evacuation, service-level violations, customer-visible downtime, and potentially data-loss risk for workloads that were not properly synchronized.
The vulnerability does not imply that SEV-SNP encryption is broken. It does not mean attackers can decrypt confidential VM memory. It does not mean SNP attestation is generally invalid. Instead, it is a kernel implementation flaw in the lifecycle management of protected vCPU state. Confidential-computing features are only as reliable as the kernel, firmware, VMM, and hardware transitions around them. This CVE is a reminder that launch-time state handling is a security boundary.

Exploitation Considerations​

There is no need to assume broad exploitation from the open internet. The attacker needs access to the host-side virtualization control plane. However, in environments where tenants or users can trigger confidential VM launches, the vulnerable operation may be reachable indirectly through legitimate orchestration APIs.
A plausible attack path would involve causing a race between the SNP launch finish operation and vCPU manipulation or execution. The details depend on the userspace VMM, orchestration model, and kernel timing. Exploiting such races reliably can be difficult, but denial-of-service attacks do not always require perfect reliability. Repeated attempts that occasionally crash a host are still operationally serious.
The most exposed environments are those where untrusted users can:
  • Create SEV-SNP guests.
  • Control VM lifecycle timing.
  • Rapidly start, stop, or reconfigure vCPUs.
  • Interact with custom VMM tooling.
  • Run experimental QEMU or KVM management code.
  • Access host-level virtualization APIs through delegated services.
In contrast, if only a tightly controlled root-owned management daemon can launch SNP guests, exploitation would likely require compromising that daemon or abusing a higher-level API that maps to the vulnerable behavior.

Detection and Triage​

Administrators should start by determining whether the system uses AMD SEV-SNP with KVM. Useful indicators include AMD EPYC hardware, the kvm_amd module, SEV/SNP-related kernel messages, QEMU command lines using SNP launch security options, and the presence of confidential-computing workflows.
Next, check the running kernel and vendor patch level. On upstream kernels, look for versions containing the fix. For distribution kernels, inspect security advisories and changelogs. A vendor may mark the CVE fixed in a kernel package whose base version still appears older than the upstream fixed version.
Operational signals that could be consistent with this issue include:
  • Host kernel crashes during SEV-SNP guest launch.
  • KVM or kvm_amd stack traces involving SNP launch finish or VMSA synchronization.
  • Guest launch failures during the SNP finalization phase.
  • Reproducible instability when launching multi-vCPU SEV-SNP guests.
  • Failures that appear timing-sensitive or disappear when vCPU concurrency is reduced.
These symptoms are not unique to CVE-2026-31591. SEV-SNP depends on firmware, BIOS settings, PSP firmware, QEMU, OVMF, kernel configuration, and platform support. A failed SNP launch may have many causes. But if crashes cluster around snp_launch_update_vmsa, snp_launch_finish, or sev_es_sync_vmsa, this CVE should be part of the triage checklist.

Mitigation and Remediation​

The primary remediation is to update the Linux kernel to a version or vendor package containing the fix. Upstream stable releases identified in public tracking include 6.18.24, 6.19.14, and 7.0.1. Administrators should prefer their distribution’s security update channel where applicable, because enterprise distributions may backport the patch into older maintained kernels.
After updating, reboot into the patched kernel. Because KVM is kernel-resident, simply updating packages is not enough if the host continues running the old kernel. In clustered virtualization environments, live-migrate or shut down guests according to policy, drain hosts, patch, reboot, and then return hosts to service.
If immediate patching is not possible, temporary risk reduction options include:
  • Disable SEV-SNP guest creation until the host can be patched.
  • Restrict confidential VM launch permissions to trusted administrators.
  • Prevent untrusted tenants from creating SNP guests.
  • Reduce exposure of experimental or custom KVM management interfaces.
  • Monitor host crashes and KVM-related kernel oops reports.
  • Avoid running untrusted SEV-SNP launch workflows on shared production hosts.
  • Use scheduling or policy controls to isolate SNP workloads onto patched nodes.
These mitigations reduce reachability but do not fix the underlying race. The correct fix is the kernel update.

Guidance for Cloud and Enterprise Operators​

Cloud and enterprise teams should treat CVE-2026-31591 as part of the confidential-computing patch baseline. Even if the vulnerability does not expose guest secrets directly, confidential-computing platforms depend on trust in the launch process. Bugs in that area can undermine reliability and customer confidence.
A practical response plan should include:
  • Inventory AMD SEV-SNP-capable hosts.
  • Identify which hosts actually run SNP guests.
  • Map kernel versions to vendor-fixed packages.
  • Prioritize multi-tenant and tenant-accessible SNP launch environments.
  • Patch and reboot hosts in maintenance windows.
  • Validate SNP guest launch after patching.
  • Monitor for regressions in QEMU, OVMF, PSP firmware, and kernel logs.
  • Update internal golden images or host baselines.
  • Document whether the CVE is applicable to each virtualization fleet.
Providers should also verify that orchestration layers do not allow untrusted users to trigger unusual VM state transitions during launch. Even after patching, launch flows for confidential VMs deserve stricter control than ordinary VM lifecycle paths.

Guidance for Windows-Focused Administrators​

Although this is a Linux kernel KVM vulnerability, Windows administrators may still encounter it indirectly. Many organizations run mixed virtualization stacks: Windows management systems, Linux KVM hosts, Azure-connected infrastructure, Linux-based appliances, or confidential-computing workloads that host Windows guests on Linux KVM.
The vulnerability is in the Linux host kernel, not inside a Windows guest. A Windows VM running as an SEV-SNP guest would not need a Windows patch for this specific issue. The host kernel must be fixed. If your organization runs Windows workloads on Linux KVM using AMD SEV-SNP, patch the Linux virtualization hosts.
If CVE-2026-31591 appears in Microsoft security tooling or vulnerability dashboards, treat it as a host-platform issue. Check whether the flagged asset is a Linux-based host, appliance, container host, Azure Linux system, or other Microsoft-managed Linux component. Do not assume that a Windows Server patch cycle resolves it unless the vulnerable Linux kernel package is also updated.

Why the Fix Is the Right Shape​

The patch is small but important. It does not redesign SEV-SNP. It does not change guest policy. It does not alter the cryptographic model. Instead, it enforces the concurrency rule that should have existed around the VMSA transition.
The most important parts are:
  • Acquire locks for all vCPUs before VMSA synchronization.
  • Assert that the vCPU mutex is held when synchronizing VMSA state.
  • Avoid direct returns from the protected section.
  • Ensure all error paths unlock vCPUs correctly.
  • Preserve the expected SEV-ES behavior while hardening the SNP path.
This is exactly how kernel race fixes often look: the vulnerable code was logically correct only if state stayed still, but the lock coverage did not guarantee that. The fix makes the assumption true.

Bottom Line​

CVE-2026-31591 is a race-condition-style vulnerability in Linux KVM’s AMD SEV-SNP launch finish path. During VMSA synchronization and encryption, KVM failed to lock all vCPUs, allowing userspace to potentially manipulate or run a vCPU while its state was being finalized. The result could be corrupted vCPU state or a host kernel crash.
The issue is most relevant to Linux KVM hosts running AMD SEV-SNP confidential VMs. It is not a general Windows vulnerability, not a remote internet bug, and not evidence that SEV-SNP encryption itself is broken. But for cloud and virtualization operators, it is important because the failure mode can affect host availability.
Patch the host kernel, reboot into the fixed version, and restrict SEV-SNP guest launch capabilities until remediation is complete. For upstream users, fixed stable kernels are available; for enterprise users, the safest path is to apply the vendor-provided kernel security update and verify that the CVE is listed as resolved in the package changelog.

Source: NVD / Linux Kernel Security Update Guide - Microsoft Security Response Center
 

Back
Top