A recently disclosed Linux kernel vulnerability, tracked as CVE‑2025‑40026, affects KVM's x86 virtualization paths and stems from an unsafe assumption in a fastpath used when completing userspace I/O: KVM sometimes (re)checks L1 intercept state in a context that cannot safely perform the memory accesses required by the check, which can lead to kernel BUGs, vCPU failures and host instability. The upstream fix changes the control flow so that the code avoids rechecking/interrogating L1 intercepts from the fastpath and instead completes userspace I/O from a safe context, eliminating the illegal “sleep-from-fastpath” scenario that produced the most visible failures.
Virtualization on x86 uses complex cooperation between CPU-provided VM-exit metadata and hypervisor optimizations. KVM implements a number of fastpaths—small, high‑speed handlers that take advantage of CPU-supplied state (for example, a supplied “next RIP”) to avoid heavy emulation when a guest executes privileged instructions. These fastpaths are performance critical in cloud and virtualization workloads but must obey kernel context rules: certain fastpath code runs with interrupts disabled and in contexts where sleeping or memory-faulting operations are forbidden. When a fastpath unexpectedly needs to read guest memory (for example, to decode an instruction because the CPU did not supply next RIP), that read can fault and require sleeping — which is illegal in the fastpath context and produces kernel BUG traces.
The specific failure class CVE‑2025‑40026 falls into this context‑correctness family: the hypervisor’s fastpath logic rechecks L1 intercepts (an internal mechanism used to decide how userspace I/O completion and emulation should proceed) while still in a no-sleep context. If the recheck triggers code paths that access guest memory or otherwise can block, the kernel can hit “sleeping function called from invalid context” assertions, crash vCPUs, and in some cases escalate to host oops or panic. Upstream maintainers therefore modified the fastpath control flow to avoid doing these rechecks in forbidden contexts and to complete userspace I/O only from execution paths that permit safe memory access.
This CVE again demonstrates two operational lessons:
Source: MSRC Security Update Guide - Microsoft Security Response Center
Background
Virtualization on x86 uses complex cooperation between CPU-provided VM-exit metadata and hypervisor optimizations. KVM implements a number of fastpaths—small, high‑speed handlers that take advantage of CPU-supplied state (for example, a supplied “next RIP”) to avoid heavy emulation when a guest executes privileged instructions. These fastpaths are performance critical in cloud and virtualization workloads but must obey kernel context rules: certain fastpath code runs with interrupts disabled and in contexts where sleeping or memory-faulting operations are forbidden. When a fastpath unexpectedly needs to read guest memory (for example, to decode an instruction because the CPU did not supply next RIP), that read can fault and require sleeping — which is illegal in the fastpath context and produces kernel BUG traces.The specific failure class CVE‑2025‑40026 falls into this context‑correctness family: the hypervisor’s fastpath logic rechecks L1 intercepts (an internal mechanism used to decide how userspace I/O completion and emulation should proceed) while still in a no-sleep context. If the recheck triggers code paths that access guest memory or otherwise can block, the kernel can hit “sleeping function called from invalid context” assertions, crash vCPUs, and in some cases escalate to host oops or panic. Upstream maintainers therefore modified the fastpath control flow to avoid doing these rechecks in forbidden contexts and to complete userspace I/O only from execution paths that permit safe memory access.
What went wrong — technical anatomy
Fastpaths, L1 intercepts and context rules
- Fastpaths are lean code paths invoked on specific VM‑exits (for example WRMSR, HLT or userspace I/O completions). They assume minimal work and run with limited kernel context (IRQs disabled, non‑sleepable).
- L1 intercepts are internal KVM checks that determine whether an event should be handled directly, forwarded, or emulated. In some code paths these checks require reading guest instruction bytes or descriptors to decide the correct action.
- The bug arises when the fastpath both assumes no further guest memory access is needed and then, under particular CPU/configuration conditions (for example when CPU doesn't provide a valid next RIP or when kernel feature flags like NRIP are false), ends up needing to re-run an intercept check that reads guest memory. That read may fault and require sleeping — but the fastpath is not allowed to sleep. The incompatibility produces a kernel BUG.
Specific triggering scenario
While details differ across kernels and CPU vendors, practical trigger conditions documented in upstream discussion and downstream trackers include:- The fastpath is taken for a VM‑exit reason (for example, userspace I/O completion or WRMSR/HLT).
- The CPU or configuration does not supply the instruction’s next RIP (nrips=false, or an AMD SVM peculiar state), forcing KVM to decode the guest instruction stream to compute advances.
- Decoding the instruction requires fetching guest memory, which can fault and thus may sleep.
- The fastpath code rechecks or examines L1 intercepts while still in the non‑sleepable fastpath context; the ensuing memory access triggers a BUG.
Impact and exploitability
Primary impact: Denial‑of‑Service (availability)
The most likely and observed consequences are:- Kernel BUG traces that reference “sleeping function called from invalid context”.
- vCPU failures and QEMU/kvm thread call traces.
- Potential kernel oops/panic that brings down the host or requires a reboot, which in turn disrupts all hosted guests and services.
Exploitability considerations
- Attack vector: local/guest. A tenant (or attacker controlling a guest VM) can craft workloads or instruction sequences that hit the vulnerable fastpath and put the host into the faulty state. This makes multi‑tenant and cloud hosts especially sensitive.
- Prerequisites: the host must run a vulnerable KVM x86 path and a CPU/configuration that exposes next-RIP or L1 intercept absence (for example nrips=false on SVM implementations).
- Complexity: moderate. The conditions are specific (fastpath taken, next RIP absent, emulator path invoked while fastpath assumptions remain), but guests can generally trigger VM‑exits intentionally, so the risk is realistic for untrusted or hostile tenants.
The upstream fix — what changed
Upstream kernel maintainers addressed the issue with a surgical control‑flow change rather than a large rewrite:- The fastpath handlers (particularly around WRMSR, HLT and userspace I/O completion) now avoid rechecking L1 intercepts when the CPU-supplied next RIP is missing or when the fastpath cannot guarantee that memory reads will be safe.
- When the next RIP is unavailable, KVM forces the code to take the emulator/slowpath while holding the appropriate reader protections and in a context that permits guest memory reads (so that any faulting instruction fetch will be handled safely).
- The net effect is that the emulator path — which can sleep — performs the decoding and any intercept decisions that require guest memory, removing the illegal sleep-from-fastpath condition.
Detection, monitoring and incident response
Indicators to hunt for
- Kernel logs containing “sleeping function called from invalid context” in KVM-related stack traces.
- Call traces that include functions such as __might_fault, __might_resched, kvm_vcpu_read_guest_page, x86_decode_insn/x86_emulate_instruction, and frames referencing kvm_amd or SVM fastpaths.
- Repeated vCPU failures tied to guest activity that executes privileged instructions (WRMSR, HLT) or heavy userspace I/O flows.
Short incident playbook
- Isolate the host — remove it from production pools if multiple guests share critical resources.
- Correlate host logs with guest identifiers to determine which guest(s) were exercising the suspicious instruction sequences.
- Suspend or quarantine any untrusted guests pending remediation.
- Apply vendor/distribution kernel update that includes the upstream fix and reboot into the patched kernel.
- Re-run the previously failing workload in a controlled lab to confirm the kernel no longer hits the “sleeping function” trace.
Remediation and mitigations
Definitive remediation
- Install a kernel package that contains the upstream patch / backport from your distribution and reboot the host. This is the only reliable, long‑term fix. Upstream commit IDs were propagated into stable trees and distributions have published fixed packages. Confirm the package changelog or vendor advisory lists CVE‑2025‑40026 or the upstream commit id before wide deployment.
Temporary mitigations (if you cannot patch immediately)
- Limit exposure: avoid running untrusted or third‑party guests on hosts until the kernel is patched. If feasible, move untrusted tenants to patched hosts.
- Configuration review: where practical, avoid driver/CPU/KVM configurations that remove next‑RIP information (for example, do not run with nrips=false unless necessary), but treat this as a high‑risk, last‑resort setting — changing low‑level CPU/KVM flags can have side effects and must be validated in test environments before rolling into production.
- Emulation fallback: in lab or non‑performance‑sensitive scenarios, restrict KVM to use the emulator path by default, ensuring all instruction decodes occur in sleepable contexts. This will incur performance penalties and is not generally suitable for production, but it reduces the immediate attack surface.
Validation checklist after patching
- Confirm the kernel package changelog references the CVE or upstream commit.
- Reproduce prior triggers in a staging environment to verify the host no longer emits “sleeping function called from invalid context” traces for the tested workloads.
- Monitor kernel logs for 7–14 days post‑patch for regressions or unusual KVM behavior.
Operational guidance and risk trade-offs
Prioritization guidance
- Cloud providers and multi‑tenant hosts: Highest priority — patch immediately. An unpatched host is high‑value for attackers seeking reliable DoS against a hypervisor.
- Shared development servers / build clusters: High priority if they allow unprivileged or unvetted guests.
- Single‑user desktop hosts with trusted VMs: Lower immediate risk, but patching is still recommended to avoid accidental host instability.
Testing and rollout recommendations
- Stage the patch in a representative pilot group of virtualization hosts.
- Validate critical functions: guest boot, live migration, snapshots, storage IO, and performance-critical workloads.
- Roll out in phases with monitoring windows and rollback plans in case of regressions.
- Keep CPU microcode/firmware updates current alongside kernel patches; while this specific fix is a kernel control‑flow change, microcode updates remain part of a layered mitigation posture for other microarchitectural issues.
Residual risks and caveats
- The patch targets a narrow timing/context correctness issue; it does not eliminate other classes of hypervisor vulnerabilities such as speculative side‑channel primitives or separate memory‑corruption bugs.
- Distribution timelines vary; embedded and vendor‑custom kernels may lag upstream and require vendor‑specific backports. Confirm your vendor’s advisory and do not assume uniform coverage.
Why this matters beyond a single CVE
CVE‑2025‑40026 exemplifies a recurring pattern in virtualization security: micro‑optimizations that rely on tightly specified CPU-provided metadata can create brittle assumptions. When the hardware or hypervisor configuration does not match those assumptions, the optimizations can cause correctness failures that manifest as stability or availability incidents on hosts that co‑locate multiple tenants.This CVE again demonstrates two operational lessons:
- Small, surgical upstream fixes that re-route work out of non‑sleepable fastpaths are the preferred engineering approach: they preserve performance while correcting the correctness assumption.
- Operators must track both upstream commits and vendor distribution advisories: upstream patches are often available before CVE assignment or vendor backports, and the mapping between commit IDs and distribution package versions is the authoritative way to confirm remediation.
Quick technical reference and commands
- Check kernel release: uname -r
- Check whether KVM is present: lsmod | grep kvm
- Search kernel logs for symptomatic traces: journalctl -k | grep -i \"sleeping function called from invalid context\" or dmesg | grep -i kvm
- Confirm package changelog after upgrade (example): rpm -q --changelog kernel-core | grep -i CVE‑2025‑40026 or apt changelog linux-image-$(uname -r) | grep -i CVE‑2025‑40026
- If you cannot patch immediately: consider disallowing untrusted guests and limiting who may create or deploy VMs on shared hosts.
Conclusion — what operators should do now
- Treat CVE‑2025‑40026 as a high‑impact availability risk for hosts that run untrusted guests or operate in multi‑tenant clouds. The bug does not appear to enable remote code execution or data exfiltration in its published form, but it does enable reliable host destabilization from a guest.
- Apply vendor-supplied kernel updates that include the upstream fix as soon as those packages are available for your distribution. Validate changelogs and perform staged rollouts with monitoring windows.
- If patching is delayed, reduce exposure by avoiding untrusted guests, validating CPU/KVM configuration (avoid nrips=false unless required and tested), and increasing monitoring for related kernel BUG/OOPS traces.
- Maintain layered defenses: keep firmware/microcode current, enforce tenant isolation, and ensure strong change control for virtualization management planes. Small kernel correctness fixes like this one are simple but crucial patches that protect host reliability — and in shared infrastructure, reliability is security.
Source: MSRC Security Update Guide - Microsoft Security Response Center