Linux 9p Client Race Fix CVE-2025-40027 Prevents Double List Del

  • Thread Author
A recent Linux kernel fix closes CVE-2025-40027, a race-condition bug in the net/9p client that could cause a double removal of a request from its tracking list — a logic race that KASAN and syzkaller surfaced as a general-protection fault and list corruption during heavy fuzzing of 9p client flows.

Linux penguin near a screen displaying 9p and CVE-2025-40027, a 9P security flaw.Background / Overview​

The Plan 9-derived 9P filesystem (commonly called 9p or v9fs) provides a lightweight network filesystem transport used in a number of virtualization and embedded scenarios — notably QEMU's virtfs/virtio-9p passthrough. The 9p client stack lives under net/9p in the kernel tree and coordinates asynchronous requests via per-file request lists (the req_list) and well-defined request status states such as REQ_STATUS_SENT, REQ_STATUS_RCVD, and REQ_STATUS_ERROR.
Syzkaller (the kernel fuzzer) reported a KASAN-caught trace where p9_fd_cancelled() and p9_read_work() could both try to remove the same request node from its list, producing a double list_del() on the same element and subsequent memory corruption / protection faults. That failure chain is reflected in the publicly published call trace reproduced by the vulnerability record.
The underlying human story is familiar to kernel reviewers: concurrency in cleanup paths, combined with asymmetric list-management and status updates, produced a narrow timing window where two threads raced to remove and free the same request structure. The resulting corruption is an availability and stability risk — in the field it will most likely appear as kernel OOPSes, KASAN reports, or host crashes — although any time a kernel list or allocator is corrupted, escalation paths beyond DoS must be treated as theoretically possible with careful heap grooming.

What went wrong — technical anatomy​

The actors: request lifetimes and request lists​

The 9p client implements an asynchronous request model. Each request object is tracked on a req_list so the client can cancel, flush, or collect replies. Request lifetimes are governed by explicit status transitions (for example SENT → RCVD → ERROR) and by callbacks that perform list removal and resource cleanup. The list operations rely on the invariant that only the owner of a request will remove it from the list — a fragile property when multiple worker contexts and cancel paths exist.

The race​

Syzkaller reproduced a scenario with two threads:
  • Thread A runs p9_read_work() and handles an incoming reply; in some flows it will cancel outstanding requests and call the client callback path which sets the request status and removes the request from the list.
  • Thread B runs p9_fd_cancelled() as part of connection teardown; it acquires m->req_lock, inspects the request status, and — based on older status checks — proceeds to process and remove the request.
Because the p9_client_cb() path may perform list removal outside the req_lock window, a narrow interleaving could lead to both threads calling list_del(&req->req_list) on the same req object: the first removal works, the second operates on a node that is no longer in a list and/or already freed, producing a list corruption and subsequent memory access to a wild pointer. The kernel call trace in the KASAN hit shows list_delp9_fd_cancelledp9_client_flushp9_client_rpc originated in a mount attempt — a typical path fuzzers exercise.

Why the previous checks were insufficient​

Originally p9_fd_cancelled() only skipped processing if req->status indicated RECV (received) or ERROR — but other states can indicate the request has already been removed by a racing path. That meant p9_fd_cancelled() sometimes proceeded to act on requests that had already been cleaned up by p9_read_work() or the client callback. The patch changes the guard logic so that p9_fd_cancelled() only proceeds for requests that are still definitively in the SENT state; any other state is treated as not owned and is skipped. This is a conservative, defensive fix that reduces the concurrency surface by refusing to act if ownership is ambiguous.

The patch and commit trace​

Upstream maintainers merged a focused change to net/9p/trans_fd.c that adjusts the status check in p9_fd_cancelled() and prevents processing requests that are no longer in the SENT state. The canonical commit referenced by public advisories is 74d6a5d56629 with the commit message along the lines of:
  • "9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work"
The stable-kernel backport series shows this patch propagated to multiple long-term branches (including 5.4.y, 5.10.y and 6.1.y stable trees) following the standard upstream → -mm → stable flow. That makes it easy for distribution maintainers to backport the small, low-risk change.
Why this edit? The code-level rationale is intentionally minimal: rather than redesigning request ownership, the patch tightens the guard so p9_fd_cancelled() will not touch requests that have left the SENT state, because those transitions imply another actor already removed or is managing the request list. That single-condition inversion (from "skip RX/ERR" to "only process SENT") is easier to reason about and lowers regression risk.

Impact assessment — who should care​

  • Systems that mount 9p filesystems — including guests using QEMU virtfs (virtio-9p) or hosts running 9p transports (tcp, unix, fd, virtio) — are the potentially exposed population. QEMU-based setups and virtual appliance images that use -virtfs or -fsdev are primary examples.
  • The bug is primarily a local vector: exploitation requires the attacker to be able to create request traffic that triggers the vulnerable code paths or to otherwise coerce the kernel into exercising the client cancel/cleanup flows. Many public trackers therefore label this CVE as not remotely exploitable in its raw form.
  • Operational severity: the most realistic impact is denial-of-service (kernel OOPS, KASAN reports, host instability). In theory, list/allocator corruption sometimes offers a path to memory-corruption primitives that a skilled attacker could convert into privilege escalation, but such paths are platform- and allocator-dependent and typically require significant additional effort. Treat escalation risk as possible but unproven.
Key takeaways:
  • This is a correctness and concurrency fix that mitigates a crash-on-certain-timing issue discovered by fuzz testing.
  • The immediate, observable symptom in the wild would be kernel OOPSes and KASAN traces referencing p9_fd_cancelled and list_del.

Exploitability — realistic threat model​

  • Skill and access required: local access (or guest access in a virtualized environment) plus the ability to trigger 9p client operations. QEMU guests that mount host exports via 9p or host processes that mount remote 9p exports are both valid trigger vectors.
  • Remote exploitation without local or guest foothold is unlikely; this is not a network daemon exposing an unauthenticated remote code path. Public advisories and NVD record the bug as discovered by Syzkaller rather than by an incident-driven PoC.
  • Attack outcomes: most probable — crash/DoS; less likely but not impossible — carefully crafted heap manipulations could amplify list corruption into write primitives. Given the complexity of such exploit chains, operators should prioritize availability mitigation first.

Detection and forensic indicators​

Look for these concrete signals in telemetry and kernel logs:
  • Kernel OOPS or KASAN backtraces that include p9_fd_cancelled, p9_read_work, list_del, or __list_del_entry. The NVD/OSV traces include an example call stack reported by syzkaller.
  • Messages tied to mounting or v9fs operations (mount attempts via mount -t 9p ...) or QEMU virtfs activity.
  • Repeated or correlated OOPSes during guest mount operations can indicate a race being hit in production. Centralize kernel logs and search for the function names above to detect hits.
Useful inspection commands:
  • Check for active 9p mounts: findmnt -t 9p
  • Check for 9p modules: lsmod | egrep '(^9p|9pnet|9pnet_virtio)'
  • Search kernel logs for relevant traces: journalctl -k | grep -E 'p9_fd_cancelled|p9_read_work|list_del'
Those simple checks can rapidly identify whether a host has the 9p client active and whether the kernel produced related OOPSes.

Mitigation and short-term controls​

The recommended path is straightforward and targeted:
  • Primary fix: apply vendor/distribution kernel updates that include the upstream patch (commit 74d6a5d56629) and reboot into the patched kernel. The change is small and widely backported to stable series; most mainstream distributions will ship updates quickly once maintainers accept the stable commits.
If a timely kernel update is not possible, use these defensive controls as interim mitigations:
  • Prevent mounting of 9p filesystems by unprivileged users. Enforce mount policy (e.g., restrict mount to admins) and remove any automated mounts used by untrusted environments.
  • Unload or blacklist 9p-related modules on hosts that do not need 9p functionality. Typical module names are 9p, 9pnet, 9pnet_virtio and 9pnet_virtio is the virtio transport module used by QEMU guest virtfs. To blacklist persistently:
  • Create /etc/modprobe.d/disable-9p.conf containing:
  • blacklist 9p
  • blacklist 9pnet
  • blacklist 9pnet_virtio
  • Reboot or manually rmmod the modules where safe.
    Note: removing modules may impact running guests or services that rely on virtfs or 9p exports — test before broad removal.
  • On hypervisors, prefer alternatives to 9p (for example virtio-fs where appropriate) and limit which guests are allowed to use passthrough filesystems. QEMU docs and kernel documentation discuss security-model and transport options for 9p.
Operational checklist for triage:
  • Identify 9p usage across the fleet: findmnt -t 9p, lsmod | grep 9p, and configuration management records.
  • If present in production, prioritize patching hosts that host multiple tenants or that expose guest-facing workloads.
  • If patching is delayed: restrict mounts, blacklist modules on non-guest hosts, and harden access to virtualization tooling.

Vendor and distribution rollout — what to expect​

Because the fix is concise and considered low-risk, maintainers posted stable backports rapidly to the stable kernel branches; stable patch lists include entries for multiple kernel releases. Distributions such as Debian and Ubuntu typically pull these stable commits into their kernel packages and announce CVE mappings (DEBIAN-CVE-2025-40027 / UBUNTU-CVE-2025-40027) via their security trackers. Operators should always verify the presence of the upstream commit in the packaged kernel changelog before assuming a host is patched.
Practical verification steps:
  • For packaged kernels: check the distribution security advisory or changelog for CVE-2025-40027 or for the commit ID 74d6a5d56629.
  • For custom-built kernels: confirm the commit is present in the kernel tree: git log --grep=74d6a5d56629 or git log --grep='Fix concurrency del of req_list'.
  • After patching: reboot into the new kernel and verify via uname -r and ensure the earlier OOPS signatures no longer appear.

Pros and cons of the fix — strengths and residual risk​

Strengths:
  • Surgical, low-risk change — the patch does not rearchitect the request model; it tightens the guard and preserves semantics for normal flows, which reduces regression probability.
  • Quick to backport — small diffs are easy for maintainers to apply to stable trees and vendor kernels. Evidence shows the patch landed in multiple stable branches.
Potential caveats:
  • Vendor lag — as with many kernel fixes, embedded systems, vendor forks and OEM kernels may lag upstream. These are the likely long tail for exposure. Operators of appliances or custom images must coordinate with vendors or apply the patch locally.
  • Residual uncertainty on exploit chains — while the public record does not show active exploitation or a public PoC, kernel list/allocator corruption is inherently dangerous. Treat claims that the issue is only DoS as pragmatic but not absolute; if an environment allows local code execution by untrusted actors, prioritize patching.

Recommended remediation plan (concise, actionable)​

  • Inventory: map hosts that build or mount 9p or run QEMU guests that use virtfs. Use findmnt -t 9p, lsmod | grep 9p, and configuration management databases.
  • Patch: apply distribution kernel updates that include the upstream stable commit; for custom kernels, merge commit 74d6a5d56629 from upstream and build/test the kernel. Reboot into the patched kernel.
  • Validate: after patching, ensure no new kernel OOPSes referencing p9_fd_cancelled or list_del appear. Run representative mount/unmount/guest operations in staging to validate behavior.
  • Mitigate short-term: if patching cannot be immediate, restrict ability to mount 9p, blacklist 9p modules on non-required hosts, and isolate any test or CI hosts from production.
  • Monitor: add SIEM / log rules to flag repeated OOPS traces containing p9_fd_cancelled, p9_read_work, or list_del as high-priority alerts.

Broader lessons for operators and maintainers​

  • Fuzzing (Syzkaller) continues to find concurrency corner cases in networked and asynchronous kernel subsystems. Rapidly applied, small fixes are the practical pattern for many kernel correctness problems.
  • Always map CVE identifiers to upstream commit IDs when evaluating whether a particular kernel package contains a fix — package changelogs or git log searches are the single source of truth.
  • Long-tail risk (embedded devices, vendor forks) remains the main operational concern; include vendor kernel packages in any vulnerability inventory process.

Conclusion
CVE-2025-40027 is a focused concurrency fix in the 9p client stack that eliminates a narrow race which could produce double list_del() and kernel-level memory corruption. The patch is small, reviewable, and already backported in stable trees — operators should treat this as a standard kernel correctness update: identify exposed hosts, apply patched packages or upstream commits, and reboot in a controlled window. Where patching is delayed, temporarily restrict 9p usage or blacklist 9p transport modules until the kernel is updated. The technical fix demonstrates the kernel community's preference for surgical, defensive changes that close a race without broad refactors — a pragmatic balance between safety and regression risk.

Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top