Linux Kernel NFS Proc Cleanup Bug Fix CVE-2025-38400

  • Thread Author
A subtle error-handling bug in the Linux kernel's NFS code — tracked as CVE-2025-38400 — has been fixed: when the initialization routine nfs_fs_proc_net_init fails, the kernel could leave behind a /proc/net/rpc/nfs entry and later log warnings or leak state during namespace teardown, a problem now addressed in stable kernels and vendor updates.

Background / Overview​

The Network File System (NFS) implementation in the Linux kernel exposes runtime state through procfs, including RPC-related entries under /proc/net/rpc. During network or namespace setup, NFS creates proc entries to reflect RPC subsystem state; those entries must be removed cleanly when initialization fails or when the namespace is torn down. A fault-injection report from syzbot discovered a path where nfs_fs_proc_net_init failed but did not remove the /proc/net/rpc/nfs entry, causing rpc_proc_exit to later attempt to remove /proc/net/rpc and find it non-empty. The result was a kernel warning about a "removing non-empty directory" and a small resource leak. This is an error-handling / resource-cleanup bug rather than a memory-corruption or privilege-escalation vulnerability. The fix focused on ensuring the code paths that handle a failed nfs_fs_proc_net_init will remove the created procfs node so that subsequent cleanup operations succeed without warnings. The change was backported into multiple stable kernel trees and incorporated into vendor kernel updates.

What exactly went wrong? Technical analysis​

The failure and its trace​

The syzbot report that triggered this fix shows an injected failure during nfs_fs_proc_net_init resulting from a low-level allocation fault (fault-injection of SLAB allocation). The call trace in the report illustrates the creation path using procfs helpers (proc_create_net_data, proc_create_reg, __proc_create) and then the subsequent failure to remove the nfs proc entry when initialization did not complete. Later, rpc_proc_exit (called as part of net namespace teardown) attempted to remove /proc/net/rpc and logged a "directory is not empty" warning because /proc/net/rpc/nfs remained. This is fundamentally an error-path regression: the normal (success) path creates nfs under /proc/net/rpc and records state; the error path that should roll back that creation missed removing the node, leaving inconsistent state that triggers warnings and minor leaks during cleanup.

Files and functions affected​

The public advisories and stable-tree patches identify the affected code as residing in fs/nfs/inode.c (NFS core code) and the fix was applied as an upstream commit (and then to stable trees such as 6.12, 6.1, 5.15, 5.10, 5.4). The upstream commit that implements the cleanup is referenced in stable trees and stable-queue patches.

Where the issue was introduced​

Kernel maintainers’ analysis points to the issue being introduced by earlier changes in the NFS/namespace initialization code (the announce mail indicates that the problem was introduced around kernel 6.8–6.9 and fixed in a later stable release). The Linux CVE announcement and stable-tree emails show commit-level rollbacks/patches and indicate the range of kernel commits where the bug appeared.

Affected systems, severity and scoring​

  • Affected component: Linux kernel — NFS (fs/nfs).
  • Impact type: Availability (resource/state leak / warning during namespace teardown), not confidentiality or integrity. Several vendor advisories classify this as moderate / medium severity.
  • CVSS v3 example score: Amazon Linux advisory reports CVSS v3.1 5.5 (Medium) with vector CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H — reflecting a local attack vector and Availability impact. Different vendors and scanners may present slightly different scores but the consensus places the issue in the moderate/medium range.
Why local? The bug requires code paths triggered during NFS proc initialization and namespace operations — the syzbot trace shows the issue appearing during an unshare/namespace operation originating from a local process. That means an attacker must have the ability to run code locally (or within a container or an environment that can trigger net namespace operations) to reach the failing path. This reduces remote exploitation risk compared to network-facing memory-corruption bugs, but it does not eliminate concern where local code execution is possible (for example, multi-tenant hosts, containers, or compromised accounts).

Why it matters: real-world impact and attack surface​

  • Containers and multi-tenant hosts: Modern systems often allow users to create network and user namespaces (containers). The syzbot trace indicates unshare_nsproxy_namespaces and create_new_namespaces were involved, meaning namespace operations can exercise the problematic path. On hosts where unprivileged user namespaces are allowed (or where users can acquire namespace capabilities via container tooling), an unprivileged user could trigger the faulty initialization path. That elevates practical risk on container platforms and shared build hosts.
  • Operational noise and availability: The direct outcome observed is kernel warnings and a leaked proc entry that prevents /proc/net/rpc removal. Repeated or targeted triggering might lead to resource exhaustion or operational instability in corner cases, producing confusing warnings in syslogs and complicating automated monitoring and cleanup. Vendor analyses characterize the primary impact as availability-related.
  • Limited to local vectors: There is no public evidence this bug allows privilege escalation, arbitrary code execution, or remote compromise by itself. Its primary risk is local denial-of-service-style disruption and log pollution. As of the time of the vendor and kernel notices, there are no public exploit reports tied to this CVE. That said, because local attack vectors can be part of complex multi-step exploit chains, the kernel maintainers and vendors treated the fault seriously and backported fixes.

Timeline and vendor responses​

  • Discovery / report: A syzbot fault-injection run discovered the problem while testing NFS initialization with artificially induced allocation failures. The bug was reported to the kernel project and tracked as CVE-2025-38400.
  • Upstream fix: An upstream commit implemented proper cleanup in the error path; that upstream commit (and its patch) was then propagated into multiple stable trees. The kernel-stable maintainers posted the patch to 6.12-, 6.1-, 5.15-, 5.10- and 5.4-stable trees.
  • Vendor updates: Major distributors incorporated the fix in kernel updates and advisories — Amazon Linux (ALAS) and downstream vendors published fixes for affected kernel series. OSV and other vulnerability aggregators reflect publication and modification dates consistent with the kernel fix rollout. Administrators are advised to apply vendor kernel updates.

Detection: how to tell if this hit your systems​

  • Check kernel logs (dmesg, journalctl -k) for the distinctive warning message recorded by the syzbot trace: a remove_proc_entry: removing non-empty directory 'net/rpc', leaking at least 'nfs' warning (or similar wording produced by remove_proc_entry/proc generic code). That message is the clearest indicator an initialization error left proc state behind.
  • Look for repeated or correlated warnings during operations that create namespaces (e.g., unshare, unshare --net, container startup) or during NFS client initialization. Hosts running container builders, CI runners, or multi-user compute clusters are the most likely places to see this.
  • Monitor for unexplained /proc anomalies: while /proc entries are kernel-managed and cannot generally be removed from userspace, unusual persistence of /proc/net/rpc/nfs during teardown or mismatches between running RPC/NFS subsystems and proc entries are signals of trouble.
If you find these logs and you are on an unpatched kernel, treat them as evidence to prioritize a kernel update and, if necessary, avoid executing the specific namespace operations until a patch is applied.

Mitigation and remediation (recommended actions)​

Immediate mitigations (short term)​

  • Restrict unprivileged namespace creation: If feasible for your environment, consider disabling or restricting unprivileged user namespace creation to reduce the local attack surface. Use the distribution-recommended sysctl approach (for example, on Debian-family systems sysctl -w kernel.unprivileged_userns_clone=0 or configure kernel.userns_restrict where applicable). This is a trade-off — disabling user namespaces will break some container and sandbox use-cases (e.g., unprivileged containers, certain developer tools). Evaluate impact before changing the setting in production.
  • Limit access to hosts that allow unprivileged namespace creation: On multi-tenant infrastructure, restrict who can run container/job workloads that create namespaces. Enforce stricter access controls in CI/CD, build hosts, or shared developer machines.
  • Monitor kernel logs and alert on the NFS proc cleanup warning: Add a rule to your log-monitoring system to detect the remove_proc_entry: removing non-empty directory 'net/rpc' warning text and alert operators for further investigation.

Long-term remediation (apply patches)​

  • Apply vendor kernel updates: The definitive fix is to upgrade to the patched kernels distributed by your operating system vendor. Vendors and kernel-stable trees have included the upstream change; apply the kernel package update from your vendor and, where required, reboot to activate the patched kernel. Distributors that published fixes include mainstream vendors who backported the patch to long-lived kernel series.
  • Backport or cherry-pick for custom kernels: If you run custom-built kernels, apply the upstream commit that fixes the nfs_fs_proc_net_init error-path cleanup and rebuild. Kernel-stable patches provide backports for older supported series (patch files and stable commit hashes were posted to stable queues). Kernel maintainers discourage ad-hoc cherry-picks in production without testing; instead, prefer vendor-supplied kernel updates if possible.
  • Restart affected services / reboot: After installing a patched kernel, reboot hosts to ensure the kernel-level change takes effect. For some environments, unloading and reloading NFS-related modules may be possible, but a reboot is the safest path to fully clear kernel-level state. Vendors’ advisories and the kernel CVE announcement recommend updating kernels rather than relying on user-mode cleanup.

Practical guidance for administrators and DevOps​

  • Prioritize updates for hosts that:
  • Run multi-tenant services (CI runners, shared developer workstations).
  • Allow unprivileged namespace creation by users.
  • Host container runtime operations or frequent namespace manipulations.
  • Run older kernel branches where the patch was backported (check vendor advisories for your kernel series).
  • Test kernel updates in staging: Because kernel updates can affect drivers and container runtimes, test the vendor kernel update in a staging environment if you have custom drivers or specialized networking stacks.
  • Add detection and alerting: Implement log-based alerts for the specific kernel warning message and monitor for unusual /proc behavior. This enables early detection of attempted exploitation or accidental triggering.
  • Review namespace policy: If your environment relies on unprivileged user namespaces for workflows, schedule a risk review. Where disabling unprivileged namespaces is not acceptable, consider compensating controls: stricter process and user isolation, limiting who can run arbitrary containers, and stricter AppArmor/SELinux confinement.

How vendors scored and why scores vary​

Different vendors and scanners assign slightly different severities depending on their scoring interpretations and environment models. The Amazon Linux advisory lists CVSS v3.1 base score 5.5 (Medium) using vector AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H, which reflects a local attack vector and an availability-only impact. Other scanners (Snyk, Rapid7) present the issue as moderate/medium severity, with CVSS calibrations consistent with a local, non-confidentiality-impacting flaw. When you map risk to your environment, consider whether local users or containerized workloads can reach the vulnerable code path.

Developer and maintainer commentary (what kernel maintainers changed)​

The upstream patch ensures the NFS code removes the procfs entry on error in nfs_fs_proc_net_init so that subsequent rpc_proc_exit calls find /proc/net/rpc empty and can remove it cleanly. The kernel-stable maintainers added the patch to multiple trees (6.12, 6.1, 5.15, 5.10, 5.4) and included the change as part of regular stable updates rather than as a standalone emergency fix. This follows the standard kernel practice: fix upstream, then stabilize and backport to supported series.

Risk evaluation and final assessment​

  • Likelihood of remote compromise: Very low. The vulnerability requires local-triggered code paths (namespace operations and NFS proc initialization). There is no public evidence of remote exploitation.
  • Likelihood of in-the-wild exploitation: Low, but not zero. The bug’s vector and impact make it an unlikely candidate for widespread weaponization. However, in multi-step attacks or in the presence of other local privilege issues, it could be part of larger exploit chains on poorly isolated hosts. No public PoC or weaponized exploit tied to this CVE was observed at the time vendors published fixes. This status can change; always check for later advisories.
  • Operational impact if triggered: Local warnings, possible resource leaks, unpredictable cleanup behavior during namespace teardown, and potential localized availability issues. Not a data-exfiltration or privilege-escalation bug by itself.
Overall, the vulnerability is best categorized as a medium-severity operational bug that’s important to patch on shared and containerized infrastructure but not an immediate emergency for standalone desktop hosts that do not allow local namespace manipulation.

Checklist: what to do now (quick actionable items)​

  • Identify hosts running affected kernel versions (check distro advisories or kernel version).
  • Apply vendor kernel updates and reboot hosts into the patched kernel.
  • If you cannot patch immediately, reduce local attack surface by restricting unprivileged namespace usage (kernel.unprivileged_userns_clone / kernel.userns_restrict), and tighten who can run container workloads.
  • Add log monitoring for the probe signature (procfs cleanup warnings) and review historical logs for the same warnings to discover past triggers.
  • For custom kernels, apply the upstream change from the stable commit queue and test before broad deployment. Kernel-stable patches are available for multiple series.

Conclusion​

CVE-2025-38400 is a textbook case of a subtle error-path cleanup bug: not catastrophic by itself, but disruptive for multi-tenant and containerized environments. The kernel community responded quickly with an upstream fix that was backported into stable series and pushed by distributors. The correct operational response for affected organizations is straightforward: patch the kernel, reboot, and — where appropriate — reduce the local namespace attack surface until updates can be deployed.
Administrators should treat this as a routine kernel patching priority for shared systems and CI/container hosts. Vigilant monitoring for the characteristic procfs cleanup warning will help detect attempts to trigger the failure path while updates are scheduled and deployed.
Source: MSRC Security Update Guide - Microsoft Security Response Center