CVE-2025-40362 CephFS MDS Caps Validation Fix in Linux Kernel

  • Thread Author
A subtle logic bug in the Linux kernel’s Ceph client has been assigned CVE‑2025‑40362 and patched: in multi‑filesystem (multifs) Ceph deployments the MDS authorization caps check could be applied to the wrong filesystem because the code did not validate the filesystem name (fsname) along with the caps, allowing a user’s permissions from one CephFS to leak into another.

Neon isometric diagram: Linux kernel links fsname1 and fsname2 blocks under lock with Ceph logo.Background​

CephFS supports running multiple named file systems (fsname) within a single Ceph cluster. Each filesystem has its own Metadata Server (MDS) state and its own authorization model: caps (capabilities) can be granted for specific filesystems and paths to Ceph clients (for example, caps mds = "allow rw fsname=cephfs path=/volumes/"). The bug addressed by CVE‑2025‑40362 arises when Ceph’s kernel client code checks MDS auth caps but fails to verify that the caps being evaluated actually belong to the same fsname as the mount. That omission can cause the kernel to apply caps from one filesystem to operations on another, breaking the intended separation of privilege between filesystems. This vulnerability is significant because it is logical: it does not rely on memory corruption or a kernel crash, but on incorrect authorization semantics. In multi‑tenant or multi‑filesystem clusters the effect is straightforward in operational terms — a client may be able to perform writes or deletes on a filesystem where it should only have read access if it holds stronger caps on a different filesystem in the same cluster. The public CVE record documents a clear reproduction scenario that demonstrates the issue on a vstart test cluster.

What the vulnerability does (technical summary)​

The root cause, in plain terms​

  • The Ceph kernel client performs an MDS auth caps check when a client attempts filesystem operations.
  • The check validated caps values (the allowed permissions) but did not always confirm the fsname associated with those caps matched the filesystem the client is using.
  • As a result, a client key carrying caps for fsname A could, under certain code paths, have those caps applied to operations on fsname B. The intended isolation between named filesystems was violated.

Typical exploit / misconfiguration scenario​

The canonical reproduction described in public advisories shows how the issue manifests:
  • Create two CephFS filesystems on a test cluster: fsname1 and fsname2.
  • Grant the same client an r (read‑only) capability on fsname1 and rw (read/write) on fsname2.
  • Export the client’s keyring and mount fsname1 with that client key.
  • The client, despite being granted only read rights on fsname1, can create or delete files there because the stronger caps for fsname2 were misapplied to fsname1 during MDS authorization checks.
That sequence demonstrates the core logic failure: caps provenance (which filesystem the caps are intended for) was not always enforced.

Verified facts and authoritative sources​

  • The National Vulnerability Database (NVD) entry for CVE‑2025‑40362 records the kernel description and the reproduction steps; it explicitly notes that the fix ensures the MDS auth caps check validates the fsname along with caps.
  • The OSV (Open Source Vulnerabilities) record mirrors the NVD description and timestamp for publication, confirming independent ingestion of the report.
  • Public CVE aggregators (e.g., cvefeed, CVE Details) also index the same description and link to the kernel patch references; these independent indexes corroborate the technical summary and remediation direction.
These independent sources align on the core facts: the bug was in the Linux kernel Ceph client, its practical impact is incorrect authorization across multiple CephFS namespaces, and the upstream kernel received a fix that validates fsname when checking MDS auth caps.

The fix: what changed in the kernel​

Upstream maintainers corrected the Ceph client’s authorization logic so that the MDS auth caps check also validates the associated fsname before applying caps to a mount. The public trackers and aggregated CVE records indicate that multiple kernel stable backports were committed to propagate the fix to maintained kernel lines; the patch was intentionally scoped to the authorization check rather than a broad refactor. That surgical approach reduces regression risk while restoring correct semantics. Notes about the patch series from public trackers:
  • Later revisions of the patch removed storing fsname from mdsmap, validated the mount’s fsname via ceph_mount_options, and tightened warning messages and defensive checks.
  • Maintainers applied defensive coding to avoid potential null dereferences and to ensure the right string comparisons are used when deciding whether a cap matches the active mount.

Who should care (impact and scope)​

This vulnerability matters most to environments that meet the following criteria:
  • Clusters running multiple CephFS filesystems (multifs) on the same Ceph cluster.
  • Deployments that rely on per‑filesystem path‑scoped capabilities to enforce tenant isolation.
  • Systems where the same client principal (for example, client.usr) is intentionally granted differing rights on different filesystems, such as cloud providers, multi‑tenant storage providers, and complex on‑prem clusters using CephFS for isolated tenant data.
If you run only a single CephFS filesystem per cluster and do not reuse client keys across filesystems, the practical exposure may be limited. However, assumptions are risky — many operators reuse client identities for automation or simplicity. As a defensive posture, treat any multi‑fs use or key reuse as in‑scope until you confirm otherwise.

Detection and verification​

If you suspect an instance of this issue in your environment, use the following prioritized checks:
  • Inventory: identify hosts that mount CephFS and determine whether they mount named filesystems (check mount options for name= or fsname= parameters). Search for mounts that include name=<something>.fsname semantics. Correlate mounts with client key usage.
  • Permission test in a controlled environment: reproduce the canonical test on an isolated lab cluster (never run destructive tests on production):
  • Create fsname1 and fsname2.
  • Give client.X read on fsname1 and read/write on fsname2.
  • Mount fsname1 as client.X and attempt a write that should be forbidden.
  • If the write succeeds, the environment exhibits the problem described in the CVE.
  • Kernel and Ceph logs: examine kernel messages (dmesg, journal) and MDS logs for warnings about fsname mismatches or unusual auth decisions; patched kernels may emit more specific warnings if fsname checks fail.
Testing should be performed in an isolated testbed with representative Ceph versions and kernel builds. The CVE description includes example reproduction steps; these are intended for safe, controlled validation only.

Mitigation and remediation guidance​

The definitive remediation is to install kernel updates that include the upstream patch or to apply the specific patch into your distribution/OEM kernel and rebuild.
Practical steps operators should follow, prioritized:
  • Inventory and triage (hours)
  • Locate all systems that mount CephFS and record whether they use named filesystems (multifs).
  • Identify client keys that are used across multiple filesystems; flag keys with differing permissions on different filesystems.
  • For vendor appliances or OEM kernels, contact the vendor to confirm whether their kernel package includes the upstream backport.
  • Patch (days)
  • Apply kernel updates from your distribution that include the fix. Use vendor advisories and package changelogs to confirm the backport contains the Ceph fsname validation fix.
  • If you build and maintain custom kernels, merge the upstream patch and test thoroughly before roll-out.
  • Temporary compensating controls (until patched)
  • Avoid reusing client keys across distinct CephFS filesystems where different privileges are assigned.
  • Restrict who can create or distribute client keys that span multiple filesystems.
  • Use network and host‑level controls to limit which clients can mount which CephFS instances (for example, network segmentation, node labels, or mount policies).
  • If feasible, temporarily avoid using named multi‑fs setups for high‑risk tenants until the fix is deployed and validated.
  • Post‑patch verification
  • Re-run the controlled reproduction test in a staging environment to verify the behavioral fix: with the patch installed, the client that has only read rights on fsname1 must not be able to write even if it holds rw rights on fsname2.
  • Monitor MDS and kernel logs for warnings and confirm no regression.

Operational risk analysis (strengths, weaknesses, and business risk)​

Strengths of the remediation​

  • The upstream fix is targeted: it validates fsname and avoids broad refactors, which minimizes regression risk when backporting to stable kernel lines.
  • The vulnerability is deterministic and easy to reproduce in test, which helps operators validate fixes quickly.

Residual risks and caveats​

  • Distribution/packaging lag: enterprise vendors, OEMs, and embedded devices may take time to ship kernel packages containing the fix. Long‑tail appliances that embed specific kernel builds may remain vulnerable until vendors issue updates. Operators must track vendor advisories closely.
  • Operational complexity: environments using automation that intentionally reuses client credentials across filesystems for convenience will need to change deployment patterns or adjust ACLs — both can be operationally disruptive.
  • Testing risk: merging kernel patches into custom trees requires careful testing, especially on storage hosts that carry customer data. Operators should stage the update and use safe rollback plans.

Business impact​

  • For multi‑tenant service providers and cloud operators, the vulnerability has a direct confidentiality/integrity risk: tenants could gain more filesystem privileges than intended, potentially allowing tenant data modification or deletion.
  • For single‑tenant or small deployments where named filesystems and shared keys are not used, the business impact is lower — nevertheless, confirm configurations to be certain.

Why the missing MSRC page matters (and what to do)​

The user’s reference to MSRC’s update guide for CVE‑2025‑40362 returning “not found” is a helpful reminder that vendor pages and curated vulnerability trackers may lag or temporarily omit entries. Microsoft has been rolling out machine‑readable VEX/CSAF attestations for some products, but those attestations are product‑scoped and may not cover every artifact instantly; absence of an MSRC entry does not mean absence of the issue in other artifacts such as Azure Linux kernels or WSL2 kernels — each artifact must be inventoried separately. For context, community trackers and vendor advisories corroborate the kernel‑level Ceph fix and list similar remediation guidance; don’t rely on a single vendor page to determine exposure.
Operational recommendation: if you rely on Microsoft‑provided Linux artifacts (Azure images, AKS node images, WSL2 kernels), verify whether those artifacts’ kernels contain the upstream commit or vendor backport by checking package changelogs or vendor advisory metadata rather than assuming a missing MSRC page implies safety.

Detection queries and incident response playbook​

High‑signal detection rules and steps to respond:
  • Hunt for anomalous writes: create short tests that attempt to write files in read‑only mounts using representative client keys; a successful write in that context is a high signal.
  • Audit mounts and key usage: log and monitor which keys are used to mount which fsname; alert when a single key appears across multiple filesystems with differing caps.
  • Preserve evidence: if you detect unauthorized writes, capture kernel logs, MDS logs, and any mount/credential metadata for forensic analysis.
  • Containment: temporarily remove affected nodes from service if you suspect unauthorized data modification, and isolate the key(s) in question (rotate or revoke client credentials where practical).
  • For managed providers: open a support ticket with your Ceph vendor/OEM and request the exact package identifiers that contain the backport.

Final notes and cautions​

  • The CVE description and community advisories do not indicate this bug was being used to achieve kernel code execution or remote privilege escalation; it is an authorization semantics bug. Treat claims of RCE or privilege escalation as environment dependent and require specific enabling conditions beyond the core bug. Public trackers focused on the auth caps leak and its access semantics.
  • Always verify vendor package changelogs and commit hashes before declaring a host patched. The upstream patch is straightforward, but distribution backports vary. Where a vendor claims a backport is included, confirm the exact commit hash or changelog entry in the package metadata.
  • If you cannot patch immediately, focus on key hygiene: avoid reusing client keys across multiple fsnames and restrict which principals can request broad capabilities. These mitigations reduce exposure even before kernels are updated.

Conclusion​

CVE‑2025‑40362 exposed a subtle but dangerous authorization check omission in the Linux kernel’s Ceph client: fsname was not always validated alongside MDS auth caps, allowing capability leakage across named filesystems. Upstream maintainers issued a targeted fix that restores correct fsname validation; operators must verify that their kernels contain the backport and apply vendor packages promptly. Multi‑tenant CephFS deployments, automation that reuses client keys, and environments that run multiple CephFS namespaces on a single cluster should treat this as high priority for inventory, verification, and patching. The vulnerability is a reminder that semantics and provenance checks are as security‑critical as memory safety, and that careful inventory and key hygiene remain essential controls while kernel patches propagate through vendor channels.
Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top