CVE-2025-31133: runc MaskedPaths Race and Local Container Escape

  • Thread Author
runc contains a newly disclosed local container escape and information-disclosure vulnerability (CVE-2025-31133) that abuses runc’s maskedPaths handling by exploiting mount/race conditions around bind-mounting the container’s /dev/null, and operators must treat hosts that run untrusted images or parallel build systems as high-priority remediation targets.

Background / Overview​

runc is the reference OCI runtime used by Docker, Kubernetes (via containerd), and many other container systems; it performs low-level namespace and mount setup and writes security metadata into pseudo-filesystems such as /proc. A class of recent disclosures shows that small validation or timing errors in those setup steps can enable host-impacting outcomes when untrusted images, shared mounts, or parallel builds are involved. CVE-2025-31133 fits squarely into that pattern: when runc uses a container’s /dev/null to mask a file (the maskedPaths feature), it did not sufficiently verify that the source inode was an honest /dev/null, allowing an attacker to race or otherwise manipulate mounts so that the bind-mount goes to an attacker-controlled target rather than true /dev/null. In plain terms, the vulnerability enables two practical attack variants:
  • an arbitrary mount gadget that can redirect mounts and reads to attacker-controlled files (leading to information disclosure, denial-of-service, or container escape), and
  • maskedPaths bypass, where runc silently skips masking if it encounters certain errors during the bind-mount sequence, leaving normally-hidden host paths exposed to the container.
Multiple independent trackers and the runc/GitHub advisory confirm the issue and the fixed releases; runc maintainers published targeted patches and staged them into releases to address the verification and race conditions.

Technical anatomy: what goes wrong​

The role of maskedPaths and /dev/null​

The maskedPaths feature allows container operators to hide or neutralize sensitive files exposed inside a container filesystem. For files, runc implements masking by bind-mounting the container’s own /dev/null on top of the target path, effectively turning reads into EOFs and writes into no-ops. For directories maskedPaths mounts a read-only tmpfs to hide contained files. Because masking relies on bind-mounting a source inode supplied by the container runtime, trusting the source inode is critical.

The failure mode: insufficient inode verification + race windows​

CVE-2025-31133 arises from two complementary problems:
  • runc did not robustly verify that the source path used for the bind-mount (the container’s /dev/null) was actually the expected device inode, and
  • the codepath allowed race conditions with shared mounts (for example, mounts created concurrently by other containers or build steps) to change the inode or remove /dev/null between runc’s checks and the actual mount operation. Attackers can exploit this time-of-check/time-of-use (TOCTOU) window to present an attacker-controlled mount as the source, or to delete /dev/null so that runc’s logic skips the bind and leaves maskedPaths inactive.

Practical exploit primitives​

The advisory and public analysis show practical exploit recipes that are surprisingly accessible in modern DevOps workflows:
  • Parallel container startup (for example, docker buildx parallelized builds) or shared tmpfs mounts create the necessary concurrent mount windows. Researchers reproduced race conditions using standard Dockerfiles that execute container startup operations concurrently.
  • An attacker-supplied image or build step can create and manipulate /dev/null inside the shared mount space to change the source inode runc will later bind. That means CI runners, public builder hosts, and shared developer machines are realistic targets.
Because the attack is local (AV:L in CVSS terminology) the bug is not a remote, unauthenticated network worm — but in multi-tenant CI and builder environments the ability to run untrusted builds is effectively a vector into hosts used by many teams, blurring the line between “local” and “remote.”

Affected versions and patches​

  • Affected: runc versions up to and including the pre-patch releases (examples listed in advisories include 1.2.7, 1.3.2, 1.4.0-rc.2 ranges).
  • Patched: runc 1.2.8, 1.3.3 and 1.4.0-rc.3 contain the fixes that harden the maskedPaths handling and tighten inode/mount verification. Administrators should take vendor-packaged updates or rebuild runc from the patched upstream if vendor packages lag.
Operators should inventory all hosts that run runc (including containerd/Docker hosts, CI/build runners, and developer workstations used for builds) and map package versions to these patched releases immediately. Distribution trackers and vendor advisories simplify mapping packages to fixed versions; rely on your distro’s security tracker to find the exact package version that includes the upstream fix.

Exploitation scenarios and threat modeling​

Realistic high-risk environments​

  • CI/CD runners that accept untrusted Dockerfiles or build contexts and use buildx or other parallel build mechanisms. The disclosure explicitly demonstrates how docker buildx-style concurrency can create exploitable race windows.
  • Multi-tenant shared developer hosts and public builder services where images from many tenants are executed with shared mounts or tmpfs.
  • Container hosts that allow privileged mounts or expose host pseudo-filesystems into build contexts. Avoid exposing host /proc-like pseudo-filesystems to untrusted builds.

What an attacker can and cannot do​

What CVE-2025-31133 enables:
  • Host file disclosure: maskedPaths skipping or misbinding can reveal normally masked host files (for example, sensitive /proc entries).
  • Denial-of-service against host services by mounting or manipulating critical host resources.
  • Container escape or privilege escalation when chained with other issues: the runc maintainers and public analysts explicitly warn that chaining this bug with related runc or kernel issues (for example, proc-redirect races or SELinux helper bugs) can materially raise the attack impact to host compromise.
What it does not do by itself:
  • Remote unauthenticated takeover: the attacker must be able to run or influence container startup/creation or to manipulate shared mounts/build-time operations; this is a local vector. However, in shared CI and builder ecosystems that local access model is frequently achievable by an attacker who can submit a malicious build.

What the patches change (high-level)​

The runc patchset addresses the root causes in three complementary ways:
  • Add stronger verification that the source inode used for bind-mount masking is the expected /dev/null device, not an attacker-controlled file.
  • Remove the silent ignore-on-ENOENT behavior where runc would skip masking if /dev/null was missing at mount time; the code now treats that condition in a safer, explicit fashion so maskedPaths are not silently bypassed.
  • Harden mount/path handling and reduce TOCTOU windows where practical (for example, using safer path joins and fd-based mount targets where possible). The broader patchset for the runc advisory includes safe path helpers and tighter inode checks.
These changes are narrow and targeted; they are not a full re-architecture of runc’s mount logic, but they address the specific verification and timing gaps that enabled this issue. Operators should apply the fixed releases rather than attempting to craft ad-hoc mitigations that could leave other TOCTOU windows exposed.

Immediate mitigations and operational checklist​

If you cannot patch every host immediately, apply these prioritized mitigations:
  • Patch priority and inventory
  • Inventory every host running runc (containerd, Docker, Kubernetes nodes, CI build hosts). Map package versions to runc releases and prioritize those with untrusted workloads.
  • Short-term runtime mitigations
  • Use rootless containers or user namespaces for untrusted builds and images; these reduce the ability of a container to weaponize a misbound mount against host files because the mapped UIDs lack host privileges. The runc advisory explicitly lists rootless contexts as a practical mitigation when patching is delayed.
  • Limit or disable docker buildx-style parallel build concurrency on shared builder hosts until patches are applied. Reducing parallelism narrows TOCTOU windows.
  • Platform hardening
  • Block untrusted images on shared CI runners (image signing, allowlist registries).
  • Avoid exposing host pseudo-filesystems or writable tmpfs mounts into builder containers.
  • Detection and telemetry
  • Add monitoring for unexpected sysctl or /proc writes, changes to core_pattern, or writes to /proc/sysrq-trigger on builder/host nodes (these are high-signal artifacts because the exploit abuses proc-like paths).
  • Hunt CI logs for evidence of parallel container startups, repeated mount operations, or scripts that create symlinks inside tmpfs during builds; those are the practical PoC patterns used by researchers.
  • Incident response
  • If you observe unexpected host /proc modifications, isolate affected hosts immediately and assume possible follow-on exploitation, since misdirected writes can be chained into more serious host impacts.

Detection & hunting playbook (practical rules)​

  • Detection rules to deploy:
  • File ownership changes to unexpected UIDs on host files following build jobs (for example, sudden changes to UID mappings on files that should be root-owned).
  • Processes associated with build runners writing to /proc files that are not typically written by those services.
  • Creation of unusual symlinks inside tmpfs mounts used by builds (these were explicitly used in PoC reproductions).
  • Investigation steps when an alert fires:
  • Identify the job/request that created the container or build and capture container images/manifests.
  • Collect the node dmesg/journalctl and container runtime logs around the time of the suspected incident.
  • Check for sysctl changes, core_pattern modifications, or /proc/sysrq-trigger writes. If any of those are present, isolate the host immediately.

Risk analysis — strengths of the disclosure and remaining uncertainties​

What’s good about the fixes and the disclosure​

  • The runc maintainers published a targeted patchset that closes the concrete verification and timing gaps rather than offering only a superficial check — the fixes include inode verification, safer path APIs, and explicit handling of ENOENT conditions that previously caused silent masking bypasses. Multiple independent trackers (GitHub advisory, CVE aggregators, distro security feeds) corroborate the affected/patched versions which simplifies enterprise prioritization.

What remains uncertain or risky​

  • The runc project notes that this patchset addresses the present failure modes but does not rewrite all mount or path-resolution logic; residual TOCTOU hazards may exist in adjacent code paths or in alternative runtimes (for example crun or youki). Operators should treat the fixes as necessary but not exhaustively protective for the general class of procfs/mount racing attacks.
  • SELinux/AppArmor effectiveness in real deployments can be environment-dependent. Some analyses note that LSM behavior varies with distro policies and SELinux modules; do not assume SELinux will stop every chained exploit without validating your specific policy in testbeds. Flag SELinux effectiveness as conditional until validated.

Exploit prevalence and telemetry​

Public reporting shows no confirmed widespread in-the-wild mass exploitation at the time of disclosure, but high-quality PoCs and straightforward build-time reproduction scripts compress the window to weaponization — defenders must act as if weaponization is a short time away. Historical patterns show that once PoCs are public, attackers quickly automate weaponization against large fleets that have exposed build nodes.

Practical roadmap for operators (recommended sequence)​

  • Inventory all hosts running runc and prioritize CI/build hosts and multi-tenant nodes.
  • Deploy vendor or upstream patches for runc (1.2.8 / 1.3.3 / 1.4.0-rc.3) and restart container services where required. Confirm package mappings via your distro tracker.
  • For hosts that cannot be patched immediately, move critical workloads to patched hosts or use rootless containers for untrusted workloads.
  • Harden CI and builder policies: enforce image signing, restrict untrusted images, and reduce build concurrency where feasible.
  • Implement the detection rules above and run a focused hunt for recent changes to /proc/sys values or core_pattern edits across builder fleets.

Conclusion​

CVE-2025-31133 is a high-consequence, local vulnerability that leverages a deceptively small verification gap and realistic mount race conditions to neutralize maskedPaths protections and to create mount gadgets that expose host resources. The practical risk is highest on shared CI/build hosts and multi-tenant container nodes where untrusted images and parallelism are common. runc’s maintainers published deliberate, narrow fixes that address the verification and ENOENT-handling failures; operators must patch runc to the fixed releases immediately, prioritize builder/CI hosts for remediation, and apply short-term mitigations (rootless containers, reduced build concurrency, runtime detections) until fleets are updated. Caveat: the exploitability and LSM behavior vary by environment. Treat any claims about SELinux or other mitigations as conditional until validated in your own testbeds, and assume that high-quality proof-of-concept material will make weaponization likely once public.


Source: MSRC Security Update Guide - Microsoft Security Response Center