CVE-2025-52881: runc procfs race enables container confinement bypass

  • Thread Author
CVE-2025-52881 exploit depicted targeting Linux /proc paths and containers.
runc’s handling of procfs writes contains a dangerous race-and-redirect weakness that allows an attacker to bypass Linux Security Module (LSM) labels by misdirecting writes to fake or otherwise benign procfs files, creating a practical path to disable container confinement and to weaponize sysctl and proc writes against the host. This vulnerability, tracked as CVE-2025-52881, was disclosed alongside patches and a coordinated advisory from the runc maintainers; affected versions include 1.2.7, 1.3.2 and 1.4.0-rc.2, and fixes were released in runc 1.2.8, 1.3.3 and 1.4.0-rc.3.

Background / Overview​

runc is the reference low-level container runtime used by Docker, Kubernetes, and many other container ecosystems to create and run OCI-compliant containers. Its job includes setting up namespaces, applying security labels, writing LSM (AppArmor/SELinux) attributes and sysctl configuration into procfs for containerized processes. Because these operations touch host-sensitive pseudo-filesystems, any logic bug in how runc writes to /proc or validates its targets can produce severe host-impacting outcomes.
CVE-2025-52881 is a race condition enabling procfs write redirection and symlink following that lets an attacker cause runc to write to a procfs path other than the one runc intended. The exploit is not a simple remote network bug — it requires the ability to start containers with custom mount configurations, or otherwise trigger racing container executions (for example, via build-time parallelism such as docker buildx build) that share mounts. When successfully exploited, a malicious actor can:
  • Prevent intended LSM labels from being applied to container processes (effectively disabling confinement),
  • Redirect writes to innocuous-seeming proc files (like /proc/self/sched) that pass simple procfs checks, and
  • Weaponize redirected writes to more dangerous proc or sysctl targets (for example, /proc/sysrq-trigger or /proc/sys/kernel/core_pattern) to cause host instability or to facilitate further privilege escalation.
Multiple vulnerability trackers and vendor advisories confirm the problem, list the same affected versions and patched releases, and enumerate the same high-level exploitation scenarios. Debian, NVD and the runc project’s own advisory are consistent on the technical root cause and the remediation timeline.

Why this matters: LSM labels, procfs and container security​

LSM labels are a primary confinement control​

Linux Security Modules such as AppArmor and SELinux are widely used as a last line of defense for container isolation. Runc writes per-process label attributes into procfs (for example, /proc/<pid>/attr/current or vendor-specific setpoints) to ensure the kernel applies the intended LSM policy to container processes. If those writes are misdirected or bypassed, the kernel may never associate the container process with the LSM policy that restricts its capabilities.

procfs writes can be weaponized​

procfs and sysctl interfaces are text-oriented and powerful. Misdirected writes to these files can do things ranging from benign no-ops to catastrophic host effects. The runc advisory explicitly calls out the risk of redirecting writes to files such as /proc/sysrq-trigger (which can trigger host hangs or panics) or writing a crafted core_pattern that executes host-level code when processes crash — an escalation path that turns a confinement bypass into host compromise.

The attack vector is local but practical​

This is a local attack class (AV:L in CVSS parlance), but the practical preconditions can be surprisingly easy to achieve in containerized deployments:
  • Malicious images or builder contexts can be distributed via registries.
  • Build-time or runtime parallelism (for example, docker buildx) can create the racing conditions needed for redirection.
  • Multi-tenant CI runners, shared builder hosts, and developer workstations that accept untrusted images are realistic attack surfaces.
Because of these operational realities, defenders must treat the vulnerability as high-priority for hosts running untrusted workloads, CI systems, or container build services.

Technical anatomy: how CVE-2025-52881 works​

Core failure mode: race + link-following in procfs writes​

At a high level, the bug allows a racing container (or parallel process) to present an alternate procfs target — either via a symbolic link in a tmpfs or by other mount manipulations — so that when runc attempts to write an LSM label (or sysctl), the write is delivered to a different proc path than runc intended.
The runc mitigation for a related older bug, CVE-2019-19921, previously checked only that the file being written looked like it belonged to procfs. Attackers can evade that check by arranging for /proc/self/attr/<label> to resolve to some procfs file that accepts writes but performs no LSM-labeling (for example /proc/self/sched), thereby defeating the simpler verification. CVE-2025-52881 expands on that class of attacks by using race conditions between container setups and shared mounts to redirect writes into attacker-chosen procfs targets even when the simplified procfs check passes.

Why race conditions matter here​

The exploit leverages timing and concurrent mount manipulation: runc assumes certain mountpoints are stable during the sequence of operations that prepare container namespaces and write attributes. When two containers or processes race while sharing mounts, an attacker can reorder operations so the proc path runc opens is the one the attacker controls. Because the kernel's procfs namespace is hierarchical and path resolution may resolve symlinks or mounts differently under concurrency, those race windows are exploitable if runc’s path handling and inode validation are insufficiently robust.

Broader attack surface: writes beyond LSM labels​

CVE-2025-52881 is not limited to LSM label files. The advisory notes that any writes to /proc — including sysctls under /proc/sys — could be misdirected. Attackers can therefore aim for files with greater destructive potential (for example, /proc/sys/kernel/core_pattern or /proc/sysrq-trigger) to cause host-level side effects beyond simple containment bypass. The advisory explicitly documents such scenarios and the maintainers’ patchset attempts to harden how runc opens and writes to proc entries.

Affected versions and patches​

  • Affected: runc versions up to and including 1.2.7, 1.3.2, and 1.4.0-rc.2 (specific CPE ranges reflect those cutoffs).
  • Patched: runc 1.2.8, 1.3.3, and 1.4.0-rc.3; the runc project released a combined patchset that includes secure-proc handling, inode verification helpers, fd-based mount target handling, securejoin usages and other hardening changes.
The runc maintainers released a consolidated set of fixes that address this vulnerability and two related issues; the patches include a private fork/patch for the selinux helper module used by runc (the upstream selinux library was accordingly involved) together with replacements for insecure path joins and the addition of safe procfs helper APIs. Linux distribution trackers (for example, Debian’s security tracker) and major vulnerability databases list the same affected/fixed versions and have begun mapping vendor package updates to fixed releases; administrators should rely on their distribution’s packaged runc update or on rebuilding runc from patched upstream if distributions lag.

Exploitation scenarios and attack chains​

The raw vulnerability is a local, race-based misdirection of proc writes, but its practical value depends on the attacker’s capabilities and environment.

Realistic attack chains​

  1. Malicious image / build-time abuse
    • An attacker publishes a Dockerfile or image that leverages buildx or parallel build runners to create the racing condition, or that creates mount-time symlinks and custom mounts to manipulate proc resolution. When a CI or developer build system runs that image, the race can occur and runc may misdirect proc writes.
  2. Post-compromise lateral use on shared hosts
    • If an attacker already controls an unprivileged process on a shared node (for example via a prior escape or misconfiguration), they can use the race to neutralize LSM protections and then launch further host-impacting exploitation such as invoking sysctl modifications or crafting core_pattern payloads to execute host-level scripts on crash.
  3. Chained exploitation with other runc/kernel weaknesses
    • The runc team and security analysts warned that CVE-2025-52881 can be combined with other runc issues (e.g., CVE-2025-31133 and CVE-2025-52565 from the same disclosure wave) to achieve full host compromise in some circumstances. The bypass of AppArmor or container-selinux labels in particular makes other remote or local weaknesses far easier to exploit. Public reporting has emphasized the ease of chaining these flaws on unpatched hosts.

What attackers cannot do (without additional conditions)​

  • The bug on its own is not a remote unauthenticated network exploit; it requires either local container startup control or the ability to influence mounts and build-time concurrency. That said, attacker-controlled images and distributed CI systems blur the line between “local” and “remote” in modern DevOps environments.

Mitigations and recommended remediation steps​

Immediate actions (apply now)​

  • Upgrade runc to one of the patched releases: 1.2.8, 1.3.3, or 1.4.0-rc.3 depending on your branch and vendor packaging. Apply vendor-supplied updates where possible; if you run runc built from source, rebuild from upstream after applying the official patches.
  • Patch your container hosts and CI/build runners that perform untrusted image builds (for example, public CI runners, shared docker build nodes, or developer build machines). These hosts are high-priority because the exploit leverages build-time parallelism and shared mounts.

Short- to medium-term mitigations (if you can’t patch immediately)​

  • Use rootless containers or user namespaces to reduce the impact of misdirected privileged writes. Running runc in an unprivileged user context prevents it from being able to write to many sensitive proc/sys targets. The runc advisory explicitly lists rootless containers as an effective mitigation for the privilege-escalation effects of redirected writes.
  • Harden CI/Builder host policies: restrict which images can be used in shared build systems, apply strict image-signing and provenance policies, and avoid accepting builds from untrusted sources. Limit build concurrency where feasible if it’s practical to temporarily reduce race windows on critical hosts.
  • Restrict access and mount privileges: limit which users and workflows can create mounts or bind-mounts into shared namespaces. Avoid exposing host-proc pseudo-filesystems into build contexts or using writable tmpfs mounts for untrusted workloads.

Operational controls and detection​

  • Inventory all hosts running runc and cross-check versions. Distribution package trackers (e.g., Debian’s security tracker) can help map vulnerable package builds to fixed revisions.
  • Hunt for suspicious build-time activities in CI logs: look for unexpected concurrency or mount setup race logs, unusual docker buildx invocations, or processes that create symlinks into tmpfs during builds.
  • Monitor /proc/sys writes and odd sysctl changes on critical build nodes for signs of deliberate misdirected writes or unexpected sysctl values that could presage a follow-up attack (for example, core_pattern changes or sysrq-trigger writes).

Detection, telemetry and incident response guidance​

  • Add detection rules for the creation of unusual symlinks inside tmpfs mounts used during builds, or for processes writing to procfs paths that are atypical for those services (for example, builder processes writing to /proc/sys). These are practical indicators because the exploit relies on redirecting writes into attacker-controlled targets.
  • If you detect changed sysctl values or modified core_pattern on a host, treat it as a high-priority incident and isolate the host. Because redirected writes can create immediate host instability (including kernel panic triggers via /proc/sysrq-trigger), err on the side of caution when investigating any unusual proc modifications.
  • Review recent CI job histories for use of docker buildx parallel builds, suspicious PRs with Dockerfiles from untrusted contributors, or jobs that created and used custom mounts in build steps. These build-time patterns were explicitly used by researchers to validate the exploit.

Risk analysis: strengths and limitations of the disclosure and fixes​

Notable strengths​

  • The runc maintainers published a substantive, multi-commit patchset that goes beyond a superficial check; fixes include fd-based mount handling, secure joining of paths, safe procfs helper APIs and a collection of other hardenings to reduce the attack surface for path resolution and proc writes. That patchset is already present in the patched releases.
  • Multiple independent trackers and vendors (NVD, Debian, Chainguard, CVE aggregators and security vendors) corroborate the vulnerability description, affected versions and fixed versions, which simplifies enterprise patch prioritization.

Potential risks and residual uncertainties​

  • The advisory acknowledges that runc’s maintainers have not completed a comprehensive redesign to eliminate every possible timing or mount-resolution hazard and that the published fixes are part of a broader patchset; there remains a possibility that other runtimes (crun, youki) may have related issues. Operators should therefore treat the fixes as necessary but not necessarily exhaustive for the general class of procfs racing attacks. The runc advisory itself cautions about residual avenues and the need for continued hardening.
  • The actual exploitability in the wild is environment-dependent. While PoCs and exploit templates exist (the runc advisory references Dockerfile-based reproductions using buildx), there is, at the time of disclosure, no widely reported active exploitation campaign attributed to CVE-2025-52881. Detection and monitoring remain essential because the availability of high-quality PoCs compresses the time to weaponization. Public reporting indicates no confirmed mass exploitation as of the initial advisory.
  • Some claims about SELinux behavior and exact cross-runtime effects are nuanced and require careful testing in your environment. The runc advisory notes that SELinux’s protections may behave differently and that runc had not fully validated all SELinux configurations against the new attack variants — treat any statement about SELinux effectiveness as conditional and verify with your specific policy and system configuration. Flag SELinux effectiveness as environment-dependent until independently validated in your testbeds.

Practical remediation checklist (prioritized)​

1. Immediately verify runc versions across your fleet (including CI/build hosts and Kubernetes nodes). Inventory and map to package versions.
2. Apply vendor-supplied updates or upgrade runc to the patched versions (1.2.8, 1.3.3, or 1.4.0-rc.3) and restart services/containers as required. 3. For environments that cannot patch immediately: transition critical workloads to rootless containers or isolate them on hosts that will be patched first. 4. Harden CI/build pipelines: block untrusted images, reduce build-node concurrency for untrusted builds, and restrict use of docker buildx where untrusted inputs exist. 5. Implement runtime monitoring rules for unusual proc/sys writes, changed core_pattern, or sysrq-trigger writes and correlate these with build or container startup events.

Final assessment and recommendations​

CVE-2025-52881 is a high-impact local vulnerability in a critical container runtime that exposes a practical route to neutralize LSM confinement and to weaponize proc/sys writes against the host. The runc maintainers’ patchset addresses the immediate failure modes and provides safer APIs and inode/path verification, and multiple independent trackers (NVD, Debian, Chainguard and security vendors) confirm the affected and patched versions. Administrators should treat this as a high-priority remediation for hosts that run untrusted workloads, CI builders or shared infrastructure where untrusted images or parallel builds occur. Short checklist (conclusion-ready):
  • Patch runc to the fixed releases immediately where feasible.
  • Prioritize CI/build hosts and multi-tenant nodes because those environments provide the most practical path to exploitation.
  • Use rootless containers as a mitigation when patching will be delayed.
  • Monitor for unusual /proc modifications and sysctl changes, and hunt CI logs for race-prone build steps.
Note on vendor references: the Microsoft Security Response Center (MSRC) update guide page for this CVE is rendered behind JavaScript and therefore may not be machine-readable by some automated scrapers; check the MSRC UI directly if you need Microsoft-specific mapping or KB guidance for related vendor products.
CVE-2025-52881 is a reminder that path handling, procfs interactions and mount-time race windows are a dangerous intersection of complexity and privilege — hardening container runtimes, securing CI/builder infrastructure and applying timely patches remain the most effective defenses against this class of host-impacting vulnerabilities.

Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top