EROFS CVE-2026-23224 Patch Fixes Race Condition in File-Backed DirectIO

  • Thread Author
EROFS in the Linux kernel has been patched for a race-condition use‑after‑free that can trigger kernel panics when a file‑backed mount is used together with the directio option — tracked as CVE-2026-23224 — and the fix replaces an unsafe free path with a simple reference‑counting discipline that prevents an iocb-backed request structure from being freed twice.

Neon Linux infographic highlighting CVE-2026-23224, featuring Tux and a file I/O workflow.Background / Overview​

EROFS (Enhanced Read-Only File System) is a space‑efficient read‑only filesystem included in upstream Linux that is increasingly used in embedded, container, and distribution images where immutability and compact on‑disk representation matter. A relatively recent enhancement added support for file‑backed mounts: instead of mounting erofs directly on a block device, the filesystem can use a regular file as its backing store. That feature allows disk images to be stored as files on another filesystem and mounted without requiring raw block devices.
To improve throughput in some scenarios, erofs also gained a directio mode for file‑backed mounts. Direct I/O bypasses page cache interactions and can let the filesystem or application interact more directly with the backing file’s pages. However, mixing asynchronous direct I/O paths with filesystem code that also interacts with the kernel page cache introduces complexity and the potential for races — and that’s exactly what CVE‑2026‑23224 exposed.
In short: when erofs is mounted from a file and directio is enabled, a race between the DIO submission path and the DIO completion callback could free an internal request object twice, leaving a dangling pointer that later dereferences cause to panic the kernel.

How the bug manifests: a readable call trace​

The panic traces public advisories reproduced include a distinctive stack that points to a DIO/BIO asynchronous I/O path. The simplified trace structure looks like this:
  • ext4_file_read_iter -> vfs_iocb_iter_read -> erofs_fileio_rq_submit
  • erofs_fileio_submit_bio -> z_erofs_runqueue -> z_erofs_read_folio
  • filemap_read_folio -> filemap_fault -> handle_mm_fault -> do_page_fault
At the completion side:
  • dio_aio_complete_work -> dio_complete -> erofs_fileio_ki_complete -> kfree(rq)
  • Later, file_accessed attempts to reference the freed rq->iocb.ki_filp, causing an access to a freed file pointer and leading to the panic.
That trace illustrates two interacting paths: the read submission path that may synchronously complete (or schedule completion) and the asynchronous completion callback. If both paths free the same request object (struct erofs_fileio_rq in the erofs implementation), the second free turns into a use‑after‑free (UAF) when some other code dereferences the freed memory.

Root cause — a classic race between submit and completion​

Technically, the bug is created by insufficient lifetime management of a small request object ("rq") used to issue direct I/O against the file that backs the erofs instance.
  • erofs allocates an erofs_fileio_rq struct and prepares a kiocb and bio for a direct I/O read.
  • The code calls into the VFS asynchronous I/O submission (vfs_iocb_iter_read or iomap_dio_rw). If the I/O is queued, that call returns -EIOCBQUEUED.
  • When the kernel later completes the BIO the DIO completion machinery invokes the kiocb completion handler erofs_fileio_ki_complete(), which performs cleanup and frees the rq.
  • Under certain timing, the submit path — which checks the submission return value — may itself call the same completion helper synchronously (if the submission code returns a non‑EIOCBQUEUED return path), and then unconditionally frees rq again after that; or the submit path and completion handler each independently free rq. That leaves code elsewhere (for instance file_accessed or other vfs code that references iocb.ki_filp) touching an already-freed memory region, leading to UAF and subsequent kernel panic.
In other words, two asynchronous control flows — the submitter’s control flow and the I/O completion work queue — were racing to free the same object. The fix converts that naive free into a reference‑counted free so that only the last holder releases the object.

The patch and why it works​

The upstream fix (landed across stable branches) is small, targeted and conceptual:
  • Add a reference counter (refcount_t) to struct erofs_fileio_rq.
  • Initialize the counter to 2 when the request object is allocated.
  • Decrement the counter in both the submission path and in the kiocb completion path.
  • Only call kfree(rq) when the refcount reaches zero.
That approach converts two potential free sites into two holders of a single resource; whichever path finishes last actually reclaims the memory. This is the canonical, low‑motor‑overhead remedy for a double‑free/UAF caused by racing lifetime owners of the same structure. It preserves the existing structure and behavior of the erofs DIO implementation while eliminating the UAF without rearchitecting the I/O model.
The patch changes a few lines in fs/erofs/fileio.c: it adds a refcount field, switches the unconditional kfree(rq) to a conditional refcount_dec_and_test + kfree, and sets refcount_set(rq->ref, 2) on allocation. The change is minimal, easy to review, and has low risk of performance regression because it only adds a tiny atomic refcount operation per allocation/completion — not per I/O operation at the hot path level.

Who is affected and how serious is this?​

  • Affected component: Linux kernel EROFS code when mounted with a file-backed device and the directio option enabled.
  • Exploit vector: local — the issue requires mounting an erofs image from a regular file with directio enabled, or otherwise exercising the file‑backed DIO read path. It is not a remotely exploitable network service vulnerability by default.
  • Impact: kernel panic (DoS), and in theory memory corruption that could be leveraged for more serious outcomes under complex, targeted exploitation scenarios. However, exploitation beyond crash would be nontrivial and requires local access and careful control of timing.
  • Severity: most public trackers mark this as medium priority for typical deployments. The risk is meaningful in contexts where erofs file‑backed mounts with directio are enabled on systems with untrusted local users or where crash resilience is critical.
Important operational note: not all erofs users are exposed. The default usage of erofs is read‑only images on block devices; many deployments do not use file‑backed mounts with directio. The vulnerable code path is a more advanced combination — file‑backed mount + directio — so the practical attack surface is limited.

Detection: what to look for in logs and crash dumps​

If you suspect a system hit this bug, look for kernel oops/panic messages that reference the following symbols or call sites (the exact offsets change by build, but the names are telling):
  • z_erofs_read_folio
  • z_erofs_runqueue
  • erofs_fileio_submit_bio
  • erofs_fileio_rq_submit
  • erofs_fileio_ki_complete
  • ext4_file_read_iter and vfs_iocb_iter_read (these appear in many traces because they reflect the caller stack)
The panic lines often include an ASCII stack trace in dmesg or the system journal. If you see a kernel panic where file_accessed or a NULL file pointer access is implicated, and the trace includes the erofs DIO path above, you have likely hit this condition.
For proactive detection, you can:
  • Audit mounts: flag any erofs mounts that use a regular backing file and the directio mount option.
  • Monitor /var/log/kern.log and journal for trace fragments listed above.
  • Use kernel crash tooling (kdump/krash) to preserve a vmcore for offline analysis; the vmcore will show whether a freed pointer was dereferenced and supply the full stack.

Recommended mitigations and remediation​

  • Patch promptly: apply the kernel fixes provided by your distribution vendor or pull the upstream stable patch into your kernel. Vendors have already shipped updates to stable kernels; for production systems follow your distro’s security advisories and install the signed kernel package updates as soon as possible.
  • Workaround until patched:
  • Avoid mounting erofs images from regular files with the directio option. Use buffered I/O (no directio) for file‑backed erofs mounts, or mount erofs from a block device instead of a file if possible.
  • If you can’t avoid file‑backed mounts, do not enable directio on those mounts until your kernel is updated.
  • Reduce local attack surface:
  • Restrict local user privileges. The exploit requires local access to set up and trigger the problematic mount and I/O pattern. Enforce least privilege and remove unnecessary ability to mount filesystems from untrusted users.
  • Use secure mount namespaces and container isolation; do not run untrusted workloads with privileges to manipulate mounts or mount options.
  • Test after patching:
  • After applying vendor kernel updates, restart into the patched kernel and validate mounts that previously used file‑backed directio.
  • Run sanity I/O checks on erofs images and exercise workloads to ensure there are no regressions.

Timeline and upstream response​

The problem was reported in relation to the file‑backed mount and directio support that was added to erofs not long before. The upstream patch was small and accepted into the stable trees; distribution vendors have since added the commit to their kernel update streams. The fix itself is conservative and limited to the erofs fileio code.
Because the patch is small, reviewable and follows a standard pattern (refcounting an object used by two concurrent execution contexts), it was backported quickly to stable kernels. Distributions that track stable Linux kernels should receive updates as normal kernel security errata.

Why this patch is the right tradeoff​

  • Minimal change: adding a refcount is lower‑risk than changing the I/O completion model.
  • Familiar pattern: reference counting is a well understood concurrency technique in kernel code and is used elsewhere for similar owner/holder races.
  • Low runtime cost: refcount operations are micro‑cost atomic ops used on allocation/teardown, not on every single fast I/O code path.
  • Preserves behavior: the patch does not change observable semantics of I/O completion to callers; it simply ensures memory isn't prematurely freed.
Given that the vulnerability was a timing/race condition limited to a narrow configuration, the patch’s scope matches the problem and avoids larger regressions.

Long‑term implications and lessons learned​

  • Complex I/O models are fragile. Mixing buffered page‑cache interactions with direct I/O paths and asynchronous completion callbacks increases concurrency complexity. Filesystems that add file‑backed modes must carefully consider lifecycle responsibilities for any per‑request objects.
  • Testing hard concurrency cases matters. Kernel fuzzers and syzbot have repeatedly unearthed corner‑case races in filesystem code; this is another example where automated fuzzing and long‑running stress tests (with DIO + mixed callers) can identify risky interactions.
  • Keep mount option combinations under configuration review. Administrators should be cautious about enabling less‑common mount options (like directio on file‑backed mounts) on production systems until those paths have had substantial bake time in stable trees.

Practical checklist for administrators​

  • Inventory: find hosts with erofs mounts that use file-backed backing files.
  • Check /proc/mounts or the output of the mount command to find entries with filesystem type erofs and note mount options.
  • If erofs is file-backed and uses directio, plan to schedule kernel updates or temporarily remount without directio.
  • Apply updates: install vendor kernel updates for the stable branches that include the erofs fix and reboot in maintenance windows.
  • Monitor: after patching, watch kernel logs for related oops traces and run the I/O workloads that exposed the problem to validate the fix.
  • Harden: reduce the set of users that can create mounts or load kernel modules; use mount namespaces and least‑privilege container runtime settings.

Risk assessment for different environments​

  • Desktop/laptop: Low to medium. Most desktop users do not use file‑backed erofs mounts with directio. If a distribution ships erofs images as regular files and sets directio by default, update quickly; otherwise risk is low.
  • Embedded appliances and OEM images: Moderate. Devices that mount images stored in a file on top of another filesystem may use directio for performance. Vendors should push kernel updates and test image mounts.
  • Cloud/container hosts: Medium. Container images and overlay approaches may interact with erofs images in unexpected ways. Any environment where untrusted tenants can trigger mount operations requires prompt patching.
  • Shared servers with many local users: Higher priority. On multi‑user systems where local users might mount files, the risk of local DoS is meaningful and should be mitigated sooner.

Final verdict​

CVE‑2026‑23224 is a narrowly targeted but real use‑after‑free in the EROFS file‑backed directio path. It is not a widespread remote code execution vector — exploitation requires local access and a specific mount configuration — but it can cause kernel panics and therefore is a meaningful denial‑of‑service risk in the environments where the configuration is in use.
The upstream fix is small, principled, and low‑risk: it introduces a short‑lived reference count that guarantees the request object is freed exactly once, regardless of the timing of submit versus completion. Administrators running affected kernels should prioritize vendor kernel updates; in the meantime, avoiding the directio mount option for file‑backed erofs is a practical workaround.
Apply vendor kernel updates as soon as practical, verify erofs mounts after patching, and incorporate this incident into your configuration review process for uncommon mount option combinations. The lesson is familiar for kernel developers and sysadmins alike: subtle concurrency bugs hide in interactions between different I/O and completion paths, and small, well‑targeted fixes plus conservative configuration practices are the best immediate defenses.

Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top