The Linux kernel project has assigned CVE‑2025‑68357 to a recently discovered race/initialization bug in the iomap layer that left asynchronous read error-completion workitems unprotected: after a behavioral change that deferred error completions to a dedicated workqueue (s_dio_done_wq), the workqueue was not always allocated for async read paths — creating a condition where deferred read error completions could run without a valid workqueue and produce oopses, incorrect error attribution, or high‑impact availability problems. This defect has been recorded in the public CVE/NVD and OSV feeds and fixed in the upstream stable trees; operators must map the upstream commits to their vendor kernel packages and apply vendor-supplied kernel updates or backports as soon as they can.
The Linux iomap framework centralizes file I/O mapping and writeback logic for modern filesystems and is widely used by XFS and other file systems to improve performance and reduce duplication of logic. As part of ongoing robustness work, an upstream change (commit 222f2c7c6d14, summarized in public trackers) moved error completion handling into a user‑context workqueue so that error processing and completion callbacks run safely outside of interrupt or tight kernel contexts. That change defers error completions to s_dio_done_wq — a per-subsystem workqueue used by direct I/O paths. However, the memo that introduced this deferral did not ensure that the workqueue was allocated when asynchronous read paths used the same deferred completion mechanism. The missing allocation produced a race or NULL workqueue usage when asynchronous reads hit error-completion code. The CVE record succinctly describes this: “Since commit 222f2c7c6d14 ('iomap: always run error completions in user context'), read error completions are deferred to s_dio_done_wq. This means the workqueue also needs to be allocated for async reads.” This is primarily an availability and stability flaw: the codepath can trigger kernel warnings, oopses, or incorrect propagation of I/O errors (for example attributing an I/O error to the wrong file), with the greatest operational impact on systems doing heavy parallel or async I/O (multi‑tenant hosts, virtualized platforms, storage servers using XFS, or systems with complex block-layer stacks). Public trackers mark the attack vector as local (a process or tenant that can initiate async reads), with a likely impact class focused on availability rather than confidentiality or integrity.
Source: MSRC Security Update Guide - Microsoft Security Response Center
Background / Overview
The Linux iomap framework centralizes file I/O mapping and writeback logic for modern filesystems and is widely used by XFS and other file systems to improve performance and reduce duplication of logic. As part of ongoing robustness work, an upstream change (commit 222f2c7c6d14, summarized in public trackers) moved error completion handling into a user‑context workqueue so that error processing and completion callbacks run safely outside of interrupt or tight kernel contexts. That change defers error completions to s_dio_done_wq — a per-subsystem workqueue used by direct I/O paths. However, the memo that introduced this deferral did not ensure that the workqueue was allocated when asynchronous read paths used the same deferred completion mechanism. The missing allocation produced a race or NULL workqueue usage when asynchronous reads hit error-completion code. The CVE record succinctly describes this: “Since commit 222f2c7c6d14 ('iomap: always run error completions in user context'), read error completions are deferred to s_dio_done_wq. This means the workqueue also needs to be allocated for async reads.” This is primarily an availability and stability flaw: the codepath can trigger kernel warnings, oopses, or incorrect propagation of I/O errors (for example attributing an I/O error to the wrong file), with the greatest operational impact on systems doing heavy parallel or async I/O (multi‑tenant hosts, virtualized platforms, storage servers using XFS, or systems with complex block-layer stacks). Public trackers mark the attack vector as local (a process or tenant that can initiate async reads), with a likely impact class focused on availability rather than confidentiality or integrity. What went wrong — technical anatomy
The change that introduced the mismatch
- A kernel commit moved error completion handling to run in user context via the s_dio_done_wq workqueue so that callbacks execute outside of constrained kernel contexts.
- The new deferral model assumes that callers that invoke error completion paths have a valid workqueue allocated and ready to dispatch workitems.
- For synchronous I/O and most write/completion paths the workqueue was allocated or always available; however, an async read path that relied on the same deferred completion flow could be executed in a code path where s_dio_done_wq was not guaranteed to have been allocated. The result: a NULL or invalid workqueue reference at dispatch time.
Symptoms at runtime
- Kernel oops with a stack trace pointing into iomap/dio completion or workqueue dispatch code.
- Sporadic or repeatable "Buffer I/O error" messages in dmesg or syslog for the affected block device.
- Erroneous fsync or write completion errors that do not map to the file actually being written (mis‑attribution).
- On stressed systems (heavy async read workloads, high concurrency), repeated oopses can lead to host instability, filesystem errors (XFS metadata messages), and in-place recovery operations (unmount + fsck) being required.
Why the bug matters
- The bug is a classic initialization/order-of-operations defect: code assumes a required global or subsystem resource (the workqueue) is present, but in some async read paths the resource was not allocated.
- Unlike pure memory corruption bugs, this one can cause both immediate crashes and silent mis‑labeling of I/O errors — both are operationally serious for storage hosts and multi‑tenant environments.
- The vulnerability is local in exploitability (an attacker needs local file I/O capability) but the operational impact on availability makes it high priority for production infrastructure.
Affected components and versions
- Affected component: the Linux kernel iomap/DIO read completion paths that were modified by the commit referenced above.
- Upstream metadata and vulnerability feeds list the fix as present in stable kernel branches with specific commit IDs documented in the public CVE/NVD/OSV entries; those references are the starting point for mapping to vendor kernel packages.
- Vendors and distributions will backport the upstream stable commits into their kernel packages; do not assume a kernel version number alone proves a fix — inspect vendor changelogs or the package's included commit list to confirm presence of the fix.
Detection and triage for administrators
Use the following checklist to quickly triage potentially affected systems and detect active problems.1. Check kernel logs for telltale signatures
- Search dmesg and journal logs for:
- kernel oops traces containing iomap, dio, s_dio_done_wq, or workqueue symbols.
- "Buffer I/O error" messages tied to device nodes (for example dm-*, sda, etc..
- XFS metadata I/O error lines or filesystem hints to unmount/repair after crashes.
- Example commands:
1. journalctl -k | grep -E 'iomap|s_dio_done_wq|dio|Buffer I/O error'
2. dmesg | grep -i 'Buffer I/O error'
2. Determine whether your kernel includes the upstream fix
- Check your distribution package changelog for a backport that references CVE‑2025‑68357 or the upstream commit IDs referenced in public advisories.
- If you manage custom kernels, inspect the kernel source for the commit (search for changes that allocate s_dio_done_wq for async read paths or examine the iomap error-completion logic).
- Where possible, match your running kernel build (uname -r and packaged changelog) to vendor advisory mapping rather than relying on kernel version numbers alone.
3. If you see crashes but no package mapping yet
- Temporarily reduce asynchronous read concurrency or throttle I/O to reduce reproduction surface (not a fix, only a mitigation).
- Isolate workloads: move untrusted or heavy write/read tenants off shared hosts until a patched kernel is applied.
- Reproduce in test/staging with controlled workloads (see remediation/testing guidance below).
Remediation and patching guidance
The only reliable remediation is to install a kernel that contains the upstream fix or a vendor-supplied backport that explicitly addresses CVE‑2025‑68357.- Priority: patch multi-tenant hosts, storage servers, and VMs first; then developer workstations and less-exposed desktop machines.
- For distribution-managed systems:
- Watch the vendor security tracker or package changelog for kernel updates that list CVE‑2025‑68357 or reference the iomap s_dio_done_wq allocation fix.
- Staged rollout: deploy to a canary/pilot group, run writeback-heavy stress tests, and then roll to production after validation.
- For custom kernels:
1. Obtain the upstream stable commit(s) referenced by NVD/OSV (the public feeds list the commit references).
2. Backport the change to your stable branch following your standard kernel maintenance process.
3. Rebuild kernels with your production configuration.
4. Validate under representative stress workloads before mass rollout.
- Run filesystem writeback and async read stress tests (fio with direct I/O flags, real workloads that mimic your production I/O patterns).
- Run XFS-specific test suites or xfstests where XFS is used.
- Monitor for absence of the previously observed dmesg signatures and for stable sustained throughput.
Practical mitigations if you cannot patch immediately
- Reduce I/O parallelism and the number of concurrent async read operations on sensitive hosts by tuning application-level parameters or VM I/O limits.
- Remove or restrict access to shared storage devices from untrusted tenants.
- Consider taking high‑risk hosts out of multi‑tenant rotation until they can be patched.
- Configure enhanced logging and crash capture (enable kdump/vmcore collection) so that any kernel oops can be retained for postmortem analysis.
For packagers and kernel maintainers — backporting notes
- The fix for this CVE is small in concept (ensuring s_dio_done_wq is allocated when async read error completions may defer work there) but must be backported carefully because iomap and DIO code is sensitive to ordering and subtle pointer/lifetime issues.
- Backporters should:
1. Apply the upstream stable commit(s) that document the fix.
2. Run the distribution’s kernel test suites, plus filesystem and I/O stress tests representative of your user base.
3. Validate that the workqueue allocation is present in all paths that can defer completions to s_dio_done_wq, including unusual async read flows that some filesystems or device-mapper combinations may exercise. - Ensure changelogs and advisory metadata include the CVE identifier and the upstream commit references so downstream admins can correlate packages to fixes.
Threat model and exploitability
- Attack vector: local. A process that can initiate asynchronous reads (or otherwise trigger the modified error‑completion path) can exercise the vulnerable code.
- Impact: primarily availability (kernel oopses, persistent I/O errors) and operational integrity (mis‑attribution of I/O errors).
- Complexity: low to moderate; triggers are local but reproducible in the right workload scenario (highly concurrent async I/O).
- Confidentiality/Integrity: public trackers indicate no direct data-exfiltration or RCE primitive is documented for this specific fix; the prime risk remains DoS and corruption/misattribution of error state.
Why the patch is the correct fix — and where risks remain
Strengths of the patch
- The upstream remedy restores the invariant that deferred completions will always be executed on a valid workqueue, preventing NULL workqueue dispatch and the resulting oopses or error-misattribution.
- The change is surgical and focused: it fixes the initialization/alloc path for s_dio_done_wq for all relevant code-paths (sync and async), minimizing risk of regressions when applied correctly.
- Because the change targets a resource allocation/initialization mismatch, it addresses the root cause rather than masking symptoms.
Lingering risks and caveats
- Backport quality matters: an incorrect backport may not resolve the race in all codepaths or could introduce new timing windows. Vendors must test under heavy I/O stress.
- Long-tail devices: embedded systems and vendor-kernel forks may lag in applying the fix; operators of such devices must coordinate with vendors or plan replacements when security is critical.
- Detection vs. prevention: relying on log monitoring alone is reactive; the only true prevention is an up-to-date kernel.
How to verify a patch presence (quick checklist)
- Confirm your vendor package changelog explicitly lists CVE‑2025‑68357 or the equivalent upstream commit IDs referenced by public trackers.
- If you have the kernel source package, inspect the iomap/DIO completion code for the allocation and initialization of s_dio_done_wq in async read paths.
- Reproduce a minimal test in a staging environment under representative concurrent async-read load and confirm the previous oops signatures do not appear.
- Capture and retain vmcore/kdump results if an oops occurs for offline analysis.
Hunting and forensic tips
- Preserve dmesg and kernel logs after any crash; enable kdump so you can analyze vmcores.
- Look for repeated oopses with iomap/dio/workqueue symbols; correlate with I/O workload logs (fio, application traces).
- If you suspect mis‑attributed I/O errors (fsync failures appearing on unrelated files), treat that as a high‑priority incident and isolate the host until a patched kernel can be applied.
Cross-verification and source note
Public vulnerability databases and package trackers have published entries for CVE‑2025‑68357; for example, NVD has an entry summarizing the change and linking to the upstream fix references, and OSV contains a corresponding record with the same summary and commit references. Kernel stable commit references are present in those feeds and in distribution advisories; map those upstream commits to your vendor kernel package to confirm remediation. Note: some static fetch attempts to kernel.org commit pages may be blocked for automated crawlers or return 403/denied responses; when that occurs, the authoritative verification method is either cloning the stable kernel repository locally or checking vendor-provided changelogs that reference the upstream commits. Administrators should rely on vendor security advisories and the kernel Git history for definitive confirmation.Recommended action plan (prioritized runbook)
- Inventory
- Identify hosts doing heavy async I/O or running XFS and high-throughput storage stacks.
- Capture uname -r, kernel package names, and vendor changelog state.
- Confirm
- Check vendor advisories for CVE‑2025‑68357 or for the upstream commit IDs.
- If unknown, inspect the kernel package source for the fix.
- Patch
- Deploy vendor-supplied kernels that include the backport; stage in pilot ring first.
- For in-house kernels, backport upstream stable commit(s), rebuild, and test.
- Validate
- Run workload-level I/O stress tests and xfstests where applicable.
- Confirm no reproductions and no recent iomap-related oopses in logs.
- Monitor
- Add dmesg/journal rules to alert on iomap/dio/workqueue oopses and Buffer I/O errors.
- Retain crash dumps for postmortem.
- Compensate (if unable to patch immediately)
- Reduce async read/write concurrency, isolate untrusted tenants, and restrict storage device access.
Conclusion
CVE‑2025‑68357 is a focused but operationally impactful Linux kernel defect: a missed workqueue allocation for asynchronous read error completions in the iomap/DIO stack. The upstream fix ensures that read error completions—deferred to s_dio_done_wq by an earlier change—are backed by a properly allocated workqueue in all paths, preventing crashes and misattributed I/O errors. Operators should treat this as a high-priority patch for storage and multi‑tenant infrastructure: confirm the presence of the upstream fix in vendor packages, stage and validate updated kernels under representative I/O loads, and apply updates broadly once verified. For forensic triage and hunting, search kernel logs for iomap/dio/workqueue oopses and Buffer I/O error traces, capture vmcores on crash, and prioritize hosts exposed to untrusted workloads until the patch is deployed.Source: MSRC Security Update Guide - Microsoft Security Response Center