Linux AFS Spinlock Recursion Bug CVE-2024-53090 and Workqueue Fix

  • Thread Author
A Linux kernel bug in the AFS subsystem, tracked as CVE-2024-53090, can trigger spinlock recursion and lead to a kernel oops or full system denial-of-service; the issue stems from afs_wake_up_async_call taking a reference in a context where a subsequent put can re-enter the same lock path, and the maintainers fixed it by deferring the cleanup to a workqueue.

Neon cyber-security art showing a spinlock recursion and CVE-2024-53090, with AFS and RXPC.Background​

The vulnerability affects the Linux kernel's AFS (Andrew File System) code path where AF_RXRPC code interacts with afs_call structures. In short, afs_wake_up_async_call could try to take a reference to an afs_call so it could queue work; if the call was already queued, an extra reference would later be released, and that release (afs_put_call could call back into RXRPC code that tries to re-acquire the same notify lock already held — producing a spinlock recursion and resulting in a kernel BUG or panic. This is not an information-disclosure or code-execution flaw: it’s an availability problem that manifests as system instability or crash. The bug was reported and fixed in the kernel tree in November 2024; the fix changes the cleanup path so the problematic reference release is deferred to a dedicated workqueue, avoiding the problematic lock re-entry. Vendor advisories and vulnerability catalogs assign a CVSS v3.1 base score of 5.5 (Medium) with the vector indicating a local attack vector and high impact to availability (AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H).

Why this matters to WindowsForum readers and system administrators​

  • Availability is critical: Kernel oopses and panics are service-stoppers on servers and workstations alike. When AFS and RxRPC are in use (common in some enterprise and academic environments), a local actor, misbehaving process, or edge-case workload can cause a system-wide crash.
  • Surface area is local but real: While this CVE requires local access (not a remote unauthenticated exploit), many production scenarios give local access to untrusted workloads — shared systems, CI runners, container hosts, or multi-tenant VMs. In those environments, a medium-rated kernel DoS can have outsized operational impact.
  • Patching complexity: Kernel fixes must be consumed via distro kernel updates, backports, or by applying patches and recompiling kernels. Organizations that freeze kernels for stability may be slow to pick up fixes, increasing risk.
The Microsoft Security Response Center description submitted with the user’s prompt emphasized the total loss of availability risk that a denial-of-service presents — either sustained while the attacker continues activity or persistent if the condition leaves the component unusable. That framing is accurate for a kernel-level crash that prevents access to resources until a reboot or patch is applied.

Technical deep dive: what exactly went wrong?​

The actors: AFS, rxRPC and afs_call​

  • AFS (Andrew File System): a distributed network filesystem that uses its own RPC transport (rxRPC) for communication between clients and servers.
  • rxRPC / AF_RXRPC: the kernel module that implements the rxRPC protocol, which AFS uses for remote procedure calls.
  • afs_call: the kernel structure that represents an in-flight remote AFS call; it’s reference-counted and may be queued for asynchronous processing.

The bug mechanics​

  • Entry state: AF_RXRPC code holds a notify lock (->notify_lock) and calls afs_wake_up_async_call while servicing rxRPC events.
  • Reference attempt: afs_wake_up_async_call attempts to take a reference to an afs_call so the call can be handed to a workqueue for deferred processing.
  • Double-queue / extraneous ref: If the afs_call is already queued, the reference code produces an extra reference that must be released.
  • Release path causes callback: Calling afs_put_call to drop that extra reference may, in some flows, call rxrpc_kernel_shutdown_call — which may try to take ->notify_lock while it’s already held.
  • Spinlock recursion: The lock is attempted to be taken twice on the same CPU context, producing a kernel BUG indicating spinlock recursion and leading to an oops or panic. NIST’s NVD entry shows a representative oops trace for this condition.

The fix​

Developers deferred the problematic afs_put_call path into a worker executed by a workqueue. By moving the reference drop and any potential callback into a new deferred worker context, the code ensures the notify_lock isn't held when rxrpc paths can re-enter, breaking the recursion chain. This is a classic deferred free approach to resolve lock-order and callback re-entry issues. Kernel commit references for the patch are listed in public vulnerability databases and NVD.

Affected kernels and distributions​

  • NVD and vendor trackers list affected kernel versions as Linux kernel releases up to (excluding) 6.11.9, as well as 6.12 release candidates (rc1–rc4). The report was added to the kernel tree in November 2024 and later incorporated into stable trees and vendor kernels through backports and packaging.
  • Major Linux distributions released advisories or kernel updates referencing the kernel fix. Administrators should apply vendor-supplied kernel updates rather than attempting to hand‑apply patches unless they maintain custom kernels. Ubuntu’s security page lists CVE-2024-53090 and assigns it a medium priority; Red Hat/enterprise distros have similar entries in their errata systems.

Impact analysis: availability, exploitability, and risk profile​

Availability​

This is fundamentally an availability vulnerability: the observed failure mode is a kernel oops (BUG) caused by spinlock recursion. The worst-case outcome is an immediate kernel crash requiring a reboot, with potential filesystem or service corruption depending on what was running at the time. Kernel-level faults can also disrupt many dependent services, not just AFS clients. The Microsoft framing of "total loss of availability" for affected components is therefore an appropriate characterization in operational terms.

Confidentiality & Integrity​

There is no indication that this bug permits code execution or data manipulation beyond the denial-of-service condition. Public vulnerability records classify the confidentiality and integrity impact as none. That explains the CVSS vector marking C:N/I:N even while A:H (availability high).

Exploitability​

  • Attack vector: Local (AV:L). An attacker needs a local foothold or the ability to trigger kernel paths that manipulate afs_call objects.
  • Privileges: Low (PR:L) in the CVSS vector, indicating an unprivileged user may be able to trigger the condition in some environments, though many such attacks require specific interactions with AFS or services that expose that code path.
  • Public exploits: At publication there is no evidence of a widespread public exploit or proof-of-concept weaponized in the wild. However, the vulnerability is relatively straightforward conceptually — a pathological sequence of events can be staged — so organizations should not be complacent.

Operational risk score​

Ranked for operational risk:
  • High risk for systems that actively use AFS and RxRPC (file servers, legacy academic environments).
  • Moderate risk on multi-tenant hosts where untrusted workloads can reach AF code paths.
  • Low risk for systems that do not use AFS and where the afs kernel module is neither built nor loaded.

Detection and indicators of compromise (what to look for)​

Kernel logs and system messages are the primary detection source. Look for patterns like:
  • "BUG: spinlock recursion" or explicit kernel oops traces referencing rxrpc, afs_put_call, rxrpc_kernel_shutdown_call, rxrpc_input_data, and krxrpcio threads.
  • Repeated unexpected kernel oopses, kworker crashes, or service restarts on machines where AFS or RxRPC is used.
  • Unexplained reboots or kernel panics on AFS clients or servers following network operations.
Example log fragments from public reports include a stack trace starting with dump_stack_lvl, do_raw_spin_lock, rxrpc_kernel_shutdown_call, afs_put_call, rxrpc_notify_socket, rxrpc_input_data, rxrpc_io_thread — these are direct fingerprints of the failing path.

Mitigation and remediation guidance​

Immediate actions (if you manage affected systems)​

  • Patch promptly: Apply vendor kernel updates as they become available. This is the recommended course: distributions ship stable backports and packaging, which reduces build risk. Ubuntu, Red Hat, and other vendors issued advisories referencing the kernel fix.
  • If you cannot patch immediately:
  • Consider unloading the afs module on affected machines that do not require AFS functionality: rmmod afs (requires that no processes are using AFS and module unloading is permitted).
  • If afs is compiled into the kernel (not a module), consider rebooting into a kernel without AFS or applying vendor kernel livepatches if available.
  • Restrict local access to systems that use AFS: tighten sudoers, isolate builds and untrusted workloads, and reduce the number of users who can trigger local kernel code paths.
  • Monitoring: Add detection rules to monitoring and SIEM systems to flag "spinlock recursion" or the specific stack traces. Capture kernel logs centrally for faster triage.
  • Backups and recovery plans: Ensure you have tested backups and reboot/recovery playbooks; kernel oopses can be disruptive and require operator attention.

Patching specifics​

  • Check your distribution’s kernel advisory for CVE-2024-53090 and apply the kernel update.
  • If running custom kernels, obtain the upstream patch (the kernel commit referenced in public trackers) and backport/apply it into your kernel tree, test in staging, and deploy.
  • For managed environments or appliances, liaise with vendors for timeline on vendor-certified updates.

Long-term operational hygiene​

  • Reduce unnecessary kernel surfaces on multi-tenant hosts (disable unused filesystem modules).
  • Run kernel test harnesses that exercise networking and filesystems under load to catch edge-case re-entrancy bugs early.
  • Maintain a patch cadence that balances stability and security, and use vendor backports for critical kernel fixes.

Strengths and limitations of the fix​

Strengths​

  • Corrective approach: Deferring resource cleanup to a workqueue is a conservative, well-understood pattern to avert lock re-entry and callback ordering problems.
  • Minimal functional change: The fix isolates the cleanup path rather than rearchitecting AFS or rxRPC logic — it’s surgical and likely to be backportable to stable kernels.
  • Vendor uptake: The patch was applied to the kernel tree and quickly appeared in vulnerability trackers and vendor advisories, meaning administrators can consume fixes via normal update channels.

Limitations and residual risks​

  • Local attack vector remains: The fix prevents the kernel-level oops, but the attack vector (local triggering of complex kernel paths) exists; untrusted local processes still pose risks in general.
  • Backporting complexity: For organizations using long-term stable kernels with heavy backports, integrating the fix may require careful testing to avoid regressions.
  • Not a mitigation for other kernel race/recursion bugs: The fix addresses a specific lock recursion; similar patterns might exist elsewhere. Broader kernel hardening and code audit work remain necessary.

Practical checklist for administrators (step-by-step)​

  • Inventory systems that run any AFS software or have afs/rxrpc kernel modules loaded.
  • Check your kernel version against the vulnerable ranges (up to 6.11.9 and affected 6.12-rc candidates).
  • Consult your distribution’s security advisory and plan kernel updates for affected hosts.
  • If immediate patching is not possible:
  • Unload the afs module where safely feasible.
  • Restrict local, unprivileged access on exposed hosts.
  • Add monitoring for kernel oops and specific stack-trace indicators.
  • Test updates in staging, then deploy to production with rollback plans.
  • Document and rehearse recovery steps for kernel panics (console capture, reboot sequencing, service restart order).

Final analysis and recommendations​

CVE-2024-53090 is a well-scoped kernel availability bug rooted in lock re-entry and reference-counting interactions between AFS and rxRPC. While the CVSS base score rates it as medium because it does not compromise confidentiality or integrity, its operational impact can be severe: kernel oopses and panics can render servers and services unusable until patched or rebooted. The fix implemented in the kernel — deferring the cleanup to a workqueue — is the appropriate engineering approach for this class of re-entrancy bug and has been accepted into upstream and vendor trees. For administrators, the core priorities are straightforward: identify affected hosts, apply vendor-provided kernel updates as soon as practical, and apply compensating controls where immediate patching is not possible (module unloads, access restriction, monitoring). Organizations that host untrusted workloads should treat kernel-level DoS vectors seriously because they allow non-privileged actors to impact the entire system. Finally, maintain an operational posture that includes fast access to vendor security advisories and a tested recovery plan for kernel faults.

Caveat and verification note: public vulnerability trackers and NVD give the canonical description and CVSS vector for CVE-2024-53090; distro advisories and the kernel tree reference the exact patch commits. Where individual kernel commit pages or vendor bug trackers require JavaScript or have restricted access, the canonical NVD and major vendor advisories provide the necessary, independently verifiable information used for this analysis. Administrators seeking the raw patch or commit context should consult kernel.org and their distribution’s security errata pages for the precise commit IDs and backport details. Conclusion: treat CVE-2024-53090 as a kernel-level availability issue with operational risk. Prioritize kernel updates on AFS-using hosts, monitor for the characteristic spinlock oops traces, and defer to vendor-supplied patches for the safest remediation path.
Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top