Linux Bluetooth MGMT Fix: CVE-2025-40284 Cancels Mesh Timer on Device Removal

  • Thread Author
A subtle timer omission in the Linux Bluetooth management stack has been assigned CVE-2025-40284 and fixed upstream — the bug left a delayed mesh-transmit completion timer running after the host device (hdev) was removed, creating a use-after-free crash that could hang or take down affected systems; the remedy was a one-line cancel that restores the same defensive behavior the MGMT code uses for other timers.

Tux the Linux penguin sits at the center, connected via Bluetooth mesh to clocks and devices.Background​

Bluetooth Mesh is a growing use case for Linux-based systems that act as gateways, controllers, or test hosts for IoT and home automation scenarios. The Linux kernel implements core Bluetooth interfaces and the MGMT (management) layer that BlueZ and other userspace stacks use to control HCI devices. That management layer schedules delayed work items (timers) for several asynchronous activities; when an HCI device (hdev) is removed, MGMT must cancel those timers so no later callback dereferences memory that has already been freed.
CVE-2025-40284 describes precisely the situation where the mesh transmit completion timer — exposed in the kernel as a delayed_work named mesh_send_done — was not being canceled when MGMT removed the hdev. If that timer fires after the device structure has been freed, the timer handler runs on freed memory, leading to a slab-use-after-free and a kernel oops/crash. Automated BlueZ mesh tests ("Mesh - Send cancel - 1") and kernel sanitizer runs (KASAN) exposed the problem as intermittent but real instability.
The upstream fix is minimal and deliberate: cancel the delayed work synchronously when MGMT removes an hdev, matching the existing pattern used for other MGMT timers.

What happened (technical overview)​

The root cause: a cancelled-to-none​

  • The Bluetooth MGMT code registers several delayed work items on each HCI device (hdev), for activities such as service-cache refresh, discovery-off, RPA expiration and mesh send completion.
  • When MGMT removes an hdev — for example during driver unload, device reset, or hot-unplug — the removal path normally cancels those delayed works to guarantee that no callbacks will run after the device is gone.
  • The mesh_send_done delayed work was omitted from that list. If the hdev was removed while mesh_send_done remained scheduled or running, the delayed-work handler could execute against memory that had already been freed, producing a slab-use-after-free.
  • KASAN traces and BlueZ test failures showed the resulting symptom: kernel oopses and worker hangs originating in timer/softirq processing, pointing to freed objects being accessed by the timer callback.

The fix​

  • The patch adds a single defensive call in the MGMT removal path:
  • cancel_delayed_work_sync(&hdev->mesh_send_done);
  • This mirrors other cancel_delayed_work_sync calls already present for the MGMT-managed delayed works, ensuring that the mesh_send_done worker will not run after the hdev is released.
  • The change is conservative: it avoids races by using the synchronous cancel variant and keeps the device teardown semantics consistent with existing timers.

Why this matters: the impact and risk profile​

What the bug does​

  • Primary impact: kernel crash / denial of service (system instability).
  • Symptom set: KASAN-reported slab-use-after-free, kernel oops messages, hung kernel workers, and test-suite failures under BlueZ mesh tester automation.
  • Where it appears: in kernels that include the MGMT implementation prior to the fix and that have Bluetooth mesh support enabled. Systems running Bluetooth mesh userspace (BlueZ mesh) or test rigs are most likely to trigger the condition.

Attack surface and exploitability (practical view)​

  • The flaw is a memory-safety bug (use-after-free). In principle, use-after-free flaws can be escalated into code-execution or privilege-escalation bugs, but exploitation depends heavily on context, memory layout, kernel mitigations (KASLR, SMEP/SMAP, CFI, KASAN in debug builds, etc., and how the timed callback uses memory.
  • There are no confirmed reports of public exploit code that turns this bug into a remote code-execution vulnerability. The observable class of failure in test logs is a crash/hang detected by KASAN.
  • For most deployments, the realistic immediate threat is denial of service (node crash, kernel panic, or hung worker threads) rather than an obvious escalation to arbitrary code execution.
  • Systems that expose Bluetooth management interfaces to untrusted inputs — particularly automated mesh testers, gateway appliances, or shared testbeds — raise the probability an attacker could deliberately trigger the condition. For embedded IoT devices that implement mesh nodes or relays, the bug can reduce availability or cause instability.
Note: definitive claims about remote exploitability require deeper exploitation analysis and proof-of-concept code; at the time of disclosure there is no public proof-of-exploit showing privilege elevation or RCE from this specific bug. Treat exploitability as possible in theory but unproven in practice and prioritize stability fixes accordingly.

Which kernels and platforms are affected​

  • The bug is in the Linux kernel's Bluetooth MGMT implementation. It affects mainline kernels and stable branches prior to the patch landing.
  • Upstream stable branches have been updated with the fix; downstream distributions typically backport and release patched kernel packages.
  • Fixed kernel snapshots/versions (representative upstream stable points) include:
  • 6.1.159 (stable branch fix)
  • 6.6.117
  • 6.12.59
  • 6.17.9
  • Distributions and vendors will publish their own security advisories and package updates that contain the upstream fixes or equivalent backports. Users should consult their vendor's security channels and package metadata to confirm whether their installed kernel includes the patch.

How to check if your system is vulnerable​

  • Determine your running kernel:
  • uname -r
  • hostnamectl (on systems that provide it)
  • Check whether your distribution has a security advisory or a kernel update mentioning Bluetooth fixes or the CVE number.
  • Inspect the kernel's Bluetooth MGMT source if you build kernels from source:
  • Look in net/bluetooth/mgmt.c for an existing call to cancel_delayed_work_sync(&hdev->mesh_send_done) inside the hdev removal function (mgmt_index_removed or equivalent).
  • If your source lacks that line and your kernel predates the fixed stable tag, you likely need the update.
  • For packaged kernels: query package changelogs or the vendor's CVE tracker to confirm whether the fix is present.
  • Check logs for relevant symptoms:
  • Kernel logs (dmesg, journalctl -k) containing KASAN or slab-use-after-free reports referencing mesh, run_timer_softirq, or mgmt_pending_remove.

Recommended remediation steps (practical, prioritized)​

  • Immediate action (all systems):
  • Update to the vendor-supplied kernel that includes the upstream fix. Rebooting into a patched kernel is required for the change to take effect.
  • If you cannot update immediately, consider temporarily disabling Bluetooth or Bluetooth Mesh userspace components until you can apply the patch. On servers or gateways that do not require Bluetooth, unloading the Bluetooth kernel module (and preventing automatic loading) reduces exposure.
  • For development, test, and CI systems:
  • Apply upstream stable patch or upgrade to the nearest fixed stable kernel (e.g., one of the versions listed earlier).
  • Re-run BlueZ mesh test suites to validate the fix under your test harness.
  • For embedded and appliance vendors:
  • Backport the one-line fix or pick a patched stable kernel branch and rebuild device kernels.
  • Validate with device-specific test cases that simulate hot-unplug and mesh-send cancel operations.
  • Monitoring and detection:
  • Watch kernel logs for slab-use-after-free or KASAN traces involving run_timer_softirq, mgmt_pending_remove, vhci_release, or mesh-send worker names.
  • Add alerting on kernel oopses and worker timeouts in devices that operate as mesh controllers or gateways.

Mitigation checklist (quick reference)​

  • Update kernel (preferred):
  • Install vendor-distributed kernel update (contains upstream fix).
  • Reboot into patched kernel.
  • If update not immediately possible:
  • Stop BlueZ/mesh userspace and unload Bluetooth kernel modules.
  • Block access to Bluetooth control interfaces (MGMT/Netlink) from untrusted users or network paths.
  • Isolate affected devices from untrusted mesh nodes until patched.
  • For kernel builders:
  • Apply upstream MGMT change to net/bluetooth/mgmt.c:
  • Add cancel_delayed_work_sync(&hdev->mesh_send_done); in mgmt_index_removed alongside other cancel_delayed_work_sync calls.
  • Rebuild and test.

Why the patch is the right fix​

  • The change follows an existing defensive pattern. MGMT already cancels other delayed work items during hdev teardown; adding the mesh_send_done cancel fills a one-off omission.
  • Using cancel_delayed_work_sync is appropriate because it:
  • Ensures any running worker has completed before the hdev is released.
  • Prevents the delayed-work callback from running after the device memory is freed.
  • The fix avoids changing semantics of mesh transmission or adding expensive synchronization; it simply ensures safe teardown ordering.

Wider implications for IoT, gateways, and test infrastructures​

  • IoT gateways and hubs that route or manage Bluetooth Mesh traffic are prime candidates to be affected. A crash in the kernel's Bluetooth stack can disrupt multi-hop mesh behaviors and leave nodes disconnected from central controllers.
  • Automated testbeds (BlueZ test bot and similar) are particularly sensitive; the bug was discovered because mesh test cases sporadically hung or caused kernel crashes when run under automation and sanitizer configurations.
  • Embedded devices that rarely reboot or that run specialized kernels must pay close attention: bespoke or outdated kernel trees may not have the fix; vendors need to evaluate and push secure firmware updates.
  • Cloud VMs with Bluetooth passthrough or development hosts used for Bluetooth testing can also be affected if they rely on the unpatched kernel.

Developer and maintainer notes​

  • Maintain minimal exposure to untrusted control paths: even when a bug appears to be a crash-only issue, avoid leaving device-management interfaces open to untrusted userspaces or remote systems.
  • Prefer synchronous cancellation semantics (cancel_delayed_work_sync) in teardown paths where the worker callback accesses device-local memory. Asynchronous cancellation can leave a race window that becomes a maintenance trap.
  • When introducing new delayed work into device structures, add the corresponding teardown cancellation in the same patch or review item to avoid omissions.

How we validated the facts​

  • The issue was identified in the kernel Bluetooth management code and described as a delayed-work cancellation omission; automated test logs and KASAN traces show slab-use-after-free behavior when the mesh_send_done timer executes after hdev removal.
  • Upstream maintainers applied a corrective patch that adds cancel_delayed_work_sync(&hdev->mesh_send_done) to the MGMT removal path.
  • The upstream fix has been merged into stable kernel branches and appears in subsequent stable releases; downstream vendors have been encouraged to backport and release updates via standard security channels.
  • Public vulnerability trackers have assigned CVE-2025-40284 to the problem and registered the description as the MGMT cancel omission leading to a crash.
Flagged limitation: definitive, low-level exploitability (i.e., whether this use-after-free can be reliably turned into kernel code execution against a fully patched host with modern mitigations) is not publicly documented and would require a separate exploitation study. The available evidence supports denial-of-service / crash impact as the practical and observed threat.

Practical FAQ​

  • Is my laptop at risk?
  • If your laptop runs a stock distribution kernel that includes the upstream fix (or a vendor update), it is not at risk from this specific bug. If your laptop runs an older kernel or a custom kernel that predates the fix and you frequently use Bluetooth Mesh features, apply the kernel update.
  • Does this let an attacker run code as root?
  • There is no public proof that this particular bug leads to remote code execution or privilege escalation. The observed issue is a memory-use-after-free that causes crashes. That said, use-after-free bugs can sometimes be exploited; treat that as a theoretical risk until your vendor’s updated kernel is installed.
  • Will disabling Bluetooth entirely fix it?
  • Yes: unloading the Bluetooth kernel module or stopping the Bluetooth userspace stack removes the immediate attack surface and prevents mesh timers from being scheduled.
  • How urgent is the update?
  • Moderate-to-high for devices that run or test Bluetooth Mesh; routine for systems that do not use Bluetooth Mesh. Because the fix is small and low-risk, applying vendor kernel updates as part of regular maintenance is recommended.

Conclusion​

CVE-2025-40284 is an instructive example of how a single omitted timer cancellation can undermine kernel stability. The problem is not a feature gap or a complex design flaw — it’s a missing defensive step in device teardown semantics that allowed a delayed-work handler to touch memory that was already freed. Upstream maintainers applied a focused, low-risk fix that restores expected behavior: cancel the mesh_send_done delayed work when MGMT removes the hdev.
For operators and vendors the actionable guidance is simple and immediate: apply the kernel update from your distribution or vendor, reboot into the patched kernel, and if that is not immediately possible, disable Bluetooth or isolate devices that act as mesh controllers until the remediation is in place. Given the prevalence of Linux in gateways, testbeds, and embedded appliances that power Bluetooth Mesh deployments, this small fix has an outsized effect on reliability — and it demonstrates the value of rigorous automated testing and sanitizer-assisted development in finding and closing timing- and teardown-related kernel bugs.

Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top