Title: CVE-2025-21825 — bpf: “Cancel the running bpf_timer through kworker for PREEMPT_RT” (what happened, who’s affected, and what to do)
Date: March 6, 2025 (published / CVE assignment) — updated summary for sysadmins (Dec 7, 2025)
Summary
- A kernel-level locking problem involving BPF map timers (bpf_timer) and PREEMPT_RT real‑time kernels was assigned CVE-2025-21825. The issue can trigger kernel lockdep/BUG warnings (e.g., "BUG: scheduling while atomic") when a BPF map update frees an element while locks are held; the upstream fix changes how a running bpf_timer is cancelled on PREEMPT_RT systems by deferring full cancellation to kworker context.
- Impact: low severity and low exploitability in typical configurations because the problem depends on PREEMPT_RT (real‑time enabled kernels) and the specific timing of BPF map update + timer activity. Vendors have published fixes and/or backports; the Linux kernel stable trees include commits that remediate the issue.
What is the bug (plain English)
- The kernel's BPF htab map implementation can overwrite a pre‑allocated hash table (htab) element during an update. When overwriting, the map code protects the freeing of the old element with a per‑bucket "bucket lock" to avoid races with concurrent map updates that might reuse the stashed element.
- The problem arises because the routine that frees element fields (check_and_free_fields may itself take another spinlock. Under PREEMPT_RT, cancelling a running bpf_timer (which relies on hrtimer_cancel/softirq_expiry_lock) can lead to code paths that attempt to acquire softirq_expiry_lock while the bucket raw spin‑lock is still held. That combination violates lock ordering rules and can trigger lockdep warnings and kernel BUG output such as "BUG: scheduling while atomic". In short: freeing under the bucket lock can cause a nested spinlock + scheduling path that is illegal on PREEMPT_RT.
Why the fix matters (technical summary)
- On PREEMPT_RT kernels the normal synchronous cancellation of a running high‑resolution timer can attempt to acquire locks that must not be acquired while holding the raw bucket spin‑lock; that produces lockdep complaints and can produce crash‑like behavior (kernel BUG messages), or at minimum a dangerous kernel‑state assertion. The upstream patch avoids acquiring the problematic lock sequence while the bucket lock is held by splitting the cancel operation into two phases:
- Try a non‑blocking cancellation (hrtimer_try_to_cancel while still in the locked scope; if that succeeds, proceed normally.
- If the timer is still running (cancel didn’t succeed immediately), schedule a worker (kworker) to call hrtimer_cancel later in process context (kworker) where it can safely acquire the needed locks without violating PREEMPT_RT lock‑ordering. This defers the potentially blocking cancellation out of the raw spin‑lock context.
Important timeline & scope / kernel versions
- CVE published: March 6, 2025.
- The bug behavior is tied to BPF timer code introduced around kernel 5.15 and the softirq_expiry_lock that exists since v5.4. The Linux kernel CVE announcement and stable commit log indicate the issue was introduced in 5.15 (commit b00628b1... and patches were merged/fixed in later kernels (fixes show up in 6.13.2 and in 6.14-rc1 via specific commits). The fix is only required on kernels that are built with PREEMPT_RT enabled (dependency: PREEMPT_RT enabled, typically in v6.12+).
Severity, exploitability, and practical risk
- Vendors and open vulnerability repositories classify this as low severity (CVSS ~3.3 in some vendor advisories) and EPSS / exploit prediction shows very low exploitation probability. Part of the reason for the relatively low score is that the issue requires PREEMPT_RT (real‑time) kernels and the specific timing conditions that lead to a lockdep BUG.
- That said, a kernel BUG or lockdep problem can be disruptive — on systems that run PREEMPT_RT kernels (e.g., real‑time appliances, some embedded systems, telecom, industrial controllers), this should be taken seriously. On general-purpose distributions (non‑PREEMPT_RT default kernels), the issue is much less likely to be triggered.
Where the fix landed (commits / files)
- The problem and the patch were recorded in the kernel CVE announcements and stable trees. The affected code is in kernel/bpf/helpers.c and the fixes are available as stable commits (the kernel CVE announcement lists the relevant commit IDs). If you need to pick patches instead of upgrading the kernel, those commits are the ones to review.
What administrators and developers should do (recommended actions)
- Inventory & triage (immediately)
- Find which systems run kernels compiled with PREEMPT_RT (real‑time). On each host, run:
- uname -r (to get kernel version)
- grep PREEMPT /boot/config-$(uname -r) || zgrep PREEMPT /proc/config.gz (to see if PREEMPT_RT/CONFIG_PREEMPT_RT is enabled)
- If you run PREEMPT_RT kernels and you use BPF programs or containers that load BPF maps/timers, prioritize those systems for patching. (If your distro ships non‑PREEMPT_RT default kernels, the risk is lower. (Commands above are common sysadmin checks; vendor packaging may differ.
- Patch / upgrade (primary mitigation)
- The recommended fix is to update the kernel to a version that contains the upstream fix — i.e., install the updated stable kernel release or vendor backport that includes the BPF patch set (see vendor advisories). Do not cherry‑pick unrelated kernel commits unless you have kernel expertise — the upstream CVE guidance likewise recommends updating to a patched stable kernel.
- Check vendor advisories for your distribution (SUSE, Ubuntu/USN, Red Hat, Oracle Linux, Amazon Linux etc.; many vendors list CVE‑2025‑21825 and provide patched kernel packages or backports. OSV and vendor pages aggregate these vendor advisories.
- Short‑term mitigations (if you cannot patch immediately)
- If you can, avoid running PREEMPT_RT kernel builds until you can apply the vendor patch (for non‑critical real‑time workloads). For embedded or specialized systems that require PREEMPT_RT, plan a maintenance window to apply vendor‑provided patches.
- Reduce exposure by avoiding untrusted BPF program loading and restricting use of BPF interfaces (e.g., limit access to bpf syscall, restrict rootless/unprivileged BPF where your distro supports it), because the condition requires BPF updates that trigger the path. (This is a general mitigation, not a guaranteed workaround.
- Monitor dmesg / syslog for the specific kernel warnings that indicate the bug is happening — e.g., lockdep BUG output containing "BUG: scheduling while atomic" with stack traces pointing into htab_map_update_elem/hrt... — and treat such hits as high priority for patching.
- Detection & monitoring
- Look for the kernel warning example reported in public advisories (the kernel BUG/lockdep trace). If you see that trace in your logs, prioritize patching that host. The linux‑cve announce message and other advisories include a representative dmesg trace you can match.
Vendor status & references (examples)
- Linux upstream and kernel CVE announcements describe the problem, the affected file (kernel/bpf/helpers.c), and list the commits that fix it. The kernel CVE team and stable tree references are the authoritative source for the exact patch.
- OSV (Open Source Vulnerabilities) and vendor advisories (SUSE, Ubuntu/USN, Oracle, Amazon ALAS) list CVE-2025-21825 with vendor‑specific statuses (published March 6, 2025; some vendor pages list patch availability/backports). Check the advisory for your distro for the exact package and version to install.
- NVD (National Vulnerability Database) entry summarises the issue and indicates the PREEMPT_RT dependency and the reasoning behind the fix.
Why the fix uses kworker (brief technical rationale)
- When a running hrtimer must be cancelled and that cancellation can block or acquire locks not allowed while holding the bucket raw spin‑lock, doing the blocking work in process context (a kworker) avoids violating lock ordering constraints on PREEMPT_RT. The patch thus tries a non‑blocking cancel immediately and, if unsuccessful, defers the blocking cancel to a worker. This prevents the bad lock ordering in atomic contexts while still ensuring the timer is eventually cancelled. There is a trade‑off: cancellation may be slightly delayed when the timer is running; that is recognized in the changelog commentary and may be revisited later.
FAQ / Common questions
- Does this affect all Linux systems?
- No. The condition requires PREEMPT_RT (real‑time) enabled kernels — that dramatically reduces the universe of affected systems compared with general‑purpose distributions. However, if your distro supplies a PREEMPT_RT kernel (embedded, real‑time appliances, or custom builds), you are in scope.
- Is there a public exploit?
- Public exploit activity is not documented and EPSS / exploit prediction indicates negligible exploitability; the patch was driven by correctness/lockdep issues rather than a known remote exploit chain. Still, kernel BUGs are serious for affected real‑time systems and should be remediated.
- Can I safely cherry‑pick the upstream commit instead of upgrading the kernel?
- Kernel changes interact with many subsystems; upstream notes and the kernel CVE team recommend upgrading to a stable patched kernel rather than ad‑hoc cherry‑picks unless you know how to thoroughly test kernel patches in your environment.
Closing notes (practical checklist)
- March 6, 2025: CVE-2025-21825 published upstream.
- If you run PREEMPT_RT kernels and BPF programs: plan for prompt patching. Check vendor advisories (OSV, SUSE, Ubuntu USN, Oracle, Amazon ALAS, Red Hat) for patched kernel packages/backports.
- If you cannot patch right away: restrict BPF program loading, monitor dmesg for the lockdep/BBUG traces discussed above, and schedule a kernel upgrade in the next maintenance window.
References (selected)
- Linux kernel CVE announcement / linux‑cve‑announce: "CVE-2025-21825: bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT". March 6, 2025.
- NVD entry for CVE-2025-21825 (summary and dependency notes).
- OSV (Open Source Vulnerabilities) aggregated advisory: CVE-2025-21825 (lists related vendor advisories like SUSE, USN). Published 2025-03-06.
- Amazon Linux ALAS / CVE-2025-21825 (severity/CVSS reference).
- CVEDetails / vulnerability summary and kernel commit references.
If you want
- I can gather vendor‑specific package names and exact patched kernel package versions for one or more distributions you use (Ubuntu, Debian, RHEL/Oracle Linux, SUSE, Amazon Linux, etc.. Tell me which distribution(s) and I’ll list the exact advisories and package updates to apply.
- I can also assemble a short playbook (commands + a one‑page checklist) you can hand to on‑call or NOC to triage and patch affected hosts.
Would you like me to collect vendor advisory links and exact package versions for your distro(s)?
Source: MSRC
Security Update Guide - Microsoft Security Response Center