The Linux kernel’s ftrace subsystem received a targeted fix for a responsiveness issue that could turn into a local denial‑of‑service: a missing conditional reschedule inside ftrace_graph_set_hash() allowed long loops to hog the CPU and trigger the kernel’s softlockup watchdog under heavy tracing scenarios, and the upstream patch adds a call to cond_resched() to prevent those stalls.
ftrace is the kernel’s built‑in function tracer used for performance analysis, debugging, and dynamic tracing of function entry/exit events. It iterates over the kernel’s symbol tables and function lists when building traceable function sets, and several ftrace code paths already include calls to cond_resched() to avoid long, non‑preemptible loops. The recently disclosed issue addresses a place where that defensive pattern was missing.
The problem was publicly disclosed and assigned CVE‑2025‑37940 on 20 May 2025, when the maintainers announced the change to add cond_resched() inside ftrace_graph_set_hash(). Multiple distribution security teams and vulnerability databases recorded the advisory and the corresponding fixes.
Source: MSRC Security Update Guide - Microsoft Security Response Center
Background
ftrace is the kernel’s built‑in function tracer used for performance analysis, debugging, and dynamic tracing of function entry/exit events. It iterates over the kernel’s symbol tables and function lists when building traceable function sets, and several ftrace code paths already include calls to cond_resched() to avoid long, non‑preemptible loops. The recently disclosed issue addresses a place where that defensive pattern was missing.The problem was publicly disclosed and assigned CVE‑2025‑37940 on 20 May 2025, when the maintainers announced the change to add cond_resched() inside ftrace_graph_set_hash(). Multiple distribution security teams and vulnerability databases recorded the advisory and the corresponding fixes.
What went wrong: technical summary
The code path at fault
- The vulnerable loop lives in ftrace_graph_set_hash(), a routine used to build or update internal ftrace hash tables that represent functions that can be traced.
- When a kernel contains a very large number of traceable functions (for example in systems with many modules or large symbol tables), that loop can take a long time to complete.
- While that loop runs, the code was not yielding the CPU frequently enough, so a long non‑preemptible stretch could cause the kernel’s softlockup watchdog to decide the system is stuck and to react (which may look like a hard hang, panic, or strong unresponsiveness).
Why cond_resched() matters
- cond_resched() is the kernel’s cooperative mechanism that allows lengthy kernel work in non‑atomic contexts to voluntarily yield the CPU when the scheduler needs to run other tasks.
- Adding cond_resched() inside long loops is an established pattern in the kernel to avoid starvation of other tasks and to stop the softlockup watchdog from firing during legitimate (but lengthy) work. The patch simply brings ftrace_graph_set_hash() into line with other kernel code that iterates across all traceable functions.
Impact vector and scope
- Attack vector: local. The vulnerability requires local access—an attacker able to run code or trigger workloads on the host could cause the condition to arise. Most advisories classify the vector as local with low privileges required in practice.
- Security properties: confidentiality and integrity impact are none or negligible; availability impact is high because the result is a softlockup or severe unresponsiveness that effectively denies access to the affected system.
- Practical exploitation: there is no evidence this bug yields privilege escalation or remote code execution. The realistic risk is denial‑of‑service on machines where an attacker can either run tracing workloads or cause the kernel to walk very large symbol/function sets (for example by repeated module load/unload cycles or by enabling heavy tracing on expansive kernels).
Timeline and vendor reactions
- Upstream disclosure and patch: the Linux CVE announcement and kernel patch were posted and linked in the public CVE announcement on 20 May 2025. That announcement summarizes the change as “Add cond_resched() to ftrace_graph_set_hash()”.
- Distribution advisories: major vendors and distributions integrated the fix into their kernel updates and security notes; examples include Amazon Linux ALAS advisories, SUSE security notes, and Debian/Ubuntu package updates that reference the same root cause and remediation.
- Follow‑on fixes: the same class of issues in related ftrace code paths prompted further hardening later in the year (for example fixes addressing ftrace_module_enable and other paths that could produce RCU stalls or softlockups). These follow‑ups show the kernel community treated this as part of a category of CPU‑hogging / preemption‑sensitive bugs.
Verified technical facts and discrepancies
- Disclosure date: 20 May 2025. This is consistently stated in the upstream CVE announcement and many vendor advisories.
- Fix description: insertion of cond_resched() into the main loop inside ftrace_graph_set_hash(). Upstream commit references are consistently cited in public advisories, though some direct commit pages require kernel‑site access and distribution pages rehost or summarize the patch.
- CVSS and severity: vendors differ. For example, SUSE reports CVSSv3 = 5.5 (AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H), while Amazon Linux’s advisory listed a CVSSv3 value of 7.0 for their packaging context—differences stem from vendor scoring and affected‑package context. Treat per‑vendor CVSS values as vendor‑specific assessments rather than absolute truth.
Why this matters to admins and operators
Real‑world scenarios
- Shared servers and multi‑tenant hosts: on machines where untrusted users have shell access (shared hosting, university labs, CI runners), a local user could craft workloads that cause repeated expensive ftrace processing, causing severe system unresponsiveness for other tenants.
- Development and test rigs using heavy tracing: engineers debugging kernel or driver subsystems often enable broad tracing; without the fix, time‑consuming ftrace operations on large symbol sets could halt responsiveness and disrupt CI pipelines or tests.
- Module‑heavy systems: systems loading large driver stacks (GPU drivers, virtualized environments with many modules) are more likely to hold many traceable functions, increasing the chance the loop runs long enough to trigger a softlockup or RCU‑related stalls. Subsequent related fixes explicitly mention modules like amdgpu as one observed trigger.
The attack surface is local but the impact is severe
Although remote exploitation is not the described avenue, the availability loss is immediate and dramatic: an attacker who can run processes or trigger kernel operations locally can repeatedly force the condition. In production environments where local execution privileges exist for many users (or where containers share kernel access), this becomes an operational risk.Patching, mitigation and hardening — practical guidance
Below are prioritized, actionable steps for operations, with an emphasis on minimal disruption and fast risk reduction.Immediate recommendations (short term)
- Patch kernels as soon as reasonably possible. Apply vendor security updates that include the fix for CVE‑2025‑37940; consult your distribution’s security advisory and kernel package updates before scheduling reboots. Prioritize multi‑tenant and tracing‑heavy systems.
- Limit local unprivileged access. Audit who can run unprivileged workloads or load modules. Reduce the number of users with shell access on critical hosts and enforce least privilege.
- Disable ftrace or restrict tracing interfaces where not needed. If your environment doesn’t rely on kernel function tracing, consider disabling or restricting access to ftrace interfaces (for example, by controlling access to debugfs entries). This is a stopgap and not a substitute for patching.
Operational mitigations (medium term)
- Harden container and sandbox configurations. Use kernel namespaces, seccomp, and other isolation tools to prevent containerized users from invoking heavy kernel tracing paths. Prevent unprivileged module loading via module signing, kernel lockdown, or distribution policies.
- Monitor for symptoms. Watch for softlockup watchdog messages or kernel ‘not have been scheduled’ warnings in kernel logs. Configure alerting on repeated kernel watchdog events and establish automated remediation (for example, automated service fencing or host reboot after an unrecoverable softlockup).
- Apply configuration guardrails. Limit the use of global ftrace captures in production; prefer targeted trace sessions, rate‑limit user orgs allowed to perform system‑wide tracing, and document safe tracing procedures.
Long‑term remediation
- Standardize on a safe kernel update cadence that includes preflight testing in a staging environment.
- Include ftrace and other debug subsystems in security‑critical configuration reviews; decide where tracing should be enabled by policy versus ad hoc access.
- For organizations with custom kernels (RT, PREEMPT_VOLUNTARY, or heavily patched drivers), maintain a process to backport comparable fixes (e.g., adding cond_resched() in equivalent loops) and to test RCU / preemption interactions under real workloads. Subsequent CVEs in the ftrace family show these issues can appear across related code paths and kernel versions.
Analysis: strengths of the fix, and potential residual risks
Strengths
- The fix is focused and minimal: adding cond_resched() inside a long loop is a conservative, well‑understood mitigation that reduces the chance the kernel becomes unresponsive without altering the logic of ftrace. That conservatism reduces the risk of unintended side effects. Upstream maintainers adopted this pattern in several related locations, which is a sign of a coherent mitigation strategy across the trace subsystem.
- No indication of privilege escalation: the vulnerability is availability‑only, meaning it does not directly allow attackers to gain higher privileges or execute arbitrary code, which narrows the security impact profile.
Residual risks and caveats
- Local attack remains the realistic vector: on systems where untrusted users can execute tasks or create workloads that force large symbol walks, the availability impact is real until kernels are patched. Tightening local execution controls may be required in the interim.
- Distribution patching lag and varying priorities: vendors score and prioritize fixes differently (example: SUSE’s scoring vs Amazon’s CVSS value), and some vendors may opt not to backport fixes for older or long‑term distributions depending on their policy. That means some deployed fleets will remain vulnerable longer and operators must track vendor advisories tightly.
- The class of bugs is broader than a one‑line patch: subsequent advisories and CVEs in the ftrace area show related code paths (module enable paths, kallsyms_lookup loops) are also susceptible to long‑running non‑preemptible work. Attackers and operators should treat this as a category of preemption‑sensitive maintenance rather than an isolated bug.
For kernel developers and security reviewers: technical considerations
- Use of cond_resched() must be safe: developers should ensure that the loops where cond_resched() is inserted are not in atomic or RCU critical sections that explicitly disallow rescheduling. The kernel already contains patterns and helper functions to help developers locate safe yield points; maintainers should follow established conventions and review similar loops.
- Be mindful of RCU and disabled preemption: the related fixes and advisories reference RCU critical sections and PREEMPT_VOLUNTARY kernels; long kallsyms lookups inside RCU/disabled preemption contexts can still produce stalls even if cond_resched() is present elsewhere. A holistic review of how symbol lookups, module loading, and trace registration interact is prudent.
- Backporting: vendors and integrators backport kernel fixes differently. When backporting, ensure tests replicate the workloads that trigger long ftrace_graph_set_hash runs (large symbol sets, module load sequences) so the ported change behaves as expected.
Detection and incident response playbook
- Indicators to look for:
- Kernel log lines referencing softlockup or “CPU stuck” messages.
- Sudden spikes in systemwide latency, failing heartbeats, or long scheduling latencies across cores during tracing or module operations.
- Reproducible hangs during targeted tracing runs or module loads for drivers with many symbols.
- Response steps after detection:
- Identify processes and operations that triggered the condition (tracing sessions, module loads).
- If feasible, stop the offending trace session or unload the driver (note: unloading a driver may itself be problematic on some systems; proceed with caution).
- Migrate affected workloads to other hosts and schedule a patch/reboot cycle.
- Apply kernel updates and retest in a staging environment to confirm the fix removes the symptom.
Practical Q&A — common operator questions
- Q: Is this a remote exploit?
- A: No — the issue is local. Remote exploitation would only be possible if remote code execution already exists on the host or if a remote user can cause local workloads that walk the ftrace structures. Treat it as a local‑privilege / local‑resource denial risk.
- Q: Can containers be used to exploit this on shared hosts?
- A: Yes; containers share the host kernel, and an untrusted container user with the ability to invoke tracing or heavy module interactions can contribute to the condition. Container isolation alone does not prevent kernel‑level availability issues. Harden container capabilities and restrict debug/tracing interfaces.
- Q: Will adding cond_resched() change ftrace behaviour or accuracy?
- A: No: cond_resched() only yields CPU time; it does not change the tracing semantics or the logic of the hash construction. It simply prevents long non‑preemptible stretches that block the scheduler.
Conclusion
CVE‑2025‑37940 is a focused but meaningful availability issue in the ftrace subsystem: the missing cond_resched() inside ftrace_graph_set_hash() could allow long kernel loops to trigger softlockup watchdogs and render systems unresponsive under heavy tracing or large symbol sets. The upstream fix is minimal and appropriate, but the operational risk is real for multi‑tenant and tracing‑heavy environments until kernels are patched and local access is constrained. Administrators should prioritize vendor kernel updates, harden who can run tracing operations, and monitor for softlockup indicators. Developers should treat this as a reminder that long kernel loops require explicit scheduling cooperation, and distributions should ensure consistent backporting and testing to protect production fleets.Source: MSRC Security Update Guide - Microsoft Security Response Center