CVE-2026-23379 ETS Offload Bug: 32-bit Overflow Causes Divide-by-Zero Panic

ChatGPT · Mar 26, 2026

Linux’s latest scheduler-related security fix, CVE-2026-23379, is a reminder that even “small” arithmetic mistakes in kernel offload code can have outsized consequences. The flaw sits in the ETS traffic scheduler path, where the kernel computes weighted round-robin parameters for hardware offload, and a plain 32-bit integer overflow can push the code into a divide-by-zero panic. In practical terms, that means a malformed or unlucky configuration can turn a networking feature into a system-crashing bug at exactly the wrong time. The upstream fix is conceptually simple — switch q_sum and q_psum to 64-bit integers — but the operational lesson is bigger: offload paths are not just performance accelerators, they are part of the security boundary.

Background

The Linux networking stack has always had a complicated relationship with hardware offload. Features such as ETS (Enhanced Transmission Selection) exist to let the kernel translate traffic-control policy into something NIC hardware can enforce more efficiently. That translation step is where software abstractions meet device-specific math, and that is exactly where assumptions tend to break. A bug in the control plane may not be glamorous, but it can still be catastrophic because it affects how the system decides who gets bandwidth, when packets are scheduled, and whether a configuration is even valid in the first place.
At a high level, ETS is meant to allocate link capacity across classes using weighted scheduling. The kernel converts quanta into WRR-style weights, then hands those weights to the offload path. That sounds straightforward until the implementation starts averaging sums of quanta that can grow larger than an unsigned int can safely hold. Once the arithmetic wraps, the scheduler can no longer trust the denominator it is about to divide by, and a divide-by-zero crash becomes possible. The patch description in the CVE record is blunt about that failure mode, and the embedded kernel splat shows the bug surfaced in ets_offload_change() while processing a tc command.
What makes this vulnerability notable is not only the crash, but the environment in which it appears. Kernel traffic-control code is often treated as infrastructure plumbing, yet that plumbing is increasingly exposed to automation, orchestration systems, and vendor-specific offload logic. In enterprise environments, the offload path can be touched by provisioning tools, config-management pipelines, and network-policy engines, meaning a flaw here can affect more than one appliance model or one class of host. That makes a scheduler bug feel less like an edge case and more like a systemic reliability risk.
The CVE also arrived with a familiar modern kernel-security pattern: a concise fix backed by a concrete crash trace, then a set of upstream stable references indicating the patch had already moved through the kernel’s maintenance machinery. That matters because the Linux stable process is often the real bridge between a code fix and practical remediation. In other words, this is not just an academic bug report; it is a patch-ready issue that downstream vendors can carry quickly.

Overview

The ETS offload path is a good example of why kernel security has become so tightly linked to data representation. The code is not trying to do exotic cryptography or parse untrusted user content. It is doing math on scheduler parameters, and the failure happens because the chosen integer type is too small for the possible range of values. That is a classic kernel bug pattern: the system is logically correct in intent, but its numeric assumptions are undersized for the real-world inputs it can receive.
This is also a reminder that “divide by zero” in kernel space is not a mere exception. In user space, a bad arithmetic path may kill a process. In kernel space, it can trigger a full machine panic, which is a much higher operational cost. The trace in the CVE description shows the fault occurred during a tc operation, which means a privileged configuration action was enough to reach the failure. That is a serious reliability issue even if the bug does not map neatly to remote exploitation.
The fix itself is notable because it changes the width of the accumulator rather than redesigning the scheduler. In practice, that is often the right call when the underlying model is sound and the problem is simply that intermediate values can exceed the container type. Using 64-bit integers for q_sum and q_psum preserves the intended logic while making overflow far less likely. The patch description explicitly identifies that change as the remedy.
There is also a broader pattern here that Linux maintainers know well: offload code often inherits the risks of both worlds. It must be correct enough for the kernel’s abstract policy model, but also robust enough to survive hardware acceleration paths, vendor extensions, and driver expectations. When a scheduler function becomes the point where arithmetic, policy, and hardware semantics all intersect, even a single narrow type choice can become a system-wide problem.

Why offload paths are especially fragile

Offload code tends to be less frequently exercised than the most common networking paths, which makes hidden assumptions more dangerous. A feature may work fine in ordinary lab tests and still fail under larger quanta, unusual class layouts, or a device that maps scheduler intent differently than the developer expected. That is why offload bugs so often show up in the long tail of production usage rather than during initial validation.

Why the CVE matters even without a CVSS score

The record indicates NVD had not yet assigned a CVSS base score at the time of publication. That does not make the issue minor; it simply means the classification work had not been completed yet. For operators, the absence of a score is not a reason to ignore the CVE. It is a reason to review whether the affected code exists in their kernels and whether the fix has been backported by their vendor.

The Arithmetic Failure Mode

At the heart of the bug is a very old kind of mistake: arithmetic overflow in a place where the code later assumes the result is safe to divide. The description says offloading ETS requires computing each class’s WRR weight by averaging over the sums of quanta as q_sum and q_psum. If those sums are stored in the same type as the individual quanta, the totals can wrap around, and once the denominator collapses to zero, the kernel hits a divide-by-zero path.
That kind of failure is easy to underestimate because the original values are not obviously malicious. They are just scheduler quanta, which makes the bug feel like a logic error rather than a security bug. But kernel bugs do not need attacker-controlled shellcode to be operationally dangerous. A bad mathematical assumption in privileged code can be enough to crash a host, interrupt traffic, or destabilize a networking appliance at scale.

Overflow in a control-plane calculation

The problem is especially subtle because the arithmetic is not the end goal; it is an intermediate step used to derive weights. That means the code is not checking a value for user-facing correctness but for internal policy translation. Those intermediate calculations are where maintainers often reach for the smallest type that “seems sufficient,” only to discover later that real-world inputs are much larger or more varied than expected.
The crash trace in the record shows the faulting instruction occurred in the offload change path while handling tc. That matters because it demonstrates the bug can be reached during normal traffic-control administration rather than only through an obscure driver path. In kernel terms, that widens the significance from theoretical arithmetic bug to practical system reliability issue.

The flaw is caused by overflow-prone accumulation.
The resulting value can become zero unexpectedly.
The division happens in the offload path, not just in a benign helper.
The crash is triggered during traffic-control configuration.
The kernel panic is a full-system stability event, not a process-local error.

Why 64-bit integers are the correct fix

The patch note says to fix the problem using 64-bit integers for q_sum and q_psum. That is the cleanest solution because it increases the headroom of the intermediate totals without altering the scheduling model itself. It is the kind of fix kernel maintainers prefer when the logic is correct but the data type is not wide enough for the workload.
Just as importantly, this kind of fix limits collateral damage. If the team had rewritten the weighting logic more broadly, the risk of regression would be much higher. By changing the accumulator width, the patch keeps the computational intent intact and reduces the chance of introducing a new scheduling bug in the process. That is a small patch, large payoff situation, which is often the hallmark of a good kernel security fix.

The Offload Path and Why It Matters

Hardware offload is attractive because it can reduce CPU load and improve throughput, but it also creates a second implementation surface for bugs. A software scheduler is already tricky; once the kernel starts translating that logic for hardware consumption, the number of assumptions multiplies. ETS is supposed to make quality-of-service behavior more deterministic, yet a divide-by-zero in the conversion layer does the opposite.
That is why offload bugs are often more than “just bugs.” They affect the contract between the kernel and the NIC driver, and that contract is part of the system’s trust boundary. If the kernel computes a weight incorrectly, the hardware may be asked to enforce a nonsensical policy, and the failure may appear far from the original arithmetic mistake. In that sense, the bug lives in a translation layer where simple assumptions can have cascading effects.

Offload is not a shortcut, it is another execution environment

Administrators sometimes treat offload as a performance feature that can be enabled once and forgotten. In reality, offload pushes logic into a more constrained and less forgiving environment, where the kernel must present values that the NIC and driver can safely consume. If the upstream math is wrong, the hardware path may magnify the mistake rather than contain it.
The CVE record’s wording makes clear that this is about the offload path specifically, not the entire ETS subsystem. That distinction matters because it helps narrow the remediation surface. Systems that use ETS only in software may not hit the bug in the same way, while environments that rely on offload are squarely in the risk zone. That is the sort of nuance security teams need during triage.

Offload introduces additional validation complexity.
Hardware paths can amplify small arithmetic mistakes.
The bug is in the translation layer between policy and device behavior.
Software-only usage and offloaded usage may have different exposure profiles.
A kernel panic in this layer can interrupt network policy enforcement.

What the Crash Trace Tells Us

The supplied oops is valuable because it turns an abstract bug report into a concrete failure path. The crash occurred in ets_offload_change+0x11f/0x290, which places the failure squarely inside the offload configuration logic. The stack trace then walks through ets_qdisc_change, qdisc_create, and the tc netlink plumbing, showing the bug is reachable through normal user-facing traffic-control operations performed with sufficient privileges.
That is important for two reasons. First, it confirms the bug is not hypothetical. Second, it shows the failure happens during administrative configuration, which means outages could occur during routine deployment or policy updates rather than during some rare corner case. A system can be stable for months and still fall over the moment an operator pushes a new qdisc layout.

Reading the stack in plain English

The stack trace tells a simple story: tc asks the kernel to modify a qdisc, the ETS code enters its change path, and the offload arithmetic blows up. That sequence is enough to show the bug is tied to a state transition rather than to data-plane packet handling. The distinction matters because it suggests the exploitability question is not “can an attacker send packets?” but rather “can an attacker or misconfigured automation induce a malformed scheduler state?”
The trace also underlines how dangerous kernel divide-by-zero faults are. In user space, the operating system can kill a process and move on. In kernel space, the same arithmetic failure can trigger a panic, especially if it occurs in a path that does not expect to recover from impossible values. That is why the CVE reads like a stability issue even though it is still security-relevant.

The fault is reachable through tc-driven configuration.
The crash lands in the ETS offload change path.
The trace proves the issue is reproducible.
The failure is in privileged kernel code.
The impact is a kernel panic, not a soft error.

Why the trace strengthens the case for a CVE

A CVE is easier to justify when a bug has a demonstrable crash signature, a narrow fix, and a known subsystem owner. This record has all three. The trace shows exactly where the divide-by-zero occurs, the fix is small and specific, and the problem is squarely in Linux kernel networking. That combination makes the vulnerability much easier for vendors and operators to assess than a vague “may crash under some conditions” report.

Enterprise Impact and Operational Risk

For enterprise teams, this CVE is best understood as a reliability and control-plane problem first and a security advisory second. If an organization uses ETS offload in production, a bad configuration rollout could trigger a kernel panic on hosts that should otherwise have been able to handle a routine quality-of-service change. That kind of failure has real costs: service interruption, incident escalation, failover churn, and postmortem work that might have been avoided by a prompt patch.
The risk profile is especially relevant for infrastructure teams managing clusters, network appliances, or specialized Linux deployments. These environments often use automated policy systems that assume kernel networking features are stable once validated. A divide-by-zero in the offload path breaks that assumption and forces operators to treat traffic-control changes as a higher-risk activity until the fix is in place.

Consumer systems vs enterprise systems

Consumer desktops are less likely to encounter this exact path, mostly because ETS offload configuration is not a common everyday action for most users. But consumer risk is not zero, especially on specialized systems, labs, or hobbyist setups where advanced networking features are enabled. Enterprise systems, by contrast, are more exposed because they are the ones most likely to use traffic shaping, policy enforcement, and hardware offload at scale.
That difference changes prioritization. A home user may never see the bug; a datacenter operator may see it during a maintenance window. The same flaw can therefore rank low in one environment and high in another, which is why administrators should anchor remediation to actual feature use rather than just headline severity.

Validate whether ETS offload is enabled in your fleet.
Check whether kernel-based traffic shaping is part of your service model.
Review automation that applies qdisc changes.
Coordinate patching with maintenance windows.
Confirm vendor backports instead of relying on mainline-only version checks.

Why mixed fleets complicate response

Modern enterprises rarely run a single kernel lineage. They may have cloud images, appliances, edge boxes, and vendor-supported distros all at once. That means a fix for one kernel tree does not guarantee protection everywhere, and a CVE can remain visible in vulnerability scans even after part of the fleet has been updated. The operational challenge is not just patching; it is patch verification across multiple packaging models.

How This Fits the Linux Security Pattern

This CVE fits a recurring Linux pattern: the most interesting bugs are often not the ones that look dramatic at first glance. Instead, they come from type choices, state transitions, or assumptions about “reasonable” input ranges. Those bugs are common because kernel code is full of performance-sensitive arithmetic, and performance-sensitive code tends to avoid unnecessary overhead until reality proves the shortcut too small.
Kernel maintainers are increasingly good at catching these issues early because of sanitizer coverage, code review discipline, and a mature stable pipeline. The CVE description shows that the problem was identified clearly enough to generate a reproducible panic trace and a targeted fix. That is a sign of health in the ecosystem, even if the bug itself is unpleasant.

Small type, big consequences

One of the enduring lessons of kernel development is that integer width matters. A 32-bit value may be perfectly adequate for one field and completely inadequate for an aggregate calculation. Once a calculation is used to derive policy or weights, the margin for error shrinks fast. This CVE is a textbook example of how “works in most cases” is not a good enough standard for privileged code.
It also shows why maintainers prefer fixes that align the data type with the actual computation. If the code needs the sum of multiple quanta, the accumulator should be large enough to hold that sum even when the inputs are unexpectedly large. Anything less is a latent outage waiting for the right configuration.

Kernel security often fails through ordinary arithmetic.
The dangerous part is usually the accumulator, not the input field.
Stable fixes are strongest when they are surgical.
Good bugs are often caught by reproducible crashes.
Sanitizers and review processes are critical for early detection.

The broader lesson for maintainers

This vulnerability reinforces a principle that applies far beyond ETS: if the logic is approximate, the implementation still has to be exact about its arithmetic. Offload code, timing code, and scheduler code often operate on heuristics, but heuristics do not excuse overflow. In fact, the more approximate the policy, the more carefully the underlying math must be guarded.

Strengths and Opportunities

The good news is that this CVE appears to have a narrow fault surface, a clear fix, and a strong chance of quick backporting. That combination is the best-case scenario for a kernel bug: contained, understandable, and easy for downstream vendors to fold into maintenance releases. It is also a useful reminder that even obscure networking code benefits from the same defensive discipline as high-profile attack surfaces.

Minimal patch surface: the fix changes accumulator width rather than redesigning ETS.
Clear root cause: the bug is tied to overflow leading to divide-by-zero.
Actionable trace: the panic trace makes reproduction and triage straightforward.
Good stable-fit: the fix is the sort that downstream maintainers can backport cleanly.
Better robustness: 64-bit arithmetic reduces the chance of wraparound in the offload path.
Operational value: the patch should improve stability for systems that use qdisc offload.
Preventive lesson: the issue encourages broader auditing of similar scheduler math.

Why this is a good example of mature kernel maintenance

This is a case where the upstream response looks exactly as it should: identify the arithmetic defect, confirm the crash, apply the narrowest effective fix, and let the stable pipeline do its work. That is how Linux reduces risk without overengineering the remedy. For enterprise operators, that predictability is as important as the patch itself.

Risks and Concerns

The primary concern is that this kind of bug can be easy to dismiss as “just a crash” when it is really a sign of deeper arithmetic fragility in a privileged subsystem. If operators assume the issue is too niche to matter, they may leave offload-enabled kernels exposed longer than they should. That is especially risky in environments where configuration changes are automated and happen more often than human operators realize.
Another concern is that network-control bugs often blur the line between reliability and security. A crash during policy enforcement may not look like a classical exploit, but in the real world it can still mean denial of service, service interruption, or a broken failover chain. The kernel does not care whether the failure was “security” or “stability”; the host still goes down.

What could go wrong downstream

The most likely downstream risk is uneven backporting. One vendor may carry the fix quickly, another may fold it into a larger kernel maintenance update, and a third may not mark the advisory clearly enough for customers to connect the dots. That fragmentation is common in Linux security, and it is why version checks alone are not enough.
A second risk is that users may not know whether ETS offload is enabled in their stack. Many environments rely on layered tooling, so the actual feature state may be buried in driver settings or provisioning templates. If administrators cannot tell whether they are exposed, they cannot prioritize the patch correctly.

Potential for under-prioritization because the issue is not flashy.
Risk of uneven vendor backports across distributions.
Possibility of hidden feature exposure in automated deployments.
Chance of service disruption during routine network changes.
Ongoing need to audit adjacent scheduler arithmetic for similar bugs.

Why representation bugs tend to linger

Type-width mistakes survive code review because they look harmless in small tests and because the code often “works” until it doesn’t. That makes them hard to spot and easy to rationalize away. Yet in kernel code, a harmless-looking representation choice can be the difference between a stable deployment and a panic on a production host.

Looking Ahead

The most immediate thing to watch is whether downstream vendors have already incorporated the 64-bit accumulator fix into stable kernel streams. Because the vulnerability has a concrete crash trace and a straightforward patch, it is the sort of issue that should move quickly through vendor backport channels. The key question for administrators is not whether the bug is real; it is whether their exact kernel build has already absorbed the correction.
A second thing to watch is whether maintainers audit neighboring traffic-control paths for similar integer-width assumptions. Once one offload calculation has been shown to overflow in a destructive way, it becomes sensible to review other weight, quota, and averaging code in the same area. That kind of follow-on hygiene often yields more stability than the original bug fix alone.

Practical next steps for operators

Confirm whether your Linux kernel uses ETS offload in production.
Check vendor advisories for the exact build that includes the 64-bit fix.
Validate any automated tc/qdisc rollout scripts against patched kernels.
Monitor for backported security notes that mention CVE-2026-23379 by name.
Treat the issue as a stability-critical update if your systems rely on traffic shaping.

What to expect from the ecosystem

It would not be surprising to see this CVE remain low-key in public severity scoring while still being treated seriously by maintainers and vendors. That is often how kernel bugs of this class behave: they are not headline-grabbing, but they are real, reproducible, and worth fixing promptly. The best outcome is the boring one — quiet backports, no regressions, and no further crashes.
In the end, CVE-2026-23379 is a reminder that kernel hardening is often about the unglamorous details: the size of an integer, the safety of a denominator, and the difference between a valid average and a catastrophic zero. The fix is small, but the lesson is large. When privileged code does math on behalf of hardware, every assumption has to survive production reality, not just the happy path in a test VM.

Source: NVD / Linux Kernel Security Update Guide - Microsoft Security Response Center

Search

Navigation section

CVE-2026-23379 ETS Offload Bug: 32-bit Overflow Causes Divide-by-Zero Panic

Background

Overview

Why offload paths are especially fragile

Why the CVE matters even without a CVSS score

The Arithmetic Failure Mode

Overflow in a control-plane calculation

Why 64-bit integers are the correct fix

The Offload Path and Why It Matters

Offload is not a shortcut, it is another execution environment

What the Crash Trace Tells Us

Reading the stack in plain English

Why the trace strengthens the case for a CVE

Enterprise Impact and Operational Risk

Consumer systems vs enterprise systems

Why mixed fleets complicate response

How This Fits the Linux Security Pattern

Small type, big consequences

The broader lesson for maintainers

Strengths and Opportunities

Why this is a good example of mature kernel maintenance

Risks and Concerns

What could go wrong downstream

Why representation bugs tend to linger

Looking Ahead

Practical next steps for operators

What to expect from the ecosystem

Similar threads

Navigation section

CVE-2026-23379 ETS Offload Bug: 32-bit Overflow Causes Divide-by-Zero Panic

Overview​

Why offload paths are especially fragile​

Why the CVE matters even without a CVSS score​

The Arithmetic Failure Mode​

Overflow in a control-plane calculation​

Why 64-bit integers are the correct fix​

The Offload Path and Why It Matters​

Offload is not a shortcut, it is another execution environment​

What the Crash Trace Tells Us​

Reading the stack in plain English​

Why the trace strengthens the case for a CVE​

Enterprise Impact and Operational Risk​

Consumer systems vs enterprise systems​

Why mixed fleets complicate response​

How This Fits the Linux Security Pattern​

Small type, big consequences​

The broader lesson for maintainers​

Strengths and Opportunities​

Why this is a good example of mature kernel maintenance​

Risks and Concerns​

What could go wrong downstream​

Why representation bugs tend to linger​

Looking Ahead​

Practical next steps for operators​

What to expect from the ecosystem​

Similar threads

Overview

Why offload paths are especially fragile

Why the CVE matters even without a CVSS score

The Arithmetic Failure Mode

Overflow in a control-plane calculation

Why 64-bit integers are the correct fix

The Offload Path and Why It Matters

Offload is not a shortcut, it is another execution environment

What the Crash Trace Tells Us

Reading the stack in plain English

Why the trace strengthens the case for a CVE

Enterprise Impact and Operational Risk

Consumer systems vs enterprise systems

Why mixed fleets complicate response

How This Fits the Linux Security Pattern

Small type, big consequences

The broader lesson for maintainers

Strengths and Opportunities

Why this is a good example of mature kernel maintenance

Risks and Concerns

What could go wrong downstream

Why representation bugs tend to linger

Looking Ahead

Practical next steps for operators

What to expect from the ecosystem