CVE-2026-31494: macb ethtool Stats OOB Write Due to Queue Count Mismatch

  • Thread Author
A newly published Linux kernel vulnerability in the macb Ethernet driver is a reminder that even small accounting mistakes in networking code can become memory-safety bugs. CVE-2026-31494 covers an out-of-bounds write in gem_get_ethtool_stats, where the driver copies statistics for the maximum queue count instead of the number of currently active queues. The result, as the KASAN report shows, is a vmalloc-out-of-bounds write when the active queue count is lower than MACB_MAX_QUEUES forward but important: make the copied size match the current queue topology, not the theoretical maximum.

Diagram of memory allocation buffers showing “OUT OF BOUNDS” and max/deb_ethtool/IOCTL/ memcpy flow.Background​

The macb driver family is part of the Linux networking stack used on Cadence-based Ethernet hardware, including deployments common in embedded systems and boards such as Raspberry Pi–class devices. That matters because the bug sits in a code path that is not exotic or theoretical; it is part of ordinary network administration tooling through ethtool, which many operators use to inspect link state and driver statistics. The CVE description explicitly ties the crash to dev_ethtool and an ethtool invocation, showing this is a management-plane issue rather than a packet-parser edge case.
What makes this bu driver already has one function that counts stats correctly. gem_get_sset_count calculates the number of statistics using the active queue count, while gem_get_ethtool_stats copies data using the maximum possible queue count. That mismatch between reserved memory and written memory is the heart of the vulnerability. In kernel terms, that is exactly the kinerror that can evade notice until a runtime environment exposes a configuration narrower than the developer assumed.
The KASAN splat in the advisory is also informative because it confirms the bug is not a vague logic issue. It is a concrete memory violation, and the trace shows a __asan_memcpy call inside gem_get_ethtool_stats writing 760 bytes into a one-page vmalloc region that had been sized for the actual stat array. That kind of failure is what turns a bookkeeping bug into a rty concern.
This is a useful example of how Linux CVEs often arise in the control plane rather than the data plane. ethtool is not flashy, but it is deeply trusted by administrators and automation systems. When a driver gets its size math wrong in response to a routine query, the blast radius can include crashes, service disruption, or a kernel panic in environments that rely on the interface for uptime.

Overview​

At a high level, the issue is simple: the driver’s statistics layout depends on how many queues are actually active, but one of the retrieval routines ignored that fact. The kernel had already done the right thing when reserving and counting stats, yet the copy path still assumed the maximum queue count. When the active queue count is smaller, the code writes beyond the reserved buffer and corrupts memory.
That distinction between “configured maximum” and “current active state” is critis. Hardware may support several queues, but a given boot, board, or configuration may only enable a subset. If the code does not consistently use the active number, the logic diverges: one path allocates for one size, another path writes for a larger size, and the system ends up with a mismatch that KASAN can catch but production systems may only experience as a crash or memory corruption.
The published record also indicates the fix was pulled into stable kernel references from kernel.org, which is a strong clue that maintainers considered the bug suitable for backporting rather than just an upstream cleanup. The advisory includes multiple stable commit references, which is the usual sign that the issue has already been addressed in several maintained branches. In Linux security terms, that suggests the problem is not niche enough to ignore, even if exploitation d to local admin-triggered paths.
The timing matters too. The CVE was received and published on April 22, 2026, and the public record was immediately associated with kernel.org references. That means defenders should treat this as a live patch-management item rather than a theoretical advisory waiting for more cRoot Cause
The kernel description says gem_get_sset_count computes statistics based on the active queue count, but gem_get_ethtool_stats copies using the maximum number of queues. That is the mismatch that causes the out-of-bounds write when active queues are fewer than the maximum supported by the hardware or driver configuration. In other words, the allocation logic and the write logic were not using the same model of the world.

Why this is more than a simple off-by-oa classic arithmetic mistake. It is a state-model mismatch. The code is simultaneously aware of the active queue count and blind to it, depending on which function is running. That kind of inconsistency is often more dangerous than a raw boundary error because it survives code review more easily: each individual function can look reasonable in isolation.​

The KASAN trace is especially telling because it reports a vmalloc-out-of-bounds write, not a read. Writes are more dangerous because they can poison adjacent kernel data structures or metadata. Even if the bug is only triggered during a stats query, the damage occurs in the kernel’s memory space, where corruption can have consequences far beyond one ioctl call.
A second important point is that the write happens in response to a user-space ethtool invocation. That does not mean the bug is remotely exploitable in the usual an the trigger is a normal administrative operation. Bugs reachable through standard management interfaces tend to get prioritized because they are realistic, reproducible, and difficult to fence off operationally.

What the crash report proves​

The advisory provides a full KASAN stack trace showing the fault in gem_get_ethtool_stats, called via dev_ethtool, dev_ioctl, sock_ioctl, and the ARM64 syscall path. That trace is valuable because it proves the bug is not speculative. It also shows the write occurs during a standard kernel-user interaction, which raises confidence that the patch addrether than a hypothetical one.
The memory layout details reinforce that conclusion. The write lands just beyond the vmalloc region reserved for the statistics buffer, and the report identifies the exact page and allocation path associated with the overrun. In practical terms, that means the size mismatch is deterministic enough to be caught in testing when KASAN is enabled, but still dangerous enough to matter in production kernels without such instrumentation. Works
The remedy is conceptually small: make gem_get_ethtool_stats honor the active queue count rather than blindly assuming the maximum queue count. The CVE text says the fix ensures the copied size only considers the active number of queues, which aligns the copy operation with the count logic already used elsewhere in the driver.

Why alignment between count and copy matters​

In kernel code, counting and copying must be treated as a paired contract. If one function computes a buffer size based on active queues and another function writes based on maxiact is broken. That is how a safe buffer becomes unsafe even though both functions may appear individually correct.
The better design is to derive both the allocation and the copy from the same topology source. That makes the code easier to reason about and much harder to misuse during future refactors. It also reduces the chance of regressions when queue counts change dynamically across platforms or device revisions.
The stable references listed in the advisory suggest the fix has already been propagated across maintained branches. That is reassuring, but it does not eliminate the need for downstream verification. Distributions often backport fixes in ways that preserve their local version numbers, so operators should not rely on version strings alone.

Why this kind of pataintainers​

The Linux stable process tends to favor narrow, obvious fixes that do not introduce broad behavioral changes. A patch that simply reduces the copied length to the active queue count fits that profile well. It corrects the defect without changing the driver’s external behavior beyond eliminating the overflow.
That is a big deal in networking code. If a patch touched queue enumeration, ethtool formatting, and multiple unrelated code paths, the regression risk would be higher. Here, the correction is tightly scoped: it makes the write match the already-established count. In kernel maintenance, small and precise is usually the safest shape for a security fix.

Technical Impact​

The immediate impact of the bug is memory corruption in kernel space. Depending on the build, platform, and kernel hardening settings, that can manifest as a panic, an oops, a silent overwrite, or an instrumented crash under KASAN. Because the write size in the report is substantial, this is not a trivial overrun that merely bumps a counter by a byte or two.
The larger technical concern is that statistics code tends to be treated as low risk, which means it can slip past defensive scrutiny. Developers often focus on packet parsing, DMA setup, and interrupt handling as the obvious danger zones. But any path that iterates oveiptors, or dynamically sized stats tables can be just as dangerous if the size model is wrong.

Why statistics paths deserve attention​

Statistics are often assumed to be read-only and harmless. That assumption is only valid if the data source and the output size are kept in lockstep. When they are not, the stats path becomes a write primitive into kernel memory. That is why this CVE is significant even though the visible action is just an ethtool query.
For embedded and appliance environments, the stakes can be even higher. Those systems often run specialized kernels, custom drivers, or long-lived firmware branches. A bug in a hardware-specific network driver may persist for a long time before a vendor backport catches up, especially if the device only enables a subset of supported queues in the field.
The fact that the issue is tied to the macb family also matters because that driver is widely used in platforms where firmware and kernel integration are tightly coupled. The more specific the hardware stack, the more important it is that driver assumptions match actual deployment state, not just maximum theoretical capability.

What administrators should infer​

Administrators should read this CVE as a warning about control-plane trust. If a driver’s statistics path can overrun memory, then even a routine maintenance command can become a crash trigger. That is especially relevant in automation-heavy environments where health checks and telemetry systems query interfaces frequently.
It also means testing should not be limited to traffic forwarding. Management operations, especially ethtool reads and driver diagnostics, belong in the validation matrix. A kernel network driver can be functionally correct under packet load and still fail under stats queries if the bookkeeping is inconsistent.

Enterprise Exposure​

The enterprise impact is likely concentrated in systems that rely on macb-based hardware, especially embedded appliances, industrial boards, and specialized ARM deployments. Those platforms may not be common in desktop fleets, but they are common in edge infrastructure where uptime matters and physical access is limited. In those environments, a kernel crash caused by a stats query can be operationally expensive.

Why the risk is not evenly distributed​

Not every Linux system is equally exposed. Desktop users are less likely to notice this bug unless they happen to run affected hardware and exercise the exact ethtool path. By contrast, appliance vendors, telecom operators, and embedded integrators are much more likely to encounter it because they often query driver stats automatically for monitoring, telemetry, or watchdog purposes.
This is also a classic example of how hardware specificity affects vulnerability relevance. The code path is not universal, but where it is present, it can be repeatedly reached. That makes it especially important for fleet managers to inventory devices by driver family, not just by kernel version.
The Microsoft Security Update Guide listing is another clue that enterprise visibility matters here. Microsoft’s vulnerability portal often surfaces Linux CVEs for administrators who track mixed estates from a central dashboard, and that broadens the audience for remediation even when Windows is not directly affected.

Operational checklist​

  • Inventory systems that use Cadence macb/gem-family network drivers.
  • Confirm whether ethtool or similar telemetry jobs run automatically.
  • Check whether your distribution has already backported the fix.
  • Validate kernels in staging before rolling them ireat driver-statistics queries as possible crash triggers on older builds.
  • Prioritize systems that serve as gateways, routers, or management nodes.
  • Verify vendor firmware images, not just upstream kernel releases.
That checklist is especially important because edge and appliance operators often rely on vendor images rather than raw upstream kernels. A fix can exist upstream while the shipped image still carries the vulnerable implementation for weeks or months.

Consumer and Embedded Exposure​

For ordinary consumer desktops and laptops, the practical risk is probably low. Most users do not interact with low-level network driver statistics, and many do not run hardware that exposes this code path at all. Still, low probability is not the same as no risk, especially if the kernel is reused in broader embedded or hobbyist environments.

Where this bug is most likely to matter​

Embedded Linux systems are the most obvious exposure point. Home routers, industrial controllers, SBC-based projects, and custom appliances may all use the macb driver or a derivative of it. In those cases, a management action that seems routine—querying interface stats—can trigger a kernel fault.
This is why consumer impact often looks different from embedded impact. On a desktop, a bad ethtool query might never be issued. On an appliance, stats collection is frequently automated and continuous. That changes the risk profile dramatically, because the bug is not just theoretically reachable; it becomes part of routine operations.

Why embedded systems are harder to patch​

Embedded products often have longer patch lifecycles and heavier vendor dependence. An upstream fix may be obvious and available, but the device image in the field may lag because the vendor has to certify a full firmware release. That delay can leave a known vulnerability exposed longer than it would be on a desktop distribution with rapid updates.
Another concern is that embedded deployments frequently suppress crashes in ways that make root-cause analysis slower. A stats query that wedges a board or trips a watchdog may not immediately be recognized as a driver memory bug. That makes timely backporting and validation especially important.
The most practical advice for consumers and small-office admins is simple: if you are running a device class that vendors describe as network-aware, managed, or appliance-like, do not assume this bug is irrelevant just because it is “kernel-only.” Kernel-only vulnerabilities are often exactly the ones that matter to devices that depend on the kernel for everything.

Relationship to the Linux CVE Process​

This CVE also illustrates how Linux security disclosure works in 2026. The kernel community assigns CVEs once a fix exists and has been applied to a stable tree, which is why the published record often follows the patch rather than preceding it by long periods. That model is different from disclosure programs that emphasize early CVE assignment before remediation.

Why the stable-tree references matter​

The record includes multiple stable references from kernel.org, which indicates that the fix has already been considered suitable for backporting across supported b of the best signals defenders can get that an issue is real and actionable. It also means security teams should look for backported distro patches, not just upstream commit IDs.
The kernel documentation explains that CVEs are assigned as part of the stable release process after a fix is available and queued for supporteexplain why the public record can appear “late” relative to the actual patch work. In practice, that means administrators should monitor both CVE listings and distribution advisories, because the publication timing may not line up perfectly with patch availability.

What this means for patch management​

The main lesson is that a CVE entry is not the whole story. The actual remediation may already be in your distro’s kernel package, or it may be waiting in a vendor backport. The safest move is to verify package contents against the vendor changelog or source patch metadata.
This also reinforces a broader best practice: treat management-plane vulnerabilities with the same seriousness as packet-path bugs. If a stats call can overrun memory, it belongs in your patch queue even if it doesn’t look remotely exploitable at first glance.

Comparison With Other Kernel Bugs​

CVE-2026-31494 belongs to a familiar class of Linux driver vulnerabilities: a size mismatch between what the kernel allocates and what it later copies. That pattern shows up across many subsystems because the kernel often has to translate abstract hardware capabilities into concrete memory layouts. The danger is that one code path uses the current state while another uses the maximum state, and the two never quite agree.

Why this class of bug keeps recurring​

Drivers evolve over time. Queue counts change, new hardware variants arrive, and old assumptions stay embedded in helper functions. It is easy for one helper to be updated to the new model while another remains bound to the old one. That is especially true in network drivers, where queueing models are central to performance and scalability.
The good news is that this kind of bug is usually fixable with a disciplined audit. The challenge is making sure every function that depends on queue topology uses the same source of truth. Once that happens, the driver becomes more maintainable and the risk of future size mismatches drops.
This CVE also sits alongside a broader set of 2026 Linux networking fixes that emphasize state consistency over cleverness. The theme across these issues is consistent: when kernel code assumes a fixed topology, but the runtime state is variable, memory safety problems follow. The result is often a small patch with a large defensive payoff.

What makes macb different​

What sets this bug apart is that the vulnerable path is stats reporting, not the transmit or receive fast path. That can make it easier to overlook, but it also makes the bug a useful case study. It proves that “non-critical” paths in the driver are still security-relevant because they sit inside trusted kernel boundaries.
A stats path also tends to be executed in diagnostic and monitoring workflows, which may happen on a schedule or during incident response. That means the trigger can be both routine and repeatable. In a production environment, that combination is enough to make a bug operationally serious even without a remote exploitation story.

Strengths and Opportunities​

The good news is that this issue looks well understood, narrowly scoped, and already fixed in stable references. That gives operators a clear remediation path and gives maintainers a model example of how to tighten size handling in driver code.
  • The bug has a clear root cause: mismatched queue counts in stats allocation and copy logic.
  • The fix is small and surgical, which lowers regression risk.
  • The crash was confirmed by KASAN, so the failure mode is well evidenced.
  • The vulnerability is in a management path, making it easier to test and verify.
  • Stable references already exist, which should help downstream backporting.
  • The issue creates an opportunity to audit similar queue-based stat code in other drivers.
  • The patch reinforces a useful design rule: keep counting and copying tied to the same source of truth.
The broader opportunity is architectural. Driver maintainers can use this case to review other helper pairs where one function counts based on active state and another writes based on capability maximums. Those are exactly the places where latent out-of-bounds writes like to hide.

Risks and Concerns​

The biggest risk is that this kind of bug is easy to dismiss as “just stats.” That framing is misleading because the write happens in kernel memory, and the trigger is a normal admin command. On systems that rely on the affected driver, that can turn a routine query into an outage.
  • The write occurs in kernel space, so corruption can be serious.
  • The trigger is a legitimate management command, not a rare exploit chain.
  • Systems with fewer active queues are more likely to expose the mismatch.
  • Embedded and appliance devices may patch more slowly than desktops.
  • Monitoring automation can repeatedly hit the vulnerable path.
  • Backport lag can leave fleets exposed even after an upstream fix exists.
  • The bug may be underappreciated because the feature looks low risk at first glance.
There is also the usual operational concern that a fixed upstream tree does not guarantee a fixed downstream image. Vendors may have to re-spin firmware, and organizations may need to wait for maintenance windows. That is why immediate inventory and package verification matter more than relying on headline status alone.

Looking Ahead​

The immediate next step is straightforward: vendors and downstream distributions should ensure the fix is present in all supported kernels that carry the macb driver. Fleet operators should verify whether their hardware even uses that stack, and if it does, they should confirm the patched build is in place before the next monitoring or maintenance cycle.

What to watch next​

  • Backport announcements from Linux distributions and appliance vendors.
  • Kernel release notes that mention macb or gem_get_ethtool_stats.
  • Firmware updates for ARM and embedded networking devices.
  • Any follow-up fixes in adjacent queue-statistics code.
  • NVD enrichment updates, including eventual severity scoring.
  • Vendor advisories that map the CVE to specific product families.
In a larger sense, this CVE is a reminder that the most dangerous bugs in mature kernel code are often the ones that look boring on paper. A stats function that copies too much data is easy to underestimate. But once a write crosses the buffer boundary in kernel space, the difference between a harmless bug and a security issue disappears quickly.
The real lesson of CVE-2026-31494 is not just that macb needed a queue-count fix. It is that kernel code must treat active state as the only state that matters when sizing memory. When the driver’s accounting and its copy logic diverge, the kernel stops being a careful manager of hardware state and becomes a writer past its own boundary. The patch closes that gap, but the architectural lesson will outlast this one CVE.

Source: NVD / Linux Kernel Security Update Guide - Microsoft Security Response Center
 

Back
Top