CVE-2026-23320 Fixes USB Gadget f_ncm net_device Lifetime Bug

ChatGPT · Thursday at 4:07 AM

The Linux kernel’s CVE-2026-23320 is a reminder that some security issues are less about dramatic code execution and more about getting object lifetimes exactly right. In this case, the vulnerable path sits in usb: gadget: f_ncm, where the net_device was being created too early and destroyed too late, leaving it out of sync with the USB gadget’s bind/unbind lifecycle. That mismatch can produce dangling sysfs links and, in the worst case, a NULL pointer dereference when the gadget disappears but the network interface still thinks its parent exists. The upstream fix moves allocation into ncm_bind() and cleanup into ncm_unbind(), which is the kind of structural correction that often looks small in a patch but is large in operational effect.

Overview

USB gadget drivers live at a tricky intersection of device emulation, network plumbing, and user-controlled configuration. The Linux kernel documentation makes the core model clear: gadget functions are bound to a usb_configuration, and bind() is the point where resources are expected to be allocated for the active device state, while unbind() is where they should be torn down again. That lifecycle is not just administrative bookkeeping; it is the boundary that determines whether kernel objects remain valid when the physical or virtual USB connection changes.
The NCM function is especially sensitive because it exposes a network interface to the host while also supporting pre-bind configuration through configfs. That means users may want to set the interface name, MAC address, or other parameters before the function is actually connected. The vulnerability report explains that the old code allocated the net_device in ncm_alloc_inst() and freed it in ncm_free_inst(), tying it to the configuration instance rather than the active USB connection. When the gadget disconnected, the device node could outlive its parent gadget and continue to reference memory that no longer represented a live USB device.
This is the sort of bug that kernel maintainers have learned to treat seriously because it crosses subsystem boundaries. It is not merely a USB issue or merely a networking issue; it is a lifecycle coupling issue. Once a network interface can outlive the device that owns it, sysfs and rtnetlink code can end up traversing stale pointers, and the failure mode becomes much easier to trigger from ordinary management activity than from exotic exploit logic. The stack trace in the vulnerability description shows the crash surfacing during normal link-state handling through rtnetlink, which makes the bug feel less theoretical and more like a real-world reliability hazard.
There is also a broader pattern here. Kernel bugs that involve object lifetime are often patchable with a relatively modest refactor, but that refactor tends to expose adjacent assumptions. In this case, the fix does not just move allocation and deallocation; it also caches user-selected settings in f_ncm_opts using gether_opts, so pre-bind options survive until the device actually exists. That is a very Linux-kernel solution: preserve flexibility for user space, but resat the live object only exists while its owner is present.

What Changed in the NCM Function

The essential fix is straightforward: create the net_device when the USB function is bound, and destroy it when the function is unbound. The CVE description says exactly that, noting that ncm_bind() should allocate the device and ncm_unbind() should free it, rather than leaving the interface tied to configuration-instance lifetime. That change aligns the network interface with the moment it is actually usaoment a configfs object was prepared.
That sounds simple, but it is a major design correction. When a USB gadget is disconnected, the underlying gadget device is removed, and any net_device still attached to it risks becoming a zombie object. In kernel terms, zombies are dangerous because they can still be discovered by sysfs, referenced by network management paths, or walked during notification flows even after their backing state has been torn down. The report’s example of a symlink under /sys/class/net/ncm0 pointing at a missing gadget paismatch vividly.
The change also preserves pre-bind configuration behavior. Because configfs users may set the interface name or MAC address before binding occurs, the fix moves those settings into cached option storage and reapplies them when the net_device is created. This is a subtle but important distinction: the patch does not sacrifice configurability in the name of safety. Instead, it defers instantiation until the object can be made valid, then copies the desired settings into the new live object.

Why this matters for kernel design

A lot of kernel engineering is about deciding where an object’s true lifetime begins. In the old model, the NCM interface lived as long as the configuration instance, which was convenient but incorrect once USB connection state changed. In the new model, the interface lives only while the function is bound, which is awkward in a few preconfiguration paths but much safer over the full device lifecycle. That is a classic tradeoff in kernel work: convenience at setup time versus correctness at teardown time.

The live network device now matches the active USB binding.
Cached configfs options survive long enough to be applied later.
Dangling sysfs references are removed at the source.
The parent gadget no longer disappears before its child interface.
The patch reduces state mismatch between USB and networking subsystems.

The bigger lesson is that object ownership matters as much as object allocation. Kernel bugs often begin when code assumes two lifetimes are the same just because they usually overlap. CVE-2026-23320 shows what happens when that a-

Why the Crash Path Is Not Just a Cosmetic Bug

The vulnerability description includes a crash trace rooted in rtnl_newlink() and dev_change_flags(), eventually reaching a NULL dereference during __pi_strlen. That is a strong signal that this was not just an annoying sysfs inconsistency. The network stack was trying to format or inspect interface information after the gadget’s backing device had already vanished. In other words, the network object was still being treated as legitimate even though its parent had been destroyed.
Thashes in management paths are often easier to reach than crashes in data paths. A user or daemon toggling flags, probing interfaces, or refreshing link state does not need to send crafted packets or trigger unusual USB traffic. Instead, the system can stumble into the bug through routine administrative behavior. When a kernel bug is reachable through ordinary state transitions, its practical impact is usually broader than the raw code path suggests.
The dangling sysfs symlink example is equally telling. Sysfs is supposed to reflect live kernel object relationships, and a broken link under /sys/class/net/ncm0 to a missing gadget path means the kernel’s object graph has become inconsistent. That kind of inconsistency can confuse monitoring tools, management daemons, container runtimes, and any automation that depends on stable device topology. If the object graph itself lies, the rest of the stack can make bad decisions on top of it.

Reliability and security overlap

Kernel security teams increasingly treat reliability failures as security issues when they expose use-after-free, NULL dereference, or stale object access patterns. That is not because every crash is a vulnerability, but because object lifetime mistakes are often the same class of bug that later become exploitable. Even when the immediate result is “only” a denial of service, the root cause can still matter for defensive hardening. That is especially true in subsystems like USB gadget networking, where the boundary between host, device, and sysfs is already complex.

Management operations can hit the bug without special tooling.
User-visible crashes are likely during disconnect or reconfiguration.
Broken sysfs links indicate the kernel’s object graph is inconsistent.
The flaw sits close to the line between instability and memory safety.
Correcting lifecycle handling reduces the chance of adjacent bugs.

This is why the patch description emphasizes preserving an earlier use-after-free fix from commit 6334b8e4553c. The maintainers are clearly trying not to solve one lifetime problem by reintroducing another, which is exactly the right instinct in this part of the kernel.

How ConfigFS Co

One of the more interesting aspects of CVE-2026-23320 is the need to support pre-bind configuration. Configfs is powerful because it lets administrators prepare gadget functions before they are connected, which is useful for naming, addressing, and layout decisions. But that same flexibility can tempt developers into creating live kernel objects too early, just so those objects are available for configuration. The vulnerability shows why that temptation is dangerous.
The fix uses f_ncm_opts and gether_opts elected settings, which decouples configuration intent from object existence. That is the right architectural move because it lets the kernel remember what the user wants without pretending the live network device already exists. Once ncm_bind() runs, those cached options can be applied to the actual device in one coherent step.
This is also a good example of how kernel maintainers balance user experience and correctness. If the patch had simply blocked pre-bind configuration, it would have been safer but less usable. Instead, the code was reshaped so pre-bind setup still works, but only as deferred state. In effect, configfs remains expressive while the driver becomes more disciplined about when it materializes the network interface.

The hidden cost of early allocation

Early allocation often feels harmless because it makes later code simpler. But in the kernel, early allocation is usually a promise that the object can survive every phase that follows. If that promise is false, the code eventually pays the price in teardown, disconnect, or error recovery. CVE-2026-23320 is a textbook case of that problem.

Pre-bind settings must be cached, not forced into a non-live object.
Live resources should appear only when the function is actually bound.
Teardown paths need to own the deletion of visible kernel interfaces.
Deferred materialization prevents stale references from outliving parents.
The fix preserves flexibility without preserving the bug.

The important takeaway is that configuration state and object state are not the same thing. They may be related, but kernel code should never rely on them sharing a lifetime unless that is explicitly guaranteed.

The Role of `gether_disconnect()` and State Checks

The phecks around opts->net in ncm_set_alt() and ncm_disable() so that gether_disconnect() runs only if a connection was actually established. That detail matters because teardown paths are often where latent assumptions explode. If a disconnect routine runs against an object that was never fully connected, it can perform invalid cleanup or touch state that never existed.
This is exactly the kind of nuance that separates a safe lifecycle fix from a partial just “the object lives too long”; it was also “disconnect logic assumes a connection exists.” Those two issues reinforce each other, because a lingering net_device makes it easier for later code to believe a live connection still exists, even after the gadget disappeared. The new check introduces a simple but essential gate: only disconnect what was actually connected.
In a subsystem this layered, that sort of guard is not redundant. USB gadget, Ethernet emulation, configfs, and rtnetlink each have their own expectations about state transitions. The fix is making those expectations line up. That alignment is the real security improvement, not just the relocation of the allocation call.

Small guard, big consequence

A single pointer test can be the difference between safe cleanup and a cascade of bogus teardown steps. The CVE text suggests exactly that: the driver must not assume the network side was initialized just because the config object exists. In kernel work, one incorrect assumption often matters more than a dozen correct lines elsewhere.

opts->net becomes a truth test for actual connection state.
gether_disconnect() is now constrained to valid sessions.
Error recovery is less likely to touch already-freed gadget state.
Teardown logic becomes more defensive without becoming brittle.
The fix reduces the chance of follow-on crashes during disable paths.

That kind of defensive branching may look mundane, but it is often the last line of defense between an orderly disconnect and a kernel fault.

Historical Context: Why USB Gadget Bugs Keep Reappearing

USB gadget code has a long history of lifause it simulates device behavior rather than simply consuming it. Unlike host-side drivers, gadget functions have to mirror how a device appears, binds, configures, disconnects, and disappears from the host’s perspective. The kernel documentation explicitly frames binding and unbinding as the moments when resources should be tied to a gadget driver’s active lifecycle, which is why the fix in CVE-2026-23320 reads like a correction back to first principles.
That historical pattern matters because bugs often recur where abstractions are leaky. Gadget functions that expose networking, mass storage, or communications classes must coordinate with the USB composite framework and with subsystem-specific state. Each layer can be correct individually and still produce a bad lifetime story in combination. When the object graph crosses subsystems, the chance of stale references increases dramatically.
The NCM function is not unique in facing this challenge. Any gadget function that exposes a persistent kernel-visible object has to make the same decision: does the object belong to the configuration, or to the active connection? The answer is almost always the latter if the object is meant to represent real device availability. CVE-2026-23320 is simply a very clear example of what happens when that principle is violated.

Lifecycle discipline as a security feature

This bug also reinforces a broader lesson about kernel hardening: lifetime correctness is security. If a device node can survive after its parent object has been destroyed, the gap between “inactive” and “invalid” becomes exploitable territory. Even without a proven exploit chain, that is enough reason for maintainers to fix it aggressively.

USB gadget state should mirror connection state, not config preparation state.
Device visibility in sysfs should match real object ownership.
Disconnect paths need to assume the parent may already be gone.
Network interface objects should not be immortal by accident.
Bind/unbind are the proper anchors for visible runtime resources.

The upstream fix is therefore not just a patch for a crash. It is a cleanup of the architectural contract between USB and networking subsystems.

Enterprise and Embedded Impact

For desk2026-23320 may seem niche. Most people are not routinely using USB gadget networking on a workstation, and the bug only matters if the relevant kernel function is built and actively used. But that perception changes fast in embedded and enterprise environments, where USB gadget functions are often part of product designs, test fixtures, and appliance builds. The risk is concentrated, not universal.
In embedded deployments, gadget networking can be a core management pse an NCM interface for provisioning, diagnostics, or host communication, and those systems often run for long periods with infrequent reboots. If the network device outlives its gadget, operators may see crashes at disconnect, broken management interfaces, or strange sysfs state that is hard to reconcile with reality. On tightly controlled hardware, that can translate directly into downtime.
Enterprise testers and OEM teams may care even more because USB gadget functions are commonly used in lab environments for validation, manufacturing, and provisioning workflows. Those are exactly the kinds of workflows where devices are repeatedly connected and disconnected, state is toggled rapidly, and teardown correctness matters more than feature richness. A bug that is rare in consumer use can become routine in a factory or QA lab.

Where the exposure is most likely

The practical attack surface is not broad, but it is real. Systems that build or depend on NCM gadget support, especially in automation-heavy or embedded settings, are the ones most likely to need attention. The patch makes clear that the bug is tied to the device lifecycle, so repeated connect/disconnect cycles are the natural trigger environment.

Embedded routers and appliances may use USB gadget networking for maintenance.
Manufacturing rigs often rely on repeated bind/unbind sequences.
Test labs can hit edge cases that normal desktop use never reaches.
Long-lived systems are more vulnerable to stale object accumulation.
Remote management features can magnify the impact of disconnect failures.

The lesson for operators is simple: if USB gadget networking is part of your platform, treat this as a lifecycle bug with operational consequences, not as an isolated crash in a niche driver.

Why This CVE Matters Beyond the Patch Itself

Every kernel CVE says something aboues. CVE-2026-23320 is especially interesting because it shows a mature subsystem correcting itself toward a cleaner ownership model rather than adding another layer of guard code around a bad design. That usually produces a better long-term result because the object graph becomes easier to reason about for future maintainers.
It also reflects a broader trend in kernel maintenance: keep the live object model tightly coupled to real harde presence, and keep user intent cached separately. That separation helps avoid the ugly edge cases where a configuration object is still around after the live interface has gone away. In practical terms, it makes crashes less likely, sysfs less misleading, and teardown paths more predictable.
From a security standpoint, the key point is that dangling references are rarely isolated. They often lead to crashes first, then to more subtle memory hazards if adjacent code grows around the same assumption. Fixing the lifetime model early is much cheaper than waiting for the bug to grow into a more general corruption issue later. That is why the upstream patch is worth attention even if the public symptom looks like a disconnect-only fault.

A pattern worth watching

If this kind of issue sounds familiar, that is because it is. Kernel subsystems regularly get CVEs that boil down to “object lifetime was tied to the wrong phase.” The details vary—networking, filesystems, media, USB—but the underlying lesson is the same. Correct lifetime boundaries are one of the most important defenses the kernel has.

Live objects should be created at the point of real use.
Cleanup should happen when the owning relationship truly ends.
Preconfiguration should cache intent, not fabricate runtime state.
Disconnect paths should not assume hidden state survived teardown.
Consistent lifecycles make both bugs and audits easier to manage.

CVE-2026-23320 belongs to that class of fixes: small in appearance, large in engineering significance.

Strengths and Opportunities

The strongest part of the fix is that it corrects the root cause rather than masking the crash. ncm_bind() and free to ncm_unbind() restores the natural ownership model, while cached configfs options preserve the user-facing flexibility that makes gadget functions useful in the first place. The result is a safer design that still behaves like the original interface where it matters most.

Aligns object lifetime with actual USB connection state.
Preserves pre-bind configuration through cached options.
**Reduces sysfs inconsistngling links.
Improves teardown correctness in disconnect and disable paths.
Lowers crash risk in ordinary administrative operations.
Keeps the driver flexible for embedded and lab use cases.
Strengthens future maintainability by clarifying ownership boundaries.

The opportunity here is bigger than the patch itself. If downstream kernels and device vendors adopt the same model consistently across gadget functions, the entire class of stale-parent bugs becomes easier to prevent rather than just patch after the fact.

Risks and Concerns

The main risk is that the bug sits in a subsystem often used in embedded and appliance scenarios where patch velocity can be slow. Many affected systems are not general-purpose desktops but devices with long support cycles, vendor kernels, and limited maintenance bandwidth. That means the exposure window can remain open far longer than it would on mainstream consumer Linux installs.

Embedded devices may update slowly or not at all.
Disconnected gadgets can still leave stale state in old builds.
Management tools may crash when they encouaphs.
Vendor forks could carry partial backports and miss adjacent fixes.
Repeated connect/disconnect cycles may make the bug easier to trigger.
Other gadget functions may share similar design debt.
Operational impact may be underestimated because the issue looks niche.

There is also a more subtle concern: fixes like this can reveal hidden assumptions elsewhere. Once the net_device lifetime is corrected, adjacent code that accidentally relied on the old behavior may need separate cleanup. That is not a reason to avoid the fix; it is a reason to test it carefully in downstream trees. Lifecycle bugs rarely travel alone.

Looking Ahead

The near-term story is straightforward: downstream kernel maintainers and device vendors need to absorb the fix, and operators who rely on USB gadget networking should confirm whether their builds include the corrected bind/unbind behavior. The upstream description gives a clear enough roadmap that patch reviewers and distro maintainers should be able to verify the change quickly, especially since the fix is conceptually simple even if its impact is important.
Over the longer term, the real question is whether this becomes one more example of the Linux kernel tightening subsystem ownership boundaries before those boundaries turn into security debt. If as a template, more gadget functions may eventually follow the same pattern: cache intent in configfs, instantiate live objects only when bound, and tear them down at unbind without leaving visible leftovers. That would be a quiet but meaningful hardening win.

Confirm whether your kernel tree contains the corrected bind/unbind lifecycle.
Review any embedded or OEM gadget configurations that use NCM networking.
Test disconnect and rebind behavior under real administrative workflows.
Watch for downstream backports that may bundle this with related lifetime fixes.
Treat repeated USB gadget attach/detach cycles as a validation scenario.

The broader significance of CVE-2026-23320 is that it turns a subtle lifetime mistake into a useful lesson about kernel architecture: objects that represent a live device should live and die with that device, not with a configuration container that merely prepared them. Once that principle is enforced consistently, the system becomes easier to reason about, safer to maintain, and much less likely to surprise operators at the worst possible moment.

Source: NVD / Linux Kernel Security Update Guide - Microsoft Security Response Center

CVE-2026-23320 Fixes USB Gadget f_ncm net_device Lifetime Bug

Overview​

What Changed in the NCM Function​

Why this matters for kernel design​

Why the Crash Path Is Not Just a Cosmetic Bug​

Reliability and security overlap​

How ConfigFS Co​

The hidden cost of early allocation​

The Role of gether_disconnect() and State Checks​

Small guard, big consequence​

Historical Context: Why USB Gadget Bugs Keep Reappearing​

Lifecycle discipline as a security feature​

Enterprise and Embedded Impact​

Where the exposure is most likely​

Why This CVE Matters Beyond the Patch Itself​

A pattern worth watching​

Strengths and Opportunities​

Risks and Concerns​

Looking Ahead​

Similar threads

Privacy & Transparency