
Microsoft has published KB5089618, a Windows Update package for the Qualcomm QNN Execution Provider, bringing the Windows ML Runtime Qualcomm QNN Execution Provider component to version 2.2604.2.0 on devices running Windows 11, version 26H1. While the support note is brief, the update is part of a much larger shift in how Windows delivers on-device AI acceleration: instead of asking every app developer to package hardware-specific AI runtimes, Windows can service key execution providers through Windows Update and keep the AI stack aligned with the device’s silicon, drivers, and operating-system build.
KB5089618 applies to Windows 11 version 26H1, all editions. It is not a general-purpose Windows 11 update for existing 24H2 or 25H2 PCs, and it is not an in-place path to Windows 11 26H1. The update targets the Qualcomm QNN Execution Provider component used by ONNX Runtime and Windows machine-learning scenarios that rely on ONNX Runtime. In plain terms, it updates the piece of the local AI runtime that helps Windows and apps run compatible ONNX models on Qualcomm hardware acceleration instead of relying only on the CPU.
The key installed entry to look for is:
- Windows ML Runtime Qualcomm QNN Execution Provider Update (KB5089618)
The update is delivered automatically through Windows Update. Microsoft does not describe a manual installer path in the support note, and the package is not positioned as an optional developer SDK download. The prerequisite is also important: the device must already have the latest cumulative update for Windows 11, version 26H1 installed. In managed environments, that means administrators should treat KB5089618 as dependent on the normal monthly servicing baseline for 26H1 rather than as a standalone component that can be evaluated entirely in isolation.
The support article also states that KB5089618 replaces KB5078978, which was an earlier Qualcomm QNN Execution Provider update for Windows 11 26H1. This replacement relationship matters for update compliance, because administrators may see older reporting references to KB5078978 while the newer applicable package is KB5089618. For most users, no direct action is needed: Windows Update should install the newer component automatically when the device is eligible and up to date.
The Qualcomm QNN Execution Provider is an execution provider for ONNX Runtime. ONNX Runtime is a cross-platform inference engine for running machine-learning models in the ONNX format. An execution provider is the layer that maps model operations to the available hardware backend. Without a hardware-specific provider, a model may run on the CPU or another generic backend. With the correct provider, supported parts of a model can be offloaded to specialized silicon, which can improve performance, responsiveness, and power efficiency.
In this case, the execution provider connects ONNX Runtime to Qualcomm’s AI acceleration stack. Microsoft’s description says the QNN Execution Provider uses the Qualcomm AI Engine Direct SDK, also known as the QNN SDK, to construct a QNN graph from an ONNX model. That graph can then be executed by a supported Qualcomm accelerator backend library. On Windows on Snapdragon devices, that usually means the runtime is trying to take advantage of Qualcomm AI hardware such as the Hexagon Tensor Processor or related NPU resources where the model, data types, and operators are compatible.
This update is therefore not an ordinary driver update in the familiar display, audio, or network sense. It is an AI runtime component update. It sits in the path between Windows ML or ONNX Runtime workloads and the Qualcomm acceleration backend. That distinction explains why Microsoft labels the installed item as a Windows ML Runtime Qualcomm QNN Execution Provider Update rather than simply as a Qualcomm driver.
For everyday users, KB5089618 may not visibly change the desktop, Start menu, taskbar, or Settings app. The update is more likely to affect behind-the-scenes behavior for local AI workloads, including apps or Windows components that use ONNX Runtime through Windows ML. If an app uses an ONNX model that can run through the QNN provider, the updated component can help keep that experience compatible with the current Windows 11 26H1 servicing baseline and Qualcomm’s AI stack.
For developers, the update is more interesting. Windows is increasingly moving toward a model where AI apps can rely on the operating system to discover and service execution providers. Rather than bundling large hardware-specific inference libraries inside every application, developers can target Windows ML or ONNX Runtime pathways and allow the platform to select the appropriate acceleration provider. On Qualcomm systems, that provider may be QNN. On other hardware, a different provider may be selected. KB5089618 is one example of Microsoft servicing those provider components through Windows Update.
This approach has several advantages. First, it reduces app package size and dependency complexity. A developer does not need to ship every possible vendor runtime for every possible NPU or GPU. Second, it gives Microsoft and silicon vendors a channel to update AI execution behavior as new hardware arrives. Third, it helps keep Windows’ local AI platform more consistent across system updates, driver updates, and app updates. Finally, it lets Microsoft respond to compatibility, reliability, or performance issues in the AI stack without waiting for every individual app developer to publish a new build.
The scope of Windows 11 version 26H1 is critical to understanding KB5089618. Windows 11 26H1 is a specialized release designed for select new hardware platforms rather than a broad annual feature update for existing Windows 11 PCs. Microsoft has described 26H1 as a scoped release intended to support new device innovation in partnership with OEMs and silicon partners. It is not being offered as an in-place update from Windows 11 24H2 or Windows 11 25H2. Existing 24H2 and 25H2 devices remain on their own servicing path and continue to receive security, quality, and feature updates through the established Windows servicing model.
That means users should not expect KB5089618 to appear on a normal Intel, AMD, or earlier Qualcomm Windows 11 24H2/25H2 PC. It is for 26H1. If a user is checking Windows Update on a device running Windows 11 25H2 and does not see KB5089618, that is expected behavior. The absence of KB5089618 does not mean the device is missing a general Windows security patch. It means that the specific Qualcomm QNN Execution Provider update for Windows 11 26H1 does not apply to that system.
For Windows 11 24H2 and 25H2 systems, Microsoft has published separate Qualcomm QNN Execution Provider updates in the KB5077xxx range, including packages that apply to those versions instead of 26H1. The existence of those separate updates reinforces the idea that Microsoft is maintaining AI runtime components per Windows release and platform target. Administrators should therefore avoid assuming that one QNN Execution Provider KB applies universally across every Windows 11 version.
KB5089618 also highlights the role of componentized AI servicing. Historically, many Windows capabilities were serviced either through cumulative updates, Store app updates, driver packages, or application updates. AI runtimes blur those lines. They are low-level enough to require hardware-vendor alignment, but high-level enough to be used by applications and Windows features. By delivering the QNN Execution Provider through Windows Update, Microsoft can update a component that supports AI execution without necessarily bundling every change into a full feature update.
That is especially important for AI-capable PCs. Local AI workloads are sensitive to small differences in model format, quantization, operator coverage, memory handling, and backend behavior. A model that runs correctly on one runtime build may fail or fall back to the CPU on another if an operator is unsupported or a graph transformation changes. Updating the execution provider can improve the reliability of the path between ONNX Runtime and the device accelerator.
The support article does not provide a detailed changelog for version 2.2604.2.0. It says the update includes improvements to the Qualcomm QNN Execution Provider AI component for Windows 11 26H1. Because Microsoft does not list individual fixes, users and administrators should be cautious about making specific claims such as “this fixes a particular app” or “this improves performance by a certain percentage.” The safer interpretation is that the package updates the AI component and may include compatibility, reliability, or performance improvements, but Microsoft has not publicly broken those improvements down in the KB note.
In managed environments, KB5089618 should be handled like other Windows Update-delivered platform components. IT teams should confirm that Windows 11 26H1 devices are receiving the latest cumulative update first, then check whether the QNN provider update follows. If update compliance tools report KB5089618 as missing, the first question should be whether the device is actually running Windows 11 26H1 and whether it has the required cumulative update baseline. The second question should be whether update policies, deferrals, network controls, or Windows Update for Business settings are delaying the component.
Because Windows 11 26H1 is not a broad deployment release, many organizations may have only a limited number of devices affected by this KB. These are likely to be new hardware systems that shipped with 26H1 preinstalled. For those organizations, KB5089618 should be part of the device validation checklist for AI-capable workloads. If a business is testing local AI applications on Snapdragon-based 26H1 hardware, confirming the installed QNN Execution Provider version can help ensure tests are being run on the latest serviced stack.
The practical verification path is straightforward:
- Open Settings.
- Go to Windows Update.
- Select Update history.
- Look for Windows ML Runtime Qualcomm QNN Execution Provider Update (KB5089618).
For developers testing ONNX Runtime behavior, Windows Update history is only one layer of validation. The next layer is application-level testing. A model may still need to be quantized or structured correctly for the QNN HTP backend. The QNN Execution Provider does not magically make every ONNX model run on the NPU. Model compatibility depends on operator support, data types, shapes, quantization format, and backend capabilities. Unsupported parts of a graph may fail or fall back to another provider depending on runtime configuration.
That distinction matters when diagnosing performance. A user may install KB5089618 and still see an AI workload run on the CPU if the model is not compatible with QNN acceleration. Conversely, an app may be using Windows ML and automatically select the best available provider without exposing the provider name to the user. In that case, the update can matter even though the app does not show “QNN” anywhere in its interface.
The QNN Execution Provider has several backend concepts. The HTP backend is commonly associated with NPU offload on Qualcomm hardware. Other backend options, such as CPU or GPU, may exist depending on the runtime and SDK configuration. For NPU-focused scenarios, the most important point is that the model must usually be prepared in a format the backend supports. Many NPU backends favor quantized models, often using 8-bit or 16-bit integer formats, because those formats are better suited to the accelerator’s performance and power characteristics.
This is why AI developers working on local inference should pay close attention to model preparation. A model exported from PyTorch or TensorFlow into ONNX as FP32 may be correct, but not optimal or even supported for a given NPU path. It may need quantization, fixed input shapes, and operator compatibility review. Tools such as ONNX Runtime quantization utilities and hardware-aware optimization pipelines can help convert models into more suitable formats. The QNN provider is one part of the puzzle; the model still needs to be designed or transformed for the target backend.
KB5089618 may also be relevant for apps that rely on Windows ML rather than directly calling ONNX Runtime APIs. Windows ML abstracts some of the provider-selection complexity. It can discover available hardware and use the appropriate execution provider. In that model, the QNN provider becomes part of the platform layer. Developers can focus on the model and app experience while Windows handles more of the low-level provider discovery and loading.
This abstraction is one reason Microsoft is emphasizing Windows Update delivery for execution providers. If execution providers are treated as platform components, then they can be updated alongside the OS and hardware ecosystem. That reduces the risk of an app bundling an outdated provider or using a runtime that does not match the driver and OS assumptions of a new device.
For users, the most common question is whether KB5089618 is “safe” or whether it should be skipped. Since Microsoft marks it for automatic delivery through Windows Update and makes it dependent on the latest 26H1 cumulative update, it should be treated as a normal servicing update for eligible devices. It is not described as a preview, workaround, or manual hotfix. Unless an organization has a specific change-management reason to pause updates temporarily, installing it is the expected path.
Another common question is whether the update requires a restart. The support article does not specify restart behavior. Windows Update may request a restart depending on what files are updated, what components are in use, and how the package is staged. Users should follow the Windows Update prompt. In enterprise environments, restart behavior should follow existing update policy and maintenance-window controls.
If KB5089618 fails to install, the troubleshooting approach should be similar to other Windows Update component failures. First, make sure the device is actually on Windows 11 26H1. Second, install the latest cumulative update for 26H1 and reboot if required. Third, check Windows Update again. Fourth, review Windows Update error codes in Settings or event logs. Fifth, on managed devices, verify that policy is not blocking the update category or delaying the package. If the device is not 26H1, the correct fix is not to force KB5089618; it is to use the updates applicable to that Windows version.
If the update installs but an AI application still does not use the NPU, the issue may not be the KB. Developers and power users should check whether the model is compatible with the QNN backend, whether the app is using Windows ML or ONNX Runtime in a way that enables hardware acceleration, whether CPU fallback is occurring, and whether the device’s Qualcomm drivers and firmware are current. The Windows Task Manager can show NPU utilization on supported systems, which can help determine whether a workload is actually reaching the NPU.
For deeper analysis, developers can use Windows Performance Recorder and Windows Performance Analyzer to inspect ONNX Runtime events, NPU usage, and execution timing. Qualcomm-specific profiling tools may also provide lower-level information about accelerator activity. These tools are more advanced than most users need, but they are useful when validating whether an execution provider update changes model load time, inference latency, CPU fallback behavior, or NPU utilization.
KB5089618 also fits into a broader pattern of AI component updates for Windows. Microsoft maintains “AI components” as separately identifiable pieces of the platform. These components can include runtimes, providers, and model-related infrastructure used by Windows AI experiences. As AI functionality becomes more dependent on local acceleration, these components will likely become more visible in update history. Users may see update names that refer to Windows ML Runtime, execution providers, or vendor-specific AI components rather than familiar OS feature names.
This visibility can be confusing at first. A typical user may not know what ONNX Runtime, QNN, or an execution provider is. But the naming is useful for transparency. It tells administrators and developers exactly which hardware acceleration component was updated. It also makes it possible to audit whether a device has the correct AI runtime layer for a given platform.
For Windows 11 26H1 devices, KB5089618 is one such audit point. It confirms that the Qualcomm QNN Execution Provider update version 2.2604.2.0 has been applied. If an organization is testing a fleet of new Snapdragon-based systems, ensuring that all test devices have the same QNN provider update can reduce variability. Without that consistency, performance comparisons may be misleading because two devices could be running different execution-provider builds.
The replacement of KB5078978 is also relevant for documentation and support teams. If internal instructions previously told users to check for KB5078978, those instructions should be updated to mention KB5089618. Since KB5089618 supersedes the earlier update, the newer package is the one users should expect to see after current servicing. Historical references to KB5078978 remain useful for understanding the update chain, but they should not be treated as the current endpoint.
It is also worth emphasizing what KB5089618 does not do. It does not turn a non-Qualcomm PC into a Qualcomm AI-accelerated system. It does not install Windows 11 26H1 on existing Windows 11 24H2 or 25H2 devices. It does not guarantee that every AI application will run faster. It does not replace the need for compatible models, drivers, and app code. It is a platform component update for eligible 26H1 systems that use the Qualcomm QNN Execution Provider.
The update’s connection to ONNX Runtime also does not mean that every ONNX Runtime installation on the system is automatically identical. Developers who install their own Python packages, NuGet packages, or private runtime builds may be using components outside the Windows-serviced path. Apps that rely on Windows ML or platform-provided execution providers benefit most directly from Windows Update-delivered provider servicing. Apps that bundle their own runtimes may need their own update process.
That distinction is important for support. If a Store app or built-in Windows feature uses Windows ML and the platform provider, KB5089618 may affect it. If a developer runs a custom Python environment with a separately installed onnxruntime-qnn package, behavior may depend on that package’s bundled libraries and configuration. In other words, the presence of KB5089618 is a strong indicator for the Windows-serviced AI component, but it is not a universal statement about every ONNX Runtime binary a developer may have installed manually.
For most users, the best advice is simple: keep Windows 11 26H1 fully updated, allow Windows Update to install KB5089618 automatically, and verify it in Update history if needed. For IT administrators, add KB5089618 to compliance checks for eligible Windows 11 26H1 Qualcomm systems and remember that it requires the latest cumulative update. For developers, treat it as part of the serviced AI runtime baseline and continue validating model compatibility, quantization, provider selection, and fallback behavior.
The larger story is that Windows AI acceleration is becoming a serviced platform capability rather than a collection of one-off vendor packages. KB5089618 is a small support article, but it points to that larger architecture. Microsoft, Qualcomm, ONNX Runtime, Windows ML, and Windows Update all meet at this layer. As Copilot+ PCs and NPU-enabled workloads become more common, these execution-provider updates will become increasingly important for ensuring that local AI runs efficiently, reliably, and consistently on the hardware it was designed to use.
Source: Microsoft Support KB5089618: Qualcomm QNN Execution Provider update (2.2604.2.0) - Microsoft Support