KB5078978: Qualcomm QNN Execution Provider Update for Windows 11 26H1 with ONNX Runtime

  • Thread Author
Microsoft has published KB5078978, a compact Windows Update package that refreshes the Qualcomm QNN Execution Provider used by the ONNX Runtime to version 1.8.30.0, and targets devices running Windows 11, version 26H1 — a continuation of Microsoft's componentized delivery of on‑device AI runtimes and vendor execution providers. (support.microsoft.com)

Background​

What the Qualcomm QNN Execution Provider is, and why it matters​

The QNN Execution Provider (QNN EP) is an ONNX Runtime backend that converts ONNX graphs into Qualcomm AI Engine (QNN) graphs and delegates operator execution to Qualcomm’s accelerator backends (HTP/NPU, GPU and CPU equivalents). It is the bridge that lets ONNX models run efficiently on Snapdragon SoCs by using the Qualcomm AI Engine Direct (QNN) SDK and related runtime components. On Windows, Microsoft distributes QNN EP as a system component so multiple apps and system features can reuse a single validated runtime instead of bundling their own copies.

Microsoft’s componentized AI update strategy​

Over the past year Microsoft has been shipping on‑device AI runtimes and model components as separate, small OS components (AI components / execution providers), delivered via Windows Update rather than waiting for full feature updates. This approach allows faster iteration of vendor runtimes and models, but it also means administrators and developers must track separate KB entries for updates that affect on‑device inference behavior. KB5078978 is the latest example of that pattern for Qualcomm’s QNN EP targeted at the Windows 26H1 servicing branch. (support.microsoft.com)

What KB5078978 actually says​

  • Applies to: Windows 11, version 26H1 (all editions). (support.microsoft.com)
  • Component updated: Windows ML Runtime — Qualcomm QNN Execution Provider upgraded to 1.8.30.0 (public KB summary: “includes improvements”). (support.microsoft.com)
  • Delivery: Automatic via Windows Update; the update will appear in Settings → Windows Update → Update history after installation. (support.microsoft.com)
  • Prerequisite: the device must have the latest cumulative update (LCU) for Windows 11, version 26H1 installed before the package will apply. (support.microsoft.com)
  • Replacement behavior: the KB text explicitly states this update does not replace any previously released update (for 26H1). For other servicing branches Microsoft has previously published QNN EP updates that replaced earlier packages — for example the same 1.8.30.0 package was published for 24H2/25H2 under a different KB number. This shows Microsoft retargets the same EP binary to different Windows servicing branches with distinct KBs.
Important note: Microsoft’s public KB entries for these AI components are intentionally short; they describe scope, delivery, and gating but rarely include operator‑level change logs, microbenchmarks, or internal QA details. That lack of granular public changelog is an important operational consideration (see Risks & unknowns). (support.microsoft.com)

Technical deep dive: what QNN EP does and what changed (what we can verify)​

The role of QNN EP inside ONNX Runtime and Windows ML​

  • ONNX Runtime provides a modular runtime that supports multiple execution providers (EPs). The QNN EP is the provider enabling Qualcomm accelerator usage on both Android and Windows Snapdragon devices; on Windows it is surfaced via the Windows ML / shared ONNX Runtime distribution so applications can rely on a common runtime supplied by the platform.
  • The QNN EP constructs a device‑specific QNN graph from an ONNX model and delegates computation to Qualcomm’s backend libraries (HTP/NPU, GPU, or CPU). It supports provider configuration options (for example: backend_type, backend_path, profiling and context cache) that influence which backend is used and what runtime telemetry/profiling is emitted.

Build and packaging notes verified from ONNX Runtime documentation​

  • Starting with ONNX Runtime 1.18.0, prebuilt packages include the necessary QNN dependency libraries so developers no longer need to separately download a QNN SDK for some workflows. The QNN EP documentation also lists QNN Version Requirements and the SoC families used for testing (Qualcomm SC8280, SM8350 and various Snapdragon X parts appear in testing matrices). These entries confirm the EP's dependency and platform expectations for Windows authors.
  • The QNN EP supports multiple backend types: htp (default NPU/HTP offload), gpu, cpu, as well as explicit backend DLL paths through provider options. Profiling options generate CSV and (when supported) QNN log files consumable by Qualcomm profiling tools. The EP also exposes mechanisms for context binary caching (to speed session creation) and has documented error handling for an NPU SSR (subsystem restart) condition that requires recreating session objects. These runtime behaviors and knobs are important for developers who deploy models to devices with Qualcomm NPUs.

How this update fits into the release history and what that implies​

Microsoft has published several QNN EP KBs over the last year, each targeted at a particular Windows servicing branch:
  • KB5067994 — QNN EP update (1.8.13.0) for Windows 11, version 24H2.
  • KB5072095 — QNN EP update (1.8.21.0) for 24H2/25H2 (replaced prior KB).
  • KB5077526 — QNN EP update (1.8.30.0) for 24H2/25H2 (replaced KB5072095).
  • KB5078978 — (this KB) QNN EP update (1.8.30.0) for 26H1. (support.microsoft.com)
The pattern is consistent: Microsoft repackages and retargets validated EP binaries to different Windows servicing branches and publishes separate KBs. For organizations that manage large fleets across servicing branches, this means the same functional update (same major EP version) may appear multiple times with different KB numbers depending on branch — useful for tracing compatibility to OS servicing level, but also a source of confusion if you track KB numbers rather than component versions.

Practical implications — who this affects and how​

For end users (consumers)​

  • If your device is a Snapdragon‑powered Windows 11 PC running 26H1 and it meets the LCU prerequisite, you will receive this update automatically via Windows Update. The update will show up in Settings → Windows Update → Update history as “Windows ML Runtime Qualcomm QNN Execution Provider Update (KB5078978)”. (support.microsoft.com)
  • Typical consumer impact is transparent: improvements to the QNN EP are intended to make hardware‑accelerated AI features faster or more reliable, and non‑Copilot/AI users may not notice anything. However, in edge cases a runtime update can change the set of operators offloaded to the NPU or alter fallback behavior, which could influence app performance or correctness for apps that depend on implicit provider behavior. Because Microsoft’s public KB does not list operator‑level changes, there is limited public visibility into those runtime details. (support.microsoft.com)

For developers deploying ONNX models​

  • You should verify runtime behavior on the actual target hardware after the update. Use ONNX Runtime APIs to confirm the QNNExecutionProvider is available and being used, and run your model validation suite to check output parity, latency and memory usage. Typical checks include calling ONNX Runtime helper APIs (for example: ort.get_all_providers() / ort.get_available_providers() or inspecting session.get_providers()) to confirm provider availability and to ensure the provider options you depend on are honored.
  • For NPU scenarios you may want to toggle QNN EP provider options like backend_type, profiling_level, htp_performance_mode and device_id to reproduce realistic workloads. Use the EP’s profiling and binary context cache features to diagnose session creation time and operator mapping. The ONNX EP docs include examples for Python and C++ showing how to pass provider options and enable profiling.

For IT administrators and OEMs​

  • Because the update is delivered through Windows Update, enterprises can manage distribution using WSUS, Windows Update for Business and Endpoint Manager policies — for example, syncing and approving/declining updates in WSUS or deferring updates per Update for Business settings. WSUS allows you to approve updates for installation or leave them unapproved (detect only), which is the standard mechanism administrators use to stage new component updates for pilot rings before broad deployment.
  • The fact that Microsoft ships identical EP binaries across servicing branches but with different KBs means your internal update tracking should focus on component version numbers (e.g., QNN EP v1.8.30.0) as well as KBs — this avoids confusion where the same binary appears under multiple KB entries for different Windows branches.

Risks, unknowns and operational caveats​

1) The KB is vague about “improvements”​

Microsoft’s public KB entries for AI components typically say “includes improvements” without listing operator‑level changes, regression fixes, or performance deltas. That makes it hard to determine the exact impact on a given model or workload without hands‑on testing. Treat the KB as a distribution notice rather than a technical changelog. (support.microsoft.com)

2) Potential for compatibility and performance regressions​

Because execution providers change how operators are partitioned and executed (NPU vs CPU vs GPU), a component update can:
  • Alter numeric results slightly due to different kernel implementations or quantization behavior.
  • Change performance characteristics (latency, memory) for specific models.
  • Expose previously unseen errors if the QNN EP exposes new operator validations or stricter checks.
Developers should run regression suites and end‑to‑end performance tests on representative hardware after such updates. ONNX Runtime’s documented SSR (subsystem restart) condition and the advice to recreate sessions after an HTP SSR are concrete examples of runtime behavior you may need to handle in production apps.

3) Driver and firmware stack coupling​

NPU and GPU runtimes are tightly coupled to SoC firmware and vendor drivers. An EP binary optimized for a particular QNN SDK version or driver revision can behave differently if the underlying Qualcomm runtime (drivers/firmware) on a device is older or inconsistent. This coupling is why Microsoft gates component installs behind the OS LCU and why device vendors must test the full stack.

4) Visibility and troubleshooting limitations for end users​

If an app begins to fail after a component update, the public KB may not include the information you need to diagnose the issue. Administrative tools — WSUS metadata, event logs, Windows Update history, and ONNX Runtime profiling output — will be your primary sources for root cause analysis. Microsoft’s support and OEMs are the escalation paths when the public KB is insufficient. (support.microsoft.com)

Recommended checklist: verification and mitigation steps​

For developers — preflight and validation​

  • Reproduce your critical inference scenarios on representative Snapdragon device(s).
  • Use ONNX Runtime checks to confirm provider presence:
  • Python: call ort.get_available_providers() or instantiate a session and inspect sess.get_providers() to ensure QNNExecutionProvider is available and selected.
  • Run functional correctness (output parity) tests and performance benchmarks before and after the update. Pay attention to quantized models and QDQ (quantize‑dequantize) operator patterns; QNN EP often has specific guidance for quantized models and Windows ARM64 scenarios.
  • Enable and capture QNN EP profiling (CSV and qnn.log when present) to analyze operator placement and time spent on NPU vs CPU; this helps detect changes in graph partitioning or problematic kernels.

For IT administrators — rollout and control​

  • Use WSUS or Windows Update for Business to stage the KB to a pilot ring first, approve or decline as needed, and monitor telemetry/performance from pilot devices. WSUS lets you approve updates for groups and see detection without immediate installation.
  • If you detect regressions, you can keep the update unapproved in WSUS for broader rings or use Update for Business deferral policies until a fix or mitigant is available. Document which KBs correspond to which component version so your rollback plan targets the correct package.
  • Collect logs: Windows Update event logs, application event logs, ONNX Runtime profiling outputs and any vendor driver logs to expedite vendor/Microsoft escalation.

For consumers — quick checks​

  • Confirm the update is installed: Settings → Windows Update → Update history and look for “Windows ML Runtime Qualcomm QNN Execution Provider Update (KB5078978)”. (support.microsoft.com)
  • If you experience app-specific problems after the update, collect app logs and Windows Update history and try a vendor‑recommended recovery path (rollback via System Restore or contact OEM support). For enterprise devices, talk to your IT helpdesk — they can check WSUS approvals and device compliance. (support.microsoft.com)

Final assessment: strengths, trade‑offs, and what to watch next​

Strengths​

  • Faster iteration: Microsoft’s componentized approach lets Qualcomm and Microsoft push targeted runtime improvements to devices without waiting for large cumulative or feature updates. This is good for performance and feature agility. (support.microsoft.com)
  • Shared runtime: A single, validated ONNX Runtime + QNN EP shipped by Windows reduces fragmentation compared to each app bundling its own runtime. That simplifies support and reduces overall disk footprint.
  • Tooling for debugging: The QNN EP exposes profiling and context cache features that help developers diagnose and optimize on‑device inference.

Trade‑offs and risks​

  • Limited public changelog: The public KB’s terseness forces reliance on in‑house testing or vendor support to understand operator‑level changes and regressions. This operational opacity is the single largest practical risk to production deployments. (support.microsoft.com)
  • Stack coupling: Performance or correctness changes may depend on driver/firmware versions that are outside Microsoft’s update cadence, requiring coordination with OEMs and Qualcomm.
  • Management complexity: The same EP binary appearing under different KB numbers for different Windows branches makes KB tracking by ID brittle; track component versions instead.

What to watch next​

  • Look for additional Microsoft release notes or advisories that map component versions to a changelog (Microsoft has “release information for AI components” pages that sometimes provide timelines; watch those pages and vendor blogs for deeper detail). If you depend on fixed operator implementations for production models, subscribe to vendor/OEM release channels and maintain a short testing gate in your deployment pipeline. (support.microsoft.com)

Microsoft’s KB5078978 is a small but consequential example of how the Windows ecosystem is evolving to deliver on‑device AI: componentized, vendor‑specific, and shipped through the same Windows Update channel that IT organizations already manage. That architecture improves agility for performance and bug fixes, but it also raises operational requirements for validation, driver coordination, and careful rollout planning. If your products or services rely on Qualcomm‑accelerated ONNX inference, treat this KB as a reminder to integrate runtime component updates into your release and test cycles — confirm provider availability, run regression and profiling tests, and use WSUS or Windows Update for Business to stage the update across your environment. (support.microsoft.com)

Source: Microsoft Support KB5078978: Qualcomm QNN Execution Provider update (1.8.30.0) - Microsoft Support