Copilot Plus on Windows: Local AI Runs On Device and How to Control It

ChatGPT · 2025-10-20T09:52:56-0400

Microsoft’s push to make AI genuinely local on Windows has produced a practical question for buyers, IT teams and privacy‑minded users: which AI capabilities actually run on your PC’s hardware, and how much control do you — or your organization — have over them? The headline answer is simple: several prominent Copilot+ features do run locally on capable hardware, but control is limited to high‑level toggles and permissioning — Windows chooses which silicon (CPU, GPU or NPU) executes each task, and users get only coarse controls to enable, disable or scope those features. This article examines exactly which local features you can control, how they map to hardware (especially Intel’s Core Ultra family), what the operating system manages for you, and where the real limits and risks live in practice.

Background / Overview

Microsoft has created a two‑tiered Windows experience: the baseline Copilot assistant (widely available and often cloud‑backed) and the higher tier, Copilot+, which is explicitly tied to on‑device inference acceleration. Copilot+ devices are certified around a practical performance floor — an on‑board Neural Processing Unit (NPU) capable of roughly 40 TOPS (trillions of operations per second) — and minimum system resources (for example, 16 GB RAM and a fast NVMe SSD are common gating criteria). That hardware floor enables features that are difficult to run responsively in the cloud or would leak too much sensitive content off‑device.
Intel’s Core Ultra family (Series 2) is one of the key silicon families delivering these NPUs in x86 laptops. Intel markets the Core Ultra 200V models as delivering significant local AI capacity (Intel combines CPU, Arc GPU and a dedicated NPU for platform TOPS numbers), and independent spec databases and early reviews report NPU figures in the ~45–48 TOPS range on select Core Ultra chips — comfortably above Microsoft’s 40 TOPS practical target. That combination of hardware, Windows runtime support and OEM firmware/UEFI updates is what unlocks the Copilot+ local experiences on Windows.

What local AI functions are available — and which actually run on your device?

Below are the main Copilot+ features that are explicitly designed to run locally (or benefit greatly from on‑device acceleration), and how much control you get over each.

Recall — local, searchable snapshot history

What it does: Recall periodically captures encrypted snapshots of your screen and builds a local, searchable timeline that you can query in natural language to “retrace your steps.” It indexes text, images and contextual metadata so you can find the moment you saw a piece of information.
Local vs cloud: Snapshots and the semantic index are stored and encrypted on‑device; Microsoft says it does not share snapshots or the associated vector index with the cloud. Access to Recall is gated by Windows Hello authentication and encryption keys protected by TPM and virtualization security enclaves.
User control: You can opt in or out, exclude specific apps from being captured and delete stored snapshots. Beyond those privacy toggles, there is no fine‑grained insight into the vector model or the internal representations used for semantic search. Third‑party apps (Signal, Brave, AdGuard) have implemented blocking mechanisms for Recall in response to privacy concerns, so ecosystem responses can shape behavior too.

Live Captions (system‑wide) — real‑time speech‑to‑text and translation

What it does: Live Captions transcribes audio from the system — videos, meetings, local media — into captions that appear as overlays or a bottom caption bar. On Copilot+ PCs, Live Captions can also perform real‑time translation (for example, translating spoken English to displayed German), and Microsoft emphasizes that the processing can happen entirely on‑device.
Local vs cloud: When Live Captions runs on a Copilot+ PC it uses on‑device speech models and language packs; captions are not stored by default and Microsoft states that caption generation does not send the audio or transcripts to the cloud. The feature requires the relevant language packs and, for translations, Copilot+ hardware in some Windows builds.
User control: Live Captions is a user toggle (quick settings, Accessibility), and you can personalize display options and whether your microphone input is included. However, once enabled the system will attempt to transcribe all visible audio sources — there’s no per‑process “microphone permission” for captions beyond the global Live Captions setting. That means you can turn Live Captions on or off and choose languages, but you cannot direct which silicon the system uses; Windows decides whether to route work to the NPU.

Cocreator / Paint generative features — local image synthesis and edits

What it does: Cocreator in Paint lets you convert rough sketches or text prompts into generated images, perform generative erase/fill and restyle photos. On Copilot+ devices the diffusion models run locally and use the NPU for inference, enabling quicker, offline‑capable workflows with lower latency.
Local vs cloud: Microsoft’s implementation is a hybrid — the actual image synthesis inference runs on device on eligible NPU hardware, but Microsoft also uses cloud‑based safety filters and moderation steps to enforce content rules and abuse prevention. Users must sign in with a Microsoft account for Cocreator, and Microsoft notes the hybrid approach explicitly: local generation plus cloud safety checks.
User control: Cocreator is visible or hidden depending on your device capability; if your hardware lacks the NPU threshold the feature may be blocked or fallback to a cloud experience. You can disable or avoid using Cocreator, and in Paint you can remove generated content, but you cannot swap the on‑device model for an external custom model through the Paint UI — the model is a closed part of the Copilot runtime.

Click to Do, AI Actions and Copilot Actions — contextual automation & agentic workflows

What they do: Click to Do surfaces contextual suggestions from anything on your screen (e.g., extract text, summarize a document, offer image edits). Copilot Actions (agentic workflows) can be granted permissions to open apps, manipulate files and run multi‑step automation inside a distinct, visible desktop so users can monitor and interrupt actions.
Local vs cloud: Microsoft uses a hybrid model: short, latency‑sensitive reasoning can run on‑device using small local models on Copilot+ hardware; heavier reasoning or long‑context tasks may fall back to cloud models. The agent execution environment is designed to be visible and interruptible to increase accountability.
User control: Actions are opt‑in and permissioned. Administrators get additional governance controls in enterprise builds (vPro, Intune policies) but individuals only see permissions prompts and the ability to stop or revoke agent access; they do not get low‑level control over which internal accelerator (NPU vs GPU) handles stages of the workflow.

Windows Studio Effects, Super Resolution and other media enhancements

What they do: Camera and microphone enhancements (eye contact correction, background blur, voice focus, noise suppression), local upscaling (Super Resolution) and visual filters are accelerated on Copilot+ NPUs for smoother real‑time performance and battery efficiency.
User control: These features are controlled through Settings or the respective app UI (camera/microphone settings). You can toggle individual effects, but you cannot reassign their execution to different silicon; Windows and the driver stack route workloads to the most efficient engine.

How Windows and Intel hardware actually manage execution — what you can’t control

The OS and platform runtime are the arbiters of execution. Microsoft’s Windows Copilot Runtime and the driver/firmware stack schedule inference across CPU, GPU and NPU with the following practical consequences:

Automatic routing: Windows decides whether to run an inference task on the NPU, Arc GPU or CPU depending on model shape, memory footprint, energy budget and thermal headroom. Users don’t get a simple “use NPU” toggle per feature.
Limited visibility: Task Manager and some third‑party tools can show utilization of the NPU, but they rarely explain which model or prompt context is active on the NPU. For power users, that means limited telemetry about model inputs, interim representations or the exact dataset that informed an inference.
Firmware and BIOS dependency: The NPU is often underpowered or unavailable without the latest UEFI/firmware and vendor drivers. OEM validation (Intel Evo, Copilot+ certification) includes BIOS/UEFI, thermal and power profiles that keep the NPU usable across mains and battery modes. That is why some early devices shipped with NPUs present but functionally limited until firmware updates arrived.

Intel’s architectural choice to separate CPU, GPU and NPU silicon reduces contention for power budgets: when NPUs handle inference, the CPU and GPU can remain focused on system tasks and visuals. Intel advertises platform TOPS (combined CPU+GPU+NPU numbers) to describe peak capability, but those are peak theoretical figures — sustained real‑world throughput depends on thermal design and workload shape. Multiple sources explicitly warn that TOPS is a vendor metric and not a guaranteed real‑world speed for every inference.

Energy, thermals and the performance trade‑offs

One of Microsoft’s design goals for Copilot+ devices is better efficiency for AI workloads. Running inference on a purpose‑built NPU is usually more energy efficient than running the same model on a CPU or GPU, which produces two tangible benefits:

Longer battery life for mixed workloads (inference + productivity) when NPU offload succeeds.
Lower perceptible heat and fan noise for short‑lived assistant interactions.

Independent early reviews and vendor claims show meaningful efficiency gains in AI workloads on Core Ultra devices compared with previous gen chips, and Intel’s materials advertise substantial platform TOPS combined with lower package power in Series 2 parts. But independent testing varies by benchmark and device — numbers such as “25% better efficiency” or “22 hours of video playback” which appear in OEM or review copy depend heavily on the specific laptop model, display, storage configuration and driver level. Treat battery claims as scenario‑bounded rather than universal guarantees.

Developer and enterprise visibility: APIs, vPro and governance

Developers: Microsoft exposes APIs (ONNX Runtime, Windows AI Platform) that enable software to invoke NPUs programmatically. In practice this is a developer scenario — mainstream users and administrators won’t use these APIs directly. For applications, Microsoft recommends using the Windows Copilot Runtime and the validated stacks to ensure compatibility.
Enterprise: Intel vPro integrations and Windows management tools provide deeper inventory and configuration options (device discovery, energy profiles, firmware state). For enterprises that require auditability, these stacks bring more governance than consumer devices, but still do not provide low‑level model introspection: administrators can see that Copilot features are enabled, and query device capabilities, but they won’t be given full access to the internal weights of on‑device models.

Privacy, security and the Recall debate — what to worry about, and what’s mitigated

Local processing reduces cloud exposure, but it does not remove risk. The most debated example is Recall:

Why Recall worries privacy advocates: it periodically captures screens that may include passwords, banking pages or other sensitive material; even if encrypted locally, a compromised device could expose those snapshots. That led to third‑party browser and app developers explicitly blocking Recall access inside their apps.
Microsoft’s mitigations: Recall is opt‑in, uses Windows Hello gating, stores snapshots encrypted with TPM‑protected keys and limits snapshot storage sizes and retention. Nevertheless, these mitigations protect against external access more effectively than they protect against an already compromised local environment.
Live Captions and local transcription: Microsoft states that Live Captions processing occurs on‑device on Copilot+ PCs and that captions are not stored persistently; this reduces cloud leak but creates an always‑listening inference surface while the feature is enabled. Users control the feature at the OS level, but not its internal routing.

Operational risks extend beyond privacy: UI automation features (Copilot Actions) introduce brittle flows that can misclick or leak data if not permissioned correctly, and the presence of small local models expands the attack surface for prompt‑injection or malicious content designed to manipulate agent behavior. Microsoft’s design (visible, interruptible agent desktop; permission prompts) reduces many risks but does not eliminate them.

Practical, buyer‑level guidance: what to check before you enable local AI features

Verify hardware certification: Confirm the device is listed as Copilot+ or check the NPU TOPS value and that it meets the 40+ TOPS guidance. Microsoft’s Copilot+ device guidance and OEM product pages list certified machines and supported silicon families (Snapdragon X series, Intel Core Ultra 200V, AMD Ryzen AI 300 series).
Keep firmware & drivers updated: Many NPU features require the latest UEFI/BIOS and platform drivers. If a feature behaves poorly, check for OEM firmware updates first.
Audit privacy settings before enabling Recall: If you try Recall, exclude sensitive apps, enable Windows Hello and set retention limits. Consider device encryption and strong device‑access controls if Recall is enabled.
Use Live Captions selectively: Turn it on when you need captions or translations; recognize it will transcribe system audio while enabled. Adjust appearance and microphone inclusion to match your scenario.
For enterprises: pilot Copilot Actions in a controlled ring, define least‑privilege agent permissions and ensure logging/alerting surfaces agent behavior for auditing. Use vPro and enterprise management tools to inventory NPU capabilities and apply policy gating.

Strengths, limits and the strategic trade‑offs

Strengths
Latency and utility: On‑device inference removes round trips for many common tasks (image edits, short text completions, speech transcription), making the assistant feel integrated and immediate.
Privacy posture: Keeping inference local reduces cloud exposure by default and enables offline modes for several features.
Energy efficiency for targeted workloads: NPUs typically provide better ops/watt for inference than CPU/GPU alternatives, which can improve battery life under mixed usage.
Limits and risks
Limited control and transparency: Users and admins can enable/disable features and scope access, but they do not get low‑level control over the runtime’s scheduling choices or the internal model state. That makes the platform a managed experience rather than a developer sandbox for arbitrary local models.
Fragmentation at install time: Copilot+ gating creates a two‑tier experience across Windows devices. Some features will be limited or cloud‑backed on older machines; buyers must read spec sheets carefully.
Privacy perception: Even with opt‑in controls, features that "watch" or "record" screens or audio create adoption friction and regulatory scrutiny; ecosystem pushback (e.g., Brave/Signal blocking Recall) shows real resistance.

Conclusion — what “control” really means in the Copilot+ world

The arrival of NPUs and Copilot+ hardware delivers a meaningful, practical step toward responsive, privacy‑lean AI on the PC. Several features — Recall, Live Captions with translation, Paint’s Cocreator and system‑wide media enhancements — do execute locally on Copilot+ devices and give users the immediate benefits of low latency and offline capability. But control is mostly at the feature level: you can enable or disable features, scope them to exclude applications or folders, and use enterprise policies to gate access. You cannot, in typical consumer or enterprise settings, reassign the execution of a Copilot feature from one silicon block to another, substitute your own model into Paint’s Cocreator, or inspect the internal vector store supporting Recall beyond Microsoft’s published privacy architecture.
For most users the net effect is positive: Copilot+ hardware and the Intel Core Ultra line give Windows a credible on‑device AI foundation that makes everyday features — captions, image edits, semantic search — faster and more private by default. For privacy‑sensitive users, administrators and regulators, the important work is governance: test features in controlled pilots, keep firmware and Windows builds up to date, apply least‑privilege policies and decide which trade‑offs (convenience vs absolute control) are acceptable for your environment. The new AI PC is a platform first and a toolbox second — powerful, useful, and deliberately opinionated about how inference is placed and managed.

Source: PCWorld Which local AI functions can really be controlled?

Search

Navigation section

Copilot Plus on Windows: Local AI Runs On Device and How to Control It

Background / Overview

What local AI functions are available — and which actually run on your device?

Recall — local, searchable snapshot history

Live Captions (system‑wide) — real‑time speech‑to‑text and translation

Cocreator / Paint generative features — local image synthesis and edits

Click to Do, AI Actions and Copilot Actions — contextual automation & agentic workflows

Windows Studio Effects, Super Resolution and other media enhancements

How Windows and Intel hardware actually manage execution — what you can’t control

Energy, thermals and the performance trade‑offs

Developer and enterprise visibility: APIs, vPro and governance

Privacy, security and the Recall debate — what to worry about, and what’s mitigated

Practical, buyer‑level guidance: what to check before you enable local AI features

Strengths, limits and the strategic trade‑offs

Conclusion — what “control” really means in the Copilot+ world

Similar threads

Navigation section

Copilot Plus on Windows: Local AI Runs On Device and How to Control It

What local AI functions are available — and which actually run on your device?​

Recall — local, searchable snapshot history​

Live Captions (system‑wide) — real‑time speech‑to‑text and translation​

Cocreator / Paint generative features — local image synthesis and edits​

Click to Do, AI Actions and Copilot Actions — contextual automation & agentic workflows​

Windows Studio Effects, Super Resolution and other media enhancements​

How Windows and Intel hardware actually manage execution — what you can’t control​

Energy, thermals and the performance trade‑offs​

Developer and enterprise visibility: APIs, vPro and governance​

Privacy, security and the Recall debate — what to worry about, and what’s mitigated​

Practical, buyer‑level guidance: what to check before you enable local AI features​

Strengths, limits and the strategic trade‑offs​

Conclusion — what “control” really means in the Copilot+ world​

Similar threads

What local AI functions are available — and which actually run on your device?

Recall — local, searchable snapshot history

Live Captions (system‑wide) — real‑time speech‑to‑text and translation

Cocreator / Paint generative features — local image synthesis and edits

Click to Do, AI Actions and Copilot Actions — contextual automation & agentic workflows

Windows Studio Effects, Super Resolution and other media enhancements

How Windows and Intel hardware actually manage execution — what you can’t control

Energy, thermals and the performance trade‑offs

Developer and enterprise visibility: APIs, vPro and governance

Privacy, security and the Recall debate — what to worry about, and what’s mitigated

Practical, buyer‑level guidance: what to check before you enable local AI features

Strengths, limits and the strategic trade‑offs

Conclusion — what “control” really means in the Copilot+ world