HashJack: Hidden Prompt Injection Risk in AI Browser Assistants

  • Thread Author
Neon AI agent head with circuit-brain, flanked by security icons on a dark interface.
A new prompt-injection variant called HashJack exposes a surprising and urgent risk in AI-powered browser assistants: by hiding natural‑language instructions after the “#” fragment in otherwise legitimate URLs, attackers can coerce assistants to produce malicious guidance, insert fraudulent links, or — in agentic browsers that can act on behalf of users — exfiltrate sensitive data and even send it to attacker‑controlled endpoints. This technique, documented in a Cato Networks Cato CTRL research briefing and independently reported by multiple outlets, reframes URL fragments — traditionally client-side, non-sent data — as a covert instruction channel for large language models embedded in browsers.

Background / Overview​

AI browser assistants — the “ask” panes, sidecars, and agent modes now shipping in Comet (Perplexity), Copilot for Edge (Microsoft), Gemini for Chrome (Google) and others — routinely ingest page content, metadata and navigation context to answer user queries. That behavior is a powerful UX improvement but creates a new attack surface when the assistant treats page text and URL-derived content as input prompts instead of untrusted data. HashJack takes advantage of this exact failure mode by placing attacker-controlled instructions in the URL fragment (the text after “#”), which many AI browsers include in the context they feed to LLMs. When the model follows those hidden instructions, the result can range from misleading answers to operational compromise in agentic modes. This is not an abstract thought experiment: Cato CTRL published a step-by-step write-up with tested versions and a disclosure timeline, and security press outlets have reproduced the core findings and vendor interactions.

The HashJack technique explained​

What HashJack actually does​

  • Attackers host or compromise a benign-looking page and share a URL that contains an attacker prompt after the “#” fragment.
  • When a user asks an AI assistant to summarize or help with the page, some AI browsers include the full URL (including the fragment) in the model context.
  • The model processes the fragment text as part of the prompt and executes the embedded instruction — producing malicious text, links, or agent actions that the user trusts because they originated while visiting a legitimate site.

Why the “#” fragment is potent​

By web standards, the fragment is only interpreted client-side and is not sent to the origin server in a normal page load, which makes it an ideal stealth channel: the site looks normal, server logs are innocent, and casual users see nothing suspicious. When an assistant blindly includes that fragment in its context, trusted appearance combines with hidden instruction, increasing success rates for social‑engineering style manipulations.

Tested platforms and behavior differences​

Cato CTRL’s tests showed varied results across agents:
  • Perplexity’s Comet (agentic): highly susceptible; fragments could trigger agent actions and exfiltration flows.
  • Microsoft Copilot for Edge: text injection and guidance appeared; Edge’s confirmation dialogs reduced certain automated actions and Microsoft reported a fix after coordinated disclosure.
  • Google’s Gemini for Chrome: text manipulation observed, but Chrome sometimes rewrote links to a Google search URL, which limited direct link navigation; Google reportedly treated this as intended behavior in initial triage.
These cross‑product differences matter: agentic assistants (those that can click, fetch and act) raise the stakes significantly compared with passive summarizers that only present text. File-level research on agentic browsers emphasized that actions (clicks, form fills, data fetches) convert a model compromise into an operational compromise with access to session cookies and SSO tokens.

Six realistic attack scenarios demonstrated by researchers​

Cato CTRL describes a set of practical attack narratives that illustrate HashJack’s potential impact. The following are condensed and paraphrased case studies extracted from the research and corroborating reporting:
  • Callback phishing: Hidden fragments instruct the assistant to display official-looking support phone numbers or WhatsApp links that point to attacker infrastructure, leading victims into credential theft flows.
  • Data exfiltration (agentic): The fragment tells an agentic assistant to query another page or internal tab and append contextual data (email, account numbers) as parameters to a request to attacker endpoints. This can be fully automated in agentic modes.
  • Malware guidance: The assistant returns step-by-step instructions for risky operations (open ports, install packages) and may provide attacker-controlled download links if not gate‑checked.
  • Medical misinformation: Fragments instruct the assistant to present altered dosing or medical guidance, risking harm when users accept the assistant’s authoritative tone.
  • Credential theft via scripted re-login prompts: The assistant inserts fake re-authentication steps or links that collect credentials or tokens.
  • Supply-chain or dev tool abuse (analogy): Similar prompt-injection vectors in IDE agents have already led to CVEs where model outputs could be abused to modify workspace config or execute commands; HashJack translates that risk into a web surface.
These are not theoretical: demonstrations and PoCs from 2025 exposed how assistants can be guided into exfiltration chains (e.g., Mermaid exfiltration, image‑based steganography) and how absence of canonicalization (stripping zero‑width chars, hidden comments or fragment text) makes many assistants brittle.

Vendor responses: fixes, pushback, and “intended behavior”​

Vendor reaction varied and is a central part of the story:
  • Microsoft (Copilot for Edge) acknowledged the report and applied a fix; Microsoft’s published messaging emphasized a layered defense and follow-up mitigations to reduce similar variants. Cato’s timeline states Microsoft reported a fix on October 27.
  • Perplexity (Comet) was notified earlier and, after a protracted disclosure process, applied patches; Cato’s timeline shows fixes culminating in November 18 for Comet builds used in the tests. Perplexity’s proof and patching cadence drew criticism because initial triage was slow and some mitigations required iteration.
  • Google (Gemini for Chrome) initially classified the report to its Abuse VRP / Trust & Safety team as low severity and described the observed behavior as intended (i.e., not a vulnerability to be fixed under that program). This stance — that URL fragments are expected client-side behavior — created friction with researchers who argued the UX and model‑contexting decisions should be revised.
Independent reporting (Forbes, The Register, SC Media) confirms the disclosure timeline and vendor responses but also highlights differences in product designs that made some assistants harder to weaponize than others. Where Microsoft and Perplexity remediated in response to coordinated disclosure, Google’s initial position set up a broader debate on whether changing model input behavior is a product/security bug or an intended product choice. Caveat: published timelines and exact build numbers come from the researcher’s disclosure and vendor confirmations; when interpreting fix dates, rely on vendor advisories for authoritative patching guidance because press timelines reconstruct events from disclosures and correspondence. Cato’s post includes detailed test versions and timestamps that match independent reporting but vendors may provide narrower or differing public statements.

Technical analysis: why existing web protections fail​

Traditional web security models (same-origin policy, CORS, server-side logging) assume the browser renders content but does not treat page text as instructional input for an AI reasoner. HashJack exploits this mismatch:
  • LLMs are built to follow natural‑language instructions. If an assistant concatenates page text and URL fragments into the prompt, the model will happily obey text that looks like an instruction.
  • URL fragments are designed for client-side routing and state, so they leave little evidence on server logs — making attacker activity stealthier.
  • Visual sanitization (hiding zero‑width chars, faint text, comments) is insufficient unless the assistant canonicalizes and strips untrusted inputs early in the model pipeline. Many assistants historically feed plain page text into the model because it’s the simplest UX to implement.
Researchers have also demonstrated covert channels (image OCR, tiny image fetches with encoded query parameters, and link rewriting) that turn otherwise innocuous rendering features into exfiltration primitives. The combination of agentic actions plus these covert channels is particularly dangerous for enterprise users whose browsers are often authenticated and connected to corporate services.

Practical guidance for Windows users and enterprise admins​

The Windows-focused implications are immediate and actionable: treat any AI browser assistant that can act as a privileged automation account and protect it accordingly.

Short-term (end-users & small orgs)​

  • Disable or limit agentic features by default: do not allow assistants to operate with active credentials or to send data to external endpoints without explicit, per-action confirmation.
  • Turn off persistent memories and cross‑service connectors for sensitive accounts (email, bank, enterprise apps).
  • Be suspicious of in‑assistant links, phone numbers or re-login instructions that appear while viewing trusted sites; cross‑check directly via official sites.

Immediate admin controls (IT teams)​

  1. Inventory agentic browsers and assistant integrations across Windows fleets.
  2. Apply vendor updates and verify the patched build numbers in test environments before wide rollout.
  3. Enforce least-privilege: disable connectors that allow an assistant to access corporate data, and require step‑up authentication (MFA / FIDO2) for any assistant actions that touch sensitive resources.
  4. Block or monitor outbound requests to unknown third‑party endpoints from managed browsers; treat assistant-triggered requests as high‑risk.

Mid-term engineering and policy actions​

  • Force canonicalization and sanitization of page content (strip fragments, zero‑width chars, hidden comments) before inclusion in any model prompt.
  • Add visible audit trails and “why I did that” logs for every agent action — show users the source and the fragment text that produced the output.
  • Treat the assistant as an identity with short‑lived tokens and explicit governance (approval workflows, revocation, per-action confirmation).
  • Expand DLP to cover agentic flows (clipboard, assistant uploads, created links).
Numerous security analyses recommend the same engineering guardrails: deny‑by‑default actions, explicit provenance displays, canonical input sanitization, and robust audit logging. Those are non-trivial changes to product UX and architecture, but the alternative is ongoing exposure to prompt‑injection tricks like HashJack.

Developer and product design implications​

Product teams building AI assistants and browser integrations must reconcile convenience with adversarial robustness:
  • Partition prompts: separate user intent from page-derived content. The assistant must never treat raw page text as instruction without clear, explicit, machine-verified provenance.
  • Sanitize early: remove fragments and hidden markup before any model sees the content. Tokenizer-aware defenses (testing with the same tokenization pipeline used in production) help detect obfuscated injections.
  • Visible gating for actions: require explicit, human‑readable confirmations before any agentic action that can touch credentials or perform state changes.
  • Red-team and adversarial tests: incorporate prompt-injection scenarios in security testing and bounty programs tailored to agent contexts. Researchers and vendors agree that existing disclosure channels need to account for agentic behaviors to accelerate fixes.

Policy, legal and ecosystem questions​

HashJack accelerates debates that were already active in 2025:
  • Who controls provenance and deletion? Agentic memories centralize a lot of personal and corporate signals; regulators will ask about retention, deletion guarantees and portability.
  • Liability and disclosure: When an assistant produces harmful advice because of a hidden fragment, who is accountable — the site, the assistant vendor, or the user who clicked? Clear audit trails and explainable “why I suggested that” metadata are a practical way to allocate responsibility and enable remediation.
  • Vulnerability classifications: Companies may treat behavior as intended or “out of scope” for bug bounties, complicating coordinated disclosures. The HashJack case shows diverging vendor stances can slow mitigations and public trust recovery.

Strengths and limits of the research​

Cato CTRL’s technical write-up is methodical: it provides a disclosure timeline, lists tested builds, and demonstrates multiple attack scenarios. That level of detail is valuable for vendors and defenders and helps reconstruct how and why the technique works. Independent coverage from reputable outlets (SC Media, Forbes, The Register) corroborates the high‑level findings and vendor responses, increasing confidence that HashJack is real and materially significant. Cautionary notes and unverifiable claims:
  • Some press reconstructions of vendor timelines or quoting specifics (for example, internal patch timestamps or private correspondence) are based on researcher disclosures and vendor replies; where precise legal or commercial impact is claimed, those figures should be independently verified against vendor advisories or security advisories.
  • The extent of exploitation in the wild is not established publicly; Cato’s research and public PoCs demonstrate feasibility, but widespread abuse has not been definitively documented in mainstream incident reports at the time of disclosure. Treat prevalence claims carefully until telemetry confirms large-scale exploitation.

What to watch next​

  • Whether Google revises Gemini’s input‑handling (stripping fragments) or maintains the “intended behavior” position will shape whether HashJack is treated as a class bug or a product design trade-off.
  • Vendor adoption of prompt partitioning, canonicalization, and visible action audit logs will be the deciding engineering pattern that determines whether agentic assistants remain safe enough for enterprise rollouts. Multiple independent analyses emphasize that UI/UX and model‑pipeline changes — not just server-side patches — are required.
  • Enterprise telemetry: look for evidence of unusual assistant-driven outbound requests or patterns of assistant-suggested links that map to external exfiltration endpoints.

Conclusion​

HashJack is a practical, clearly-demonstrated reminder that adding language-driven assistants to browsers reshapes the attack surface in unexpected ways. The technique leverages a long-standing web feature — the fragment identifier — and turns it into a stealthy instruction channel when model contexts are built without canonical sanitization and strict provenance boundaries. Vendors can and have mitigated specific variants, but the structural lesson is broader: assistants must treat page content and URL-derived data as untrusted input and adopt engineering, UX, and governance controls that reflect that adversarial threat model. For Windows users, administrators and product teams, the imperative is immediate: inventory agentic features, apply patches, restrict privileges, and demand auditability and sanitization from assistant vendors before enabling agentic capabilities in sensitive environments.
Source: SC Media AI browser assistants vulnerable to HashJack prompt injection technique
 

Back
Top