LangGrinch CVE-2025-68664: Patch LangChain Core to Stop Serialization Exploits

  • Thread Author
The discovery and public disclosure of a critical serialization-injection flaw in LangChain Core — tracked as CVE-2025-68664 and widely discussed under the nickname LangGrinch — is a timely reminder that the rise of agentic AI and autonomous workflows changes the security calculus. The flaw is not a prompt-level weakness; it is a supply‑chain vulnerability in a widely used orchestration framework that, if left unpatched, can turn ordinary LLM outputs into a conduit for secret exfiltration, unintended class instantiation, and even template-driven code execution. Organizations must treat AI frameworks and orchestration layers as first-class attack surfaces and apply the same disciplined vulnerability management and runtime controls used for any critical infrastructure. The patch is straightforward; the operational work to find and clean vulnerable deployments is not.

Background / Overview​

AI applications have evolved from passive consumer features into active, permissioned actors: agents that call tools, orchestrators that maintain state, and pipelines that serialize and deserialize structured metadata across services. That architectural shift widens the attack surface beyond user prompts — to SDKs, serialization formats, orchestration state, and the libraries used to persist and restore agent state.
LangChain is one of the leading frameworks enabling agentic workflows in Python and JavaScript ecosystems. The vulnerability CVE-2025-68664 sits precisely at the heart of that workflow: the dumps() / dumpd() serialization APIs (and the corresponding loads behavior) failed to reliably separate data from control when a reserved marker — the lc key — could appear in otherwise free-form dictionaries. When that happened, deserialization treated attacker-controlled data as a trusted object descriptor rather than inert user data, enabling dangerous object reconstruction paths. This is a classic injection failure in a serialization context and a high-risk supply‑chain problem because many organizations treat framework components as trusted.

The LangGrinch vulnerability explained​

What went wrong: the reserved "lc" marker​

LangChain’s internal serialization format uses a reserved key, lc, to mark objects that represent serialized LangChain constructs (prompts, messages, documents, etc.). The vulnerable functions failed to escape or neutralize this reserved marker when user-controlled dictionaries were serialized. If an LLM output, user-supplied metadata, or third-party plugin injected a dictionary containing an lc key, the subsequent deserialization step could interpret that structure as a legitimate, executable LangChain object.
That misclassification breaks the fundamental design goal of serialization: treat data as data, not code. Because of how object reconstruction and the framework’s import maps work, maliciously crafted serialized manifests could cause:
  • Secrets exfiltration — deserialized objects could access environment variables or runtime secrets if the deserialization pathway allowed such reads.
  • Unexpected class instantiation — objects of authorized classes could be created with attacker-controlled initialization parameters, producing side effects.
  • Template-driven code execution — in configurations using templating engines like Jinja2, malicious templates embedded in reconstructed objects could execute arbitrary code.

Technical scope and patched versions​

Trusted public records and project advisories confirm the affected ranges and patches:
  • For the Python langchain-core lineage: versions < 0.3.81 and versions >= 1.0.0 and < 1.2.5 are affected; 0.3.81 and 1.2.5 include the fixes.
  • Equivalent vulnerabilities were disclosed in the JavaScript/TypeScript packages and tracked under a coordinated advisory (separate CVE identifiers are associated with the JS ecosystem). Patch versions differ slightly in the JS packages and were released in tandem.
The vulnerability carries a CVSS v3.1 score of 9.3, reflecting remote exploitability with little or no privilege and high confidentiality impact. Multiple vulnerability databases and vendor advisories independently list the same score and severity.

Why this matters for AI supply chain security​

Agentic applications magnify the risk of framework-level flaws in three ways:
  • Agents act with privileges. Agents and orchestrators commonly hold API keys, cloud role bindings, or managed identities to access data sources and tooling. A compromised orchestration layer can be an instant pivot to sensitive resources.
  • Serialization is everywhere. Orchestration frameworks serialize state for persistence, resume, and cross-service messaging. Once a serialization API is trusted, many code paths may rely on its benign behavior.
  • LLM outputs are attacker Prompt injection research has shown how model responses can be coerced into returning structured content. If those outputs are later serialized and trusted, the LLM becomes an indirect remote code/data injection vector. This coupling of prompt-layer weaknesses with supply‑chain vulnerabilities is exactlh dangerous.

Microsoft Defender’s posture: what the vendor recommends and how it helps​

Microsoft’s published case study frames supply‑chain risk as an operational problem that requires discovery, posture management, runtime detection, and developer-focused remediation workflows. Their guidance for LangGrinch is :
  • Patch immediately. Upgrade LangChain Core to patched versions (Python: 0.3.81 or 1.2.5 or later). Microsoft lists this as the primary mitigation.
  • Inventory and hunt. Use Defender for Cloud (Cloud Security Posture Management / DCSPM) to identify compute and container assets running vulnerable langchain-core versions. Microsoft also recommends connecting code environments to Defender for Cloud to find vulnerabln the SDLC.
  • Remediate across the lifecycle (Code → Ship → Runtime). Defender’s posture recommendations span CI/CD scanning, container image scanning, and runtime controls that can block suspicious behaviors and network egress. Microsoft emphasizes integrated workflows that create GitHub issues directly from Defender for Cloud and use AI-assisted coding agents to shorten remediation cycles.
  • Hunting guidance and alerts. Microsoft produces hunting queries (KQL) for Defender XDR to find devices with packages. The guidance points investigators to suspicious Python processes that access environment variables or make unusual network calls immediately after LLM interactions as high-priority indicators. The published KQL example is practical for teams already ingesting software inventory telemetry.
These capabilities — automated discovery, integrated remediation workflows with GitHub, and XDR hunting — are useful for rapidly reducing exposure in large estates. But they are not a panacea. The guidance itself acknowledges limitations in coverage and integration work required to connect code repositories and compute telemetry. Microsoft also notes continued expansion of platform coverage; customers should verify feature availability for their tenant and region before relying on platform-only remediation.

Confirmed facts and verifications​

  • CVE‑2025‑68664 is a real, publicly disclosed vulnerability affecting langchain-core serialization APIs; the issue is documented in NVD and multiple security advisories. CVSS = 9.3.
  • Patched Python versions are 0.3.81 and 1.2.5 (and later). For JS/TS, parallel patches were released for relevant package versions. Organizations must confirm which LangChain distribution they use (langchain vs langchain-core, Python vs npm).
  • Public, community-sourced proof-of-concept analyses and write-ups exist; however, there is no consensus public evidence of a large-scale, coordinated mass-exploitation campaign at the time of writing. Security vendors report PoCs and active community discussion, and some warn about the ease of weaponization — but broad, confirmed in-the-wild exploitation remains unproven in open reporting. Treat assertions of mass exploitation with caution until authoritative telemetry is published. //www.rescana.com/post/langgrinch-cve-2025-68664-critical-langchain-core-vulnerability-enables-secret-exfiltration-and-c)

Practical, prioritized actions for security and engineering teams​

Below is an operational checklist — ranked and actionable — to reduce immediate risk and to harden AI application supply chains against similar problems in the future.
  • Patch first (minutes–hours)
  • Upgrade Python langchain-core to 0.3.81 or 1.2.5+ depending on your branch. Confirm dependency trees and rebuild container images.
  • For JS/TS builds, upgrade @langchain/core and langchain packages to patched versiadvisories.
  • Inventory and triage (hours–days)
  • Use software inventory tools (Defender, SCA scanners, container registries) to identify images or hosts with vulnerable versions.
  • If you use Defender for Cloud, run the Cloud Security Explorer and the provided software inventory queries to enumerate affected compute assets. Microsoft’s guidance describes connecting code environments and enabling D‑CSPM/Defender for Containers/Servers to broaden detection.
  • Block, compensate, monitor (days)
  • Where patching is delayed, apply compensating controls: restrict network egress from orchestration hosts, rotate any long‑lived keys accessible to agent runtimes, and enforce least-privilege managed identities.
  • Add runtime instrumentation to log deserialization events, suspicious object instantiation, and access to environment variables from Python processes handling LLM outputs.
  • Hunt and validate (days)
  • Use advanced hunting queries in XDR to look for:
  • Python processes associated with LangChain that read environment variables soon after an LLM call.
  • Unexpected outbound network connections following LLM interactions.
  • Microsoft provided a KQL example that locates installed langchain packages by version; adapt it to your telemetry schema before running it at scale.
  • Shift left (weeks)
  • Treat model outputs as untrusted inputs inside pipelines. Add explicit sanitization layers before any serialization step and avoid serializing raw LLM outputs unless absolutely necessary.
  • Update CI/CD SCA scanning to flag vulnerable langchain versions and fail builds that include them.
  • Add adversarial testing into MLOps pipelines (prompt-injection tests that target metadata fields such as metadata, additional_kwargs, and response_metadata).
  • Governance and cataloging (ongoing)
  • Create an AI Bill of Materials that lists models, agents, SDKs, plugins, containversions. Map which agents possess which identities and what data sources they can access.
  • Require attestations for third‑party agents/plugins and periodic revalidation of their supply‑chain provenance.

Hunting example and operational nuance​

Microsoft’s published KQL snippet is a practical starting point for defenders already ingesting software inventory:
Deory
| where SoftwareName has "langchain" and (
// Lower version ranges
SoftwareVersion startswith "0." and toint(split(SoftwareVersion, ".")[1]) < 3
or (SoftwareVersion hasprefixit(SoftwareVersion, ".")[2]) < 81)
// v1.x affected before 1.2.5
or (SoftwareVersion hasprefix "1." and (
toint(split(SoftwareVersion, ".")[1]) < 2
or (
toint(split(SoftwareVersion, ".")[1]) == 2 and toint(split(SoftwareVersion, ".")[2]) < 5
)
))
)
| project DeviceName, OSPlatform, SoftwareName, SoftwareVersion
That query is useful as-is for inventories ingested into Defender telemetry, but teams should avoid blind reuse: adapt the logic to match your inventorypackage managers that report versions differently (wheel metadata, pip freeze, container image labels). Also, remember that not all runtime installs will report through the same inventory feed — serverless functions, ephemeral containers, or developer laptops may be missed wiectors.

Critical analysis — strengths, limits, and operational risk​

Strengths of the Defender-centric approach​

  • Integrated lifecycle remediation: connecting posture findings into GitHub issues and automating fix workflows shortens MTTR in many organizations that already use Azure DevOps/GitHub tooling. This is a material operational advantage for large estates.
  • Telemetry fusion: correlating identity, endpoint, and cloud telemetry with AI-specific signals (prompts, retrieval traces) gives SOC analysts richer context than siloed alerts.

Important limitations and risks​

  • Coverage gaps: Defender’s ability to find all vulnerable instances depends on the organization’s telemetry connectors (containers, servers, code repositories). Remediation may still miss developer desktops, internal CI runners, or third-party services not reporting software inventory. Microsoft’s blog acknowledges ongoing expansion of scanners and coverage. Treat platform promises as conditional until you validate them in your environment.
  • Operational complexity and cost: instrumenting rich prompt telemetry, connecting code repos into posture tools, and tuning runtime detections require skilled staff. False positives from agent telemetry can easily overwhelm teams that lack AI-specific triage playbooks.
  • Over-reliance on vendor defaults: vendor tooling is helpful but not sufficient. Expect to augment with custom input sanitizers, CI gates, and runtime allowlists for deserialization where practical.
  • Exploit speed vs patch windows: high-severity vulnerabilities with low complexity (CVSS 9.3) create pressure to patch rapidly. Even when patches are available, organizations with complex deployment lifecycles will face nontrivial coordination to roll updates across distributed microservices and function runtimes. Multi-stage deployment, staged rollback plans, and canary validation are essential.

Is there evidence of exploitation in the wild?​

Security vendors and researchers published PoCs and exploit analyses shortly after disclosure; this spurred public discussion and constructive mitigation guidance. However, as of the latest coordinated advisories, there is no widely corroborated telemetry indicating a large-scale, automated exploitation campaign tied to CVE‑2025‑68664. That status could change quickly — defenders should treat the vulnerability as high-priority and assume exploitation is feasible, given the trivial attack complexity and public PoCs. Label any claims of widespread exploitation as currently unverified unless backed by reproducible, multi‑tenant telemetry.

Hardening guidance for developers and architects​

  • Never trust serialized LLM outputs. Before you serialize anything produced by a model, treat it as untrusted input. Apply explicit validation that strips or rejects reserved keys like lc, or better yet, avoid round‑tripping model responses through internal serializers unless strictly necessary.
  • Use allowlists for deserialization. Prefer allowlist patterns that only permit known-safe classes to be reconstructed. If the framework offers an allowed_objects parameter, use it; if not, wrap deserialization in a sandbox that enforces type and namespace checks.
  • Minimize secrets exposure. Remove long-lived secrets from runtime environments that agents can reach. Use short-lived managed identities and ephemeral tokens. Audit any code paths that read environment variables inside agent workflows.
  • Tighten template engines. If your orchestration uses template rendering (Jinja2 or similar), ensure templates are rendered with strict sandboxing and that untrusted template inputs are disabled or separated.
  • Adopt adversarial testing. Add tests that attempt to inject reserved markers into metadata fields and assert that pipelines sanitize or reject them.

Incident response playbook (high level)​

  • Isolate affected service instances and snapshot forensic data (memory and disk).
  • Rotate credentials and invalidate tokens the affected service could access, starting with high‑value secrets.
  • Execute focused hunts for deserialization anomalies, unusual class instantiation logs, and exfiltration patterns (unexpected blob downloads, egress to unfamiliar endpoints).
  • Remediate code and images, promote patched images through the pipeline, and validate behavior via canaries before broad rollout.
  • Notify downstream and upstream customers/suppliers if your agent or API integrations expose them to risk.

Conclusion​

CVE‑2025‑68664 (LangGrinch) is a stark illustration of how AI supply chain security is no longer an academic exercise. The interplay between LLM outputs, serialization formats, and orchestration frameworks turns old classes of bugs — injection, insecure deserialization, template injection — into AI-native threats with outsized impact. The fix is simple: patch LangChain, inventory your estate, and apply runtime compensations where immediate patching is impossible. The harder and longer-term work is organizational: building telemetry that captures prompt and orchestration signals, shifting left in MLOps, and adopting identity‑centric, least‑privilege controls for agents.
Microsoft Defender’s posture and XDR tooling provide practical capabilities — inventory, hunting, and integrated remediation workflows — that materially shorten the path from discovery to fix for many customers, but no single vendor tool removes the need for developer discipline, secure serialization practices, and proven incident playbooks. Treat frameworks and SDKs as part of your AI Bill of Materials, enforce explicit sanitization of model outputs, and prioritize patching high‑impact libraries as a continuous operational imperative.
If you manage AI workloads: patch first, inventory broadly, and instrument richly. The next supply‑chain surprise won’t wait for a convenient maintenance window.

Source: Microsoft Case study: Securing AI application supply chains | Microsoft Security Blog