AI Security

vLLM CVE-2025-66448: Auto-Map RCE via Model Configs

A critical RCE in vLLM allows malicious model configs to bypass trust_remote_code=False. We analyze the bug, the patch, and what every vLLM operator should do.

CVE-2025-66448 is the second high-severity remote code execution vulnerability to land in vLLM in 2025, following CVE-2025-62164 disclosed earlier in the year. CVE-2025-66448 specifically targets the way vLLM processes the auto_map field in Hugging Face model configurations: the framework would fetch and execute Python code from a remote repository even when the operator had set trust_remote_code=False, which was the documented safety boundary. The vulnerability was patched in vLLM 0.11.1 via pull request #27204. This post analyzes the bug, the exploitation pattern, and the defender response for any organization running vLLM in production.

What is the auto_map field and why does it matter?

When you load a Hugging Face model in vLLM (or Transformers), the framework reads config.json from the model repository to learn architecture, tokenizer, and processing details. The auto_map field is an optional configuration block that maps Auto classes (AutoModel, AutoTokenizer, AutoConfig) to specific implementations — including implementations defined in Python files within the model repository itself. The intended use case is legitimate: a model with novel architecture (say, a Mamba-style state-space model when SSMs were new) needs to ship its own modeling code because Transformers does not yet support it. The Hugging Face safety contract was that this code path required trust_remote_code=True. CVE-2025-66448 is the discovery that vLLM's handling broke that contract.

How does the exploit actually work?

The attacker hosts a model repository containing a crafted config.json with an auto_map block referencing a Python file. The Python file contains arbitrary code in its module-level scope or in the __init__ of an exposed class. When vLLM loads the model, the auto-mapping logic resolves the reference and triggers Python's import machinery — which executes the module-level code regardless of the trust_remote_code flag, because vLLM's pre-flight check did not propagate the flag deep enough into the loader. The result: simply pointing vLLM at the attacker-controlled model identifier is enough to execute attacker code in the inference process, with the privileges of whoever runs vLLM (typically a service account with GPU access and network egress).

Who is vulnerable?

Any vLLM deployment running a version before 0.11.1 that loads models specified at runtime by user input, by API parameter, or by an automated workflow pulling from public registries. The most exposed configurations are: hosted inference platforms that let customers specify a model identifier, self-hosted Ollama-style deployments where developers experiment with new Hugging Face models, and CI pipelines that automatically test new model releases. Deployments that exclusively load a hardcoded model from local disk are not directly exploitable via this vector — but they are still vulnerable if a developer fetches a new model into the local cache from an untrusted source.

What does the patch actually do?

PR #27204 makes the auto_map handling honor trust_remote_code=False strictly: when the flag is false, vLLM refuses to resolve any auto_map entry that would require executing repository code. Models that genuinely need custom code now produce a clear error rather than silently executing untrusted code. The fix is conservative — it does not attempt to sandbox the code path — and the right operator posture is to keep trust_remote_code=False as the default and only enable it for explicitly vetted models. The patch landed in vLLM 0.11.1; the CVE recommends upgrading immediately for any production deployment.

How should we harden vLLM beyond the patch?

A single CVE patch is insufficient for a system in vLLM's threat model — a long-running inference process that pulls model artifacts from the public internet. Three layers of defense.

# vLLM hardening configuration for production deployment
deployment:
  vllm_version: ">=0.11.1"
  startup_flags:
    trust_remote_code: false
    enforce_eager: false
    disable_log_stats: false
  model_source:
    allowed_registries:
      - "internal-mirror.example-corp.local"
    blocked_registries:
      - "huggingface.co"   # require explicit promotion through mirror
    integrity_check:
      mode: required
      provider: openssf-model-signing
      trusted_signers:
        - "fulcio:meta-platforms@github-actions"
        - "fulcio:nvidia-publishing@github-actions"
  network:
    egress_allowlist: []   # inference workers should not reach external internet
    ingress: ["api-gateway-only"]
  runtime:
    container_image: "ghcr.io/vllm-project/vllm:0.11.1-distroless"
    seccomp_profile: "vllm-restricted.json"
    capabilities_drop: ["ALL"]
    read_only_root: true

First, mirror Hugging Face through an internal registry that you control and scan model contents for known-bad signatures before promoting. Second, require model-signing verification (OpenSSF Model Signing v1.0, integrated into NVIDIA NGC since March 2025) so that even if an attacker compromises the upstream Hugging Face account, an unsigned artifact never reaches your inference fleet. Third, eliminate inference-worker egress to the public internet — there is no legitimate reason an inference process should talk to anything except your API gateway and your model storage. If CVE-2025-66448 had landed against a properly egress-locked vLLM cluster, the attacker's reverse shell would have failed to connect.

How does this compare to other model-loading RCE classes?

CVE-2025-66448 sits in a small but growing family of bugs where a documented safety flag did not enforce what its name implied. CVE-2025-32434 in PyTorch broke the weights_only=True guarantee for torch.load. CVE-2024-50050 in Meta's llama serving code was the root of the ShadowMQ propagation. The Hugging Face Transformers ecosystem has had its own series of trust_remote_code interpretation issues across 2024 and 2025. The common pattern: a flag added to make the safe path explicit, then a code path discovered where the flag was not propagated correctly. For defender architecture, the right inference is to never rely on a single safety flag as the boundary. The boundary is the sandbox: a container, a microVM, or a gVisor sandbox with no network egress and no host filesystem access. If your inference workload runs in a properly constrained sandbox, a flag-bypass RCE becomes contained to the sandbox rather than spreading to your cluster. The flags are useful belts; the sandbox is the suspenders.

What about the related vLLM vulnerabilities?

CVE-2025-62164 (also patched in vLLM 0.10.2+) targets prompt-embedding handling via torch.load(). CVE-2025-9141 affects the Qwen3-Coder tool-call parser. And the ShadowMQ class of vulnerabilities, disclosed by Oligo Security in November 2025, identified ZeroMQ-and-pickle deserialization patterns shared across vLLM, NVIDIA TensorRT-LLM, Meta Llama, Microsoft Sarathi-Serve, and SGLang — copy-pasted across projects from a single upstream pattern. The common thread is that AI inference frameworks were architected for research velocity and inherit research-grade trust assumptions. Production hardening was bolted on later, and CVE-2025-66448 is the predictable consequence: a documented safety flag (trust_remote_code) that did not actually mean what its name implied.

What is the procurement signal for vLLM going forward?

Two CVE-2025-class disclosures in a single year (62164 in mid-year, 66448 in November) put vLLM in an uncomfortable position from a procurement perspective: it remains the dominant open-source inference framework with by far the most active development, but its security cadence is still catching up to its capability cadence. The right enterprise posture is to continue using vLLM (the alternatives have their own issues — TensorRT-LLM was equally affected by ShadowMQ, SGLang is unpatched as of disclosure, llama.cpp covers a smaller capability surface) while applying defense-in-depth that does not assume vLLM is bug-free. That means egress lockdown, network policy isolation, mandatory model signing verification, and a CVE-tracking discipline that gives you patched versions within days of disclosure rather than weeks. vLLM 0.11.1 is a known good baseline; pin to that or later and have a documented upgrade path for the next disclosure, which will arrive.

How Safeguard Helps

Safeguard tracks vLLM (and TensorRT-LLM, SGLang, TGI, Sarathi-Serve) versions in your AIBOM and matches them against CVE-2025-66448, CVE-2025-62164, the ShadowMQ family, and any future disclosures, alerting within minutes of CVE publication. Griffin AI generates patched-configuration diffs against your live deployments, so you know exactly which production clusters need the 0.11.1 upgrade and which can wait. Policy gates block deployments of vLLM versions below the safe baseline and refuse model identifiers from non-mirrored registries, and the OpenSSF Model Signing integration verifies signatures on every model pull. Egress monitoring on inference workers flags any outbound connection that does not match the egress allowlist, catching exploitation attempts even when a CVE is unknown. The result: an AI-framework CVE becomes a tracked patching event rather than an emergency.

vllm cve-2025-66448 inference-platform rce ml-supply-chain

Back to all articles