AI Security

AI Hallucinations Meet Package Confusion: A New Class of Supply Chain Attack

When LLMs hallucinate package names that don't exist, attackers can register them. This supply chain attack vector is already being exploited in the wild.

Here's an attack pattern that couldn't have existed two years ago: an attacker asks ChatGPT to recommend a Python package for a specific task. ChatGPT confidently recommends a package that doesn't exist. The attacker registers that package name on PyPI with malicious code. Now, every developer who gets the same recommendation from ChatGPT and runs pip install is compromised.

This isn't theoretical. Researchers at Vulcan Cyber demonstrated this in mid-2023, finding that ChatGPT consistently hallucinated specific package names when asked certain questions, and those names were available for registration on public package registries.

How AI Hallucinations Create Real Packages

Large language models don't look up information in real time. They generate text based on patterns learned during training. When asked "what Python package handles X?", the model might generate a plausible package name that sounds right but doesn't actually exist.

The key insight is that these hallucinations are consistent. Ask the same model the same question multiple times, and it often generates the same fictional package name. This consistency is what makes the attack viable. The attacker doesn't need to guess what name an AI will recommend. They can simply ask and find out.

The attack flow:

Query popular LLMs with common developer questions across multiple programming languages
Identify package names that are hallucinated consistently
Check whether those names are available on public registries (PyPI, npm, RubyGems)
Register the available names with packages containing malicious payloads
Wait for developers following AI recommendations to install them

Why This Attack Is Effective

Trust in AI recommendations. Developers increasingly use LLMs as a first source of information. A recommendation from ChatGPT carries weight, especially for developers new to a language or ecosystem. They may not verify that a package exists or check its download count before installing.

Plausible naming. Because LLMs generate names based on patterns in real packages, the hallucinated names sound legitimate. A package called python-http-client-toolkit sounds like something that should exist. A developer won't necessarily question it.

No verification step. Package managers install packages by name without requiring users to verify provenance. pip install hallucinated-package either finds the package or fails. If an attacker has registered the name, it succeeds, and the developer assumes everything is fine.

Scale. A single attacker can query LLMs with thousands of developer questions, identify hundreds of hallucinated package names, and register them in bulk across multiple registries. The attack scales to target any language ecosystem with a public package registry.

The Research

Several research groups have quantified this risk:

Vulcan Cyber found that approximately 35% of ChatGPT code generation responses included at least one reference to a package that didn't exist
Testing across Python and Node.js, researchers identified over 100 consistently hallucinated package names that were available for registration
Some hallucinated names were remarkably specific, suggesting they might be based on patterns from deprecated packages or internal packages that leaked into training data

The consistency rate is what makes this particularly dangerous. If an LLM hallucinated different names each time, the attack surface would be too diffuse. But the same model giving the same wrong answer to the same question means the attacker has a reliable pipeline.

Variations and Escalations

Targeted hallucination inducement. Attackers can craft questions designed to elicit hallucinated package names for specific domains. "What's the best Python library for parsing DICOM medical images?" might consistently produce a non-existent package name that would be installed by healthcare developers.

Versioned hallucinations. LLMs sometimes hallucinate specific version numbers. "Use package-x version 2.4.1" where the package exists but version 2.4.1 doesn't. If the attacker can hijack the package or the registry allows publishing specific versions, they can target developers who specify the hallucinated version.

Namespace confusion. Different registries have different namespace rules. An LLM might recommend @company/package on npm, hallucinating a scoped package under an organization that doesn't own that scope. An attacker could register the scope and package.

Documentation poisoning. Attackers can create professional-looking documentation sites for hallucinated packages, complete with installation instructions and usage examples. When a developer searches for the package after an AI recommendation, they find an apparently legitimate project.

Defensive Measures

Verify before installing. Before running pip install or npm install on an AI-recommended package, check: Does it exist on the registry? How many downloads does it have? When was it published? Who maintains it? A package with zero downloads published yesterday should raise immediate suspicion.

Use lock files and verified dependency lists. Maintain curated lists of approved packages for your organization. New dependencies, whether recommended by AI or discovered elsewhere, should go through a review process.

Package registry defenses. Registries should consider reserving commonly-hallucinated package names to prevent malicious registration. This is similar to how registries already reserve names that could cause confusion with popular packages.

LLM provider responsibility. AI companies should test their models for consistent package hallucination and implement guardrails. If a model consistently recommends a non-existent package, that's a detectable and fixable behavior.

Organizational policy. Establish clear guidelines that AI-recommended packages must be verified through the same dependency review process as any other new dependency. The convenience of AI should not bypass security controls.

How Safeguard.sh Helps

Safeguard.sh directly addresses the AI hallucination supply chain risk through policy gates that validate every dependency before it enters your pipeline. When a developer adds a package to a project, whether recommended by AI or found through any other means, Safeguard.sh can verify it against known-good registries, check its provenance and publication history, and flag newly-published packages with no track record.

Our SBOM management ensures that every dependency in your software is documented and continuously monitored. If a hallucinated package somehow enters your supply chain, Safeguard.sh's continuous vulnerability monitoring will flag suspicious packages based on their metadata and behavior patterns. In a world where AI recommendations are becoming a primary way developers discover packages, Safeguard.sh provides the verification layer that prevents hallucinated recommendations from becoming real compromises.

AI-hallucination package-confusion supply-chain dependency

Back to all articles

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

AI Hallucinations Meet Package Confusion: A New Class of Supply Chain Attack

How AI Hallucinations Create Real Packages

Why This Attack Is Effective

The Research

Variations and Escalations

Defensive Measures

How Safeguard.sh Helps

Related articles in AI Security

Building an Eval Suite for Your Security LLM Workflows

Zero-Day Discovery With LLM-Augmented Reachability: A Safeguard Engine Walkthrough

Frontier LLM Vendors Are Not Your Supply Chain Security Vendor

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers