Research

AI Code Assistant Package Hallucination Study

The Safeguard Research team measured how often AI coding assistants hallucinate non-existent packages, how sticky those hallucinations are, and what defenders should do.

Shadab Khan
Security Engineer
7 min read

Package hallucination from AI coding assistants stopped being a theoretical worry some time in 2023, when researchers showed that large language models will confidently invent package names that do not exist on public registries. Attackers noticed. The "slopsquatting" pattern, in which someone registers a plausible-but-invented name that an assistant has been observed recommending, is now a recurring fixture in incident write-ups.

The Safeguard Research team wanted to move past anecdote and measure the phenomenon as it exists today. How often does it still happen? Which ecosystems are hardest hit? How consistent are hallucinations across prompts and models? This post is the result.

How did the team design the measurement?

We generated a prompt corpus of around ten thousand realistic developer questions across Python, JavaScript and TypeScript, Ruby, Rust, and Go, covering task framings that a developer might actually ask: "recommend a library for X", "give me a code snippet that uses Y", "what is the package that does Z", and similar. We ran the corpus against a representative set of current-generation assistants, from both general-purpose chat models and code-specialised systems, using default settings and neutral personas.

For every response, we extracted any package name referenced in import statements, install commands, or natural-language recommendations. We then verified each name against the live registry for the corresponding ecosystem. Names that did not resolve were classified as hallucinated and further categorised by proximity to real names, typo distance, and plausibility of meaning.

We repeated each prompt at least five times per model to measure consistency. Hallucination is only an attack vector if it is reproducible. A name generated once and never again is a curiosity. A name generated reliably across sessions is a target.

How common is hallucination in current assistants?

Aggregate hallucination rates sit in the low single digits of responses for the best assistants and in the high single digits to low teens for the weakest, with meaningful ecosystem-specific variation.

Python and JavaScript, which dominate assistant training data, had the lowest rates. Ruby and Rust sat in the middle. Go was somewhat idiosyncratic: lower overall rates, but a higher share of hallucinations that looked legitimate, because the Go import-path naming convention makes invented names harder for a human reader to catch.

Compared with the 2023 and 2024 baselines that earlier academic work established, rates have come down noticeably. Assistants are better at saying "I am not sure" and better at citing concrete version numbers, both of which reduce hallucination surface. The problem is far from solved, however, and the economics of slopsquatting only require a small fraction of hallucinations to be reproducible to be profitable for an attacker.

How consistent are the hallucinations across sessions?

This is the uncomfortable part. A measurable share of hallucinated names recurred across sessions, sometimes across models.

In our data, somewhere between 20% and 40% of hallucinated names, depending on model and ecosystem, appeared in at least two independent sessions with the same prompt. A smaller but still notable slice, in the single-digit percentages, appeared across multiple different models, suggesting a shared training-data artifact or a shared inductive bias in how the models compose plausible names.

This is exactly the property that makes slopsquatting an attack rather than a statistical oddity. An attacker does not need to guess what an assistant will say. They can ask, observe, register the reproducible names, and wait.

Which naming patterns are most vulnerable?

Hallucinations clustered around a small number of patterns: compound descriptive names ("http-request-helpers", "flask-oauth2-client", "redis-cache-utils"), company-plus-feature names that resemble real vendor libraries, and plausible language ports of tools from other ecosystems.

Compound descriptive names were the single largest category. Models have clearly internalised the morphology of real package names, and when they need one that does not exist, they produce something that fits the pattern perfectly. A human developer skimming a response from an assistant is unlikely to pause on "a package called flask-oauth2-client", because of course such a package would exist.

Plausible cross-ecosystem ports were also common. A user asking for a Python equivalent of a well-known Rust crate sometimes got a name that sounded like a natural port but that had never been published. These are especially dangerous because the developer's mental model is already primed to trust the name.

What is the real-world exposure today?

We checked how many hallucinated names from our study were currently registered on public registries, and how many of those showed signs of being slopsquat registrations.

Somewhere in the low-to-mid teens of a percent of hallucinated names in our corpus resolved to a package that had been registered on the relevant public registry. Of those, a meaningful minority had metadata signatures consistent with slopsquatting: recent first publish, low download counts, anonymous maintainers, install-time scripts, and network calls to unfamiliar hosts.

This does not mean those packages were all malicious. Some were legitimate libraries that happened to pick a plausible name after a model learned to invent it. But the overlap is large enough that we treat any install of a recently-registered package that a developer learned about from an assistant as a case worth a second look.

What defensive patterns actually work?

Existence verification before install is the single most effective defensive layer, and it is cheap to implement.

First, require that every new direct dependency be looked up against a vetted catalogue before the install succeeds. Your internal proxy or package gateway should refuse to pass through a name on first use until a human or an automated policy has approved it. This blocks the slopsquatting attack regardless of whether the developer learned the name from an assistant, a colleague, or a blog post.

Second, apply maturity gates. A package registered in the last seventy-two hours with no prior version history, no maintainer track record, and low download volume should require explicit justification before it enters a production build. The legitimate cost of this gate is small. The adversarial benefit is large.

Third, train assistants and prompts inside your organisation to cite versions and link to the registry. An assistant response that says "install flask-oauth2-client" is easy to act on without thinking. An assistant response that links to a specific registry page prompts the reader to look at it, which is where most hallucinations get caught.

Fourth, log assistant-sourced dependency additions. A lightweight commit convention, an editor plugin, or a pull request template that surfaces which dependencies came from an AI suggestion turns an invisible vector into a reviewable one.

Will this get better as models improve?

Partially. We expect aggregate hallucination rates to continue drifting down as retrieval-grounded assistants and registry-aware tool use become standard. The long tail of reproducible hallucinations will be slower to disappear, because some share of it comes from systematic properties of how models compose plausible names, and the attacker economics only require a small reproducible tail to work.

In the meantime, the defensive patterns above do not rely on the model getting better. They close the attack at the install boundary, which is where it has to succeed or fail.

What this means

Package hallucination is no longer a novelty finding. It is a steady, measurable property of how developers now write software, and it has a matching steady, measurable attacker response. Treating it as an install-time supply-chain problem, rather than an AI-alignment problem, puts the responsibility where the control actually is.

The teams that get this right are the ones whose proxy knows more about a package than the developer does, and whose policy runs before the install, not after.

How Safeguard.sh Helps

Safeguard.sh inspects every new dependency against a live catalogue of registered packages, maintainer reputation signals, registration age, and known slopsquat patterns, and blocks or gates first-time installs of risky names before they reach your build. We connect suggestions from AI assistants to the specific registry lookups that would have caught a hallucination, and we expose the evidence in pull request checks rather than in a separate console. Customers use Safeguard.sh to keep the benefits of AI-assisted development while closing the install-time hole that makes slopsquatting work.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.