Open Source Security

How to Detect Typosquatting in Package Installs

Build a pre-install guard that catches typosquatted npm, PyPI, and RubyGems dependencies using Levenshtein distance, download-count heuristics, and registry APIs.

Nayan Dey
Senior Security Engineer
5 min read

Typosquats prey on a single wayward keystroke. Install requsts instead of requests and you hand an attacker a Python interpreter on your laptop. The 2023 PyTorch torchtriton incident, the recurring colors.js/color-js confusion, and dozens of npm attacks all used names one or two characters away from a popular package. This tutorial shows you how to detect typosquatted names at install time using Levenshtein distance against a popularity baseline, query the npm and PyPI registries for publish-age and download-count signals, and wire the check into a pre-install Git hook and CI job. Prerequisites: Node 20 or Python 3.11, a list of your organization's allowed top-level deps, and 30 minutes.

What signals indicate typosquatting?

Four signals matter: edit distance from a popular name, low download count, very recent first-publish date, and a different publisher than the package being imitated. Any two together is suspicious; three is a near-certain hit.

A package named reqeusts with 18 downloads this month, published 9 days ago, by a maintainer who has no other packages — that is the signature. Compare against real requests: 900M monthly downloads, first published in 2011, by kennethreitz.

How do I compute name similarity?

Use Levenshtein distance with a small threshold (1 or 2) against a list of the top 1000 most-downloaded packages in your ecosystem. Most real typosquats are edit-distance 1 from a well-known name.

npm i -g fast-levenshtein
cat > check.mjs <<'EOF'
import lev from 'fast-levenshtein';
const popular = ['react', 'lodash', 'axios', 'express', 'chalk'];
const input = process.argv.slice(2);
for (const name of input) {
  for (const p of popular) {
    const d = lev.get(name, p);
    if (d > 0 && d <= 2 && name !== p) {
      console.log(`SUSPICIOUS: ${name} is ${d} edits from ${p}`);
    }
  }
}
EOF
node check.mjs reqeusts loadash axois
# SUSPICIOUS: loadash is 1 edits from lodash
# SUSPICIOUS: axois is 1 edits from axios

Maintain your popular-packages list from real usage — npm's all-the-package-names is 2M entries and too noisy. A 1000-entry list curated from your package-lock.json across all repos works better.

How do I check registry metadata?

Query the npm registry directly for publish age and maintainer info. npm view <pkg> time.created maintainers returns the data in one round-trip.

npm view requsts time.created maintainers.0.name
# 2022-03-14T07:42:11.893Z
# nobody-special-42

npm view requests time.created maintainers.0.name
# 2011-09-14T04:22:58.093Z
# kennethreitz

A package published in the last 30 days with a maintainer who has no other packages should block the install. PyPI exposes the same data via its JSON API at https://pypi.org/pypi/<name>/json.

How do I check download counts?

The npm API at api.npmjs.org/downloads/point/last-month/<pkg> returns monthly download counts. Anything under 1000/month that shares a prefix with a popular package is a red flag.

curl -s https://api.npmjs.org/downloads/point/last-month/reqeusts | jq
# { "downloads": 42, "start": "2024-01-15", "end": "2024-02-14", "package": "reqeusts" }
curl -s https://api.npmjs.org/downloads/point/last-month/requests | jq
# { "downloads": 2891754, "package": "requests" }

PyPI's download stats live on BigQuery's bigquery-public-data.pypi.downloads — rate-limited for free use but free tier sufficient for CI. Cache results locally for 24 hours to stay within quotas.

How do I build a pre-install Git hook?

Add a pre-commit hook that runs against any changes to package.json or requirements.txt. Use Husky for npm projects or pre-commit for Python.

npm i -D husky lint-staged
npx husky init
cat > .husky/pre-commit <<'EOF'
npx lint-staged
EOF
cat > .lintstagedrc.json <<'EOF'
{ "package.json": "node ./scripts/check-typosquat.mjs" }
EOF

The hook gives developers instant feedback at git commit time, before a malicious install ever runs. Keep the script under 3 seconds or developers will bypass it with --no-verify.

How do I wire this into CI?

Run the same checker as a PR status check, but also audit the full lockfile — not just changed lines. Transitive deps can also be typosquats, and a malicious indirect dep is just as dangerous.

name: Typosquat check
on: [pull_request]
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11
      - uses: actions/setup-node@1a4442cacd436585916779262731d5b162bc6ec7
        with: { node-version: '20' }
      - run: npm ci --ignore-scripts
      - run: node scripts/check-typosquat.mjs --lockfile package-lock.json

Fail the build on any SUSPICIOUS finding and require a human to acknowledge it. The acknowledgement should live in a checked-in allowlist file, not as a CI override flag.

How Safeguard Helps

Safeguard runs typosquatting detection on every SBOM it ingests, comparing each package name against a curated popularity corpus and enriching it with real-time threat intelligence from Socket, OSS Review Toolkit, and the OpenSSF malicious packages feed. Griffin AI ranks findings by reachability — a typosquat in a dev-only path is lower priority than one your production code actually imports. The platform keeps a historical registry of compromised-package incidents so a name flagged three months ago in another tenant auto-blocks in yours. Policy gates can reject a PR that introduces a low-reputation package whose name is two edits from a top-100 dep, and alerts route to Slack within seconds. Catch the typo before it ships.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.