Supply Chain Security

Python Package Typosquatting in 2024: Scale, Tactics, and Defenses

Typosquatting on PyPI reached industrial scale in 2024, with attackers using automated tooling to register thousands of malicious package names targeting common misspellings of popular libraries.

James
Threat Intelligence Lead
6 min read

Typosquatting, the practice of registering package names that are deliberate misspellings of popular libraries, has been a known threat in package registries for years. But in 2024, the scale and sophistication of typosquatting attacks on the Python Package Index (PyPI) reached a new level.

Security firms Phylum, Checkmarx, and Socket collectively identified and reported thousands of malicious PyPI packages in the first eight months of 2024. The attacks are no longer opportunistic one-offs. They are systematic, automated campaigns designed to compromise as many developers as possible.

The Scale of the Problem

Between January and August 2024, security researchers identified over 10,000 malicious or suspicious packages on PyPI. While not all of these were typosquats (some used dependency confusion, star jacking, or other techniques), typosquatting remained the dominant attack vector.

The targeting follows a clear pattern. The most typosquatted packages are the most popular ones:

  • requests (typosquats: requets, reqeusts, request, requsts)
  • beautifulsoup4 (typosquats: beautifulsoup, beutifulsoup4, beautfulisoup4)
  • numpy (typosquats: numpi, numppy, nunpy)
  • pandas (typosquats: panda, pandsa, pandass)
  • flask (typosquats: flaask, flaskk, flsk)
  • django (typosquats: djano, dajngo, djnago)

Attackers use automated tools to generate permutations of popular package names (character swaps, omissions, additions, keyboard-proximity substitutions) and register them in bulk. A single campaign might register 50-100 typosquats in a single day.

Evolution of Techniques

The typosquatting attacks of 2024 are more sophisticated than earlier efforts in several ways:

Functional facades: Many malicious packages now include actual functionality that mimics the legitimate package. Instead of an empty package with only a malicious setup.py, the attacker copies the legitimate package's code and inserts malicious payloads. A developer who installs the typosquat and tests basic functionality may not notice anything wrong.

Delayed activation: Some packages wait before executing their malicious payload. The setup.py install hook runs clean code, but the payload activates after a delay, when the package is imported in a specific context, or when certain environment variables are present (indicating a CI/CD environment with valuable secrets).

Obfuscation layers: Malicious code is increasingly obfuscated using:

  • Base64 encoding with multiple layers.
  • Character code concatenation (chr(104)+chr(116)+chr(116)+chr(112) to spell "http").
  • Code stored in non-obvious locations like font files, image metadata, or test fixtures.
  • Dynamic code loading from external URLs, so the package itself appears clean during static analysis.

Environment-aware payloads: Advanced typosquats check their execution environment before deploying payloads. They look for:

  • CI/CD environment variables (GITHUB_TOKEN, CI, JENKINS_URL) to target build systems.
  • Cloud provider metadata endpoints to identify cloud environments.
  • Docker/container indicators to target containerized deployments.
  • Cryptocurrency wallet indicators to target individual developers.

Common Payload Types

The malicious payloads delivered through PyPI typosquatting in 2024 generally fall into these categories:

Information stealers: The most common payload type. These collect environment variables, SSH keys, AWS credentials, browser cookies, and cryptocurrency wallet data, and exfiltrate them to attacker-controlled servers. Many use legitimate services (Discord webhooks, Telegram bots, Pastebin) for data exfiltration.

Reverse shells: Packages that establish persistent remote access to the compromised system. These are particularly dangerous in CI/CD environments where the reverse shell inherits the permissions of the build process.

Cryptocurrency miners: Some packages deploy cryptocurrency miners that consume compute resources. These are often targeted at CI/CD environments where compute usage may not be closely monitored.

Dependency chain injectors: The most subtle variant. The malicious package modifies the victim's Python environment to inject additional malicious dependencies, which persist even if the original typosquat is uninstalled.

PyPI's Response

PyPI has taken several steps to combat typosquatting:

Malware detection scanning: PyPI has implemented automated scanning that checks newly uploaded packages for known malicious patterns, including suspicious install hooks, obfuscated code, and network calls during installation.

Name similarity checks: New package registrations are now checked against a database of popular package names, with warnings or blocks for names that are too similar.

Trusted Publishers: The Trusted Publishers program links package uploads to specific GitHub repositories and CI/CD workflows, eliminating the risk of compromised maintainer credentials. Packages published through Trusted Publishers have verifiable build provenance.

User reporting improvements: PyPI has improved its malware reporting process, enabling faster takedown of confirmed malicious packages.

Despite these improvements, the cat-and-mouse dynamic continues. Attackers adapt their techniques to evade detection, and the volume of new package submissions makes comprehensive human review impossible.

Protecting Your Organization

Preventive measures:

  1. Use a private PyPI mirror or proxy (Artifactory, Nexus, or devpi) that allows you to curate and approve packages before they are available to developers.
  2. Enable hash-checking mode in pip (pip install --require-hashes) to ensure that only packages with verified integrity are installed.
  3. Use lockfiles (pip-compile from pip-tools, or poetry.lock) to pin exact package versions and hashes. Lockfiles prevent accidental installation of unexpected packages.
  4. Configure pip to use only your private registry: Set --index-url to your private registry and disable fallback to public PyPI.

Detective measures:

  1. Scan dependencies with security tools like pip-audit, Safety, or commercial SCA solutions that check for known malicious packages.
  2. Review new dependencies before installation. Check publication date, maintainer history, and download counts. A package published last week with 5 downloads is more suspicious than one published 3 years ago with 5 million downloads.
  3. Monitor for unexpected packages in your environments. Regular audits of installed packages across development and CI/CD environments can catch typosquats that slipped through.

Organizational measures:

  1. Establish an approved package list for your organization. New packages must go through a review and approval process before being added.
  2. Educate developers about typosquatting risks. A five-minute awareness session during team meetings can significantly reduce accidental installations.
  3. Use IDE extensions that check package names against known typosquats during import statements.

How Safeguard.sh Helps

Safeguard.sh provides automated defense against typosquatting and other package supply chain attacks.

  • Package reputation scoring evaluates every dependency in your SBOM based on publication age, maintainer trust, download patterns, and behavioral analysis, flagging packages that match typosquatting profiles.
  • Continuous SBOM monitoring detects when new, unreviewed packages enter your dependency tree, whether through direct installation or transitive dependency changes.
  • Policy enforcement lets you define rules that block installation of packages below a minimum trust score, packages published within a specified time window, or packages from unverified publishers.
  • Alerting and remediation provides immediate notification when a dependency is identified as malicious, with guidance on safe removal and replacement.

Typosquatting is a simple attack that exploits human fallibility. Automated tooling turns a moment of inattention into a detectable, blockable event instead of a compromise.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.