Open Source Security

Python PyPI Malware Campaigns in 2021

Malicious packages on PyPI surged in 2021, targeting developers with credential stealers, backdoors, and data exfiltration. Here's what the campaigns look like and how to defend against them.

Yukti Singhal
Security Researcher
5 min read

Python's Package Registry Under Siege

PyPI (Python Package Index) hosts over 350,000 packages and serves billions of downloads per year. It's the backbone of the Python ecosystem — every pip install resolves against PyPI by default. And in 2021, it became a primary target for malware distribution.

The attacks aren't sophisticated. They don't need to be. PyPI's publishing model is open: anyone can create an account and upload a package with any name that isn't already taken. There's no mandatory code review, no identity verification, and — until recently — no mandatory two-factor authentication. The barrier to distributing malware through PyPI is essentially zero.

Major Campaigns in 2021

The aws-login0tool Campaign (June 2021)

Sonatype researchers discovered a package called aws-login0tool — a typosquat targeting developers looking for AWS authentication utilities. The package contained code that harvested AWS credentials, environment variables, and SSH keys, sending them to an attacker-controlled server via DNS exfiltration.

The package was designed to look legitimate: it included a plausible description, version history, and documentation. It targeted a specific niche — DevOps engineers working with AWS — where credential theft has maximum value.

The Discord Token Stealers (July-August 2021)

Multiple coordinated campaigns targeted Discord users through PyPI packages. Researchers at JFrog identified over 30 malicious packages designed to steal Discord authentication tokens. The packages had names like discord-tools, discord-exploit, and discordhelp — targeting developers building Discord bots.

The payloads scanned local browsers and Discord's local storage for authentication tokens, then exfiltrated them via webhook to Discord channels controlled by the attackers. Stolen tokens allowed full account takeover without credentials.

The noblesse Campaign (September 2021)

JFrog discovered a family of malicious packages including noblesse, genesisbot, are, suffer, and noblesse2. These packages contained code that:

  • Stole Discord tokens
  • Harvested browser credentials from Chrome
  • Took screenshots
  • Exfiltrated data via Discord webhooks

The packages targeted a specific demographic: young developers building Discord bots, who were less likely to scrutinize package contents.

setup.py Abuse (Throughout 2021)

Python packages use setup.py for installation logic, which is executed during pip install. This is the primary mechanism for PyPI malware — the setup.py script runs arbitrary code with the permissions of the user running pip.

Researchers documented campaigns where setup.py was used to:

  • Download and execute secondary payloads from remote servers
  • Install persistence mechanisms (cron jobs, scheduled tasks)
  • Modify system configurations
  • Exfiltrate credentials from ~/.aws/credentials, ~/.ssh/, and browser storage

Dependency Confusion on PyPI

Following Alex Birsan's February 2021 research, multiple attackers attempted dependency confusion on PyPI. The --extra-index-url configuration in pip makes Python particularly vulnerable — pip checks both the public PyPI and the private index, preferring the higher version number regardless of source.

Organizations using internal package names without claiming them on PyPI were targeted throughout the year.

Why PyPI Is Particularly Vulnerable

setup.py Is Arbitrary Code Execution

Unlike npm's package.json, which is data with optional install scripts, Python's setup.py is a full Python script that executes during installation. There's no opt-out for script execution during pip install (until PEP 517/518 and pyproject.toml gain wider adoption). Every pip install of an unknown package is arbitrary code execution.

No Namespace Scoping

npm has scoped packages (@company/package). PyPI doesn't have an equivalent. Every package name is global, first-come-first-served. This makes typosquatting and dependency confusion easier — there's no way to claim a namespace for your organization.

Limited Malware Detection

PyPI has minimal automated malware detection. Packages are published and available immediately. Detection relies primarily on community reporting and external security researchers scanning published packages.

In 2021, PyPI began partnering with organizations to improve automated malware detection, but the coverage remains limited compared to the volume of uploads.

Slow Removal

Even after a malicious package is reported, removal takes time. The PyPI team is small, and reports require investigation. During the gap between reporting and removal, the package continues to be available for download.

Defensive Practices

Use Virtual Environments

Always install packages in virtual environments, never system-wide. This limits the blast radius of malicious packages to the virtual environment, not the entire system.

Verify Before Installing

Before pip install unfamiliar-package:

  • Check the package on pypi.org: download count, publication date, author, repository link
  • Low download counts and recent publication dates are warning signs
  • Verify the repository link exists and contains the actual source code

Use --index-url, Not --extra-index-url

For private packages, configure pip with --index-url pointing to your private registry. Don't use --extra-index-url, which adds the private registry alongside PyPI without replacing it.

Pin Dependencies

Use pip freeze > requirements.txt and commit it. Better yet, use pip-tools or poetry with lockfiles that include dependency hashes:

# requirements.txt with hashes
requests==2.26.0 \
    --hash=sha256:b8aa58f8cf793ffd8782d3d8cb19e66ef36f7aba4353831010519d7422db5288

Scan Dependencies

Use tools like pip-audit, safety, or bandit to scan installed packages for known vulnerabilities and malicious indicators.

Monitor for Typosquats

If your organization uses internal Python packages, claim those names on PyPI as placeholders. Monitor PyPI for packages with names similar to your internal packages.

How Safeguard.sh Helps

Safeguard.sh monitors PyPI and other Python package sources for malicious packages, typosquats, and dependency confusion attempts targeting your organization. The platform maintains a continuously-updated database of known-malicious Python packages and cross-references your project dependencies against it in real time.

When a new malicious package is identified on PyPI, Safeguard.sh checks whether any of your projects depend on it — directly or transitively — and alerts you immediately. For dependency confusion risks, the platform compares your internal package names against public PyPI to identify potential squatting vulnerabilities before attackers exploit them.

The platform also integrates into your Python CI/CD pipeline, scanning installed packages at build time for known malware, unexpected behaviors, and security vulnerabilities. This build-time gate prevents malicious packages from entering your deployment pipeline, regardless of how they arrived in your dependency tree.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.