When you install a package -- npm install, pip install, gem install -- you expect the files to land in the project node_modules, virtual environment, or gem directory. What if they land somewhere else entirely?
Path traversal in package archives is a well-known vulnerability class that remains surprisingly common. A malicious package can contain files with names like ../../../etc/cron.d/backdoor or ../../.bashrc that, when extracted, write to locations outside the intended installation directory.
How Path Traversal Works in Packages
Most package formats are archive files (ZIP, tar.gz, or custom formats) that contain a directory structure. The package manager extracts this archive to a specific location. The vulnerability occurs when the archive contains entries with relative path components (..) that escape the extraction directory.
The Zip Slip vulnerability (disclosed by Snyk in 2018) affected hundreds of libraries across Java, JavaScript, Ruby, .NET, and Go that extracted ZIP files without checking for path traversal. The vulnerability was in the extraction logic, not the ZIP format itself -- the archive is valid, it just contains unexpected paths.
The Attack Chain
- Attacker creates a package that is functionally legitimate (to pass review and automated checks)
- The package archive contains files with traversal paths
- When installed, these files are written to attacker-chosen locations
- The written files could be cron jobs, SSH authorized keys, shell profile modifications, or overwritten application code
Real-World Examples
npm tar module (CVE-2021-32804). The tar npm package (used internally by npm itself) did not properly sanitize paths containing .. components after stripping a prefix. This allowed a crafted tarball to write files outside the extraction directory.
Python zipfile module. Python zipfile.extractall() was historically vulnerable to path traversal. The extractall() method now includes a check in Python 3.12+, but older versions and custom extraction code remain vulnerable.
Ruby gem extraction. RubyGems has had multiple path traversal vulnerabilities in gem extraction, where a crafted .gem file could write files to arbitrary locations.
Why This Is Harder to Fix Than It Sounds
Simply checking for .. in file paths is insufficient. Attackers use various encodings and normalizations to bypass naive checks:
- Encoded sequences:
%2e%2e%2for..%5con Windows - Unicode normalization: Some filesystems normalize Unicode characters, allowing bypass through visually similar characters
- Symlink chains: Create a symlink that points outside the directory, then write through the symlink in a subsequent file
- Absolute paths: Some archive formats allow absolute paths (
/etc/cron.d/backdoor) that bypass relative path checks - Windows-specific: Mixed path separators (
..\\..\\..) or UNC paths (\\\\server\\share\\path)
A correct implementation must resolve the final path after all normalization and verify that it falls within the intended extraction directory. The check must happen after path resolution, not before.
Package Manager Defenses
npm now validates file paths during extraction and rejects packages containing traversal paths. However, npm install hooks run before this validation in some cases.
pip relies on the Python zipfile and tarfile modules for extraction. Recent versions include path safety checks, but packages that use custom installation scripts can bypass these protections.
Maven extracts dependencies to the local repository and does not typically extract file archives during build. However, Maven plugins that extract archives (like maven-dependency-plugin with unpack) may be vulnerable.
Go modules download source code and verify it against the checksum database. The go tool does not extract arbitrary archives during module installation, making it less susceptible to this class of attack.
Defensive Measures
Use current package manager versions. Newer versions of npm, pip, and gem have added path traversal protections. Keeping your package manager updated is a basic defense.
Validate custom extraction code. If your build process extracts archives (ZIP, tar, or other formats) from dependencies, ensure the extraction code checks for path traversal. Use library functions that include built-in safety checks.
Sandbox package installation. Run package installation in a sandboxed environment (container or restricted user) where writing to sensitive system locations is prevented by OS-level permissions.
Monitor filesystem changes during installation. Tools that track filesystem modifications during package installation can detect unexpected writes to locations outside the installation directory.
Review installation scripts. Package post-install scripts can write to arbitrary locations without using archive path traversal. Review the installation scripts of new dependencies before adding them to your project.
How Safeguard.sh Helps
Safeguard.sh analyzes packages for known vulnerabilities including path traversal risks. Our platform monitors package registries for malicious packages that exploit archive extraction flaws, and our SBOM generation tracks the full dependency tree so you know exactly which packages are being installed and what they contain. When path traversal vulnerabilities are discovered in package managers or extraction libraries, Safeguard.sh alerts you immediately.