Software Supply Chain Security

Malware Analysis Techniques for Suspicious npm Packages

When an npm package looks suspicious, you need a systematic approach to determine if it is malicious. These analysis techniques separate noise from genuine threats.

Yukti Singhal
Security Researcher
6 min read

The npm registry hosts over two million packages. Security researchers discover hundreds of malicious packages every month. Some are obvious -- a package named crossenv mimicking the legitimate cross-env with a postinstall script that exfiltrates environment variables. Others are subtle -- a package that functions normally for months before a maintainer account is compromised and a malicious update is pushed.

When a dependency triggers a security alert or an internal review flags something unusual, you need a systematic analysis process. Gut feelings are not sufficient. You need evidence.

Static Analysis: Reading the Code

The first step is reading the code. Download the package tarball (npm pack package-name) rather than installing it. Installation triggers lifecycle scripts, which is exactly what malicious packages exploit. The tarball gives you the code without executing anything.

Start with package.json. The scripts section is the most common malware vector. Look for preinstall, install, postinstall, preuninstall, and uninstall scripts that execute code. Legitimate packages use lifecycle scripts for building native addons or running setup tasks. Malicious packages use them to execute payloads before the developer even imports the package.

A lifecycle script that runs node setup.js is not inherently suspicious. But setup.js that decodes a base64 string and evaluates it, or makes an HTTP request to a non-standard domain, is a red flag. Follow the execution chain from the lifecycle script to every file it touches.

Obfuscation is a strong signal. Legitimate npm packages have no reason to obfuscate their code. Minification (removing whitespace and shortening variable names) is common for published libraries, but actual obfuscation -- encoding strings, inserting dead code, restructuring control flow -- indicates an intent to hide behavior. Tools like js-beautify or prettier can undo minification. Obfuscation reversal requires more specialized tools like de4js or manual analysis.

Look for these specific patterns in the code:

Environment variable access. process.env access beyond expected configuration (like NODE_ENV) suggests data harvesting. Malware commonly reads AWS_ACCESS_KEY_ID, NPM_TOKEN, GITHUB_TOKEN, and similar credential variables.

Network requests to unusual destinations. HTTP calls to IP addresses, recently registered domains, or known paste sites (pastebin, requestbin) indicate exfiltration or command-and-control communication.

File system access outside the package directory. Reading ~/.ssh/id_rsa, ~/.npmrc, ~/.aws/credentials, or similar credential files is a clear malicious indicator.

Dynamic code execution. eval(), Function(), vm.runInNewContext(), and similar dynamic evaluation of strings constructed at runtime. This is the most common technique for hiding malicious payloads.

Child process spawning. child_process.exec(), child_process.spawn(), or execSync() calls that run shell commands, especially commands constructed from strings rather than static arguments.

Metadata Analysis

Package metadata provides context that code analysis alone does not.

Publication history. Check when the package was published and how many versions exist. A package published once with no updates is suspicious if it has significant download numbers. Use npm view package-name time to see the publication timeline.

Maintainer history. Check if the maintainer recently took over the package. Use npm view package-name maintainers and compare with the package's git history. A maintainer change followed by a version bump is a common attack pattern (the event-stream attack).

Download patterns. Sudden download spikes on a previously unknown package may indicate that the package is being pulled in as a transitive dependency of a more popular package that was recently compromised. Use npm's download count API to check trends.

Readme quality. Malicious packages often have minimal, copied, or nonsensical READMEs. Legitimate packages typically have documentation proportional to their functionality. A package claiming to be a comprehensive utility library with a two-line README warrants scrutiny.

Repository link verification. Check that the repository URL in package.json actually exists and contains the published code. Malicious packages sometimes list a legitimate project's repository to appear trustworthy while the actual published code differs entirely.

Dynamic Analysis: Running in Isolation

When static analysis is inconclusive, run the package in an isolated environment and observe its behavior. Never run suspicious packages on your development machine or any system with access to production resources.

Use a disposable virtual machine or container. Configure network monitoring to capture all outbound connections. Set up file system monitoring to detect reads and writes outside the package directory.

Install the package with npm install --ignore-scripts first to get the code without triggering lifecycle scripts. Then manually execute the lifecycle scripts while monitoring.

Tools for dynamic npm package analysis include:

Socket.dev provides automated analysis that checks for many of the patterns described above. It runs packages in sandboxes and analyzes their behavior.

npm-audit-resolver helps triage audit results and track decisions about flagged packages.

Sandworm monitors npm package behavior at runtime, detecting file system access, network requests, and child process spawning.

Record the DNS queries the package makes. Malware frequently uses DNS for command-and-control or exfiltration (DNS tunneling). A package that resolves unusual domains during installation or import is suspicious.

Behavioral Pattern Recognition

After analyzing enough malicious packages, patterns emerge that speed up future analysis.

The build script dropper. A postinstall script downloads and executes a secondary payload. The package code itself appears benign because the malicious code is fetched at runtime. Detection: check for network requests in lifecycle scripts.

The delayed activation. The package works normally but includes a timer or condition that activates malicious behavior after a delay (days or weeks). This defeats analysis that only observes behavior during installation. Detection: look for setTimeout with large values, date comparisons, or external configuration checks.

The dependency chain bomb. The package itself is clean but depends on a malicious sub-package. The malicious code exists several levels deep in the dependency tree. Detection: analyze the entire dependency tree, not just the top-level package.

The protestware variant. The maintainer adds code that executes only under certain conditions (specific locales, IP ranges, or environment configurations). The code might delete files, display messages, or modify data. Detection: look for conditional execution based on locale, timezone, or IP address.

Building an Analysis Workflow

Systematize your analysis process. When a package is flagged:

  1. Download without installing. Extract the tarball.
  2. Check metadata: publication date, maintainer history, download trends.
  3. Read package.json scripts section. Follow the execution chain.
  4. Scan for obfuscation, dynamic evaluation, and network/filesystem patterns.
  5. If inconclusive, run in an isolated environment with monitoring.
  6. Document your findings with specific evidence.
  7. Report confirmed malicious packages to npm security.

Automation helps at scale. Configure your CI pipeline to flag packages that match suspicious patterns before they enter your dependency tree. This shifts analysis left -- catching threats before they are installed rather than after.

How Safeguard.sh Helps

Safeguard.sh automates many of these analysis steps at the point of dependency resolution. It evaluates packages against behavioral signatures, metadata anomalies, and known malicious patterns before they enter your project. Its continuous monitoring catches compromised updates to existing dependencies -- the scenario where a trusted package turns malicious after a maintainer account takeover. For security teams that cannot manually analyze every dependency update, Safeguard.sh provides the automated first pass that identifies the packages requiring human investigation.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.