Best Practices

Dependency Compromise Timeline Reconstruction

How to rebuild a precise timeline after a dependency has been compromised, using lockfile history, registry metadata, and CI logs.

Timelines are the spine of a good incident report. Without one, you have a pile of facts that no executive, auditor, or customer can turn into a decision. I have spent more hours than I care to admit reconstructing exactly when a malicious dependency entered a codebase, when it started shipping, and when the last customer pulled it. The craft of timeline reconstruction is unglamorous, but it is what separates a report that closes the incident from a report that reopens it six months later when someone asks a question nobody thought to ask.

The Three Clocks You Care About

Every dependency compromise has at least three clocks ticking in parallel. The registry clock is when the malicious version was published and available for consumption. The ingestion clock is when your pipelines first pulled it. The distribution clock is when you first shipped it to customers. The gap between these clocks is the difference between a near miss and a customer-impacting event.

I build the timeline by anchoring the registry clock first because it is the most objective. For npm, the publish timestamp is in the packument:

curl -s https://registry.npmjs.org/left-pad | \
  jq '.time | to_entries | map({version: .key, time: .value})'

For PyPI, the JSON API at https://pypi.org/pypi/requests/json gives upload times per file. For Maven Central, search.maven.org exposes timestamps per artifact. These timestamps are authoritative because they come from the publisher's infrastructure, not yours.

Rebuilding the Ingestion Clock

The ingestion clock is harder because it lives in a dozen places: CI logs, developer workstations, cache proxies, lockfile commits. The single most useful source is your lockfile history. A lockfile tells you exactly which version was resolved at a given commit, and git blame tells you when that resolution changed.

git log --all --follow --patch -- package-lock.json | \
  grep -B2 -A2 '"@scope/compromised"' | \
  head -100

If you use a pull-through cache (Artifactory, Nexus, Verdaccio, or Sonatype), the cache's first-fetch timestamp for the compromised version is gold. I pull these logs at the start of every investigation because they answer the question "when did this thing first touch our environment?" in a single line:

curl -u admin:$ART_TOKEN \
  "https://artifactory.internal/api/search/creation?from=1710000000000&repos=npm-remote" | \
  jq '.results[] | select(.uri | contains("compromised"))'

A common mistake here is to assume the lockfile commit date equals the ingestion date. It does not. The ingestion happens the first time any runner actually pulls the version, which might be hours or days after the commit. Anchor to the cache, not the repo.

The Distribution Clock

The distribution clock is where the stakes get real. If your build produced a container image on Tuesday and that image went to production on Wednesday, customers were exposed starting Wednesday. To reconstruct this clock, you need the artifact promotion record.

For container builds, docker manifest inspect gives you the config digest, and the config contains the build timestamp. For traditional deploys, your CD system (ArgoCD, Spinnaker, Harness) has a deployment history API. Pull it for every environment:

argocd app history my-service -o json | \
  jq '.[] | {revision, deployedAt, source}'

Cross-reference each deployment's source commit against the lockfile history you already built. The first deployment whose source commit contains the compromised lockfile entry is your distribution start. The last deployment before the fix is your distribution end.

Putting the Clocks Together

With the three clocks anchored, I build a single table. Columns are timestamp (UTC, always), clock (registry, ingestion, distribution), event, and source. A typical reconstruction might look like:

2024-07-20 14:22 UTC, registry, malicious v4.2.1 published, npm packument
2024-07-20 15:07 UTC, ingestion, first pull to Artifactory, Artifactory creation API
2024-07-20 15:31 UTC, ingestion, first CI build includes version, Jenkins console log
2024-07-21 09:14 UTC, distribution, first prod deploy with compromised lib, ArgoCD history
2024-07-22 11:48 UTC, detection, Safeguard flags package, alert-id 9f3c
2024-07-22 12:03 UTC, containment, registry pin added, git commit abc123
2024-07-22 13:45 UTC, distribution, last affected prod replica drained, ArgoCD rollout

Every row has a source so that a reviewer can trace the claim back to a log line. This is the structure that survives audit. The first time a lawyer asks "how do you know this happened at 15:07," you will be glad you wrote it down.

Dealing with Gaps and Drift

Real timelines have gaps. Clocks drift. CI runners in different regions can log in different timezones. I have seen timelines off by an hour because someone forgot that their build farm was still on PST while the registry logs in UTC. Normalize everything to UTC at collection time, not at analysis time.

Gaps usually fall into two categories. Either you do not have the log (the CI ran on an ephemeral runner whose logs expired), or the log exists but is ambiguous (the cache does not record which team member triggered the fetch). Write the gaps into the report as gaps. Nothing destroys credibility faster than a timeline that pretends to know things it does not.

For ephemeral runners, the fix is pre-incident: ship CI logs to long-term storage so that the next investigation has data. A one-line aws logs put-retention-policy --log-group-name /ci/ephemeral --retention-in-days 365 would have saved me a week of forensic guesswork on one engagement.

Communicating the Timeline

A reconstructed timeline is useless if nobody reads it. I present timelines as a single narrative paragraph at the top of the incident report, followed by the table, followed by a section per clock with the detailed evidence. Executives read the paragraph, engineers read the table, and auditors read the evidence section. The same document serves all three audiences.

Include a "what we don't know" section. Customers and regulators forgive uncertainty if you name it. They do not forgive false certainty that falls apart under questioning.

How Safeguard Helps

Safeguard reconstructs dependency timelines automatically by combining registry metadata, lockfile history, and artifact ingestion telemetry into a single correlated view. When you open a finding, the platform shows you the exact commit that introduced the dependency, the first build that consumed it, and every environment where it still runs — without you having to stitch together jq and grep across five systems. For incidents, Safeguard exports the timeline as a signed JSON artifact that plugs directly into your incident report, so the clocks you need are already aligned to UTC and sourced to their origin logs. The result is that the hardest part of post-incident reporting becomes the easiest.

Timeline Forensics Dependencies

Back to all articles

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Dependency Compromise Timeline Reconstruction

The Three Clocks You Care About

Rebuilding the Ingestion Clock

The Distribution Clock

Putting the Clocks Together

Dealing with Gaps and Drift

Communicating the Timeline

How Safeguard Helps

Related articles in Best Practices

Open Source vs Commercial Security Scanners 2026

Buyer Guide: Software Supply Chain Security 2026

Best Secret Scanning Tools 2026 Comparison

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers