AI Security

Zero-Day Discovery In Your Dependency Graph

Most zero-days that hurt enterprises in 2026 live three or four hops deep in the dependency graph. Here is what it takes to actually find them there.

The dependency graph that ships with a typical enterprise web application is no longer something a human can read. A medium-sized Node service I audited last quarter resolved 1,427 packages, of which the team had directly required 38. The remaining 1,389 arrived as transitive dependencies, four to seven levels deep, each carrying its own assumptions about input validation, encoding, and trust. The application's own first-party code was about 80,000 lines. The dependency surface around it was somewhere north of 7 million.

Pattern scanners did not warn the team that any of this was a problem. CVE-matching tools came back clean, because there were no CVEs filed against the specific versions in use. Yet two of the deeply transitive packages contained reachable, exploitable bugs that nobody had named yet. Neither the maintainer nor the security industry had ever assigned them an identifier. They were genuine zero-days, and they were sitting four hops deep in a graph that nobody had read.

This is the modern shape of the problem. Zero-day discovery in 2026 is not primarily a question of finding novel bugs in your own first-party code, although that matters. It is a question of finding novel bugs in the graph of strangers' code that you ship inside your binary.

Why transitive dependencies are where zero-days hide

The dependency graph is asymmetric. The packages you directly require get more attention from you, because you read their docs, you bump them deliberately, and you notice when they break. The packages four hops away get almost no attention. They are pulled in to satisfy a peer dependency of a build-time tool of a framework you chose for unrelated reasons. You have probably never read the code. The maintainer is often a single individual. The CI is often minimal. The security review, if it exists, is usually a snapshot from a community audit two years ago.

This is the most fertile ground in the ecosystem for previously unnamed bugs. The combination of low attention, high reuse, and weak review converges to a population of vulnerabilities that exist on disk but have never been catalogued. Zero-days are not rare in this graph. They are abundant. The reason they go unreported is that nobody is looking for them in the right way.

Why SCA scanners miss them

Software composition analysis tools were built to answer a different question: which of your dependencies have known CVEs? That question is useful, and the answer changes daily. But it is not the same question as "which of your dependencies contain bugs nobody has named yet." The CVE database is a record of what the security community has already found and reported. It is not, and was never intended to be, a comprehensive catalogue of vulnerabilities that exist in the code.

For a zero-day in a transitive dependency, the CVE simply does not exist. The bug is real, the data flow is exploitable, and the SCA tool will return clean for as long as the bug remains unreported. By the time the CVE lands, the patch has shipped and your incident response window has closed. SCA is the right architecture for known-vulnerability matching. It is the wrong architecture for zero-day discovery.

What reachability across a dependency graph actually requires

To find a zero-day in transitive code, the analysis has to do three things that almost no commercial tool does well.

First, it has to construct the dependency graph in a form that preserves call-graph and data-flow edges, not just package names. A list of packages with versions tells you nothing about whether tainted data from your handler can reach a sink inside a fourth-hop dependency. You need the inter-procedural call graph that crosses package boundaries.

Second, it has to do the reachability analysis at scale. A graph with seven million lines of transitive code cannot be analysed exhaustively in the same way you would analyse first-party code. The engine has to know which entry points in your application reach which functions in which dependencies, and prune the analysis to flows that have a chance of being triggered by an attacker.

Third, it has to reason about exploitability with grounded evidence. A reachable taint flow into a transitive sink is not, by itself, a vulnerability. The hypothesised exploit conditions have to be checked against the actual sanitisers, type narrowing, and framework-level escaping along the path. Without that disproof step, the noise level is so high that the findings are useless.

The engine-plus-Griffin AI architecture in this setting

The pipeline I have come to trust for this work splits the problem into stages that fit those three requirements. A static engine constructs the cross-package taint graph from the resolved dependency manifest and the source artifacts on disk. The graph is grounded in real edges, not heuristics. Griffin AI then reasons over each candidate flow, hypothesises a CWE class for the potential bug, and articulates the exploit conditions that would have to hold. A disproof pass tries to kill each hypothesis by examining sanitisers, narrowing, and runtime invariants along the entire cross-package path. Anything that survives all three stages is reported with its full taint path attached, including the package boundaries it crosses and the line numbers in each transitive package.

The findings that emerge from this pipeline are qualitatively unlike anything an SCA tool produces. They name the entry point in your code, the set of intermediate functions in your direct dependencies, the function in the four-hop transitive dependency where the sink lives, and the disproof attempt that failed. The triager has the entire reachability story in front of them.

What teams do with these findings

The most common workflow I see is responsible disclosure to the upstream maintainer, with the Safeguard report as supporting evidence. Maintainers respond well to reports that show the entire taint path. They are accustomed to dismissing speculative bug reports from drive-by reporters who never read the code. A report grounded in reachability, with the disproof attempt attached, is a different kind of artefact entirely. It signals that the reporter has done the work.

The second-order effect is that organisations using this pipeline are now contributing to the broader ecosystem's zero-day discovery rate, not just protecting themselves. A bug found in a transitive dependency, reported to the maintainer, and patched upstream is a bug that no longer exists for any downstream consumer. This is the closest the industry has to a positive externality in vulnerability work.

A note on the false-positive problem at this scale

The naive worry about running deep-graph reachability analysis is that the false positive count will explode. Seven million lines of transitive code, multiplied by even a small per-line FP probability, would produce a queue that no team could read. The architecture that prevents this collapse is the disproof pass. The model hypothesises bug classes over the candidate flows the engine surfaced, and the disproof pass actively tries to falsify the hypothesis using the same engine evidence. Hypotheses that survive falsification become findings. Hypotheses that the disproof pass kills never reach the queue. Empirically, in the deployments I have observed, transitive-graph FP rates land in the same single-digit-to-low-teens band as first-party FP rates for the same pipeline, which is what makes the approach operationally viable rather than theoretically interesting.

How Safeguard Helps

Safeguard ingests your full dependency manifest, resolves the transitive graph including build-time and dev tooling, and runs the engine-plus-Griffin AI pipeline across the entire surface on every merge. It surfaces reachable, disproof-tested zero-day candidates in transitive code that pattern scanners and SCA tools cannot reach by design. Each finding ships with the cross-package taint path, the CWE hypothesis, the failed disproof attempt, and a draft responsible-disclosure report you can send to the upstream maintainer. The platform tracks the disclosure lifecycle and notifies you when an upstream patch lands so you can update your build with a clear audit trail.

zero-day griffin-ai ai-security supply-chain

Back to all articles

More on #zero-day

View all →

AI Security

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Zero-Day Discovery In Your Dependency Graph

Why transitive dependencies are where zero-days hide

Why SCA scanners miss them

What reachability across a dependency graph actually requires

The engine-plus-Griffin AI architecture in this setting

What teams do with these findings

A note on the false-positive problem at this scale

How Safeguard Helps

More on #zero-day

Zero-Day Discovery With LLM-Augmented Reachability: A Safeguard Engine Walkthrough

Pattern Scanners Can't Find Zero-Days. This Can.

Anthropic's Mythos Vulnerability Scanner: An Honest Assessment of Strengths, Weaknesses, and Reasons to Be Cautious

The Limits of Single-Model Vulnerability Scanning: A Technical Analysis of the Mythos Approach

Related articles in AI Security

Building an Eval Suite for Your Security LLM Workflows

Zero-Day Discovery With LLM-Augmented Reachability: A Safeguard Engine Walkthrough

Frontier LLM Vendors Are Not Your Supply Chain Security Vendor

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers