AI Security

Novel Bug Class Detection: Griffin AI vs Mythos

What happens when the bug does not match any known CWE? A study of how grounded and pure-LLM scanners perform on genuinely novel vulnerability patterns.

Every security engineer I know has a drawer full of bugs that do not fit the catalogue. A privilege escalation chain that depends on a UUID collision. An information leak through a cache header that leaks tenant identity across shared CDN edges. A business-logic flaw where a currency conversion rounds in a specific direction that lets you drain a few cents per transaction at scale. None of these are "unknown unknowns" to the humans who find them. They are simply patterns that sit outside the clean CWE taxonomy and therefore fall off the radar of tools that pattern-match against the catalogue.

Detecting genuinely novel bug classes is one of the fairer tests of an AI bug hunter, because it decouples the detection from the tool's training-set memorisation. If the bug looks like nothing in the public corpus, the model cannot have learned it by example. It has to reason.

The pure-LLM failure mode

Mythos-class tools ostensibly shine at novelty. They are language models; they are good at generating hypotheses; they do not need a pre-specified CWE to emit a finding. In practice, this freedom is the source of most of their false positives. Without a structured representation of the program, the model's "novelty" collapses into pattern-matching on training-set grammar. It will generate a finding shaped like a known bug, call it novel, and leave the reviewer to discover that the pattern it matched against does not correspond to any real primitive in the target program.

The 2025 analysis from the Oxford Internet Institute on "Hallucinated Novelty in LLM Code Audit" documented this carefully. They evaluated several pure-LLM scanners on a corpus of curated novel bugs that had been publicly disclosed after the models' training cutoff. The tools flagged roughly the expected volume of findings; fewer than 8 percent of the "novel" findings actually corresponded to the curated bugs. The rest were inventions.

What genuine novelty detection requires

A grounded pipeline has a harder job here but a more honest one. Griffin's engine represents the program explicitly: call graph, data flow, control flow, module boundaries, framework annotations. When it surfaces a path, the path is a real thing in the program. The novelty question then reduces to: does the combination of this source, this sink, and this constraint set correspond to a known CWE pattern, or is it something else?

Griffin's answer when the pattern does not match cleanly is to emit a finding with a CWE classification of "uncategorised" or a nearest-neighbour parent class, and to flag the deviation from the canonical pattern in the hypothesis. The disproof pass still runs: the claim has to survive attempts at invalidation regardless of whether the CWE label is standard. A reviewer reading a novel-class finding sees the grounded path, the non-matching pattern, and the disproof reasoning. They are not being asked to believe a novel claim on the model's authority; they are being shown evidence.

A recent example worth describing in the abstract

Without naming customers, one of the clearer examples from the last few months was a tenant-isolation bug in a SaaS application where a shared cache key did not include the tenant ID. The source was HTTP input shaped as a document ID; the sink was the shared cache lookup; the bug was that two tenants owning different documents with the same numeric ID could poison each other's cached responses. This is not a classic CWE. It is closest to CWE-359 exposure of private personal information or CWE-840 business logic errors, but neither captures it cleanly.

Griffin's output described the path, flagged the missing tenant component in the cache key, and proposed the exploit conditions (attacker populates their document ID namespace to collide with the victim tenant's active documents). The disproof pass confirmed that the cache key construction genuinely omitted the tenant. A pure-LLM scanner on the same codebase produced a long list of "CWE-79 reflected XSS" findings based on pattern-matching input handling in the same controller, none of which were real.

The training-data dependency

The deeper issue is that pure-LLM scanners are only as creative as their training data. For CWE classes with extensive public corpora of example bugs (injection, XSS, path traversal, memory corruption in C), the scanner can at least pattern-match plausibly. For CWE classes with sparser public examples (tenant isolation, rate-limit bypass, business-logic confusion), the scanner either ignores them entirely or hallucinates them in inappropriate places. The training signal shapes both the recall and the noise in ways that are hard to predict without running the tool against a labelled novel-bug corpus.

Griffin's training signal is also not free of bias, but the engine-driven scaffolding prevents the model from emitting findings outside the flows the engine has proven exist. The model can still fail to classify a novel pattern correctly, but it cannot invent the flow from whole cloth.

Where grounded scanners still miss

It is worth being honest about what grounded pipelines cannot do. If the novelty is in an area the engine does not model, for example cryptographic protocol bugs where the vulnerability is in the interaction of multiple independent flows rather than in any single flow, Griffin will not surface the bug. It does not have a representation of the protocol-level invariants that are being violated. These bugs are often found by protocol-aware fuzzing or by formal verification, not by the style of analysis an LLM-assisted scanner does.

A realistic programme uses Griffin for the large class of bugs where taint-path reasoning applies, and pairs it with targeted dynamic analysis, fuzzing, and specification-driven verification for the rest. No single tool is going to find everything. The relevant comparison is which tools produce trustworthy output inside their scope.

How Safeguard Helps

Safeguard treats novel bug classes as first-class citizens in the finding pipeline. Griffin AI's "uncategorised" findings are routed to a dedicated review queue where senior security engineers can inspect the grounded evidence and either formalise a new internal pattern or accept the finding as a one-off. Over time the platform accumulates a library of internally discovered patterns specific to your stack, and Griffin learns to recognise them on subsequent scans. The effect is a detection capability that grows with your programme rather than being pinned to the public CWE catalogue.

griffin-ai mythos zero-day ai-security

Back to all articles

More on #griffin-ai

View all →

AI Security

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Novel Bug Class Detection: Griffin AI vs Mythos

The pure-LLM failure mode

What genuine novelty detection requires

A recent example worth describing in the abstract

The training-data dependency

Where grounded scanners still miss

How Safeguard Helps

More on #griffin-ai

Total Cost of Ownership: Griffin AI vs Mythos

API Surface Reviewed: Griffin AI vs Mythos

Real-World Deployment: Griffin AI vs Mythos

Safeguard Griffin AI: Eval Benchmarks Published

Related articles in AI Security

Building an Eval Suite for Your Security LLM Workflows

Zero-Day Discovery With LLM-Augmented Reachability: A Safeguard Engine Walkthrough

Frontier LLM Vendors Are Not Your Supply Chain Security Vendor

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers