AI Security

Hallucinated Security Findings: Measurable Rates

Pure-LLM security analysis hallucinates findings at rates between 20% and 70% depending on the task and model. Grounding is the architectural answer.

Nayan Dey
Senior Security Engineer
2 min read

When a frontier model is asked to find vulnerabilities in code without structured grounding, it produces findings — some real, some not. The rate of hallucinated findings (plausible-sounding vulnerabilities that don't actually exist in the code) varies with the task and model but consistently lands between 20% and 70% in published research. For production use, any rate above ~10% means the output is operationally unusable without heavy human filtering. Grounding is the architectural mechanism that drops the rate to production-acceptable levels.

What hallucinated findings look like

Three common patterns:

  • Non-existent imports. The model reports a dangerous import that isn't in the file.
  • Wrong function attribution. The model attributes a vulnerability to the wrong function.
  • Confident-sounding non-vulnerabilities. Plausible-looking SQL injection in code that uses parameterised queries.

Each is costly to triage because the finding reads like analysis.

Why hallucination happens

Three reasons:

  • Models are completion engines. They produce the most likely next tokens given the input. Plausibility is optimised, not truth.
  • Security-specific training data is limited. The model's training includes general security knowledge but not deep grounding in specific codebase analysis.
  • Multi-hop reasoning is unreliable. Chains of inference accumulate error.

Each factor compounds.

How grounding fixes it

Griffin AI's grounding approach:

  • The engine does reachability, taint, and call graph analysis deterministically.
  • The model reasons over the engine's structured output, not over raw code.

The model is asked "given this specific taint path, what is the exploit hypothesis?" not "find vulnerabilities in this code." The narrower question is one the model is far more reliable at.

Published Griffin AI hypothesis-accuracy numbers: 81% full agreement, 94% with partial CWE credit. The gap from the 30-80% pure-LLM baseline is the grounding effect.

What to evaluate

Three concrete checks:

  1. Ask the platform to analyze code with no structured grounding. Measure hallucination rate.
  2. Add grounding (reachability, SBOM). Re-measure.
  3. Compare operational usability.

How Safeguard Helps

Safeguard's engine-plus-LLM grounding architecture measurably reduces the hallucination rate that afflicts pure-LLM security analysis. For teams whose triage time is dominated by false positives, grounding is the architectural property that changes the economics.

Related articles in AI Security

AI Security

Safeguard Now Supports Every Major AI Model Family for Zero-Day Discovery: Anthropic, OpenAI, Gemini, Microsoft, Meta, and Your Own Models

You should not have to choose between your organization's AI strategy and your security platform. Safeguard's agentic zero-day discovery and remediation pipeline now works on Anthropic Claude Fable 5, OpenAI GPT, Google Gemini, Microsoft Phi, Meta Llama, Safeguard native models, and privately hosted custom models — all running as first-class agents in the same Multi-Agent TAOR Deep Think AI Engine.

June 9, 2026Read
AI Security

Anthropic Claude Mythos Releases Tomorrow: Capabilities, Benchmarks, and What Security Teams Must Do Now

Anthropic's Claude Mythos model goes public on June 10, 2026 — a frontier AI that scored 97.6% on the Math Olympiad, completed expert-level hacking tasks at 73% success, and found 271 vulnerabilities in Firefox 150. Here is everything security teams need to know before it lands, and how Safeguard already supports Mythos zero-day discovery natively.

June 9, 2026Read
AI Security

Claude Fable 5: Anthropic's Most Capable Public Model Is Here — Benchmarks, Capabilities, and What It Means for Security

Anthropic just released Claude Fable 5, its most capable publicly available model and the first Mythos-class AI open to everyone. 80.3% on SWE-Bench Pro, 88% on Terminal-Bench 2.1, state-of-the-art across software engineering, vision, and scientific research. Safeguard has already integrated Fable 5 natively — here is everything you need to know.

June 9, 2026Read

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.