Griffin is the heavyweight reasoning family — five size variants spanning 8B to a 671B-MoE flagship, all weighted purely on a cybersecurity corpus. It hypothesises exploit chains, cites the call-graph path, attempts a disproof against the project's sanitiser config, and writes the patch.
Every variant shares the corpus, tokeniser and reasoning trace format. They differ in parameter count, context window and where they run.
| Variant | Parameters | Context window | Latency p95 | Deployment shape | Typical use |
|---|---|---|---|---|---|
| Griffin Lite | 8B | 32k | ~1.2s | IDE-side cloud burst / CLI deep-scan | Fast single-finding reasoning. |
| Griffin S | 14B | 64k | ~2.8s | Cloud | Mid-depth call-graph reasoning, PR-level reviews. |
| Griffin M | 32B | 128k | ~5.5s | Cloud | Repo-wide reasoning, transitive taint chains. |
| Griffin L | 70B | 128k | ~8s | Dedicated GPU | Multi-hop cross-package exploit hypothesis. Default production tier. |
| Griffin Zero | 671B-MoE (~37B active) | 256k | ~12s | Multi-GPU cluster / sovereign | Deepest reasoning, supply-chain-scale audits. |
Every Griffin call emits a four-stage trace. Reviewers see the chain, not a single label, and can reject at any stage.
[01] HYPOTHESIS
class: CWE-502 (unsafe deserialization)
entry: HTTP POST /api/import-config
gadget: pkg:maven/com.fasterxml.jackson.core/jackson-databind@2.9.10
[02] CITED PATH
handler.parseRequest() -> service.importConfig()
-> codec.decode(bytes) -> ObjectMapper.readValue(InputStream, Object.class)
6 hops, 3 package boundaries, 1 sanitiser bypassed (allow-list mismatch).
[03] DISPROOF ATTEMPT
- polymorphic typing disabled? no (DefaultTyping.NON_FINAL active)
- allow-list enforced? partial; missing on nested key 'plugins'
- sandbox or seccomp profile? none on this code path
refutation failed; finding stands.
[04] PROPOSED PATCH
- replace ObjectMapper.readValue with constrained reader
using ALLOWED_TYPES allow-list
- bump jackson-databind to >= 2.15.2 (advisory-aligned)
- add SecurityManager-equivalent unit test covering nested 'plugins'.A triage score decides which Griffin size handles each candidate. You don't pay Zero-tier compute for an in-package call.
Eagle assigns a complexity score from the call graph: depth, sanitiser ambiguity, cross-package edges, sink severity.
Cheap, in-package candidates route to Lite. Mid-depth PR work routes to S or M. Multi-hop cross-package paths route to L. Sovereign or long-budget audits route to Zero.
The chosen variant runs the hypothesise / cite / disprove / patch trace. The trace ships with the finding so reviewers can audit which variant produced what.
Pick the variant that fits your budget and watch it reason through your real call graph, not a benchmark.