Aegis. The reasoning architecture inside every Griffin model.
Aegis is the long-context reasoning architecture every Griffin variant runs on — sliding-window plus landmark attention, a security-augmented tokeniser, structured trace output, and mixture-of-experts routing in the largest tier. It is the reason the lineup is fast where it needs to be and deep where it has to be.
Four pieces that make Aegis what it is.
Sliding-window + landmark attention
Local attention over a windowed token range with sparse global landmarks. Lets Griffin Zero reason over a 256k-token call graph without quadratic blow-up. Latency stays bounded; the window slides where the reasoning needs it.
Security-augmented tokeniser
Adds ~28,000 tokens covering CWE/CVE IDs, taint operators, package coordinates (purl format), and attack-pattern shorthand. Each is a single token instead of 4–8 BPE pieces, which keeps the security-relevant context dense.
Mixture-of-experts (Zero only)
Eight experts, top-2 routing, ~5.5% of parameters activated per token. Griffin Zero reaches 671B parameters at the inference cost of a ~37B dense model. The router itself is trained on the security task distribution.
Structured trace output
Every response is emitted as HYPOTHESIS / CITED PATH / DISPROOF / PROPOSED PATCH. The trace is the auditability story — every finding ships with the reasoning the model walked and the refutation it tried.
Four design choices a generic transformer doesn't make.
Tokeniser priors for security
Off-the-shelf tokenisers fragment CVE IDs, purls, and CWE numbers into character soup. Aegis treats them as single tokens, which preserves their semantic role inside long-context reasoning.
Long-context retrieval gates
Before attention runs, a retrieval gate pre-ranks call-graph chunks by relevance. The model only spends compute on the chunks that might matter. Time-to-first-token drops materially at 256k context.
Reasoning trace as a first-class output
Chain-of-thought in generic models is a side effect. In Aegis it is the output contract — structured, fielded, and consumable downstream by the disproof head and the auto-fix pipeline.
Adversarial disproof decoder head
A second decoder head tries to refute whatever the main head proposed. Only candidates that survive disproof reach your queue. This is how Griffin keeps false-positive rates low at production scale.
Every Griffin variant runs Aegis attention.
Eagle and Lion use simpler dense architectures because their workloads don't need the long-context machinery.
Griffin Lite (8B)
Aegis attention at 32k context. Fast cloud-burst reasoning for single-finding workflows.
Griffin S / M (14B / 32B)
Aegis at 64k and 128k context. Mid-depth call-graph reasoning, PR-level review workloads.
Griffin L (70B)
Aegis at 128k context, dense. Default production tier for multi-hop cross-package exploit hypothesis.
Griffin Zero (671B-MoE)
Aegis at 256k context with 8-expert MoE. Sovereign-grade reasoning, deepest audits, agentic disclosure workflows.
Numbers that justify the architecture choices.
See Aegis reasoning on your own code.
Book a session — we'll point Griffin at your repo and walk the structured reasoning trace with you.