Architecture · Aegis

Aegis. The reasoning architecture inside every Griffin model.

Aegis is the long-context reasoning architecture every Griffin variant runs on — sliding-window plus landmark attention, a security-augmented tokeniser, structured trace output, and mixture-of-experts routing in the largest tier. It is the reason the lineup is fast where it needs to be and deep where it has to be.

256k
Usable context (Zero)
~28k
Security tokens added
5.5%
Active params (Zero)
Top-2
Expert routing
Architecture components

Four pieces that make Aegis what it is.

Sliding-window + landmark attention

Local attention over a windowed token range with sparse global landmarks. Lets Griffin Zero reason over a 256k-token call graph without quadratic blow-up. Latency stays bounded; the window slides where the reasoning needs it.

Security-augmented tokeniser

Adds ~28,000 tokens covering CWE/CVE IDs, taint operators, package coordinates (purl format), and attack-pattern shorthand. Each is a single token instead of 4–8 BPE pieces, which keeps the security-relevant context dense.

Mixture-of-experts (Zero only)

Eight experts, top-2 routing, ~5.5% of parameters activated per token. Griffin Zero reaches 671B parameters at the inference cost of a ~37B dense model. The router itself is trained on the security task distribution.

Structured trace output

Every response is emitted as HYPOTHESIS / CITED PATH / DISPROOF / PROPOSED PATCH. The trace is the auditability story — every finding ships with the reasoning the model walked and the refutation it tried.

How Aegis differs

Four design choices a generic transformer doesn't make.

Tokeniser priors for security

Off-the-shelf tokenisers fragment CVE IDs, purls, and CWE numbers into character soup. Aegis treats them as single tokens, which preserves their semantic role inside long-context reasoning.

Long-context retrieval gates

Before attention runs, a retrieval gate pre-ranks call-graph chunks by relevance. The model only spends compute on the chunks that might matter. Time-to-first-token drops materially at 256k context.

Reasoning trace as a first-class output

Chain-of-thought in generic models is a side effect. In Aegis it is the output contract — structured, fielded, and consumable downstream by the disproof head and the auto-fix pipeline.

Adversarial disproof decoder head

A second decoder head tries to refute whatever the main head proposed. Only candidates that survive disproof reach your queue. This is how Griffin keeps false-positive rates low at production scale.

Where Aegis appears

Every Griffin variant runs Aegis attention.

Eagle and Lino use simpler dense architectures because their workloads don't need the long-context machinery.

Variant 01

Griffin Lite (8B)

Aegis attention at 32k context. Fast cloud-burst reasoning for single-finding workflows.

Variant 02

Griffin S / M (14B / 32B)

Aegis at 64k and 128k context. Mid-depth call-graph reasoning, PR-level review workloads.

Variant 03

Griffin L (70B)

Aegis at 128k context, dense. Default production tier for multi-hop cross-package exploit hypothesis.

Variant 04

Griffin Zero (671B-MoE)

Aegis at 256k context with 8-expert MoE. Sovereign-grade reasoning, deepest audits, agentic disclosure workflows.

Eval highlights

Numbers that justify the architecture choices.

256k
Usable context vs 32–128k baseline
98%
Adversarial prompt resistance
0.6%
Security-Q&A hallucination rate
+12%
Cross-package taint precision

See Aegis reasoning on your own code.

Book a session — we'll point Griffin at your repo and walk the structured reasoning trace with you.