The 32B dense Griffin variant. M is the smallest tier where 128k context and the full Aegis attention stack come together - sized for nightly repo-wide audits and coordinated multi-finding triage, with predictable dense activation and Sovereign-tier deployment support.
128k context holds the call-graph metadata for a multi-service repo in a single window. M is the smallest variant where 'reason over the whole repo' becomes a one-shot rather than a stitched-together chunked pass.
Comfortable with taint propagation that crosses several package boundaries. The disproof pass stays coherent at this depth; chains beyond seven hops favour Griffin L's deeper reasoning budget.
Correlates related findings across services in one reasoning pass instead of independently. Useful when several findings are symptoms of the same root cause and a coordinated patch is the right answer.
Carries the full Aegis attention stack at 128k context with dense activation. Predictable latency and memory footprint compared to the MoE-routed Zero variant - simpler to capacity-plan.
Honest limits. The router escalates up to L or Zero when depth, sovereign tier, or adversarial disproof demands it.
M supports Sovereign tier but not the full 256k context and 8-expert MoE depth. Air-gapped audits at supply-chain scale route to Griffin Zero with its retrieval-gated 256k window.
Zero outperforms M on the longest cross-package taint chains where the disproof pass needs more reasoning depth than 32B can carry. M will produce a hypothesis but the refutation step degrades past seven hops.
p95 of ~5.5s is too slow for inner-loop PR comments. PR-time and CLI deep-scan workloads route to Griffin S at 64k or Griffin Lite at 32k, where the latency budget fits the developer's flow.
| Parameters | 32B (dense) |
| Context window | 128k tokens |
| p95 latency | ~5.5s end-to-end |
| Active params per token | 32B (dense, no MoE) |
| Quantisation | FP16 default, INT8 available |
| Deployment tier(s) | Dedicated cluster, VPC-isolated, Sovereign |
| Minimum GPU | 2x A100 80GB |
| Recommended GPU | 1-2x H100 80GB |
| Memory footprint | ~64 GB at FP16 |
| Inference cost relative tier | Mid tier |
| Eval - exploit-hypothesis accuracy | 77% |
| Eval - adversarial prompt resistance | 97% |
| Eval - security-Q&A hallucination rate | 0.9% |
Scheduled audit pass across the entire repository in one context. M reads the full call graph, hypothesises across services, runs the disproof pass against the project's sanitiser config, and emits a ranked finding list with cited traces.
Folds related findings across services into a single reasoning trace, surfacing the root cause rather than the symptoms. Tunes the queue toward fixes that retire multiple findings together.
Drafts a coordinated patch set spanning the services that share the same vulnerable dependency or sanitiser-bypass pattern. The 128k window holds the diff plus the surrounding call graph in one context.
Targeted deep-scan of an unfamiliar codebase as part of due diligence. M reasons over the whole repository's transitive taint surface and emits a structured report with cited hops for each high-priority finding.
A repo-wide reasoning call with 128k context and correlated finding output.
curl -X POST https://api.safeguard.sh/v1/reason \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"variant": "griffin-m",
"input": {
"repo_id": "repo_2a71eb39",
"mode": "repo-audit",
"include_trace": true,
"include_patch": true,
"correlate_findings": true
},
"constraints": {
"max_latency_ms": 7000,
"context_budget_tokens": 128000
}
}'M handles the upper-mid Eagle triage band: deeper call graphs, more cross-package edges, and findings that benefit from correlated reasoning. The router also pins M for any workload whose context exceeds 64k - that threshold is what S can't carry, regardless of the score.
128k context, dense activation, predictable capacity - the variant the router picks for repo-wide reasoning.