Variant - Griffin M

Griffin M. Repo-wide reasoning over transitive taint chains.

The 32B dense Griffin variant. M is the smallest tier where 128k context and the full Aegis attention stack come together - sized for nightly repo-wide audits and coordinated multi-finding triage, with predictable dense activation and Sovereign-tier deployment support.

32B
Parameters
128k
Context window
~5.5s
p95 latency
Dedicated / VPC / Sovereign
Deployment tier
Best at

What M earns its slot on.

Repo-wide reasoning across thousands of files

128k context holds the call-graph metadata for a multi-service repo in a single window. M is the smallest variant where 'reason over the whole repo' becomes a one-shot rather than a stitched-together chunked pass.

Transitive taint chains up to ~7 hops

Comfortable with taint propagation that crosses several package boundaries. The disproof pass stays coherent at this depth; chains beyond seven hops favour Griffin L's deeper reasoning budget.

Multi-finding correlation in a single pass

Correlates related findings across services in one reasoning pass instead of independently. Useful when several findings are symptoms of the same root cause and a coordinated patch is the right answer.

Mid-depth Aegis attention (128k, no MoE)

Carries the full Aegis attention stack at 128k context with dense activation. Predictable latency and memory footprint compared to the MoE-routed Zero variant - simpler to capacity-plan.

Not for

Where M should hand off.

Honest limits. The router escalates up to L or Zero when depth, sovereign tier, or adversarial disproof demands it.

Deepest sovereign workloads

M supports Sovereign tier but not the full 256k context and 8-expert MoE depth. Air-gapped audits at supply-chain scale route to Griffin Zero with its retrieval-gated 256k window.

Adversarial disproof on the most complex cross-package chains

Zero outperforms M on the longest cross-package taint chains where the disproof pass needs more reasoning depth than 32B can carry. M will produce a hypothesis but the refutation step degrades past seven hops.

Latency-sensitive PR-time work

p95 of ~5.5s is too slow for inner-loop PR comments. PR-time and CLI deep-scan workloads route to Griffin S at 64k or Griffin Lite at 32k, where the latency budget fits the developer's flow.

Specs detail

The full spec sheet.

Parameters32B (dense)
Context window128k tokens
p95 latency~5.5s end-to-end
Active params per token32B (dense, no MoE)
QuantisationFP16 default, INT8 available
Deployment tier(s)Dedicated cluster, VPC-isolated, Sovereign
Minimum GPU2x A100 80GB
Recommended GPU1-2x H100 80GB
Memory footprint~64 GB at FP16
Inference cost relative tierMid tier
Eval - exploit-hypothesis accuracy77%
Eval - adversarial prompt resistance97%
Eval - security-Q&A hallucination rate0.9%
Typical workflows

Where M earns its compute.

Nightly repo-wide audit

Scheduled audit pass across the entire repository in one context. M reads the full call graph, hypothesises across services, runs the disproof pass against the project's sanitiser config, and emits a ranked finding list with cited traces.

Coordinated triage of correlated findings

Folds related findings across services into a single reasoning trace, surfacing the root cause rather than the symptoms. Tunes the queue toward fixes that retire multiple findings together.

Auto-fix campaign across services

Drafts a coordinated patch set spanning the services that share the same vulnerable dependency or sanitiser-bypass pattern. The 128k window holds the diff plus the surrounding call graph in one context.

M&A-style diligence scan

Targeted deep-scan of an unfamiliar codebase as part of due diligence. M reasons over the whole repository's transitive taint surface and emits a structured report with cited hops for each high-priority finding.

API example

Run a repo-audit pass against Griffin M.

A repo-wide reasoning call with 128k context and correlated finding output.

POST /v1/reason - griffin-m
curl -X POST https://api.safeguard.sh/v1/reason \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "variant": "griffin-m",
    "input": {
      "repo_id": "repo_2a71eb39",
      "mode": "repo-audit",
      "include_trace": true,
      "include_patch": true,
      "correlate_findings": true
    },
    "constraints": {
      "max_latency_ms": 7000,
      "context_budget_tokens": 128000
    }
  }'
Routing

When the router picks M.

Triage score 0.6 - 0.75, or context > 64k

M handles the upper-mid Eagle triage band: deeper call graphs, more cross-package edges, and findings that benefit from correlated reasoning. The router also pins M for any workload whose context exceeds 64k - that threshold is what S can't carry, regardless of the score.

Repo-wide audit or correlated findings -> M.
Score < 0.6 and context ≤ 64k -> downshift to S.
Score > 0.75 or 7+ hops -> upshift to L.
Sovereign deepest-tier audit -> Zero on sovereign cluster.

Point M at your nightly audit.

128k context, dense activation, predictable capacity - the variant the router picks for repo-wide reasoning.