Variant · Griffin Zero

Griffin Zero. Sovereign-grade reasoning.

671B parameters via 8-expert mixture-of-experts, ~37B active per token, 256k usable context. The deepest reasoning the lineup ships — sized for sovereign and air-gapped deployments, cross-package chains beyond twelve hops, and the agentic disclosure workflows that the other variants cannot sustain.

671B-MoE

Parameters

256k

Context window

~12s

p95 latency

Sovereign / Air-gapped

Deployment tier

Best at

What only Zero can carry.

Sovereign and air-gapped workloads

The deepest reasoning Safeguard ships, running inside the customer's own perimeter with no egress. Built for the longest reasoning budgets — quarterly portfolio audits, regulator-grade findings, classified-environment review.

Cross-package taint chains beyond 12 hops

Where Griffin L tops out, Zero keeps reasoning. 256k usable context plus the MoE expert routing carry chains through five or more package boundaries without dropping the cited path.

Coordinated disclosure pipelines

Agentic workflow: propose the upstream patch, run it through the maintainer's project test suite, and draft the disclosure thread. Zero is the only variant with the reasoning budget to sustain the full chain autonomously.

Deepest agentic reasoning

Multi-step autonomous reasoning over tools, retrieval, and the disproof head — workflows that need to plan, execute, observe, and replan five or six times. Other variants lose the thread; Zero does not.

Not for

Where Zero is overkill.

Zero is opt-in by deployment tier. The auto-router will not silently send shared-cloud traffic to Zero — too expensive, too slow for what most workloads need.

PR-time work

A ~12s p95 plus the cost profile is wrong for inline PR review. Use Griffin Lite for IDE-side cloud-burst reasoning, and Griffin S for PR-level checks where the budget is seconds, not a session.

Latency-sensitive CI gates

If a deployment gate must finish in under ten seconds, Zero is the wrong shape. Route the gate to Griffin M or L — they will carry the typical taint chain at a fraction of the wall-clock cost.

Cost-sensitive workloads

Zero is the highest-cost tier in the lineup. The auto-router avoids it by default on the shared-cloud tier — Zero is opt-in by deployment tier, not the cheapest variant that could carry a chain.

Specs detail

Everything you need to size a sovereign cluster.

Parameters	671B total (sparse mixture-of-experts)
Context window	256k usable (retrieval gates over windowed attention)
p95 latency	~12s end-to-end
Active params per token	~37B (~5.5% of total)
Quantisation	FP16 default · FP8 for sovereign deployments
Deployment tier(s)	Sovereign · Air-gapped · Dedicated cluster (large)
Minimum GPU	8x H100 80GB
Recommended GPU	11x H100 multi-AZ (Growth) up to 22x H100 multi-AZ (Mature)
Memory footprint	~340 GB at FP8 · ~680 GB at FP16
Inference cost relative tier	Highest
Eval — exploit-hypothesis accuracy	88%
Eval — adversarial prompt resistance	99%
Eval — security-Q&A hallucination rate	0.3%
Eval — top-5 candidate-path retention vs known CVE ground truth	97% (+12 precision points over Griffin L on cross-package taint paths)

Typical workflows

Where Zero spends its reasoning budget.

Sovereign deployments

Run inside the customer's own region, account, and VPC, with no traffic leaving the perimeter. Zero is the reasoning tier sized for the regulatory and operational constraints of sovereign infrastructure.

Air-gapped audits

Bring-your-own-cluster, no network egress, no outbound telemetry. Zero ships as a container image plus weights; the eval gate and disproof head run entirely on customer hardware.

Coordinated upstream disclosure

End-to-end agentic flow: confirm the chain, draft the upstream patch, run the maintainer's test suite locally, and prepare the disclosure thread for a human to approve. Zero owns each step.

Heaviest distillation experiments

Teacher model for the heaviest Lino distillation experiments. The Lino-class students that ship to IDEs inherit their reasoning priors from Zero's traces under sustained agentic load.

API example

Zero is opt-in by tier.

Sovereign and air-gapped tiers can request Zero directly. The agentic block opts the call into upstream patch drafting, project-test execution, and disclosure preparation.

griffin-zero · POST /v1/reason

$ curl https://api.safeguard.sh/v1/reason \
    -H 'Authorization: Bearer $SAFEGUARD_API_KEY' \
    -H 'Content-Type: application/json' \
    -d '{
      "model": "griffin-zero",
      "mode": "sovereign-audit",
      "project": "proj_a91f...",
      "deployment_tier": "sovereign",
      "candidate": {
        "cwe": "CWE-918",
        "entry": "POST /api/webhook/dispatch",
        "sink": "HttpClient.send(URI)"
      },
      "context_budget": 262144,
      "emit_trace": true,
      "run_disproof": true,
      "agentic": {
        "draft_upstream_patch": true,
        "run_project_tests": true,
        "draft_disclosure": true
      }
    }'

{
  "model": "griffin-zero",
  "finding_id": "find_8801",
  "verdict": "exploitable",
  "trace": {
    "hypothesis": "CWE-918 via webhook URL allow-list bypass",
    "cited_path_hops": 14,
    "package_boundaries": 5,
    "disproof_result": "refutation_failed",
    "upstream_patch_applied": true,
    "project_tests_passed": "412/412",
    "disclosure_draft_ready": true
  },
  "experts_activated": ["taint", "http", "sanitiser-ambiguity"],
  "latency_ms": 11904,
  "router_score": 0.94
}

Routing

When the auto-router escalates to Zero.

Triage score 0.9 – 1.0, OR sovereign tier, OR explicit

Zero is opt-in by deployment tier. The default auto-router does not escalate to Zero on the shared-cloud tier — the cost and latency are wrong for typical production traffic. Zero is selected only when one of three conditions holds.

Triage score above 0.9 — the candidate's call-graph complexity exceeds what Griffin L can carry confidently.
Deployment tier is sovereign or air-gapped — Zero is the variant licensed and sized for those perimeters.
Caller explicitly requests Zero — pass model: "griffin-zero" to force the variant.
Shared-cloud traffic without an explicit request stays on Griffin L. No silent escalation, no surprise bill.

Bring Zero inside your perimeter.

Sovereign-grade reasoning on hardware you control. Book a session to size the cluster and walk the agentic disclosure pipeline end-to-end.