Variant · Griffin L

Griffin L. The default production tier.

70 billion dense parameters, 128k context, ~8s p95. The variant the auto-router lands on for most production reasoning — multi-hop cross-package exploit hypothesis, adversarial disproof, auto-fix PR with the full trace attached. Mid-high cost, deep enough for the chains that actually ship.

70B

Parameters

128k

Context window

~8s

p95 latency

Dedicated GPU

Deployment tier

Best at

What L was built to carry.

Multi-hop cross-package exploit hypothesis

Reasons across up to ~12 call-graph hops and three or more package boundaries in a single pass. Cites the path, names the sanitiser that would have caught it, and proposes the patch.

Adversarial disproof on real chains

Runs the disproof pass against the project's actual sanitiser config, allow-lists, and seccomp profile. Refutes its own hypothesis where the runtime would have stopped the chain, so the queue stays clean.

Auto-fix PR with deep reasoning trace

Drafts the patch, regenerates the unit test, and attaches the full HYPOTHESIS / CITED PATH / DISPROOF / PROPOSED PATCH trace to the PR. Reviewers approve the chain, not just the diff.

Default tier the auto-router lands on

When nothing cheaper can carry the chain and nothing larger is required by tier policy, Griffin L is what the router chooses. Most production reasoning work lands here.

Not for

Where L is the wrong tool.

Honest list. The auto-router avoids these by design; if you call L directly here, expect to pay for compute you didn't need.

Sub-second IDE work

An 8s p95 is wrong for a save-keystroke loop. Use Griffin Lite for IDE-side cloud-burst reasoning where the latency budget is measured in hundreds of milliseconds.

Sovereign or air-gapped with the longest budgets

Griffin L runs well on dedicated GPU but the deepest sovereign workloads — 12+ hop chains, agentic coordinated disclosure — should target Griffin Zero on a sovereign cluster.

Contexts beyond 128k tokens

L tops out at 128k usable context. For repositories where the relevant call graph cannot be cropped under 128k, route to Griffin Zero with 256k usable via retrieval gates.

Specs detail

Everything you need to size the deployment.

Parameters	70B (dense)
Context window	128k tokens
p95 latency	~8s end-to-end
Active params per token	70B (dense — all weights active)
Quantisation	FP16 default · INT8 available · FP8 in beta
Deployment tier(s)	Dedicated cluster · VPC-isolated · Sovereign
Minimum GPU	2x H100 80GB
Recommended GPU	4x H100 80GB or 8x A100 80GB
Memory footprint	~140 GB at FP16 · ~80 GB at INT8
Inference cost relative tier	Mid-high
Eval — exploit-hypothesis accuracy	81%
Eval — adversarial prompt resistance	98%
Eval — security-Q&A hallucination rate	0.6%
Eval — top-5 candidate-path retention vs known CVE ground truth	94%

Typical workflows

Where L spends its compute.

Production auto-fix campaigns

Sweep a service or repo for a target CWE class. L reasons each candidate, runs the disproof pass, and emits one PR per surviving finding with the full trace attached.

Multi-service correlation

Trace a taint chain that crosses HTTP boundaries between two or three services. L holds the call graph across the boundary in context and cites which service contributed which hop.

Coordinated disclosure draft

Given a confirmed finding in a third-party dependency, L drafts the upstream issue, the suggested patch, and the disclosure thread before the human reviewer touches it.

Teacher for Lino distillation

L is the teacher model for the on-device distillation pipeline. Its traces are the supervision signal that lets a small Lino student reach Lite-class accuracy at IDE latency.

API example

Call L directly when you know you want it.

The auto-router will pick L for you on most production traffic. When you want to force the variant — for example during a campaign — pass it explicitly.

griffin-l · POST /v1/reason

$ curl https://api.safeguard.sh/v1/reason \
    -H 'Authorization: Bearer $SAFEGUARD_API_KEY' \
    -H 'Content-Type: application/json' \
    -d '{
      "model": "griffin-l",
      "mode": "multi-hop-reasoning",
      "project": "proj_a91f...",
      "candidate": {
        "cwe": "CWE-502",
        "entry": "POST /api/import-config",
        "sink": "ObjectMapper.readValue"
      },
      "context_budget": 128000,
      "emit_trace": true,
      "run_disproof": true
    }'

{
  "model": "griffin-l",
  "finding_id": "find_4129",
  "verdict": "exploitable",
  "trace": {
    "hypothesis": "CWE-502 via jackson-databind polymorphic typing",
    "cited_path_hops": 6,
    "disproof_result": "refutation_failed",
    "proposed_patch": "constrained-reader + advisory bump"
  },
  "latency_ms": 7842,
  "router_score": 0.83
}

Routing

When the auto-router picks L.

Triage score 0.75 – 0.9

Most reasoning work in production lands in this band. Eagle scores the candidate on call-graph depth, sanitiser ambiguity, cross-package edges and sink severity. Scores in the 0.75 – 0.9 range route to Griffin L by default: the chain is deep enough that S/M would under-reason, but not so deep that Zero is required.

Below 0.75: routed to Griffin S or M depending on context size.
0.75 – 0.9: routed to Griffin L. Default landing zone for production reasoning.
Above 0.9 or sovereign-tier deployments: escalated to Griffin Zero.
Callers can override the router by passing model: "griffin-l" explicitly.

Put L on your next campaign.

Pick a CWE class, point Griffin L at the service, and review the surviving PRs with their reasoning traces attached.