Variant - Griffin Lite

Griffin Lite. Cloud-burst reasoning at IDE / CLI speed.

The 8B dense entry point of the Griffin lineup. Lite carries the full Aegis attention stack and the same structured trace contract as its larger siblings, sized to fit the PR-time and CI deep-scan budgets where every second of latency is visible to a developer.

8B
Parameters
32k
Context window
~1.2s
p95 latency
Shared / Dedicated
Deployment tier
Best at

What Lite earns its slot on.

PR-time reasoning on a single finding

Sub-1.5s round-trips fit inside the developer's pull-request inner loop. Lite carries enough capacity to explain one taint path and propose a sanitiser-aware fix without breaking flow.

CLI deep-scan in CI

Designed for the CI step that fans out across thousands of candidate findings. Throughput per dollar is the lowest of the Griffin family, with a tight latency budget that keeps PR checks under typical job timeouts.

Fast remediation suggestions

Generates a one- or two-hunk patch with a cited sanitiser, plus a short rationale. Good for the common case where the bug is local and the fix is mechanical — bumps, replacements, allow-list tightenings.

Reachability rationale on a single taint path

Lite is the cheapest variant that still emits a structured trace. For single-path reachability questions it produces hypothesis, cited hops, disproof attempt, and patch in one pass without escalating.

Not for

Where Lite should hand off.

Honest limits. The router escalates to heavier variants when the path needs more reasoning depth than 8B can carry.

Repo-wide multi-hop reasoning

32k context tops out around a few hundred files of call-graph metadata. For audits that need to fold in the entire transitive closure, escalate to Griffin M or L.

Adversarial disproof on cross-package chains

Long disproof chains across package boundaries need the deeper reasoning budget of Griffin L or Zero. Lite will produce a hypothesis but the disproof pass is unreliable past three hops.

Sovereign or classified workloads

Lite is sized for shared and dedicated cloud tiers. Air-gapped and sovereign deployments with the heaviest reasoning needs route to Griffin Zero on a sovereign cluster.

Specs detail

The full spec sheet.

Parameters8B (dense)
Context window32k tokens
p95 latency~1.2s end-to-end
Active params per token8B (dense, no MoE)
QuantisationFP16 default, INT8 available
Deployment tier(s)Shared cloud, Dedicated cluster
Minimum GPU1x A100 40GB
Recommended GPU1x A100 80GB or H100
Memory footprint~24 GB at FP16
Inference cost relative tierLowest of the Griffin variants
Eval - exploit-hypothesis accuracy68%
Eval - adversarial prompt resistance94%
Eval - security-Q&A hallucination rate1.4%
Typical workflows

Where Lite earns its compute.

Single-finding fix suggestion

A finding comes in, Lite produces hypothesis, cites the path, attempts a disproof against the project's sanitiser config, and emits a patch hunk. The whole exchange fits inside a normal PR comment cycle.

PR comment generation

Renders the structured reasoning trace into a reviewer-friendly comment with the cited hops, the refutation that failed, and the proposed fix. The trace ships with the comment so reviewers can audit it.

CLI scan summarisation

After a CLI deep-scan, Lite condenses the per-finding traces into a top-priorities summary with rationales. Cheap enough to run across thousands of candidate findings in a single CI job.

Cloud-burst fallback for on-device

When the on-device student model's confidence falls below threshold, the IDE routes that single finding to Lite for a deeper reasoning pass without leaving the inner loop.

API example

Invoke Lite directly when the router would have picked it anyway.

A single-finding reasoning call with the latency budget pinned to the PR-time tier.

POST /v1/reason - griffin-lite
curl -X POST https://api.safeguard.sh/v1/reason \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "variant": "griffin-lite",
    "input": {
      "finding_id": "fnd_8f2c19a4",
      "mode": "single-finding",
      "include_trace": true,
      "include_patch": true
    },
    "constraints": {
      "max_latency_ms": 1500,
      "context_budget_tokens": 32000
    }
  }'
Routing

When the router picks Lite.

Triage score 0.0 - 0.4

Lite handles the low-complexity, single-path band of the Eagle triage score. That covers findings where the call graph has shallow depth, the sink severity is moderate, sanitiser ambiguity is low, and no cross-package edges are involved. Inside this band, Lite produces a structured trace at roughly a tenth of the L-tier compute cost.

Single-path taint, in-package, ≤ 3 hops -> Lite.
Above 0.4 or context > 32k -> escalates to S or M.
Cross-package or adversarial disproof -> L or Zero.
On-device confidence below threshold -> cloud burst into Lite.

Drop Lite into your CI step.

Cheap enough to run on every pull request, deep enough to emit a structured trace with the finding.