How we scale models. Without scaling the harm.
Scaling model capability has to be paired with scaling our ability to red-team, disprove, and govern. We scale only when the safety posture scales with the capability. This page is the public framework.
Four safety levels. Each with required controls.
SL-1 — Bounded inline
Lion-class (1B)On-device only. No exfiltration paths. Sub-100ms p95. Signed weights verified at install. Deployable on any developer machine; no network egress required.
SL-2 — Multi-tenant cloud
Eagle 13B · Griffin Lite 8B · Griffin S 14BMulti-tenant inference; structured trace contract; per-tenant audit log; standard adversarial-resistance gate (≥0.94 prompt-injection block-rate).
SL-3 — Dedicated reasoning
Griffin M 32B · Griffin L 70BSingle-tenant inference cluster; advanced adversarial-resistance gate (≥0.97); coordinated-disclosure obligations; quarterly red-team rotation; structured-trace human audit on 300 samples per release.
SL-4 — Sovereign deep reasoning
Griffin Zero (671B-MoE)Air-gapped or VPC-isolated. Full red-team + manual trace audit per release. Customer-controlled key material. ITAR/export-control review. Adversarial-resistance ≥0.99.
What triggers a re-tier.
If safety regresses, the tier ceiling lowers. The model continues to ship — but at the lower safety level — until the posture catches up. The triggers:
- Adversarial-resistance regression > 0.5% on the held-out suite
- Hallucination-rate regression > 0.1% on the security-Q&A eval
- Refusal-rate regression > 5% on legitimate-research prompts
- Novel jailbreak class observed in MCP-server traffic or red-team logs
- Customer-facing safety incident attributed to model behaviour
- External regulatory or sectoral requirement change affecting deployment posture
How we govern this.
Internal safety review board
Weekly review of red-team findings, eval regressions, and customer safety reports. Membership rotates across engineering, research, and security.
External red-team partners
Quarterly rotation of independent red teams with sector-specific specialisations (offensive security, prompt injection, AI safety).
Public RSP commitments
This page is the public commitment. Updated quarterly. Material changes flagged on the changelog.
Independent audit
Annual third-party audit of the eval methodology, the corpus curation, and the release pipeline. Summary published in the transparency report.