Responsibility · Transparency Report

What we count. What we publish. What we get wrong.

Aggregate platform numbers, government requests, incidents, and — the section nobody likes writing — the commitments we missed. Published quarterly. Methodology is open.

Last updated: Q2 2026

Aggregate Platform · Last 12 Months

The numbers, in one place.

Anonymised, aggregate. No individual-customer attribution. Methodology link below.

Active tenants

1,240

Single- and multi-tenant deployments combined.

Findings shipped

3.42M

Survived adversarial disproof; reached a customer queue.

Auto-remediated

61%

Findings closed by an automatic Safeguard patch.

Adversarial resistance

0.948

Mean score across the Griffin family this quarter.

Red-team first-pass

82%

Model releases that cleared the gate on the first attempt. Not 100%, on purpose.

Coordinated disclosures

147

Published with upstream maintainer in the loop.

Threat-feed items

412

Public posts with cited evidence and reproducible artefact.

Corpus rotations

Training-corpus refresh events; each one is recipe-versioned.

Government Requests

Requests received, complied with, challenged.

Single-digit counts. Illustrative; the live numbers are updated each quarter. The principle is: minimal compliance, robust challenge of overbroad requests, and customer notification wherever legally permitted.

Jurisdiction	Received	Complied	Challenged
United States	7	4	2
European Union	4	3	1
United Kingdom	2	1	1
Singapore	1	1	0
Other	3	1	1

Illustrative counts. The live report updates these on a quarterly cadence with sealed-request caveats noted where applicable.

Incidents and Post-mortems

What broke and what we changed.

Illustrative excerpts from recent quarters. The live incident log is at status.

SEV-2INC-2026-014

Eagle triage stream backlog after upstream advisory burst

Scope · Eagle queue latency rose to 14 minutes for ~3 hours on 12-Mar; multi-tenant only.

Root cause · Advisory-feed burst exceeded queue-worker concurrency cap; rate limiter held the queue rather than the producer.

Full RCA

SEV-3INC-2026-009

Griffin reasoning-trace truncation regression

Scope · 0.4% of Griffin M traces truncated below the documented 8k token contract for 9 hours.

Root cause · Inference batching change silently lowered the per-trace token budget; caught by a downstream contract test, not the release gate.

Full RCA

SEV-1INC-2026-002

Cross-tenant trace metadata leak in the audit export

Scope · Two enterprise tenants saw tenant-id metadata fields belonging to other tenants in an exported audit bundle. No findings content leaked.

Root cause · Export bundler shared a non-thread-local context across tenant boundaries during parallel export. Affected customers were notified within 18 hours.

Full RCA

Where We Did Not Meet a Commitment

The honest section.

Two misses in the last reporting period. Listed in plain language. No hedge.

Model-tier parity SLA missed by 12 days on the Griffin L → Griffin Zero promotion

Our public commitment is full-lineup parity across deployment shapes within 30 days of a tier promotion. On the most recent Griffin L → Griffin Zero promotion, sovereign customers waited 42 days. Cause was an unplanned export-control review on one corpus subset. We mis-scoped the review window when we announced the date. The new commitment, learned from this miss, includes the review window inside the SLA — not outside it.

Customer-facing latency regression on Eagle confidence calls

A 280ms p95 regression on Eagle confidence-score calls persisted for 11 days in Q1. Our commitment to root-cause and ship a fix is 5 business days for a customer-facing performance regression of that magnitude. We took longer because we mis-prioritised the trace from "performance" into the standard backlog rather than the regression queue. The triage rule has been changed; the miss is logged here.

How we count what we count

Methodology, in two paragraphs.

Findings are counted at the moment they survive adversarial disproof and become available to a customer queue, not at the moment a candidate is generated. Auto-remediation is counted only when a Safeguard-proposed patch is the one that lands in the customer's main branch. Adversarial-resistance scores are mean scores across the held-out evaluation set; the per-model breakdown is in the research notes for that release. Threat-feed items are counted once per first publication. Disclosure counts include only items the upstream maintainer was contacted on; silent posts are not counted, because we do not do silent posts.

The full data-flow diagram, the metric definitions, and the pipeline that produces this page live on the architecture page. Each number on this page is reproducible from the source events. Where a count cannot be reproduced — because of a sealed-disclosure or sealed-request constraint — that limitation is noted next to the number rather than papered over.

Commitments

The public commitments this report measures us against.

Open

Responsible scaling

The safety framework behind every model release.

Open

Security

Certifications, controls, and the trust-centre.

Open

Live status

Real-time incident feed and uptime numbers.

Open

What we count. What we publish. What we get wrong.

The numbers, in one place.

Requests received, complied with, challenged.

What broke and what we changed.

Eagle triage stream backlog after upstream advisory burst

Griffin reasoning-trace truncation regression

Cross-tenant trace metadata leak in the audit export

The honest section.

Model-tier parity SLA missed by 12 days on the Griffin L → Griffin Zero promotion

Customer-facing latency regression on Eagle confidence calls

Methodology, in two paragraphs.

What this report sits next to.

Commitments

Responsible scaling

Security

Live status

Want to see the full quarterly data set? Ask.

Product

Solutions

Compare

Resources

Company

Legal

Developers