Ten commitments that bind how Safeguard treats customer data, model releases, disclosure, and our own mistakes. They are not marketing — they are the bar we have to clear. Where we fall short, we expect to be told.
Plain language. No legalese. If a commitment below ever fails to apply, we owe an explanation in writing.
Customer source, prompts, scan artefacts, and findings never enter the training corpus. Anonymised model-behaviour telemetry is used to improve the model only with explicit opt-in. Individual customer artefacts are not.
Releases that regress on adversarial resistance, refusal-rate stability, or trace quality are held — regardless of headline benchmark gains. The gate is binary, not a target. The gate cannot be waived by a product manager.
Model weights are signed. Datasets are versioned. Training runs are reproducible from the recorded recipe. Any customer can request an attestation that ties a deployed model to the recipe that produced it.
Every Griffin verdict emits a HYPOTHESIS / CITED PATH / DISPROOF / PROPOSED PATCH trace. We do not ship findings without that trace. We do not redact reasoning to make a number look better.
No proprietary data formats where an open one exists. Full export of customer data on request, in an open schema, within five business days. Migration paths off Safeguard are documented and supported.
Griffin Zero on sovereign deployments uses the same weights, the same training recipe, and the same safety controls as the multi-tenant deployment. Air-gap is a deployment property, not a capability ceiling.
Aggregated, anonymised findings are used to publish public threat-feed items and research. They are not sold to ad networks, data brokers, offensive-security vendors, or any party operating outside a defensive use case.
When the platform identifies a candidate vulnerability in third-party code, the default is coordinated disclosure with the upstream maintainer under our published SLA. Public posting requires explicit customer consent.
If a security incident materially affects customer data, customer findings, or customer-deployed model artefacts, affected customers are notified within 24 hours of confirmation. The clock starts at confirmation, not at convenience.
When we mis-design a feature, mis-prioritise a roadmap, or mis-handle a customer interaction, we say so and change course. Where the misstep was public, the correction is public. Compounding a mistake costs more than admitting one.
Commitments are only as good as the mechanisms behind them. Each of these surfaces is public and auditable.
Quarterly roadmap is public. Shipped, slipped, and cut items are all visible. Customers can see what changed and why.
Threat-feed items are published with the cited evidence and the trace that produced them. No findings posted without a reproducible artefact.
Any customer can request a signed attestation tying their deployed model to the training recipe and weights hash that produced it.
Aggregate platform numbers, government requests, incidents, and commitments missed — published on a quarterly cadence.
If a Safeguard release, decision, or behaviour visibly breaches any of the commitments above, raise it directly. The compliance mailbox is monitored by a named person on the responsibility team. Security-relevant findings can also route through the bug-bounty programme. Both channels guarantee a response.
Three short constitutions that govern how we build, ship, and behave.
What we count, what we publish, what we get wrong — on a quarterly cadence.
How model capability is paired with safety controls before a tier ships.
The trust-centre: certifications, controls, and the architecture they back.