OpenAI published the GPT-5 system card on August 13, 2025, alongside the model's general availability across ChatGPT, the API, and Azure OpenAI. The 80-plus-page document is the most consequential safety disclosure from OpenAI since the GPT-4 system card in March 2023, and it lands under the company's updated Preparedness Framework v2 (effective April 15, 2025). For platform security teams that have to underwrite GPT-5 in regulated workflows — financial advice, medical triage, code generation that ships to production — the system card is not optional reading. It documents the categorical risk tier OpenAI assigned to the model, the safeguards it claims sufficiently minimize that risk, and the residual failure modes that remain after those safeguards. This post extracts the supply-chain-relevant findings.
Where GPT-5 sits in the Preparedness Framework
Preparedness Framework v2 collapsed the prior four-tier scale into two operational thresholds: High capability and Critical capability. High means a model "could amplify existing pathways to severe harm" and must have safeguards in place before deployment. Critical means the model "could introduce unprecedented new pathways to severe harm" and additionally requires safeguards during development. GPT-5 is classified High in biological and chemical capabilities and High in cybersecurity. It is not classified Critical in any category. That High-in-biosecurity classification is the most consequential single line in the card: it triggers stricter monitoring, a documented refusal stack for CBRN uplift requests, and a commitment by OpenAI not to make raw, unmitigated GPT-5 weights or full chain-of-thought outputs available through any public surface.
What changed between GPT-4o and GPT-5 on safety
The system card discloses substantial movement on three axes. First, the model's refusal behavior on dual-use biology questions improved against the December 2024 internal benchmark: GPT-5 refuses or safely-completes 96% of evaluation prompts versus 81% for GPT-4o. Second, on the StrongREJECT jailbreak benchmark, GPT-5 with its production safety stack scores 0.93 (higher is more robust) compared with 0.83 for GPT-4o. Third, OpenAI added a "deliberative alignment" training step that requires the model to reason explicitly about policy before answering high-stakes prompts; the card credits this for the lift on agentic safety evaluations. Notably, the card concedes that GPT-5 is more capable of cyberattack uplift than GPT-4o on the MITRE ATT&CK CTF benchmark — solving 35% of intermediate-tier capture-the-flag challenges versus 19% for GPT-4o — and OpenAI's mitigation here is a combination of refusal training and downstream monitoring rather than a capability cap.
Sensitive conversations addendum
In November 2025 OpenAI published an addendum to the GPT-5 system card on sensitive conversations, following the disclosure that GPT-4o and earlier models had been involved in tragic interactions with vulnerable users. The addendum documents a new safety completion pathway for prompts involving suicidality, self-harm, or acute psychological distress: GPT-5 routes to specialized refusal-plus-resource responses, and high-confidence detections are flagged for human review when surfaced in ChatGPT (not in raw API traffic). For enterprise buyers building consumer-facing applications on the API, this matters: the default API behavior does not flag the same way ChatGPT does, and your application is responsible for replicating those safeguards.
GPT-5.1, GPT-5.2, and the cadence problem
OpenAI followed the August 13 launch with GPT-5.1 Instant and GPT-5.1 Thinking on November 12, 2025, then GPT-5.2 on December 11, 2025, and a GPT-5.2-Codex addendum on December 18, 2025. Each release carried a system card addendum rather than a full re-evaluation. From a vendor risk standpoint this is the same problem we have with browser engine versioning: the named model in your contract is not the model executing your prompt today. The API selector gpt-5 resolves to a latest-stable pointer that has changed at least three times since August. If your model risk policy demands version pinning, you must reference the dated model strings (gpt-5-2025-08-13, gpt-5.1-2025-11-12, etc.) and accept that those snapshots are deprecated on a schedule OpenAI publishes in its model deprecation page.
What this means for your AIBOM
The system card's most actionable artifact is the per-model evaluation table that maps each capability dimension to a numeric score, the safeguards applied, and residual risk language. Capturing those rows into an AIBOM entry gives downstream consumers — security review, regulators, customers — a deterministic answer to "what tier of model is running here and what did the vendor disclose about its hazards." A minimal AIBOM excerpt for GPT-5 in CycloneDX 1.6 ML-BOM form looks like this:
{
"bomFormat": "CycloneDX",
"specVersion": "1.6",
"components": [
{
"type": "machine-learning-model",
"name": "gpt-5",
"version": "2025-08-13",
"supplier": { "name": "OpenAI" },
"modelCard": {
"modelParameters": {
"task": "text-generation",
"approach": { "type": "supervised" }
},
"considerations": {
"performanceMetrics": [
{ "type": "StrongREJECT", "value": "0.93" },
{ "type": "MITRE-ATT&CK-CTF-intermediate", "value": "0.35" }
],
"ethicalConsiderations": [
{ "name": "preparedness-framework",
"description": "High capability: biology, cyber" }
]
}
}
}
]
}
What's still missing from the disclosure
Three gaps remain. (1) Training-data provenance is described only in aggregate; there is no enumeration of sources sufficient to satisfy the EU AI Act's training data summary obligation that took effect August 2, 2025. OpenAI publishes a separate training data summary template, but it does not appear inside the system card. (2) The card describes safety evaluations OpenAI ran on the model in its production deployment configuration, including the production safety stack. Models routed through Azure OpenAI may run with a slightly different stack (Microsoft adds its own content filters), and that delta is not characterized. (3) The card does not disclose specific external red-team firms used, only that the SAFE program engaged "dozens of external testers." For regulated industries that need attestable assurance about test independence, that is insufficient.
How Safeguard Helps
Safeguard's AIBOM registry tracks GPT-5 (and 5.1, 5.2, 5.2-Codex) as distinct CycloneDX 1.6 model components with system-card-derived metadata. Policy gates allow security teams to block product deployments that depend on a model classified High in capability dimensions outside your risk appetite — for example, a healthcare product can require that any model in use refuses bio dual-use prompts at >95% on StrongREJECT. Griffin AI watches the OpenAI model deprecation feed and the Preparedness Framework page, raising findings when a contracted dated model is end-of-lifed or when the framework's capability thresholds change. TPRM workflows continuously score OpenAI's safety disclosure cadence against your vendor scorecard, alerting when system card addenda lag a model release by more than your policy permits.