On April 15, 2025, OpenAI published Preparedness Framework version 2 — the first major rewrite of the framework since its initial release. The v2 document streamlines the capability levels from the prior four-tier system to two clear thresholds, adds AI self-improvement as a Tracked Category alongside cybersecurity and biology/chemistry, and tightens the operational requirements for what "sufficiently minimize" risk means in practice. For enterprise defenders evaluating OpenAI's models, the framework is the document that explains why OpenAI made certain safety choices and what triggers the next round of restrictions. This post extracts the operationally relevant changes and explains what to track in your model risk register.
What are the new capability thresholds?
The Preparedness Framework v2 streamlines to two thresholds: High capability and Critical capability. High capability is defined as a capability level that could amplify existing pathways to severe harm. Critical capability is defined as a capability level that could introduce unprecedented new pathways to severe harm. The operational difference matters: systems reaching High capability must have safeguards before deployment, while systems reaching Critical capability require safeguards during development — meaning OpenAI commits to not training certain capabilities at all without pre-deployment mitigation in place.
The reduction from four tiers to two is a meaningful simplification. The prior framework had Low, Medium, High, and Critical, which led to definitional disputes (where does Medium end and High begin?) that consumed governance attention without changing operational behavior much. The v2 binary is easier to operationalize: either a capability requires safeguards or it does not, and the disagreement collapses to whether the model passed the High threshold.
What is AI self-improvement and why was it added?
The third Tracked Category in v2 is AI self-improvement, sitting alongside cybersecurity and biological/chemical capabilities. OpenAI explained the addition by stating that self-improvement "presents a distinct plausible, net new, and potentially irremediable risk, namely that of a hard-to-track rapid acceleration in AI capabilities which could have hard-to-predict severely harmful consequences." Concretely, OpenAI is now evaluating whether their models can autonomously contribute to ML research, ML engineering, and (most importantly) recursive self-improvement of their own training pipelines.
The GPT-5.2 system card update in December 2025 was the first to publish numerical evaluation results for this category. The model was reported as below the High threshold but with measurable capability — meaning the trajectory is real and is being tracked, but the redline has not been crossed. For enterprise defenders, this is not a category that affects today's deployments directly; it is the category that signals the regulatory environment those deployments will face in 2-3 years.
What changed about "sufficiently minimize"?
The phrase "sufficiently minimize risk" was load-bearing in the v1 framework but was operationally vague. V2 tightens it by requiring: (a) an explicit evaluation methodology, (b) third-party validation where available, (c) public disclosure of evaluation results in the system card, and (d) ongoing monitoring of deployed systems against the documented capability levels. The fourth point matters most operationally: it means OpenAI commits to re-evaluating deployed models if real-world usage reveals capability not seen in pre-release testing. For defenders, this is the closest the framework comes to a continuous-monitoring obligation, and it is the right hook to ask vendors about during procurement.
What are the Research Categories?
V2 introduced Research Categories — areas of capability that could pose risks of severe harm but do not yet meet the criteria to be Tracked Categories. The criteria are: (1) plausible, (2) measurable, (3) severe, (4) net new, and (5) instantaneous/irremediable. A capability that fails any of these criteria is not Tracked, but if it is plausible to become Tracked, it becomes a Research Category. This is OpenAI signaling future scope: capabilities they are watching but have not yet committed to evaluating against a redline. For enterprise governance, the Research Categories list is the leading indicator of what capabilities will be regulated in the next framework revision.
How does this compare to Anthropic's RSP and Google's FSF?
The three frontier labs now have broadly similar frameworks: Anthropic's Responsible Scaling Policy (v3.0 released February 2026), OpenAI's Preparedness Framework v2 (April 2025), and Google DeepMind's Frontier Safety Framework. All three define capability tiers, all three commit to pre-deployment evaluation, all three publish system-card-style disclosures. The honest assessment is that the frameworks are converging on a common architecture — capability evaluation as a release gate — while disagreeing on specific thresholds and remediation timelines. For enterprise defenders, the practical implication is that your model risk register can use the same template across vendors, with vendor-specific tier names mapped to a normalized severity scale.
# Normalized vendor risk-tier mapping for model risk register
vendor_tiers:
anthropic:
asl-2: { normalized: low }
asl-3: { normalized: medium }
asl-4: { normalized: high }
asl-5: { normalized: critical }
openai:
below-high: { normalized: low_to_medium }
high: { normalized: high }
critical: { normalized: critical }
google_deepmind:
below-ccl: { normalized: low_to_medium }
at-ccl: { normalized: high_to_critical }
internal_action:
low: { gate: "standard-pr-review" }
medium: { gate: "appsec-review-required" }
high: { gate: "governance-board-approval" }
critical: { gate: "do-not-deploy-without-vp-engineering-signoff" }
What about the academic criticism?
In September 2025, a paper on arXiv (2509.24394) titled "The 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices" argued that the framework's language is permissive enough that OpenAI could comply without meaningful mitigation, and provided a proof-of-concept analysis of the framework's affordances. The paper is worth reading; the critique is not that OpenAI is acting in bad faith but that the framework as written would not bind them to specific mitigations under adverse interpretation. For enterprise defenders, this is the right reason to require vendor evidence beyond compliance with their own framework — independent third-party evaluation, contractual SLAs around safeguards, and your own internal evaluations.
What is the framework's interaction with EU AI Act?
The EU AI Act, which began phased enforcement in 2025, defines high-risk AI systems with their own evaluation and disclosure requirements. The OpenAI Preparedness Framework operates at a different layer — it covers frontier capability assessment, while the AI Act covers product-level deployment risk — but the two intersect at the system-card and capability-disclosure surface. OpenAI's preparedness disclosures are a useful input to AI Act conformity assessments for enterprises deploying GPT-5.x products in EU markets, but they are not a substitute. The AI Act requires deployer-side documentation of intended use, risk management measures, and post-market monitoring; the Preparedness Framework provides part of the evidence package but not the whole. Enterprises in EU markets should expect to combine vendor disclosures (Preparedness Framework, system card) with their own deployment-specific risk assessment to produce a complete conformity file.
What should enterprises do with this?
Three updates. First, map vendor tier classifications into your normalized risk register so cross-vendor decisions use the same units. Second, require vendors to cite the specific evaluation methodology and the third-party evaluator (the UK AISI, METR, Apollo Research, etc.) in their evidence package, not just the resulting classification. Third, track the Research Categories list as a procurement signal: capabilities that move from Research to Tracked are the capabilities your governance program will be asked about within the next 18 months.
How Safeguard Helps
Safeguard parses OpenAI preparedness updates, Anthropic RSP versions, and Google DeepMind Frontier Safety Framework reports into a unified vendor-risk schema, normalized to the High/Critical scale used in v2. When OpenAI publishes a new framework version or a system card update, Safeguard diffs the disclosed capability tiers and flags policy gates that need governance review. Griffin AI generates internal evaluation harnesses aligned to the published methodologies, so your numbers are comparable to the vendor's. Policy gates enforce the model-tier-to-deployment mapping (a model classified High requires VP-engineering signoff; a model in Critical territory is blocked outright). The result: vendor safety frameworks become operational gates in your environment rather than marketing artifacts in your procurement folder.