AI Security

CoSAI Releases Model Signing and Incident Response Frameworks

The Coalition for Secure AI published two operational frameworks in November 2025: Signing ML Artifacts and AI Incident Response. We unpack what each contains and how to adopt them.

In November 2025, OASIS Open announced the release of two operational frameworks from the Coalition for Secure AI (CoSAI): "Signing ML Artifacts: Building towards tamper-proof ML metadata records" and a companion AI Incident Response framework. CoSAI was launched in 2024 by Google, NVIDIA, Cisco, Microsoft, Anthropic, and others as an OASIS Open Project focused on standardizing AI security practices. The two frameworks released in November are the first operational deliverables — practical enough that enterprise security teams can adopt them directly rather than waiting for further iteration. This post walks through what each framework contains and how to roll them out alongside existing security programs.

What does the model signing framework actually specify?

The "Signing ML Artifacts" framework formalizes what the OpenSSF Model Signing v1.0 specification (also released in 2025) covers, but with two additions specific to enterprise deployment contexts. First, it defines a metadata-attestation chain: not just signing the model file, but attesting to the training data identity, the training configuration, the evaluation results, and the system card. Second, it provides verification policies for downstream consumers — guidance on what to check when pulling a model, when to fail closed versus open, and how to chain trust from upstream publisher through internal promotion gates.

The framework explicitly references in-toto attestation formats, the Sigstore transparency log (Rekor), and OASIS attestation chains. The output for an enterprise adopting it: every model in your AIBOM has an attached signing manifest, a Rekor reference, and a pinned signing identity. The framework's authors include security engineers from Google, NVIDIA, Anthropic, and CoSAI's other founding members; the document is the most cross-vendor agreement on model integrity attestation that exists in 2025.

What is the AI Incident Response framework?

The IR framework adapts traditional incident response (NIST SP 800-61, ISO/IEC 27035) to AI-specific incident types: model jailbreak in production, prompt injection successfully exfiltrating tool-call permissions, model output causing downstream harm, training-data leakage, model-weight exfiltration, model substitution. For each incident type, the framework defines: detection signals, initial containment actions, evidence collection requirements, eradication steps, recovery procedures, and lessons-learned questions. The format will be familiar to anyone who runs a SOC — it is a runbook library, not a theoretical document.

The most operationally useful section is the evidence-collection guidance. AI incidents have evidence types that traditional IR does not handle well: model output logs, retrieval-augmented context windows, agent tool-call sequences, prompt history, and the state of any external systems the agent interacted with. The framework specifies what to capture and how long to retain it — the same way 800-61 specifies what to capture during a malware incident.

# CoSAI AI Incident Response — minimum evidence collection for an LLM incident
incident:
  type: prompt_injection_successful_exfiltration
  detected_at: "2025-11-22T14:33:00Z"
  evidence:
    request_traces:
      retention: 365_days
      required_fields:
        - request_id
        - user_id_or_session
        - model_id_and_version
        - prompt_full_text
        - system_prompt_full_text
        - rag_context_documents
        - tool_calls_with_arguments
        - tool_call_responses
        - model_response_full_text
        - safety_classifier_outputs
    upstream_evidence:
      - rag_index_state_at_incident_time
      - model_signature_verification_record
    downstream_evidence:
      - tool_target_system_logs
      - data_egress_records
  containment_actions:
    - revoke_compromised_session_token
    - quarantine_rag_documents_under_review
    - disable_affected_tool_scope
    - rotate_api_keys_for_invoked_tools
  notification:
    - security_oncall_pager
    - dpo_if_pii_exfiltrated

How does this fit with existing incident response programs?

The CoSAI framework is designed to layer onto existing IR programs rather than replace them. The recommended adoption pattern: extend your existing SIRT runbooks with AI-specific playbooks derived from the framework, train your existing on-call rotation on the new incident types, and add the evidence-collection requirements to your log retention configuration. The framework does not propose a separate AI Incident Response team — the practitioners who handle traditional security incidents are the right people to handle AI incidents, but they need new runbooks. The 2025 reality is that "AI incident" is a vague enough category that it can mean a model jailbreak, an exposed API key, a vendor outage, or a regulatory exposure. Treating them as separate incident types with separate runbooks gives the SOC a defined response surface.

How does signing tie into incident response?

The two frameworks are designed to interlock. A model substitution incident — where an attacker replaces a production model with a backdoored variant — is detectable in two ways: by signature verification failing at model-pull time (the signing framework's job) or by anomalous model output detected in production (the IR framework's job). The signing framework is the prevent layer; the IR framework is the detect-and-respond layer. The integration point is the AIBOM: if your AIBOM records the signing identity, the Rekor entry, and the expected SHA-256 for every production model, then any incident response activity can quickly verify whether the running artifact matches the expected one. Without that integration, you spend the first day of an incident reconstructing what should have been there.

What about coordinated disclosure for AI vulnerabilities?

The CoSAI work overlaps with — but does not subsume — the broader question of how to disclose AI-specific vulnerabilities. The OWASP Gen AI Security Project, the AI Vulnerability Database (AVID), and MITRE's ATLAS framework all cover related ground. CoSAI explicitly states it aims to harmonize rather than compete; the IR framework references all three as compatible upstream sources. For enterprises, the practical takeaway is that you should use the CoSAI runbook templates, ingest CVE feeds (including AI-tagged CVEs), monitor the AVID database for emerging research, and use ATLAS as the threat-model reference for adversary tactics.

What are the CoSAI member organizations actually committing to?

CoSAI's founding members include Google, NVIDIA, Cisco, Microsoft, Anthropic, IBM, OpenAI, PayPal, Intel, Amazon, and several others. The November 2025 frameworks are not just whitepapers — the founding members publicly committed to implementing them within their own AI offerings. NVIDIA's NGC catalog model signing (rolled out March 2025) is a concrete instance; Google's secure AI framework (SAIF) for internal model deployments incorporates the CoSAI guidance; Microsoft's responsible AI program at Azure ML overlaps significantly. The procurement implication for enterprise buyers: when negotiating with these vendors, the CoSAI frameworks are now part of the implicit contract — vendors who fail to implement them are deviating from a public commitment, and that deviation is leverage in procurement and incident-response conversations.

What is the adoption timeline?

The November 2025 release is "operational draft" status — production-ready but expected to iterate based on adopter feedback. CoSAI explicitly invited adopters to submit case studies and recommended changes through the OASIS Open project. Realistic enterprise adoption: pilot in a single AI team in Q1 2026, expand to all AI-using teams by Q3 2026, integrate into procurement and vendor management by Q4 2026. The framework is not regulatory — yet — but the trajectory matches what happened with CIS Controls, ISO 27001, and NIST CSF: industry consensus first, regulatory anchoring later. Defenders who wait for regulation will pay more to retrofit later.

How Safeguard Helps

Safeguard implements the CoSAI signing framework natively: every model in your AIBOM carries a signing manifest, a Rekor reference, and a pinned identity, with verification at both pull-time and load-time. The IR framework's evidence-collection requirements are wired into the platform's logging configuration — request traces, prompt history, tool-call sequences, and RAG context windows are captured with the retention periods CoSAI specifies. Griffin AI generates AI-incident runbooks for each of your deployed model surfaces, derived from the CoSAI templates and customized to your tool-call inventory. Policy gates block deployments that lack a complete signing chain or that do not configure the required logging surface, so the frameworks become operational policy rather than aspirational documentation. The result: when an AI incident happens, your team executes a tested runbook instead of inventing one under pressure.

cosai model-signing incident-response oasis ai-supply-chain

Back to all articles

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

CoSAI Releases Model Signing and Incident Response Frameworks

What does the model signing framework actually specify?

What is the AI Incident Response framework?

How does this fit with existing incident response programs?

How does signing tie into incident response?

What about coordinated disclosure for AI vulnerabilities?

What are the CoSAI member organizations actually committing to?

What is the adoption timeline?

How Safeguard Helps

Related articles in AI Security

NIST SP 800-218A: Operationalizing AI Secure Development in 2026

Ollama CVE-2026-7482 'Bleeding Llama': Out-of-Bounds Read

Building an Eval Suite for Your Security LLM Workflows

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers