AI Security

Training Data Provenance: Griffin AI vs Mythos

Training data is a supply chain component. Knowing what went into a model is the precondition for knowing what could come out of it. Few tools track this; the few that do matter disproportionately.

Shadab Khan
Security Engineer
5 min read

Every model's behaviour is a downstream effect of its training data. When the training data is opaque, the model's behaviour has an opaque lineage. For general-purpose consumer applications, opacity is a business decision with trade-offs. For enterprise security tooling, opacity is a liability with compliance and operational consequences that grow as regulatory regimes mature. Griffin AI and Mythos-class general-purpose AI-for-security tools handle training data provenance differently, and the differences have outlasted the early years of the AI-in-security debate.

What training data provenance means

Four concrete asks:

  • Catalogue of training data sources. What corpora contributed to training. At what scale. From what time ranges.
  • Fine-tune recipe documentation. What fine-tuning, if any, was applied. On what data. With what objectives.
  • Hash-level artifact tracking. A specific hash identifying the specific training snapshot that produced the deployed model weights.
  • Drift attestations. Signed statements that the deployed model is the one the provenance describes.

Each of these is an element of AI-BOM. Together they constitute enough provenance that a security team can reason about the model's likely behaviour.

How Griffin AI handles it

Griffin AI is built on top of frontier models from Anthropic (Claude family). Anthropic publishes model cards, safety research, and release notes for the model versions it ships. Griffin AI adds layers on top:

Version pinning. Each Griffin AI release is pinned to a specific Anthropic model version. Customers see the pin in the release notes.

Eval attestation. Every Griffin AI release runs the full eval harness against the pinned model version. The output is a signed attestation that the model at that version passes the regression gates.

Degraded-mode documentation. When the underlying model version changes, Griffin AI documents what changed and what customers should expect.

No silent fine-tuning. Griffin AI does not fine-tune the underlying model on customer data. Customer data stays in customer context; the model's behaviour is defined by Anthropic's training data plus Griffin's prompt-level scaffolding.

The cumulative effect: customers know exactly which model is running, what it was trained on (at the level Anthropic publishes), and what eval behaviour it has. This is not perfect transparency — frontier-model training data is not fully disclosed by any vendor — but it is materially more provenance than a fine-tuned opaque model.

Where Mythos-class tools typically land

Some Mythos-class general-purpose AI-for-security tools fine-tune their own models on proprietary security datasets. The fine-tuning can produce meaningful quality gains. It also introduces new provenance questions:

  • What's in the fine-tuning dataset? Proprietary security data often includes partner-contributed data, customer-contributed data, scraped-from-disclosure data. The provenance can be murky.
  • How stable is the fine-tuned model? Retraining on new data produces a new model with new behaviour. Customers experience this as drift.
  • What happens on a fine-tune pipeline compromise? The dependency tree for a fine-tune pipeline — data, code, infrastructure — is its own supply chain attack surface.

None of this is disqualifying. It is a different provenance shape with its own trade-offs.

Why this matters for compliance

Three concrete regulatory pressures that increase the value of provenance:

  • EU AI Act requires high-risk AI systems to document training data sources and fine-tuning methodology.
  • Sector regulations (financial services, healthcare) increasingly require AI vendors to attest to training-data controls.
  • Customer procurement — enterprise buyers increasingly ask for AI-BOM and training-data attestations during vendor review.

Vendors that can produce these documents meet the asks. Vendors that cannot struggle with regulated customers.

A practical audit checklist

Five questions for any AI-for-security vendor:

  1. What model is deployed right now, and what is its version identifier?
  2. What training data (or fine-tuning data) went into this model?
  3. When the model is upgraded, how are customers notified?
  4. How is the model upgrade gated — by evals? By customer preview?
  5. Can the vendor produce a signed attestation of the currently-deployed model for our compliance team?

Vendors who answer all five concretely are ready for regulated deployment. Vendors who answer some are partially ready. Vendors who answer none are asking for pre-regulation trust that the regulators are about to take away.

What to evaluate

Three concrete checks:

  1. Ask for the current model identifier in writing.
  2. Ask what changes when the model upgrades — for findings, for API responses, for customer-experienced behaviour.
  3. Ask for a sample AI-BOM or training-data attestation.

The answers distinguish vendors whose AI governance is production-ready from vendors whose AI governance is pre-regulation.

How Safeguard Helps

Safeguard's AI-BOM capability tracks model versions, training-data attestations, and fine-tune provenance for models in the customer's environment — including the models that power Griffin AI itself. Pinned model versions, eval attestations, and degraded-mode documentation are published for every Griffin AI release. For organisations whose compliance programs are starting to ask AI-BOM questions, this provenance discipline is the foundation the rest of the AI governance program sits on.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.