AI Security

AIBOM in 2026: Treating AI Models as a Software Supply Chain

The AI bill of materials is graduating from optional security artifact to procurement requirement. Here is what AIBOM/ML-BOM actually tracks in 2026, how it ties to the EU AI Act, and where it still falls short.

Priya Mehta
AI Policy Analyst
7 min read

For two decades, the software supply chain meant code: packages, libraries, container layers, transitive dependencies you never chose but inherited anyway. The SBOM exists because we learned, painfully, that you cannot secure what you cannot enumerate. Log4Shell was the lesson nobody forgot.

In 2026, the same logic is being applied to models. A fine-tuned model pulled off Hugging Face is a dependency. The dataset it trained on is a dependency. The base weights, the tokenizer, the inference framework, the eval harness — all dependencies, and most of them arrive with provenance that ranges from thin to nonexistent. The AI bill of materials, or AIBOM, is the industry's attempt to drag that opacity into the light. This is a trend worth taking seriously, and also one worth being honest about: the standards are maturing fast, but the data feeding them is still mostly aspirational.

What an AIBOM Actually Tracks

A traditional SBOM lists components and versions. An AIBOM extends that vocabulary to the things that make a model a model. In practice, a useful AIBOM captures:

  • Model lineage — the base model, fine-tuning steps, and the weights themselves, ideally with cryptographic hashes so you can prove the artifact you are running is the artifact that was attested.
  • Training data summaries — what datasets contributed, their licenses, and known provenance gaps.
  • Training and inference configuration — architecture, hyperparameters, framework versions, and the compute used.
  • Evaluation and risk artifacts — benchmark results, bias assessments, and any documented limitations or human-oversight mechanisms.

The reason this matters is concrete. If a base model is later found to be poisoned, or a training dataset is discovered to contain copyrighted or unsafe material, an AIBOM is the difference between "we can trace exactly which of our products are affected in an afternoon" and "we have no idea, please stand by for several weeks." Model poisoning and data poisoning are not hypothetical attack classes; they are the AI-native version of the dependency-confusion and typosquatting attacks that already plague package registries.

The blast radius is also wider than people expect. A single popular open-weight model gets fine-tuned, quantized, merged, and re-uploaded thousands of times, and each derivative inherits whatever was wrong with its parent. That is precisely the transitive-dependency problem that made SBOMs necessary for code, except the dependency tree is now made of weights nobody can easily diff. Without a bill of materials, "which of our deployed models descend from this compromised base" is an unanswerable question.

The Standards Are Real Now — Pick Two

The encouraging part of the 2026 story is that the formats stopped being whiteboard diagrams. There are two that matter, and serious teams are using both.

CycloneDX ML-BOM, maintained under OWASP, added machine-learning support back in version 1.5 and has continued to mature. It is built to be generated automatically inside CI/CD pipelines, which is exactly where you want a bill of materials produced — as a byproduct of the build, not as a document somebody fills in by hand three months later. It represents models, datasets, and their dependencies, including dataset provenance and framework configuration.

SPDX 3.0, from the Linux Foundation and released in 2024, added an AI Profile and a Dataset Profile. Its advantage is regulatory weight: SPDX is ISO-standardized, which makes it the more natural choice when you have to hand something to an auditor or a regulator.

The pragmatic pattern emerging this year is to generate CycloneDX ML-BOM internally for engineering and CI/CD use, and to require or produce SPDX 3.0 AI Profile artifacts when you are dealing with external vendors and filings. Tooling such as Protobom and BomCTL can translate between the two, so the choice is less either/or than it first appears. For teams building on Hugging Face, the OWASP AIBOM Generator introduced at RSAC 2025 is the fastest way to see what a real AIBOM looks like rather than arguing about it in the abstract.

The EU AI Act Turned This From Nice-to-Have to Obligation

The regulatory forcing function is no longer speculative. As of 2 August 2025, the EU AI Act's obligations for providers of general-purpose AI (GPAI) models entered into application. Providers placing a GPAI model on the market after that date must comply now; providers of models already on the market before that date have until 2 August 2027 to fall in line.

What the Act requires reads almost like an AIBOM specification written by lawyers. GPAI providers must maintain technical documentation covering model architecture, training methodology, the computational resources used, and training-data summaries, along with known limitations. They must give downstream providers enough information to integrate the model compliantly, implement policies to respect EU copyright law including the text-and-data-mining opt-out, and publish a sufficiently detailed summary of training content using a template from the AI Office. Documents are submitted to the AI Office through the EU's designated platform.

Read that list again with a security hat on. "Information to downstream providers so they can integrate compliantly" is, functionally, a machine-readable bill of materials. The regulation does not mandate a specific file format, but the obligations map so cleanly onto AIBOM fields that producing one is the path of least resistance to compliance. That alignment is the single biggest reason AIBOM moves from optional to expected in 2026.

Where the Honesty Has to Come In

None of this is a finished story, and pretending otherwise does the field no favors.

The first problem is that an AIBOM is only as good as its inputs, and model provenance is genuinely hard to establish after the fact. A traditional package has a registry, a version, and a hash. A fine-tuned model downloaded from a community hub may have none of that lineage documented, and you cannot retroactively manufacture a clean training-data summary for weights someone else produced. The format gives you a place to record provenance; it does not create provenance that was never captured.

Second, there is a real risk of theater. A signed, schema-valid AIBOM that asserts "trained on proprietary data, details unavailable" satisfies a checkbox without reducing any risk. The artifact existing is not the same as the artifact being useful. Auditors and procurement teams will need to get good at reading AIBOMs critically rather than just confirming one was attached.

Third, the dataset layer remains the weakest link. Enumerating datasets is one thing; verifying that a dataset is what it claims to be, free of poisoned samples or unlicensed content, is a far harder problem that no bill of materials format solves on its own. AIBOM tells you what to go check. It does not do the checking.

How Safeguard Helps

Safeguard treats models as first-class supply-chain components, not a separate universe. Our AIBOM/ML-BOM capability generates and ingests CycloneDX and SPDX artifacts alongside your existing SBOMs, captures model and dataset provenance with attestation, and runs the inputs through the same policy gates and vendor scorecards you already use for third-party risk — so a poisoned base model or an unlicensed dataset trips a gate before it ships, not after. Because the platform is model-agnostic, components like OpenAI Daybreak or Anthropic Mythos plug in as exactly that — components — while the verification and orchestration layer above them does the multi-agent checking that keeps false positives down. If you are trying to make AIBOM mean something more than a checkbox, reach out.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.