AI-BOM is the answer to a question that has been embarrassing enterprise security teams for two years: what model, trained on what data, with what fine-tuning, running in what environment, is actually serving responses to our users right now? Without a structured answer to that question, AI supply chain security is rhetorical. With one, it becomes auditable. Griffin AI ingests AI-BOM artifacts as first-class inventory inputs. Mythos-class tools, ironically the very products people assume would be good at this, treat AI-BOM as another document to summarize. The gap between those two postures is wider in the AI domain than it is in traditional software, because AI systems have more moving parts with less mature tooling.
What an AI-BOM contains that a software SBOM does not
A complete AI-BOM describes models (name, version, architecture, parameter count, quantization, license), datasets (source, size, collection date, preprocessing, redaction status, licensing), training configuration (hyperparameters, seed, hardware, duration), evaluation results (benchmarks, bias metrics, safety evaluations), deployment context (inference runtime, hardware, geographic region), and the prompt and tool ecosystem that surrounds inference (system prompts, tool schemas, MCP servers, retrieval sources). That is a much larger field surface than a traditional SBOM. It is also a field surface that regulators increasingly require — the EU AI Act, NIST AI RMF, and most industry-specific AI governance frameworks all reference AI-BOM as the mechanism for compliance. Griffin AI parses each of these fields into typed nodes in the inventory graph. A Mythos-class tool reads them as paragraphs and surfaces whichever words the model happens to attend to.
The CycloneDX ML-BOM extension is the standard, and it needs structured parsing
CycloneDX 1.5 extended the spec with machine learning components — the @type: "machine-learning-model" component type, with fields for model parameters, training data references, evaluation metrics, and quantization details. SPDX 3.0 added an AI profile with similar capabilities. These extensions are not cosmetic — they encode the provenance chain that lets you answer "which model is this, where did it come from, and what was it trained on." Griffin AI implements both extensions as typed schema in the ingestion pipeline. Each ML component becomes a node with typed properties, and those properties become queryable. A Mythos-class tool treats the ML-BOM extension as more text to summarize, which is fine for demo purposes and insufficient for compliance.
The self-reference problem pure-LLM tools cannot escape
Here is a subtle but real issue with using a pure-LLM tool to reason about AI-BOMs. The tool is itself a model. When asked to describe another model, it has pretraining knowledge about that other model, and it has the context window content of the AI-BOM. These two sources of information routinely contradict each other, and the model has no principled way to arbitrate. In tests I ran against a Mythos-class tool asked to describe a custom fine-tune of Llama 3 with a specific training dataset, the output merged facts from the AI-BOM with facts the underlying model had learned about base Llama 3 during its own pretraining. The result was plausible and partially wrong. Griffin AI avoids this category of error because the inventory is the ground truth and the LLM is only the reasoner over that ground truth, not a second source of information.
Dataset provenance and why training data must be structured
An AI-BOM's dataset section is the most operationally loaded field. It records where training data came from, when it was collected, what preprocessing was applied, what redaction was done, and what licensing applies. If your model was trained on a dataset that contained copyrighted material, the AI-BOM is where that fact is recorded. Griffin AI treats each dataset as a node with typed provenance edges — edges to source URLs, licenses, redaction logs, and evaluation sets that use the dataset as a ground truth. You can ask "show me every model trained on datasets that include user-generated content after March 2024" and get a graph traversal answer. A Mythos-class tool cannot answer that question because it doesn't maintain the edge set between models and datasets.
Prompt inventory is part of the AI-BOM and it is invisible to prose tools
Modern AI systems are prompt-plus-model, not just model. The system prompt, the retrieval augmentation instructions, the tool use schemas, and the guardrail definitions are all part of the deployed system and all need to be inventoried. A vulnerability in a system prompt — say, a prompt that can be jailbroken through a specific phrasing — is a vulnerability in the product. Griffin AI ingests prompts as versioned artifacts with their own hashes, and those hashes change when the prompt changes, which produces an audit trail. A Mythos-class tool might include the system prompt text in its summary if it fits the context window, but it does not maintain version history, and it does not hash. Prompt drift over time is the kind of thing that slips past unstructured tooling.
MCP servers, tools, and agent-level supply chain
The newest layer of the AI supply chain is the MCP server ecosystem — the servers that provide tools, resources, and prompts to agent systems. Each MCP server is itself a supply chain artifact with its own provenance, its own code, and its own update cadence. Griffin AI's AI-BOM extension tracks MCP servers as a distinct component type, with edges into the models they serve and the data sources they expose. Mythos-class tools that lack an AI-BOM schema cannot distinguish an MCP server from any other software dependency. The distinction matters because MCP servers can carry authentication tokens, expose sensitive resources, and run third-party code — they have a different threat model than a typical library.
Evaluation results are part of provenance
A responsible AI-BOM records evaluation results — safety benchmarks, bias evaluations, capability benchmarks, red-team outcomes. These results establish what the model was verified to do before deployment. Griffin AI ingests evaluation results as structured nodes with the benchmark identifier, version, score, and date. Queries like "show me every deployed model whose safety evaluation score dropped between versions" work because the evaluation history is a graph, not prose. Pure-LLM tools, again, summarize the latest evaluation and lose the history. A dropping trend line is invisible to a tool that only sees the current row.
Deployment environment as an inventory field
The same model behaves differently depending on quantization, batch size, hardware, and inference runtime. An AI-BOM records the deployment configuration as part of the inventory so that "the Llama 3 70B we deployed last month" is distinguishable from "the Llama 3 70B we evaluated in staging." Griffin AI's schema distinguishes deployment targets as child nodes of model versions, so a CVE in the inference runtime (TensorRT, vLLM, TGI) is attached to the deployments using that runtime, not to the model generically. Mythos-class tools collapse the deployment surface into the model surface and lose the specificity required to answer runtime questions.
Regulatory consumption requires structured outputs
The EU AI Act's risk documentation requirements, NIST AI RMF's evidentiary profile, and ISO/IEC 42001's AI management system clauses all assume a structured inventory of the AI system. They reference specific fields — training data sources, evaluation outcomes, deployment context — and expect auditable answers. Griffin AI produces answers that map directly to these frameworks because its inventory has fields that match the framework requirements. A Mythos-class tool produces paragraphs that a human then has to extract into framework-compliant artifacts. That human step is where most compliance programs fail, and it is avoided by tools that emit structured outputs natively.
What this looks like in practice
Export a CycloneDX ML-BOM for one of your deployed models. Populate the training data, evaluation, and deployment sections honestly. Ask your AI security tool five questions: which datasets were used, what was the last safety evaluation score, which MCP servers does this model access, what is the runtime, and which system prompt version is currently live. Griffin AI answers each question with a graph lookup. Mythos-class tools answer with varying degrees of proximity to the truth. Grade the answers against the document. The correctness gap will decide which tool belongs in your AI governance program.