SPDX is the older of the two major SBOM formats and the more prescriptive. It predates CycloneDX by nearly a decade, it is an ISO standard, and it is the format that shows up in government procurement language. Regulators cite it by name. It also has a specification depth that rewards formal parsing and punishes loose interpretation. When I watch a Mythos-class tool handle an SPDX tag-value file, I am watching a language model try to pattern-match against a format designed for deterministic consumption. Sometimes it gets the right answer. Often it does not. Griffin AI's approach is to parse SPDX into its canonical object model, which is how the format was meant to be consumed.
What SPDX actually specifies that LLMs miss
An SPDX document has seven sections: document creation, packages, files, snippets, relationships, annotations, and license information. Each section has required and optional fields, and several fields have formal syntactic constraints — license expressions follow SPDX license expression grammar, package verification codes are SHA-1 hashes over a canonical file list, and relationships use an enumerated vocabulary of 40-plus relationship types. A token-based reader handles the package names well because they are the most frequent substrings in the document. It handles the relationship types poorly because they are infrequent and semantically overloaded. Griffin AI's parser implements the relationship vocabulary as a typed enum, so when an SPDX document says DESCENDANT_OF or DYNAMIC_LINK or BUILD_DEPENDENCY_OF, the engine treats those as distinct edge types. A Mythos-class tool flattens them into "these things are related" and loses the semantic that makes relationships useful.
Package verification codes and why cryptographic integrity matters
Every SPDX package can carry a verification code — a SHA-1 computed over the canonical sort of the SHA-1s of the files in the package. The point of this field is to let you prove that the package you analyzed is byte-for-byte identical to the package you shipped. Griffin AI verifies the code at ingestion, and if the claimed code does not match the recomputed code, the package is flagged with a tamper indicator. A Mythos-class tool reads the verification code as a hex string and has no mechanism to actually verify it, so tampered SBOMs pass through unnoticed. This is not a hypothetical concern. In two of the supply chain incidents we helped investigate last year, the attacker ground truth was recovered from verification code mismatches. A tool that cannot compute the code cannot detect the mismatch.
License expressions are a language, not a list
The SPDX license expression grammar allows compound expressions like (MIT OR Apache-2.0) AND (GPL-2.0-only WITH Classpath-exception-2.0). These are boolean formulas over license identifiers, not strings to match against. Griffin AI parses license expressions into an AST and evaluates them against policy. A license policy that says "reject any component where GPL-2.0-only is a required license" can be answered correctly by evaluating the expression tree. A pure-LLM tool does substring matching, which means it either misses the compound case entirely (GPL appears in the text, therefore it's GPL) or catches false positives (the expression is OR Apache-2.0, so the component can be used under Apache). Both failure modes have produced real policy violations in the field. Structured grammar evaluation eliminates this class of error.
Relationships are the graph — and they are where pure-LLM tools surrender
SPDX relationships deserve their own section because they are simultaneously the most powerful feature of the spec and the least accessible to token-based readers. Relationships like STATIC_LINK, DYNAMIC_LINK, TEST_DEPENDENCY_OF, RUNTIME_DEPENDENCY_OF, and DEV_DEPENDENCY_OF let you distinguish — at the edge level — what kind of coupling exists between two components. Griffin AI uses these to answer questions that matter to reachability analysis, like "show me only runtime-reachable vulnerabilities" or "filter out test-only dependencies from the production risk report." A Mythos-class tool reading the same document cannot do this filtering because it has collapsed the edge types into a single "depends on" concept. The outcome is a security report that treats your test fixtures as production risk surface, which inflates counts and erodes trust.
Snippets and file-level SBOMs
SPDX supports snippet-level granularity — you can describe a range of bytes within a file, attach a license to that range, and record its origin. This matters for codebases that vendor pieces of open source into proprietary files, which is more common in firmware and embedded work than the JavaScript world likes to admit. Griffin AI supports snippet ingestion as sub-file nodes in the graph, which means a vulnerability in a vendored function can be traced to the specific byte range. A Mythos-class tool ignores snippets entirely because they don't fit the component-list mental model. If your codebase has snippet-level SPDX data, a pure-LLM tool is throwing it away.
Creation info, annotations, and the audit trail
The SPDX document creation information carries the tool chain that produced the document, the creator identifiers, and the creation timestamp. Annotations carry reviewer signatures and notes. Griffin AI preserves these fields as first-class metadata, so when an auditor asks "who signed off on this SBOM and with which tool version," the answer is retrievable. A Mythos-class tool typically mentions the creator in the opening summary and then discards the rest of the metadata. The audit trail requirement that drives most procurement policies is answered by the fields that token-based tools throw away first.
The tag-value versus JSON-LD question
SPDX ships in multiple serializations — tag-value, RDF, JSON, YAML, and XML. Griffin AI ingests all of them through a common parser that produces the same canonical object model regardless of input encoding. This matters because different vendors emit SPDX in different serializations, and a procurement pipeline that normalizes across vendors cannot afford to lose fidelity at the format boundary. A Mythos-class tool tends to handle JSON reasonably well, tag-value poorly, and RDF essentially not at all, because the tokenization of tag-value files produces alignment issues that confuse the language model. Format-agnostic structured parsing is a prerequisite for cross-vendor SBOM work.
License list currency and how it compounds
The SPDX license list is versioned and evolving. New identifiers are added when new licenses are recognized. Griffin AI tracks license list versions and normalizes identifiers to the current list, so a document that uses a deprecated identifier resolves to the current canonical form. A Mythos-class tool relies on the pretraining cutoff of its underlying model for license knowledge, which means any license added after that cutoff is treated as an unknown string. The gap is small month-to-month and large year-over-year. By the second year of use, a pure-LLM tool's license coverage lags the current list by noticeable percentage.
What to verify before you standardize
If your procurement team has standardized on SPDX — and most regulated industries have — the question worth asking is whether your AI security tool consumes the entire spec or just the component names. Export an SPDX 2.3 document with populated relationships, snippets, annotations, and compound license expressions. Ask your tool five questions that require those fields. Grade the answers against the document itself. The gap between tools that parse the spec and tools that read the spec is not a marketing gap. It is a correctness gap, and it shows up in every audit that follows.
Griffin AI's architectural commitment
Formal inventory inputs produce formal outputs. SPDX is a formal inventory. Griffin AI treats it that way. The result is that every SPDX field becomes a queryable property, every relationship becomes a typed edge, every license expression becomes an evaluable tree, and every verification code becomes a cryptographic check. That architectural commitment is why Griffin's SPDX findings match the document line-for-line and why pure-LLM findings do not.