Engineering

CycloneDX and SPDX: Why Safeguard Supports Both and How We Normalize Between Them

The SBOM format debate misses the point. Safeguard ingests both CycloneDX and SPDX, normalizes to a common model, and lets you query and export in either format.

Michael
AI/ML Engineer
7 min read

The SBOM community has spent years debating whether CycloneDX or SPDX is the better standard. We have a different perspective: the debate is irrelevant for practitioners. What matters is that your tooling works with whatever format you encounter.

Safeguard supports both CycloneDX and SPDX. This is not a marketing checkbox. It is a deep, normalized integration that lets you ingest SBOMs in either format, query them uniformly, and export in either format. Here is how we built it and why.

Why Both Formats Exist

CycloneDX and SPDX serve overlapping but different audiences.

SPDX originated at the Linux Foundation in 2010, primarily as a license compliance tool. Its data model is rich in licensing constructs: license expressions, license references, extracted licensing information. SPDX was adopted as ISO/IEC 5962:2021, giving it formal international standard status. It is the format most commonly requested in license compliance contexts and in some government procurement specifications.

CycloneDX was created by OWASP in 2017, designed specifically for security use cases. Its data model is rich in security constructs: vulnerability references, exploitability assessments, composition completeness declarations, and service definitions. CycloneDX was built to be lightweight and machine-processable, with a focus on CI/CD integration.

Neither format is strictly better than the other. SPDX has more expressive license modeling. CycloneDX has more expressive security modeling. Both can represent the core component inventory that most SBOM use cases require.

In practice, organizations encounter both formats. Your internal tooling might generate CycloneDX, but a vendor provides SPDX. A government customer requires SPDX, but your commercial customers want CycloneDX. Your SCA tool outputs CycloneDX, but your license compliance tool expects SPDX.

Requiring a single format creates friction. Supporting both eliminates it.

The Normalization Challenge

The hard part of supporting both formats is not parsing. Parsing CycloneDX JSON/XML and SPDX JSON/tag-value/RDF is straightforward. The hard part is normalization: mapping the concepts in each format to a common internal model so that queries, policies, and vulnerability correlation work identically regardless of the source format.

Some mappings are obvious:

| CycloneDX | Safeguard Internal | SPDX | |---|---|---| | component.name | component.name | package.name | | component.version | component.version | package.versionInfo | | component.purl | component.purl | package.externalRef (purl type) | | component.type | component.type | package.primaryPackagePurpose |

Others are less straightforward.

License Normalization

CycloneDX uses license objects with SPDX license IDs and expressions. SPDX uses license expressions with its own set of operators (AND, OR, WITH) and supports extracted licensing information for licenses not in the SPDX license list.

Our common model uses SPDX license expressions as the canonical format (since SPDX's license expression syntax is more expressive), but can represent both CycloneDX license lists and SPDX license expressions. During ingestion, CycloneDX license objects are converted to equivalent SPDX expressions. During export to CycloneDX, SPDX expressions are converted back to CycloneDX license objects.

Relationship Normalization

CycloneDX represents relationships implicitly through nesting (a component contains sub-components) and through dependency graphs. SPDX uses explicit relationship types (DEPENDS_ON, CONTAINS, BUILD_TOOL_OF, etc.) with a richer vocabulary.

Our common model uses explicit typed relationships (closer to the SPDX approach) and synthesizes them from CycloneDX's nesting and dependency structures during ingestion. When exporting to CycloneDX, relationships are converted to the appropriate nesting and dependency graph structures.

Component Identity

The most critical normalization is component identity: determining when two component references in different SBOMs refer to the same software package. We use Package URL (purl) as the primary identity, supplemented by CPE and name/version pairs where purl is not available.

During ingestion, we extract or synthesize purls for every component:

  • If the SBOM includes a purl, we use it directly
  • If the SBOM includes ecosystem-specific identifiers (Maven coordinates, npm package names), we construct the corresponding purl
  • If only name and version are available, we attempt to match against known packages in our registry and assign the correct purl

This purl-first approach is what enables cross-format querying. When you search for "jackson-databind 2.15.3" across your inventory, the query resolves to pkg:maven/com.fasterxml.jackson.core/jackson-databind@2.15.3 and matches regardless of whether the SBOM was CycloneDX or SPDX.

Querying Across Formats

Once SBOMs are normalized, all Safeguard queries work across formats transparently. You do not need to know or specify the source format.

"Show me all products containing Log4j below 2.17.1"

This query searches across every SBOM in your inventory -- CycloneDX, SPDX, any version of either format -- and returns results with consistent component identity and metadata.

Vulnerability correlation similarly works across formats. The correlation engine operates on purls, so a CVE match against a component found in a CycloneDX SBOM uses the same logic as a match against the same component found in an SPDX SBOM.

Policy evaluation is format-agnostic. A policy that blocks components with GPL licenses evaluates the normalized license expression, not the raw format-specific license representation.

Export in Either Format

Just as Safeguard can ingest both formats, it can export in both. This is useful for several scenarios:

  • A customer requires SPDX, but your tools generate CycloneDX. Upload CycloneDX, export as SPDX.
  • You want to feed SBOMs into a tool that only accepts CycloneDX, but your vendor provided SPDX.
  • You need to provide the same SBOM in both formats to different customers.

Export is not a simple format conversion -- it is a full reconstitution of the document from the normalized model. This means the exported document is well-formed and valid against the target format's schema, with all required fields populated.

We test export fidelity by round-tripping: ingest a CycloneDX document, export as SPDX, re-ingest the SPDX, export as CycloneDX, and compare with the original. Any information loss in the round-trip is a bug we fix.

What Gets Lost in Translation

In the interest of honesty: normalization is not lossless in every case.

SPDX's extracted licensing information (custom license text for non-standard licenses) does not have a direct CycloneDX equivalent. We preserve it in the internal model but cannot fully represent it in CycloneDX export.

CycloneDX's service definitions (for SaaS BOMs) do not have an SPDX equivalent. Again, preserved internally but lost on SPDX export.

CycloneDX's formulation data (build process description) is unique to CycloneDX and has no SPDX mapping.

For the core use cases -- component inventory, vulnerability correlation, license compliance, dependency tracking -- the normalization is lossless. The edge cases are in advanced features specific to one format.

Practical Recommendations

If you are starting from scratch and can choose a format:

  • Choose CycloneDX if your primary use case is security (vulnerability management, SCA, policy enforcement). Its data model is more natural for security workflows.
  • Choose SPDX if your primary use case is license compliance or if a specific regulation or customer contract requires it.
  • Choose both if you have mixed requirements, and let Safeguard handle the normalization.

If you are inheriting SBOMs from multiple sources in multiple formats, do not try to standardize the sources. Ingest everything into Safeguard and work from the normalized model. Trying to force all your vendors and tools onto a single format is a losing battle.

The SBOM format wars will eventually converge, or interoperability will mature to the point that the format choice is genuinely irrelevant. Until then, practical tooling that handles both is the pragmatic approach.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.