SBOM

SBOM Validation and Quality Checks: Ensuring Your SBOMs Are Actually Useful

A syntactically valid SBOM can still be useless. Here's how to validate structure, completeness, and accuracy to produce SBOMs worth trusting.

Shadab Khan
Cloud Security Architect
6 min read

Generating an SBOM is step one. Generating a good SBOM is the actual goal, and most organizations skip the quality assurance in between.

A syntactically valid CycloneDX document with 12 components when your application has 400 dependencies is worse than no SBOM at all. It creates false confidence. It passes automated checks that only verify schema compliance. And it fails catastrophically during the one scenario where SBOMs matter -- when you need to know if you're affected by a vulnerability.

Quality checks close the gap between "we have an SBOM" and "we have an SBOM we can trust."

The Three Layers of SBOM Quality

Layer 1: Structural Validity

Is the SBOM syntactically correct and schema-compliant?

This is the baseline. A CycloneDX JSON document should validate against the CycloneDX JSON Schema. An SPDX document should parse without errors. Structural validation catches:

  • Malformed JSON/XML
  • Missing required fields
  • Invalid field values (e.g., an unrecognized component type)
  • Broken references (a dependency referencing a non-existent component)
# Validate CycloneDX
cyclonedx-cli validate --input-file sbom.json --input-format json --input-version v1_5

# Validate SPDX
pyspdxtools_parser sbom.spdx.json

Most tools generate structurally valid SBOMs. If yours doesn't, you have a tooling problem, not a quality problem.

Layer 2: Completeness

Does the SBOM include all the components that are actually in the software?

This is where most SBOMs fall short. Common completeness gaps:

  • Missing transitive dependencies -- the SBOM lists direct dependencies but not their dependencies
  • Missing OS packages -- container SBOMs that catalog application libraries but skip apt/apk packages
  • Missing native libraries -- binary dependencies that aren't managed by a package manager
  • Missing build tools -- compilers, linkers, and generators that affect the output

Completeness checking requires comparing the SBOM against ground truth:

# Compare SBOM component count against actual installed packages
SBOM_COUNT=$(cat sbom.json | jq '.components | length')
APK_COUNT=$(docker run --rm myimage apk list --installed 2>/dev/null | wc -l)
NPM_COUNT=$(docker run --rm myimage sh -c 'cd /app && npm ls --all --json 2>/dev/null' | jq '[.. | .version? // empty] | length')

echo "SBOM: $SBOM_COUNT components"
echo "APK packages: $APK_COUNT"
echo "NPM packages: $NPM_COUNT"

If the SBOM says 50 and the container has 250, you have a completeness problem.

Layer 3: Accuracy

Are the component names, versions, and identifiers correct?

Accuracy issues are subtle and dangerous:

  • Wrong versions -- the SBOM says lodash@4.17.21 but the actual deployed version is 4.17.19
  • Wrong identifiers -- a Package URL that doesn't match the actual package
  • Stale data -- the SBOM was generated from a lockfile that doesn't match the deployed artifacts
  • Incorrect relationships -- a component listed as a dev dependency when it's actually required at runtime

Accuracy is hardest to verify automatically. Some approaches:

# Verify checksums: compute hash of actual file and compare to SBOM
ACTUAL_HASH=$(sha256sum node_modules/lodash/lodash.js | cut -d' ' -f1)
SBOM_HASH=$(cat sbom.json | jq -r '.components[] | select(.name=="lodash") | .hashes[]? | select(.alg=="SHA-256") | .content')

if [ "$ACTUAL_HASH" != "$SBOM_HASH" ]; then
  echo "MISMATCH: lodash hash doesn't match SBOM"
fi

NTIA Minimum Elements

The NTIA (National Telecommunications and Information Administration) defined minimum elements for SBOMs. Every SBOM should include:

| Element | Description | CycloneDX Field | SPDX Field | |---------|-------------|-----------------|------------| | Supplier Name | Who supplies the component | supplier.name | supplier | | Component Name | Name of the component | name | name | | Version | Version string | version | versionInfo | | Unique Identifier | Unique ID for the component | bom-ref, purl | SPDXID, externalRefs | | Dependency Relationship | How components relate | dependencies | relationships | | Author of SBOM Data | Who created the SBOM | metadata.authors | creationInfo.creators | | Timestamp | When the SBOM was created | metadata.timestamp | creationInfo.created |

Check your SBOM against these minimums:

# Quick NTIA check for CycloneDX
cat sbom.json | jq '{
  has_timestamp: (.metadata.timestamp != null),
  has_tools_or_authors: ((.metadata.tools | length > 0) or (.metadata.authors | length > 0)),
  total_components: (.components | length),
  components_with_version: ([.components[] | select(.version != null and .version != "")] | length),
  components_with_purl: ([.components[] | select(.purl != null)] | length),
  components_with_supplier: ([.components[] | select(.supplier != null)] | length),
  has_dependencies: (.dependencies | length > 0)
}'

If components_with_purl is significantly lower than total_components, your SBOM will perform poorly in vulnerability matching.

Automated Quality Gates

Integrate quality checks into your CI/CD pipeline so bad SBOMs fail the build:

- name: Generate SBOM
  run: syft dir:. -o cyclonedx-json > sbom.json

- name: Validate SBOM Structure
  run: cyclonedx-cli validate --input-file sbom.json

- name: Check SBOM Quality
  run: |
    TOTAL=$(cat sbom.json | jq '.components | length')
    WITH_PURL=$(cat sbom.json | jq '[.components[] | select(.purl != null)] | length')
    WITH_VERSION=$(cat sbom.json | jq '[.components[] | select(.version != null and .version != "")] | length')
    HAS_DEPS=$(cat sbom.json | jq '.dependencies | length')
    
    echo "Components: $TOTAL"
    echo "With PURL: $WITH_PURL"
    echo "With version: $WITH_VERSION"
    echo "Dependency entries: $HAS_DEPS"
    
    # Fail if less than 80% of components have PURLs
    PURL_RATIO=$(echo "scale=2; $WITH_PURL / $TOTAL" | bc)
    if (( $(echo "$PURL_RATIO < 0.80" | bc -l) )); then
      echo "FAIL: Only ${PURL_RATIO}% of components have Package URLs"
      exit 1
    fi
    
    # Fail if no dependency information
    if [ "$HAS_DEPS" -eq 0 ]; then
      echo "FAIL: No dependency relationships found"
      exit 1
    fi

Tool-Specific Quality Issues

Different SBOM generation tools have different blind spots:

Syft produces comprehensive SBOMs but may miss Go binaries compiled without module information. Verify Go components by checking go version -m output against the SBOM.

Trivy focuses on vulnerability-relevant packages, which means it may skip components that don't map to known vulnerability databases. Good for security, potentially incomplete for compliance.

cdxgen has strong language ecosystem support but may miss OS-level packages in container scans. Pair it with a container-focused tool for full coverage.

Microsoft SBOM Tool generates SPDX with good provenance metadata but may not capture all transitive dependencies in some ecosystems.

Run multiple tools and compare the output. Discrepancies reveal blind spots:

# Generate with two tools
syft dir:. -o cyclonedx-json > sbom-syft.json
cdxgen -o sbom-cdxgen.json .

# Compare component counts
echo "Syft: $(cat sbom-syft.json | jq '.components | length') components"
echo "cdxgen: $(cat sbom-cdxgen.json | jq '.components | length') components"

If one tool finds 200 components and another finds 350, investigate the delta.

Ongoing Quality Monitoring

SBOM quality isn't a one-time check. Monitor quality metrics over time:

  • Completeness trend -- are SBOMs getting more or less complete across releases?
  • PURL coverage -- what percentage of components have Package URLs?
  • Supplier coverage -- what percentage of components have supplier information?
  • Freshness -- how old is the most recent SBOM for each product?

Track these as metrics alongside your other security KPIs.

How Safeguard.sh Helps

Safeguard validates every uploaded SBOM against both structural and quality criteria. The platform flags SBOMs with low Package URL coverage, missing dependency relationships, or incomplete supplier information. Quality scores give you visibility into which SBOMs are trustworthy and which need attention. When vulnerability matching produces low-confidence results because of poor SBOM quality, Safeguard tells you -- so you fix the input, not the output.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.