SBOM

Generating SBOMs with Syft: The Complete Guide

Syft is the most popular open-source SBOM generator. Here's how to use it effectively for containers, directories, archives, and CI/CD pipelines.

Alex
DevSecOps Lead
6 min read

Syft, built by Anchore, has become the de facto standard for SBOM generation in the open-source ecosystem. It's fast, supports a wide range of package ecosystems, handles container images natively, and outputs both CycloneDX and SPDX. If you're starting with SBOM generation, Syft is where most teams begin.

This guide covers everything from installation to advanced configuration, with real-world patterns for integrating Syft into production workflows.

Installation

Syft offers multiple installation methods:

# Official install script (Linux/macOS)
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin

# Homebrew (macOS)
brew install syft

# Chocolatey (Windows)
choco install syft

# Docker (no local installation needed)
docker run --rm anchore/syft:latest --help

# Go install
go install github.com/anchore/syft/cmd/syft@latest

Verify the installation:

syft version
# Output: syft 0.80.0 (or current version)

Basic Usage

Scanning Container Images

# Scan from Docker Hub
syft alpine:3.17

# Scan from a private registry
syft registry.example.com/myapp:latest

# Scan a locally built image
syft docker:myapp:dev

# Scan an OCI image directory
syft oci-dir:./my-image-export/

Scanning Directories

# Scan a project directory
syft dir:./my-project

# Scan a specific path
syft dir:/opt/application

Scanning Archives

# Scan a tarball
syft ./my-archive.tar.gz

# Scan a ZIP file
syft ./my-package.zip

# Scan a Java WAR/JAR
syft ./my-application.war

Output Formats

Syft supports all major SBOM formats:

# CycloneDX JSON (most common for security workflows)
syft alpine:latest -o cyclonedx-json > sbom.cdx.json

# CycloneDX XML
syft alpine:latest -o cyclonedx-xml > sbom.cdx.xml

# SPDX JSON
syft alpine:latest -o spdx-json > sbom.spdx.json

# SPDX Tag-Value
syft alpine:latest -o spdx-tag-value > sbom.spdx

# Syft's native JSON (richest detail)
syft alpine:latest -o json > sbom.syft.json

# Human-readable table (default)
syft alpine:latest -o table

# Multiple outputs simultaneously
syft alpine:latest -o cyclonedx-json=sbom.cdx.json -o spdx-json=sbom.spdx.json

The last command is particularly useful: generate both formats in a single scan pass.

Supported Package Ecosystems

Syft detects packages from:

| Ecosystem | Detection Source | Notes | |-----------|-----------------|-------| | Alpine (APK) | apk database | Full package metadata | | Debian (DPKG) | dpkg status | Includes source package info | | RPM | rpm database | RHEL, CentOS, Fedora | | Python | pip, poetry, conda, egg-info | Parses requirements, lockfiles, installed packages | | Node.js | npm, yarn | package-lock.json, yarn.lock, node_modules | | Go | go.mod, go.sum, binaries | Reads embedded module info from compiled binaries | | Java | JAR, WAR, EAR, pom.xml | Inspects Maven metadata in archives | | Ruby | Gemfile.lock, gemspec | Full gem resolution | | Rust | Cargo.lock | Crate dependencies | | .NET | packages.config, project.assets.json | NuGet packages | | PHP | composer.lock | Packagist dependencies | | Dart | pubspec.lock | Flutter/Dart packages | | Swift | Package.resolved, Podfile.lock | CocoaPods and SPM | | Haskell | stack.yaml.lock, cabal.project.freeze | Hackage packages |

For compiled Go binaries, Syft reads the module information embedded by the Go toolchain -- even without source code. This is particularly valuable for container scanning where you only have the binary.

Configuration

Syft accepts configuration via file, environment variables, or command-line flags.

Configuration File

Create .syft.yaml in your project root or ~/.syft.yaml:

# .syft.yaml
output:
  - "cyclonedx-json"

package:
  cataloger:
    enabled: true
    scope: "all-layers"  # or "squashed" for final filesystem only

# Control which catalogers run
catalogers:
  - name: "python-package-cataloger"
    enabled: true
  - name: "javascript-package-cataloger"
    enabled: true

# Registry authentication
registry:
  auth:
    - authority: "registry.example.com"
      username: "${REGISTRY_USER}"
      password: "${REGISTRY_PASS}"

Layer Scanning Strategy

For container images, Syft offers two scanning strategies:

# Squashed: scan the final filesystem (default)
# This is what runs in production
syft alpine:latest --scope squashed

# All-layers: scan every layer individually
# Catches packages that were installed and then removed
syft alpine:latest --scope all-layers

squashed is usually what you want -- it reflects the actual deployed state. all-layers is useful for auditing: if a secret or vulnerable package was added in an early layer and removed later, it still exists in the image archive.

Advanced Patterns

Excluding Paths

Skip scanning certain directories (e.g., test fixtures, vendored copies):

syft dir:./my-project --exclude './test/**' --exclude './vendor/**'

Scanning Remote Images Without Pulling

# Scan directly from registry without docker pull
syft registry:ghcr.io/myorg/myapp:v1.2.3

This is faster and uses less disk space than pulling the image first.

Combining with Grype for Vulnerability Scanning

Syft and Grype (Anchore's vulnerability scanner) work together:

# Generate SBOM and scan for vulnerabilities
syft alpine:latest -o json | grype

# Or use Grype directly (it uses Syft internally)
grype alpine:latest

CI/CD Integration

GitHub Actions (Official Action)

- name: Generate SBOM
  uses: anchore/sbom-action@v0
  with:
    image: myapp:${{ github.sha }}
    format: cyclonedx-json
    output-file: sbom.json
    upload-artifact: true
    upload-release-assets: true

GitLab CI

sbom-generation:
  stage: security
  image: anchore/syft:latest
  script:
    - syft ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA} -o cyclonedx-json > sbom.json
  artifacts:
    paths:
      - sbom.json
    reports:
      cyclonedx: sbom.json

Jenkins

stage('SBOM') {
    steps {
        sh '''
            curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b ./bin
            ./bin/syft docker:myapp:${BUILD_NUMBER} -o cyclonedx-json > sbom.json
        '''
        archiveArtifacts artifacts: 'sbom.json'
    }
}

Performance Optimization

Syft scans can take seconds to minutes depending on image size and complexity. Optimize with:

  1. Use squashed scope unless you specifically need all-layers analysis
  2. Cache the vulnerability database if using Grype alongside Syft
  3. Scan from registry instead of pulling images locally
  4. Exclude irrelevant paths to reduce scan scope
# Benchmark scan time
time syft alpine:latest -o cyclonedx-json > /dev/null
# Typical: 2-5 seconds for a small image

time syft node:18 -o cyclonedx-json > /dev/null
# Typical: 10-30 seconds for a larger image

Common Issues and Troubleshooting

Missing Go dependencies in binaries: Ensure Go binaries are built with module support (GO111MODULE=on). Binaries built with -ldflags="-s -w" strip debug info but Syft can usually still read module data.

Incomplete Python packages: Syft needs egg-info or dist-info directories. If Python packages are installed without metadata (pip install --no-deps), Syft won't detect them.

Docker socket access: When scanning local Docker images, Syft needs access to the Docker socket. In CI, ensure the runner has Docker available.

How Safeguard.sh Helps

Safeguard integrates directly with Syft-generated SBOMs. Upload CycloneDX or SPDX output from Syft into the platform, and Safeguard picks up the full component inventory for continuous vulnerability monitoring. The platform enhances Syft's output with live vulnerability correlation, policy enforcement, and cross-project dependency search -- turning a point-in-time scan into ongoing supply chain visibility.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.