Open Source Security

GraalVM Native Image Supply Chain

Name: Safeguard
Brand: Safeguard
Availability: PreOrder

GraalVM native images change the supply chain story in ways that most SBOM tooling has not caught up with yet. Here is what gets baked in, what gets stripped out, and what still needs to be tracked.

GraalVM's native image has moved in the last two years from a niche tool for cold-start-sensitive workloads to a mainstream production choice, especially for teams running Spring Boot 3 or Quarkus in container environments. GraalVM 21 (September 2023) and 22 (early 2024) shipped with production-ready native image support, and the community edition is now bundled in a way that makes adoption much less painful than it was even in 2022. But the supply chain story for native images is not the same as the supply chain story for JAR files, and most of the tooling that teams use to scan and SBOM their Java applications is still built around JAR-era assumptions.

This post walks through what actually happens to your dependency tree when you compile a native image, what that means for CVE applicability, how SBOMs should be generated for native binaries, and the specific gotchas I have hit integrating native image builds into a supply chain policy.

What native image actually does

GraalVM native image is an ahead-of-time compiler that takes your JARs and produces a single native executable, statically linked against a substrate VM. The key steps: points-to analysis computes the reachable part of your code, the reachable classes are initialized at build time where possible, the bytecode is compiled to native code, and the resulting binary includes only the reachable fraction of your dependencies.

The "reachable fraction" is the critical phrase. If your application uses jackson-databind but only ever calls ObjectMapper.readValue on a specific handful of types, the native image analysis may strip out the gadget-chain deserialization code paths that historically produced the Jackson CVEs. Conversely, if you use reflection (which is common in Spring), or dynamic proxies, or service loader, the analysis may include classes that the straight-line call graph would not reach, and those classes get compiled in even if you never exercise them.

From a supply chain perspective, this means the native binary contains a subset of the classes from the JARs it was built from, and the subset is determined by static analysis plus your reachability metadata (the reachability-metadata.json files that GraalVM 22.3 standardized).

The CVE applicability question

Take CVE-2022-42003, a Jackson deserialization vulnerability in versions before 2.13.4.1 and 2.12.7.1. The vulnerable code path requires the Jackson polymorphic type handling machinery to be reachable, and that machinery is often not in the reachable set of a tight Spring Boot web service. Does your native image contain the vulnerable code?

The honest answer is: sometimes yes, sometimes no, and the only way to know is to check the binary. Standard JAR-based CVE scanning will flag your native image build because the input JAR has a vulnerable Jackson version. That is not wrong, but it is also not actionable without the reachability information. You need a VEX-style statement that says, for this specific native image build, the vulnerable code is not reachable, and here is the build-time evidence.

GraalVM 23.1 (released October 2023) improved the tooling around build-time reachability reports. The -H:+PrintReachabilityRoots flag writes out the reachability root set. The --emit build-report flag produces an HTML report of what was included and why. These are the artifacts that a supply chain tool should be consuming to produce accurate VEX statements for native image builds.

Reachability metadata and the reachability-metadata-repository

The GraalVM team maintains a community repository at github.com/oracle/graalvm-reachability-metadata that contains reachability metadata for hundreds of popular Java libraries. When you build a native image, the Gradle plugin or Maven plugin (org.graalvm.buildtools.native 0.10.2 as of early 2024) can automatically apply this metadata so that reflection-heavy libraries work correctly.

From a supply chain perspective, the metadata repository is itself a dependency. It is versioned, it is mutable by contributors, and the specific metadata that shapes your native image build came from a specific commit of that repository. A malicious or buggy metadata entry could cause reflection targets to be included or excluded from your build in ways that affect behavior and security. Most teams do not track which version of the metadata repository was used for which build; GraalVM 24+ is adding build-info output that includes the metadata repo commit, but teams on older versions need to pin it explicitly.

The native-image-agent and the false-positive pipeline

The standard workflow for generating reachability metadata for your own application is to run the application under the native-image-agent during testing, which records all reflection, resource loading, and dynamic proxy usage. The recorded metadata is then committed to the repository and fed into the native image build.

The supply chain risk here is that the agent records whatever the tests exercised, which may be a superset of what production actually uses. Extra reflection targets become extra classes in the native binary, which become extra code that might be flagged by a CVE scan. This is fine in isolation, but it complicates the VEX story: the presence of a class in the binary does not mean the code path is reachable at runtime, even though native image made it syntactically reachable.

Some teams solve this by running the agent against production traffic in a staging environment rather than against unit tests, which produces a more precise metadata set. This requires care around test data and PII, but it is the approach that produces the cleanest native images.

SBOM generation for native images

Standard SBOM tools (Syft, CycloneDX Maven Plugin, SPDX tools) scan the input JARs and produce an SBOM based on the dependencies declared in the build. This is correct for the inputs to the native image build, but it is not the same as the SBOM of the native binary itself.

The CycloneDX Maven plugin 2.8.0 (January 2024) added experimental support for native image SBOMs, which use the build-time reachability data to annotate dependencies with their reachability status. Syft 0.105 (April 2024) added basic support for reading native image metadata from the binary. Neither is fully mature yet; the standard pattern in 2024 is to generate two SBOMs per native image build: an input SBOM from the JAR tree and a build SBOM from the native image metadata. Downstream consumers use the input SBOM for license compliance and the build SBOM for vulnerability scoping.

The static linking question

Native images statically link against a libc implementation. GraalVM defaults to glibc on Linux hosts, which means the resulting binary carries a libc version. If that libc version has a CVE, your binary is affected. The musl static linking option (--static-musl) produces a truly static binary against musl, which has a different and generally smaller CVE surface. Alpine-based container images typically use musl; Debian and Ubuntu-based use glibc.

Whichever you choose, the libc becomes part of your binary's supply chain and should be reflected in the SBOM. Most Java-ecosystem SBOM tools do not include the libc, because the JAR-era mental model does not have system libraries. When you ship native images, your SBOM story needs to bridge the Java ecosystem and the OS package ecosystem; tools like Syft and Trivy handle this relatively well because they were built with container SBOMs in mind.

Build reproducibility and the Oracle vs Community divide

GraalVM comes in two flavors: Oracle GraalVM (commercially supported, some optimizations exclusive) and the Community Edition (Apache 2.0 + GPLv2, fewer optimizations). Binaries built with the two flavors are not byte-identical even for the same source; the Oracle version includes optimizations that change the output. From a supply chain reproducibility perspective, this matters because your reproducible-build verification needs to pin the exact GraalVM distribution and version.

GraalVM 17.0.7 and 21.0.1 (the JDK version-aligned releases) both have community and Oracle variants. A build-provenance statement should record which variant was used, along with the SDK version and the native image tooling version. The SLSA provenance format v1.0 (January 2024) supports this through the builder field.

How Safeguard Helps

Safeguard ingests both the input JAR SBOM and the native image build SBOM for GraalVM builds, so our vulnerability scanner produces two sets of findings: CVEs applicable to the JAR inputs and CVEs actually reachable in the compiled binary. That distinction closes out the largest source of native-image-era false positives, which is CVE alerts on gadget chains that the native image stripped at build time. We also track the reachability metadata repository version and the GraalVM distribution used for each build, so your provenance and reproducibility story is complete without manual bookkeeping.

GraalVM Native Image Supply Chain Java SBOM

Back to all articles

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Self-healing security runs on Safeguard.

Your first fix PR is minutes away.

Book a demo Get started

No sales call required, even your agent can complete the purchase over MCP.