SBOM & Compliance

in-toto Attestation Formats Reviewed

The in-toto attestation framework is the plumbing under SLSA, Sigstore, and most supply chain tooling. Here is a practical review of the v1 formats and their edges.

Nayan Dey
Senior Security Engineer
6 min read

If you spend enough time reading supply chain tooling, you notice that everything eventually converges on in-toto attestations. SLSA provenance uses them. Sigstore's attestation flows use them. SBOM attachments use them. Witness attestations use them. The in-toto Attestation Framework v1, which reached stability in 2023, is the layer that makes those claims interoperable, and understanding it is the difference between wiring tools together cleanly and writing glue code forever.

This post is a practical review of the v1 format: the Statement structure, the DSSE envelope, the predicate ecosystem, and the gotchas we keep running into. It is not a specification walkthrough; the spec lives at github.com/in-toto/attestation and is short enough to read in one sitting. It is an opinionated tour of what matters operationally when your pipeline has to produce, sign, store, and verify these things.

What is a Statement, really?

An in-toto Statement is a JSON document with four required fields. The _type must be https://in-toto.io/Statement/v1, which pins the schema. The subject is a list of artifacts, each identified by a name and a digest map. The predicateType is a URL that identifies what kind of claim is being made. The predicate is the claim itself, structured according to whatever the predicate type defines.

The Statement is deliberately minimal. It does not carry signatures. It does not carry metadata about when or how it was produced. All it does is say "here is a claim of type X, about artifacts with digests Y, and here is the structured body of the claim." Everything else, including authentication, is the job of the envelope that wraps it.

This separation matters more than it first appears. It means the same predicate body can be verified with different signing schemes. A SLSA provenance predicate can be wrapped in a DSSE envelope signed with Sigstore in one pipeline and wrapped in a DSSE envelope signed with a long-lived key in another. Consumers that understand the predicate do not need to know how it was signed; consumers that verify signatures do not need to understand the predicate.

Why DSSE and not plain JWS?

The Dead Simple Signing Envelope (DSSE), specified in secure-systems-lab/dsse, is what wraps Statements for signing. DSSE v1.0 defines a JSON structure with three fields: payloadType, payload (base64-encoded Statement), and signatures. The payload type is almost always application/vnd.in-toto+json when the payload is an in-toto Statement.

The reason the attestation framework chose DSSE over JWS is that JWS is surprisingly brittle for this use case. JWS requires the payload to be a JSON object with particular structural properties, has historical issues around algorithm confusion, and bakes the header into the signed bytes in a way that complicates multi-signer scenarios. DSSE is deliberately plainer: the signed content is a well-defined PAE (Pre-Authentication Encoding) of the payload type and payload, and signatures are listed separately with keyid references. Cosign, Witness, and the SLSA generators all emit DSSE envelopes as their output.

A common integration error is to treat the payload as if it were a raw Statement. It is not; it is base64-encoded, and the signature is over the PAE, not over the base64 string. When you write a verifier, the decode order is: parse the envelope, decode the payload, parse the Statement, check the predicate type, then separately verify the signatures against the PAE. Skipping the PAE step and verifying against the raw payload is a bug that has shipped in more than one tool.

Which predicate types actually matter?

The predicate type registry at github.com/in-toto/attestation/tree/main/spec/predicates lists the standard predicates. The ones that actually show up in production pipelines are SLSA Provenance v1 (https://slsa.dev/provenance/v1), SLSA Verification Summary v1 (https://slsa.dev/verification_summary/v1), SPDX (https://spdx.dev/Document/v2.3), CycloneDX (https://cyclonedx.org/bom/v1.5), and the generic Link predicate for older in-toto flows.

The SLSA Provenance v1 predicate is the one most consumers verify against. Its body has two top-level fields. buildDefinition describes the inputs: the build type (a URI), external parameters like the source repository and ref, internal parameters, and resolved dependencies. runDetails describes the execution: the builder identity, metadata about when the build ran, and optional byproducts.

The VSA (Verification Summary Attestation) predicate is underused but genuinely helpful for large deployments. A VSA is a claim by a verifier that they checked a set of provenance and the artifact passed policy. Downstream consumers can trust the VSA instead of re-running every verification step. At L3 policies, a VSA produced by a trusted verifier is often the only thing a deployment-time gate checks.

Where do implementations disagree?

Even with a tight spec there are places where emitters disagree. The first is subject naming. The spec says subjects have a name field, but does not constrain what the name means. Some emitters use a filename, some use an OCI reference, some use a package URL, and some use an empty string. Verifiers that match on subject name break when an emitter changes convention; matching only on digest is safer.

The second disagreement is around resolvedDependencies in provenance. The spec defines a ResourceDescriptor structure, but emitters vary in how much detail they include. The slsa-github-generator populates dependencies with the commit SHA of the source repository. Bazel-based generators include every input hash. Cloud Build is somewhere in the middle. A verifier that expects exhaustive dependencies from one emitter will fail against a minimal one, even though both are spec-compliant.

The third is around how attestations are attached to artifacts. The in-toto spec does not define attachment; it defines the document. In practice there are three common attachments: as a file next to the artifact (artifact.intoto.jsonl), as an OCI referrer using the cosign referrer spec, and as a Rekor log entry referenced by UUID. Consumers need to support all three, and a surprising number of CI systems produce one form while their downstream consumers expect another.

What about multiple signatures?

DSSE supports multiple signatures on the same envelope, each with its own keyid. This is useful when multiple parties need to attest to the same claim: a builder can sign provenance, and then a policy verifier can add its own signature indicating that the provenance met policy. The envelope stays the same; new signatures are appended.

Most tooling ignores this capability today, treating DSSE as a single-signer format. That is fine for simple flows, but it is worth knowing the capability exists when designing distributed verification. Witness, which we cover in a separate post, is one of the few tools that takes multi-signer DSSE seriously and builds workflows around it.

How Safeguard Helps

Safeguard accepts in-toto attestations in their native DSSE form, validates the envelope, decodes the Statement, and routes the predicate to the right verifier based on predicateType. We support SLSA Provenance v1, SPDX, CycloneDX, and VSA predicates out of the box, and we preserve the original envelope bytes so audit trails remain verifiable after ingestion. The platform also handles the three attachment conventions (file, OCI referrer, Rekor reference) without forcing pipelines to standardize. When a predicate type arrives that we do not yet parse, we store the raw envelope and surface it under the artifact so your team can write custom policy against it.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.