Cloud Security

AWS ECR Signing Policies with Notation

ECR now supports Notation-based image signing and trust policy enforcement. Here is how to design signing policies that survive scale and auditors.

Shadab Khan
Security Engineer
7 min read

Notation is the CNCF project that has quietly become the most practical image signing tool for AWS shops. Cosign still has the broader community and better Sigstore integration, but Notation's native alignment with AWS Signer and ECR makes it the path of least resistance for signing images that will run on EKS, ECS, or App Runner. The real design question is not "Cosign or Notation" — it is how to structure trust policies so that the signing scheme actually improves security instead of becoming a second registry with the same access problems.

This is the walk-through I give teams that have decided on Notation and now have to make it work at production scale. It covers trust policy structure, identity hygiene, and the gotchas that show up when you onboard the third team.

Why use Notation with ECR rather than Cosign?

Because AWS Signer plus Notation is the only signing path that integrates with ECR replication, ECR pull-through caches, and AWS IAM out of the box. Cosign works with ECR as a generic OCI registry, and it works well, but the trust policy is enforced by your admission controller rather than by AWS itself. For shops where AWS is the security perimeter, that difference matters.

AWS Signer acts as a managed signing service with keys that live in a service-managed HSM. You do not handle the key material. You define a signing profile, grant a role permission to sign against it, and AWS handles the actual cryptographic operation. The signature is attached to the image as an OCI artifact using the Notation format. When the image is pulled by a Notation-aware client, the client validates the signature against the trust policy before the image is allowed to run.

The operational advantage is that you never have a signing key on a build host. In Cosign's keyless flow, you get a similar property through Sigstore, but Sigstore is an external dependency with its own trust assumptions. For regulated workloads, having the signing service inside the AWS compliance boundary simplifies the audit story significantly.

The trade-off is that Notation is less flexible. You cannot easily sign with a BYO key, the trust policy language is less expressive than Cosign's CEL-based policies, and the tooling ecosystem is smaller. If you need to sign with HSM-held customer keys for a regulatory requirement, Notation's default flow does not help.

How should I structure Notation trust policies?

One trust policy per trust scope, scoped by registry and repository pattern, with explicit identity requirements. The Notation trust policy has four main fields: registry scopes, signature verification level, trust stores, and trusted identities. Get all four right and you have a policy that is both enforceable and auditable.

registryScopes should never be "*" in production. Use explicit repository patterns so that each scope corresponds to a clear set of images. A scope like 123456789012.dkr.ecr.us-east-1.amazonaws.com/prod/* applies to all production repositories in one account. Separate scopes for dev, staging, and prod let you apply different identity requirements at each level.

signatureVerification.level should be strict for production. The audit level logs signature failures without blocking, which is appropriate during rollout. The skip level exists for emergencies and should never be the default. I recommend a pipeline that promotes a scope from audit to strict only after a two-week clean signal.

trustedIdentities is where most policies fail to be restrictive. A trusted identity looks like x509.subject:CN=mybuilder,O=mycompany,C=US. If you only specify the organization, any certificate from that org signs valid images. Pin the common name to a specific builder identity, and rotate the builder certificate as part of normal PKI hygiene.

How do I handle multi-team signing without one team's mistake breaking production?

Give each team its own signing profile, its own trusted identity, and its own trust policy scope — and enforce that only specific repositories can be signed by specific profiles. The mistake I see is a single organization-wide signing profile where every team's builder can sign against every repository, and then one compromised builder signs malicious production images.

AWS Signer supports profile-level IAM, so you can grant a team's CI role permission to sign against only their profile. Combine this with an IAM condition on ecr:PutImage that requires a signature from a specific profile — the condition references the Notation-Signing-Cert-Subject metadata — and you get end-to-end enforcement: team A's builder cannot publish a signed image into team B's repository, even if it has write access to the repo.

This pattern requires a bit of IAM policy engineering. A base pattern I use is:

  • aws-signer:SignPayload on a specific profile ARN, for the team's CI role
  • ecr:PutImage on the team's repositories, with a condition on the signature metadata
  • Repository policies on the team's ECR repos that deny uploads not signed by the expected profile

The repository policy is the belt-and-suspenders layer. Without it, an IAM policy misconfiguration on the CI role could still push unsigned images. With it, the registry itself refuses unsigned uploads for that repo.

What goes in the admission-time verification layer?

A Notation-aware admission controller — ratify is the common choice on Kubernetes — configured to pull the same trust policy the CI system uses to sign. The policy lives in Git, is mounted into the admission controller, and is never edited out of band. If the policy changes, the change goes through a pull request and redeploys the admission controller.

The admission controller pulls the image's signature from ECR at verification time, validates the signature chain against the trust store, and matches the signer identity against the trusted identities list. If all of these pass, the pod is allowed. If any fail, the pod is rejected with an audit event.

The gotcha is that signature verification adds latency to pod creation. For a signed image on a warm path, verification is under 200ms. For a cold path, it can be 2-3 seconds. Tune your kubelet image pull timeout accordingly, and preload images in namespaces that need fast pod spin-up.

What about cross-account ECR replication and signatures?

Signatures replicate with the image when you use ECR's replication feature, but the trust store in the destination account has to be configured to trust the same signing identity. This is where cross-account supply chains break most often — the image replicates, the signature replicates, and then the destination admission controller rejects the pod because the trust store does not include the source account's signer.

The fix is to treat signing identity as a global resource. Define your trusted builders at the organization level in AWS Organizations, replicate the signer profiles across accounts using a shared trust store definition in Git, and ensure that every admission controller in every account pulls from the same Git source of truth. This is a bit of operational overhead, but it is the only way cross-account signing stays coherent.

How Safeguard.sh Helps

Safeguard.sh pairs ECR signing telemetry with reachability analysis across every signed image you run, so your signature verification load is focused on artifacts whose vulnerabilities are actually in reachable call paths — reachability cuts 60 to 80 percent of findings that would otherwise crowd the admission audit log. Griffin AI generates Notation trust policies, AWS Signer profile bindings, and admission controller configurations from your SBOM graph and alerts when the trust store drifts from Git. Safeguard's SBOM module attaches a signed SBOM attestation to every image at 100-level dependency depth, its TPRM module tracks which external signer identities are currently in the trust store, and container self-healing rewrites workload image references automatically when a patched and resigned image is published through the same ECR replication path.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.