ECR has supported OCI artifact attachments since 2023, which means Cosign signatures land in the registry next to the image rather than in a separate signature store. That single change took image signing from "interesting in theory" to "operationally viable in production AWS shops." The 2026 question is no longer whether to sign images. It is how to roll signing out across an AWS estate without breaking the deploy pipeline.
I have led three production rollouts of Cosign on ECR in the last fifteen months, in shops that ranged from a few hundred images to several thousand. The patterns that work are surprisingly consistent. The patterns that fail are also consistent, and the failure mode is almost always the same: signing the images is easy, verifying them at deploy time is the part that breaks production.
What is the right Cosign signing approach for ECR in 2026?
For most production AWS shops, the right approach is KMS-backed Cosign signing with a per-pipeline key, with keyless signing as a separate path for open-source artifacts the shop publishes externally. KMS-backed keys give you a managed, audited, rotatable key that never leaves AWS. Keyless signing using Sigstore's public good infrastructure is appropriate for public artifacts but exposes the signing identity to the public Rekor log, which is usually not what enterprises want for internal artifacts.
The KMS-backed pattern is straightforward. You create a KMS key in the build account, grant the CodeBuild or other CI service role the kms:Sign permission on it, and configure Cosign to use the key with the awskms:// URI. The signature is stored in ECR alongside the image, the public key is exported once and distributed to the verification surface, and the signing path is audited by CloudTrail.
The per-pipeline key requirement is the part most rollouts get wrong. A single signing key for the whole organization is operationally simple and a security liability — a compromise of any pipeline that uses the key compromises every image the key has signed. A per-pipeline key contains the blast radius and gives auditors a clean answer to "which pipeline produced this artifact."
How do I avoid breaking the deploy pipeline during rollout?
Roll out in three phases: sign-only, verify-warn, verify-enforce. Each phase runs for at least two weeks before moving to the next, and the move is gated by a dashboard that shows the percentage of images signed, the percentage of pull requests that include the signing step, and the count of verification failures.
The sign-only phase adds the Cosign step to the build pipeline but does not enforce verification anywhere. The output of this phase is a complete inventory of signed images and a list of pipelines that are not signing. The list is the work item for the next phase.
The verify-warn phase enables signature verification at deploy time, but configures the verification to log failures rather than block deploys. The output is a complete inventory of deploy paths that would fail verification, which is almost always larger than the pipeline list, because the same image can be deployed by multiple paths. Each failing path is a work item.
The verify-enforce phase flips the verification to blocking. By the time you get here, the previous two phases have driven the failure count to zero or near zero, and the move to enforcement is uneventful. If the failure count is non-zero, you stay in verify-warn until it is.
How do I verify signatures on EKS without operational pain?
Use Kyverno or Gatekeeper with the OCI signature verification policy, configured with the public key for each authorised signer. The verification happens at admission time, before the image is pulled to the node, which means a verification failure is a clean rejection rather than a partial deployment.
The Kyverno pattern is the one I recommend most often, because it ships a first-class Cosign verification rule. The policy specifies the registry, the image pattern, and the public keys that are acceptable for that pattern. A pod that references an image whose signature does not verify against any of the listed keys is rejected at admission, and the rejection is logged with enough context to debug.
The public key distribution is the operational detail most rollouts underestimate. The keys have to be available at admission time, which means they have to live in a config map or secret in the cluster, and they have to be updated whenever a new pipeline is added or a key is rotated. Automating this distribution with the same IaC that manages the cluster is the right pattern; doing it manually is the wrong pattern, and it scales poorly past a dozen pipelines.
For ECS and Lambda, the verification path is different. ECS does not have an admission controller, so verification has to happen in the deploy script or the pipeline action that calls aws ecs update-service. Lambda's container image deployment also has no admission controller, so the verification happens in the pipeline. In both cases, the verification step is a simple cosign verify call with the public key and the image reference, and a non-zero exit code fails the deploy.
What about IRSA, Pod Identity, and the signing path?
The signing service account needs kms:Sign on the signing key, and nothing else. Use IRSA or EKS Pod Identity to bind the build pod to the signing service account, and scope the role with a condition on the source pipeline ARN.
The condition on the source ARN is the part most teams skip. Without it, any pod in the cluster that can assume the role can sign images, which is too wide. With it, only the specific build pod from the specific pipeline can use the key, and a compromise of any other pod in the cluster cannot produce a valid signature.
The audit log on the KMS key is the corresponding evidence trail. Every kms:Sign call is logged with the principal, the source IP, and the signing context. The principal is the build pod's role, the source IP is the build node's IP, and the signing context includes the image digest. An anomaly in any of those signals is the leading indicator that the signing path has been compromised.
What about the legacy unsigned images already in ECR?
Tag them with a legacy-unsigned label, exclude them from the verification policy with an explicit exception list, and set a deadline by which the exception list must shrink to zero. A blanket exception for "old images" tends to become permanent unless it has a deadline and a tracking ticket per image.
The migration path for a legacy image is straightforward. The image's source repository builds a new image in the new pipeline, the new image is signed, the deployment manifest is updated to reference the new image, and the legacy image is removed from the exception list. The shop's standard image-update process should handle this naturally; the supply chain rollout just adds the verification gate to the existing process.
How Safeguard Helps
Safeguard verifies Cosign signatures on every ECR image at scan time, regardless of whether the image is currently deployed, and produces a continuous posture signal that maps signed images to deployed workloads. The posture signal flags images that are deployed but not signed, signed but not by an authorised key, or signed by a key that has been rotated out.
Griffin AI generates per-pipeline KMS keys, Cosign sign steps, Kyverno verification policies, and IRSA role configurations from an existing ECR deployment, then opens pull requests against the customer's IaC repo to roll the rollout out. The PR-driven workflow is what makes the rollout tractable in shops with hundreds of pipelines.
Safeguard's SBOM module attaches an in-toto attestation to every ECR image alongside the Cosign signature, the TPRM module tracks the trust state of every base image and dependency the build consumes, and the container self-healing module rewrites image references in deployed workloads automatically when a patched image is published with a fresh signature. The verify-warn dashboard is built into the platform, so the rollout's phase progression is data-driven rather than guesswork.
For shops that need to cut signature verification noise, Safeguard's reachability scoring filters the verification findings to images that are actually deployed, which typically reduces the alert volume by 70 to 90 percent compared to the raw registry inventory.