Cloud Security

Azure ACR Image Signing with Notation Policy

Azure Container Registry plus Notation gives you signing, trust policy, and AKS enforcement without bolting on Sigstore. Here is how the pieces actually fit together.

Shadab Khan
Security Engineer
7 min read

Azure Container Registry's Notation integration is the Microsoft-native answer to image signing. It uses keys held in Azure Key Vault, it signs with Notation's OCI-artifact format, and it verifies at admission time using Ratify on AKS. The end-to-end story is materially easier to deploy than a cross-cloud Cosign or Sigstore setup, and if your org has standardized on Microsoft everywhere, this is the path of least resistance.

What the documentation does not do well is tell you how to scale this to a multi-team, multi-registry, multi-cluster environment without every team inventing their own trust model. This guide fills that gap with patterns that have survived about a dozen real deployments.

Why use Notation with ACR instead of Cosign?

Because the key material lives in Azure Key Vault and the signing operation is a native Azure call, which makes the compliance story dramatically simpler for Microsoft-centric organizations. Cosign is fine, and many teams run it on Azure successfully, but the keyless Sigstore flow pulls in a dependency on a public-good service and the keyed flow still needs you to build the Key Vault integration yourself.

With Notation and the AKV plugin, the signing request is an AKV Sign operation, which is already in your audit trail, already in your key management, and already subject to the same RBAC and rotation policies as every other key your organization holds. The signing identity is the Microsoft Entra principal that called the AKV signing key, which is verifiable in activity logs and tied back to whichever workload ran the operation.

The operational benefit is that the signing key never leaves Key Vault. The build host calls AKV, AKV signs the payload using the HSM-backed key, and the resulting signature is uploaded to ACR as an OCI artifact attached to the image. An attacker who compromises the build host gets access to call the signing operation for as long as the token is valid, but they do not get the key itself.

How do I structure ACR, AKV, and signing profiles?

One registry per environment or per tenancy boundary, one AKV per environment, and one signing key per logical signing identity — not one signing key per project. The common mistake is to create a new signing key every time a new team starts signing, which produces a sprawling key inventory that nobody audits.

A signing identity in this model is a policy-level concept: "images signed by the production build system." That identity backs onto one AKV key, and the build systems for every production-eligible team are all granted permission to call that one key. The trust policy on the AKS side lists that one identity as trusted for the production scope. When you add a new team, they do not get a new signing key; they get permission to sign against the existing production identity.

If that feels too permissive, the second layer is IAM on ACR itself. A team's CI principal has AcrPush on only its own repositories. So even though team A's CI can sign against the shared production key, it cannot push to team B's repositories. The signing identity is the trust assertion; the registry RBAC is the authorization boundary.

For multi-registry topologies — a common pattern is one ACR per region with replication — use the same AKV key and signing identity across all registries, because an image signed once should verify everywhere the image is replicated.

How does the Ratify admission policy actually work?

Ratify is the Kubernetes admission controller that validates Notation signatures. On AKS it integrates with Azure Policy, so you configure it through the platform rather than deploying Ratify manifests directly. The policy is: for any image in a specified scope, the image must have a Notation signature, and the signature must chain up to a trust store that contains the expected root certificate.

The trust store in Ratify is a namespaced Kubernetes resource. Populate it with the certificates of each signing identity you trust for that cluster. A production cluster's trust store contains only the production signing root; a development cluster's trust store contains development roots and, optionally, production roots for read-only deployments.

The policy is expressed in Azure Policy syntax, and you can scope it per-namespace using Kubernetes labels. This lets you stage rollouts: enable the policy in warn mode on every namespace, then move namespaces to deny mode one at a time as you confirm every image is signed. The warn-to-deny progression is where most deployments fail, because teams try to flip the whole cluster in one change and discover they have 30 unsigned legacy images running.

Bake two weeks of warn-mode operation into every rollout. Use the policy compliance report to list every image that would have been rejected, chase down each image owner, and do not flip the switch until the report is clean.

What does a multi-team trust policy look like?

The trust policy lives in a Git repository owned by the platform team. It declares the registry scopes, the trust stores, the trusted identities, and the verification levels. Individual teams do not edit the trust policy directly; they file pull requests to add their signing identity to the list of trusted identities for their scope.

A multi-team scope definition in Notation's trustpolicy.json looks something like this: one entry per environment, with a registry scope that matches the environment's ACR, and trusted identities that include every signing root valid for that environment. Team-specific repositories are handled by ACR RBAC, not by the trust policy, because the trust policy is about who can sign, not about who can push.

This design has one failure mode: a team whose signing root is in the production trust policy can, in principle, sign any production image. Accept this as a trust premise of the system — the signing root is inside your organization — and audit for it. The audit is a simple query: list every signed image in the production ACR, grouped by signer identity, and alert on any signer that is not expected for that image's repository.

How do I handle signature replication across regions?

ACR replication copies both the image and its signature automatically, but the replicated signature still needs its trust chain validated in the destination region. This works as long as every region's trust store contains the same root certificate. The mistake is to define per-region trust stores that diverge over time, and then discover that a globally replicated image fails admission in one region.

Treat the trust store as a global resource. Store it in Git, render it from a single source of truth, and apply it to every AKS cluster in every region through Azure Policy. When you add a new signing identity, it propagates everywhere. When you rotate a root certificate, it rotates everywhere. Divergence is the enemy.

How Safeguard.sh Helps

Safeguard.sh correlates ACR signing events with reachability analysis on every image your AKS clusters run, so the signature verification audit log is focused on vulnerabilities that are actually called by deployed workloads — reachability cuts 60 to 80 percent of the noise that would otherwise come out of Ratify compliance reports. Griffin AI generates Notation trust policies, AKV signing key bindings, and Azure Policy scopes from your existing registry graph and alerts when the trust store drifts from Git. Safeguard's SBOM module attaches a signed SBOM attestation to every ACR artifact at 100-level dependency depth, its TPRM module tracks which external signer identities are currently trusted by your clusters, and container self-healing rewrites AKS image references automatically when a patched and resigned image is published through the same ACR replication fabric.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.