Container Security

K8s Admission Controllers for Supply Chain Policy

How to design Kubernetes admission controllers that enforce supply chain policy without turning every deploy into a 30-minute argument with the cluster.

Shadab Khan
Security Engineer
6 min read

Admission controllers are the last honest choke point in a Kubernetes cluster. Everything upstream — CI pipelines, registries, scanners, signing services — is advisory. The moment the API server accepts a pod spec, the workload is on the node. If you want supply chain policy to actually mean something, it lives here.

I have spent the last two years rolling admission policy out to clusters that range from two-node edge sites to multi-tenant platforms with 40,000 pods. The pattern that works is narrower than the docs suggest and more political than the blog posts admit.

Why put supply chain enforcement at admission time?

Because nothing else in the pipeline is actually a gate. CI can be bypassed with a direct kubectl apply. Registry scanning happens asynchronously. Signing is worthless if nothing verifies the signature before the container starts. Admission control is the point where the cluster decides whether an image gets to execute, and it has full visibility into the object spec, the namespace, and the workload identity making the request.

I have watched teams bolt every other control onto their pipeline and then discover a break-glass service account that skipped all of them. Admission was the only layer that caught it, because the API server asks the same questions of every request regardless of who sent it.

That said — admission control is not free. Every admission webhook is latency you add to every create and update in the cluster. A badly written validating webhook with a 2-second timeout can stall node autoscaling during an incident. Treat the admission path as production-critical infrastructure, not a side project.

What should admission controllers actually check for supply chain?

Three things, in priority order. First, image provenance — is this image from a registry you trust, signed by an identity you recognize, with attestations that match the workload? Second, image identity — is the image referenced by digest rather than a mutable tag? Third, runtime configuration that interacts with the supply chain surface, such as whether the pod pulls from ephemeral registries or mounts the container socket.

The thing I see teams get wrong is trying to do vulnerability gating at admission time. Do not. CVE data is stale within hours, scanners disagree, and the right answer for a critical vulnerability at 2 a.m. is rarely "refuse to schedule the pod." Put vulnerability checks in CI where a human can react, and reserve admission for policy that has a binary right answer. Signature verification is binary. "Does this image have fewer than 3 high-severity CVEs" is not.

A short list I actually deploy:

  • Reject images not referenced by SHA256 digest in production namespaces
  • Reject images from any registry not on an explicit allowlist
  • Require a valid Sigstore signature with a Fulcio identity matching the expected workload identity
  • Require an in-toto SLSA provenance attestation, verified against the expected builder
  • Block hostPath, privileged, and hostNetwork unless the namespace is labeled for system workloads

OPA Gatekeeper or Kyverno — which one wins in 2026?

Kyverno wins for most supply chain policy, Gatekeeper wins for complex authorization logic. This is an unpopular opinion, and I have flipped on it twice, but the current state of both projects makes it clear.

Kyverno's native support for verifyImages with Cosign and Notary is the deciding factor. You write a policy that says "any image in this namespace must be signed by this identity" and Kyverno handles key management, transparency log verification, and mutation of the image reference to its digest. In Gatekeeper, you are writing Rego and shelling out to external data providers, and you will spend a weekend you do not have debugging cert chains.

Kyverno's weakness is expressiveness. If your policy needs to correlate four resources, read a ConfigMap, and check a label on the namespace's parent, Rego is a better language for that problem. Gatekeeper also has a more mature constraint framework story — if you need to generate policy reports for auditors who want to see every violation the cluster has ever produced, Gatekeeper's audit CRDs are less work.

One pattern that works well: Kyverno for image and signature policy at admission, Gatekeeper for RBAC and resource policy audit. They do not conflict and they each play to their strength.

How do you roll admission policy out without breaking everything?

Start in audit mode, ship the policy to production, and leave it there for two weeks before you flip the enforcement switch. I have never seen this go well without that waiting period, because there is always a legacy workload deployed from a forgotten helm chart pointing at an unsigned image in a registry nobody owns anymore.

The staged rollout that has worked for me looks like this. First, validationFailureAction: Audit on every new policy. Log violations to Prometheus and send them to whoever owns the namespace. Second, after two weeks of clean audit signal for the known-good workloads, move non-production namespaces to Enforce. Third, wait a full sprint. Fourth, move production namespaces one at a time, with the team that owns the workload on a bridge call. Fifth, enable the policy as a default for new namespaces, with a documented exception process.

Do not skip steps. The time you save skipping staged rollout gets paid back with interest during the outage where a policy change blocks every pod restart across three regions because someone renamed a registry last Tuesday.

What about policy for pods that already exist?

Admission controllers only run at create and update. Anything already running in the cluster is invisible to validating webhooks until someone touches it. This matters for supply chain policy more than other policy classes, because the question "are any of my currently-running pods from an untrusted source" is one an auditor will ask.

The answer is audit mode combined with a separate scanner that lists all running pods, extracts their image references, and checks them against the same policy bundle. Kyverno has a reporting mode for this, and OPA Gatekeeper has its audit controller. Run it on a cron, alert on drift, and track the numbers over time. If you cannot produce a list of every unsigned image running in the cluster within 60 seconds of being asked, your supply chain program is narrative rather than operational.

How Safeguard.sh Helps

Safeguard.sh pairs admission policy with reachability analysis, so you are not burning cycles enforcing signatures on images that contain vulnerable code nothing in your workload ever executes — reachability cuts 60 to 80 percent of the noise that would otherwise show up in admission audit reports. Griffin AI generates and validates admission policy for Kyverno and Gatekeeper directly from your SBOMs, so a policy change in one cluster propagates to the rest with correct identity references. Safeguard's TPRM module tracks the trust chain of every third-party image at 100-level dependency depth, and container self-healing rewrites workloads automatically when a signed patched image is published upstream, closing the gap between "policy blocks deploy" and "policy unblocks deploy with the fix already applied."

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.