Container Security

A Practical Kubernetes Operator Security Checklist

Kubernetes operators run with broad cluster access. This checklist covers the controls that matter most in 2025, from RBAC scoping to image provenance.

Shadab Khan
Security Engineer
4 min read

Kubernetes operators are the quietest privileged software in most clusters. They reconcile custom resources, provision cloud infrastructure, and often hold cluster-admin-equivalent rights. OperatorHub listed 417 community and certified operators in November 2025, and the average production cluster runs nine. When the CNCF reviewed operator incidents between 2023 and 2025, over-scoped ClusterRoles accounted for 46% of exploited paths and vulnerable controller images accounted for 22%. A compromised operator does not require a container escape to do damage; it can create pods, read secrets, and mint ServiceAccount tokens through the API it was built to call. This checklist distills the controls that matter most going into 2026, with specific examples you can lift into your own environment.

Why are operators such a high-leverage target?

Because they combine broad permissions with persistent reconciliation loops. A web pod that gets compromised must escape or pivot; an operator's pod often already has get, list, watch, create, update, delete on secrets and pods, cluster-wide. The OLM (Operator Lifecycle Manager) model historically encouraged requesting ClusterRole scope for convenience, and many maintainers have not trimmed permissions since. Combined with the fact that operators are typically run from vendor-published images updated on vendor cadence, they sit at the intersection of privileged code and supply chain trust.

What does minimum-viable RBAC look like for an operator in 2025?

Namespace-scoped when possible; ClusterRole only for genuinely cluster-wide resources; resourceNames restrictions where the operator reconciles a known set. The Kubernetes 1.32 release in December 2024 stabilized aggregated ClusterRoles and improved kubectl auth can-i --list output, which makes scoping audits easier. A minimum checklist: deny cluster-admin, deny get on all secrets unless scoped, and drop any verbs not demonstrated in the operator's reconciliation traces over a 30-day window.

# Scoped permission example, not cluster-admin
kind: Role
metadata:
  namespace: acme-system
rules:
- apiGroups: ["acme.io"]
  resources: ["widgets"]
  verbs: ["get","list","watch","update","patch"]
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["widget-admin"]
  verbs: ["get"]

What admission and runtime controls should gate operators?

Three layers. First, admission-time: use Kyverno or Gatekeeper to require signed images, to reject pods with hostPID, hostNetwork, or privileged containers, and to disallow automount of default ServiceAccount tokens. Second, runtime: apply a Pod Security Admission restricted profile and, where possible, seccomp RuntimeDefault. Third, network: deny egress by default for operator namespaces, allow only the API server and documented webhooks. CNCF's 2025 Kubernetes hardening guide bundles these as the operator-specific baseline.

How should we verify operator supply chain integrity?

Require signed images with attestations. Red Hat OpenShift operators in the certified catalog ship with Sigstore-style signatures since OCP 4.16. Prometheus Operator, Strimzi, cert-manager, and Crossplane all sign images via cosign and publish SBOMs as of 2025. A gate-worthy policy is: require a valid cosign signature against a pinned public key, require an SBOM attestation in CycloneDX 1.6 or SPDX 3.0, and require SLSA provenance pointing to a known-good build path. Operators without signed releases should be upgraded, replaced, or run with additional sandboxing.

What incident telemetry actually helps when an operator is compromised?

Kubernetes audit logs with a specific filter for ServiceAccount tokens issued to operator namespaces, plus API server metrics on spikes in create/patch activity from operator principals. Falco's 2025 default rule set added three operator-tailored rules: unexpected secret reads by a controller, ServiceAccount token creation outside reconciliation windows, and CRD reads from outside the operator pod. Ship those to SIEM, tag by operator name, and set alerts on rate rather than absolute counts. Most compromised operators we have seen generate orders-of-magnitude more API calls than normal before human eyes notice.

What should you do this quarter?

Audit each operator against the checklist above, prioritizing those with ClusterRole access to secrets. Remove any bundled default ServiceAccount, pin operator versions by digest, and add Kyverno policies to block unsigned images in operator namespaces by default. Finally, treat operators as first-class components in your SBOM pipeline so vulnerability findings map to a responsible owner, not to a shared "platform" team inbox.

How Safeguard Helps

Safeguard treats every Kubernetes operator as a tracked asset with its own SBOM, signature status, and RBAC posture. Customers can query which operators request cluster-admin, which images lack Sigstore signatures, and which namespaces violate their operator baseline policy, all in one dashboard. Policy gates block deployment of operator images that fail signature or SBOM checks, and integration with Kyverno and Gatekeeper enforces the controls at admission. When a CVE lands on a controller image, Safeguard surfaces every cluster that runs it and generates a prioritized remediation plan in minutes.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.