Container Security

Rancher Cluster Security Hardening

Rancher is the distribution that runs when your Kubernetes is neither EKS nor OpenShift. Hardening it well is specific work.

Rancher occupies a particular niche. Larger than k3s, smaller than OpenShift, more opinionated than vanilla kubeadm, and distinctly popular in the places where cloud providers are not available — European telcos, Latin American banks, on-prem enterprise estates, edge deployments running in places without reliable internet. As of mid-2024 the 2.8 release line is current and RKE2 1.28 is the default downstream Kubernetes distribution.

SUSE, which acquired Rancher Labs in 2020, has invested heavily in the security posture over the last several years. The CIS hardened profiles ship working out of the box. The RKE2 distribution is tested against DISA STIG. Fleet, the GitOps-native multi-cluster management layer, has matured past its early instability.

Making all of that work in production still requires specific hardening choices. This is a guide to the ones that matter.

Start With RKE2, Not RKE1

RKE1 has been deprecated since 2022 and will be end-of-life mid-2025. If you are running new Rancher clusters, they should be RKE2. The older distribution is still supported but is not receiving new features, and its security posture trails RKE2 in several meaningful ways.

RKE2 was designed with the CIS Kubernetes Benchmark as a first-class requirement. It ships with containerd as the runtime, enables seccomp profiles by default, uses SELinux when available, and pre-configures kube-apiserver audit logging. The profile: cis-1.23 or profile: cis-1.7 option in the config.yaml pushes the cluster to CIS-compliant defaults, which closes around 80 percent of the benchmark items without manual work.

If you are still on RKE1 for existing clusters, the migration path to RKE2 goes through either cluster rebuild (cleanest, most work) or the Rancher-provided migration tooling (less clean, less work). The migration itself is a good time to harden, because you are already touching every cluster.

The 2023 CVEs That Shaped Current Defaults

Rancher has had real CVEs in the last two years, and the patterns they show up in are worth understanding.

CVE-2023-32191, disclosed in May 2023, was an authorization bug where users with Manage-Cluster-Members permission could elevate their own permissions. It affected Rancher 2.6.x before 2.6.13 and 2.7.x before 2.7.4. The lesson was that Rancher's RBAC layer is not Kubernetes RBAC — it is a second system that maps to Kubernetes RBAC, and bugs in the mapping layer are where privilege escalation lives.

CVE-2023-22647, from January 2023, allowed a local admin with access to the Rancher UI to escalate to cluster admin on downstream clusters. The root cause was inadequate separation between the Rancher management plane and the downstream clusters it managed. The fix tightened the boundary, but the lesson is that the Rancher UI is a trust anchor for every cluster it manages.

CVE-2023-22650, same disclosure, was a Fleet issue where a malicious Git repository could be used to escalate privileges on clusters using Fleet-managed GitOps. The fix introduced signed commit verification requirements; the lesson is that GitOps without signature verification is a supply chain vector whether you call it that or not.

CVE-2024-22030, disclosed in February 2024, was an authorization bypass in Rancher webhook validation. Affected 2.7.x before 2.7.12 and 2.8.x before 2.8.3. Fixing required a Rancher upgrade, and the fact that it was disclosed alongside a working PoC made the patching window short.

The pattern is that Rancher's management plane is high-value attack surface, and cluster-to-cluster separation is not automatic. Both of those observations inform the hardening specifics below.

Separate the Rancher Management Plane

A common architectural mistake is running Rancher on the same cluster as production workloads. This is convenient and saves a node pool. It is also the mistake that turns a Rancher CVE into a production compromise.

Run Rancher on a dedicated management cluster. Ideally, a cluster that hosts only Rancher, cluster-level observability, and any necessary supporting tooling. The workloads you manage should live on separate downstream clusters.

This is what the "HA Rancher" reference architecture has always recommended, but in practice many deployments collapse the management and workload planes for cost reasons. The cost of maintaining a separate management cluster is real; the cost of not maintaining one shows up during incident response, when you discover that the blast radius of a Rancher compromise includes every workload you care about.

If you cannot run a dedicated management cluster, at least apply strict NetworkPolicy between the Rancher namespace and workload namespaces, and ensure the cluster's API server is not directly reachable from workload pods.

The Authentication Integration Points

Rancher supports authentication via local users, LDAP, Active Directory, GitHub, SAML, OIDC, and several others. Pick one federated provider and stick with it. Local users should not exist beyond a break-glass admin account.

The break-glass account matters. Rancher has had outages where the configured identity provider was unreachable and the only way to recover was a local admin login. Keeping one local admin with a credential stored in a separate password manager — not in the same identity system that might be down — is operational hygiene.

Multi-factor authentication on the identity provider side is the obvious next step. Rancher does not implement its own MFA; it relies on whatever your SSO provider does. This is fine if your SSO provider does MFA well. It is a gap if your SSO provider is configured loosely.

Downstream Cluster Kubeconfig Handling

Rancher distributes kubeconfig files for downstream clusters through the UI and API. These kubeconfigs are the credentials an operator uses to talk directly to a workload cluster's kube-apiserver.

By default, the kubeconfigs Rancher generates are long-lived — typically years. This is a bad posture. A kubeconfig leaked from a laptop in 2023 should not still work in 2025.

The fix is to configure short-lived kubeconfig TTLs at the Rancher level and have users regenerate them as needed. Rancher 2.7.5 and later support kubeconfig TTL configuration globally and per-user. Set the TTL to something reasonable — hours to days, not years — and require users to regenerate credentials instead of caching them forever.

For programmatic access, Rancher API tokens should have TTLs as well. The CATTLE_TOKEN_TTL environment variable controls the default. A value of 24 hours or 72 hours is reasonable for most operational patterns.

Fleet and GitOps Signing

Fleet, Rancher's GitOps-native multi-cluster deployment system, has been the subject of several of the CVEs above for good reason: a GitOps pipeline that deploys to every cluster in your estate is an obvious high-value target.

Fleet 0.8 and later support Git commit signature verification, and enabling it is the single most impactful Fleet hardening step. Without signature verification, anyone who can push to the Fleet-watched repository can deploy anything to any cluster. With it, the attack path requires compromising a signing key, which is a meaningfully higher bar.

Beyond commit signing, restrict which repositories Fleet is allowed to pull from. The Fleet GitRepo resource can specify repository URLs, and a policy that enforces "Fleet only pulls from our own Git organisation" prevents accidental or malicious pointing at external repositories.

Audit Logging and the SIEM Path

Rancher produces audit logs for the management plane. Downstream RKE2 clusters produce Kubernetes audit logs for their own API servers. Both should flow to a SIEM.

The Rancher management plane audit logs are the more unusual and more valuable data. They show which user created a cluster, who granted which permissions, when a kubeconfig was generated, which Fleet repository was added. This is the data that reconstructs timelines when something goes wrong.

Rancher's audit log configuration is through the CATTLE_AUDIT_LEVEL setting, with levels 0-3. Level 3 is verbose enough to reconstruct management plane activity and moderate enough that log volume is manageable. Level 0 is the default and is not useful.

How Safeguard Helps

Safeguard integrates with Rancher to inventory every downstream cluster and their workloads, tracks Rancher and RKE2 versions against known CVEs (including the 2023 authorization and Fleet issues), and flags clusters running versions with unpatched advisories. We audit Rancher RBAC configurations for the patterns that have historically led to privilege escalation — users with Manage-Cluster-Members in production-tier projects, Fleet repositories without signature verification, kubeconfig TTLs set to values that exceed policy. For teams running heterogeneous Rancher estates, we normalize findings across RKE1, RKE2, and imported clusters so you can see the risk picture in one view rather than one per cluster type.

rancher kubernetes container-security rke2 hardening

Back to all articles

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Rancher Cluster Security Hardening

Start With RKE2, Not RKE1

The 2023 CVEs That Shaped Current Defaults

Separate the Rancher Management Plane

The Authentication Integration Points

Downstream Cluster Kubeconfig Handling

Fleet and GitOps Signing

Audit Logging and the SIEM Path

How Safeguard Helps

Related articles in Container Security

Container Image Supply Chain: From Dockerfile to Production

The Minimal Base Image Myth: What Actually Reduces Attack Surface

Multi-Arch Image Builds and Attestation Pitfalls

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers