NetworkPolicy resources have been in Kubernetes since 1.7 and the conversation around them has been stuck in zero-trust talking points for about that long. Block east-west traffic by default. Allow only what you need. Pretend the cluster network is hostile, because it probably is.
All of that is correct. It also misses the scenario that network policies are surprisingly good at handling: your running workload turns hostile. Not because an attacker got in, but because you trusted a dependency that you should not have, and the dependency is now exfiltrating data or calling home.
The supply chain attacks of the last two years — xz-utils in March 2024, 3CX in 2023, Ledger Connect Kit in December 2023, the Polyfill.io compromise in June 2024 — all share a trait. The malicious code executes from inside a trusted process, in a trusted pod, doing something the pod's network position should not allow. Network policies, done right, catch exactly this.
The Shape of a Supply Chain Compromise
Consider what a backdoored dependency typically needs to do after it gains execution. It might beacon to a command-and-control server. It might exfiltrate credentials or data. It might pull down a second-stage payload. It might pivot to an internal service it was not supposed to talk to.
All of these are network operations. And in the normal course of running a web service, the pod almost never does any of these things. A typical API service talks to its database, talks to its cache, talks to a handful of internal services, and maybe calls a small number of external APIs. That is the complete network surface.
A default-allow NetworkPolicy posture means the compromised dependency has free rein on the cluster network. It can scan internal services, hit the cloud metadata endpoint, call any external address. A strict policy — allow database, allow cache, allow specific external APIs by DNS name, deny everything else — constrains the compromise to exactly the actions the legitimate workload would have taken.
The 3CX attack is the textbook case. The malicious code in libffmpeg.dll beaconed to specific C2 domains and downloaded stages. A corporate endpoint with DNS filtering might have caught it. A Kubernetes workload with a tight egress NetworkPolicy would have blocked the C2 traffic entirely.
Default-Deny Is the Starting Point, Not the End
The first step in any NetworkPolicy program is a default-deny baseline. This is well-understood:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Apply this to every namespace. Nothing can talk to anything. Then add explicit allow rules for the traffic that should exist.
Most teams stop here and feel they have done the work. They have not. The supply chain angle requires going further in two specific directions: egress filtering to external destinations, and restrictions on the cloud metadata endpoint.
Egress Is Where Supply Chain Defense Lives
Most NetworkPolicy documentation focuses on east-west traffic — pod to pod, namespace to namespace. That is important. But the supply chain threat model makes egress traffic at least as important.
Your API pod talks to Stripe, say, and to your corporate OAuth provider, and to PyPI for an occasional runtime package pull. That is three external destinations. A compromised dependency wanting to exfiltrate data is going to try to reach a C2 server, which is none of those three.
Kubernetes NetworkPolicy historically could not express "allow traffic to api.stripe.com" because the resource operates on IP addresses and CIDRs, not DNS names. This has been the operational pain point that kept most teams from taking egress filtering seriously.
Cilium's CiliumNetworkPolicy extended the model to include DNS-based rules via toFQDNs, and Calico added equivalent functionality in the 3.20 release line. In 2024, the upstream Kubernetes NetworkPolicy does not support FQDNs directly, but the major CNIs all do. If you are running Cilium 1.14 or later, or Calico 3.26 or later, you have the tooling. Use it.
A tight egress policy for a typical API pod looks like this: allow DNS to CoreDNS, allow specific FQDNs for external APIs the service actually uses, allow internal services by pod selector, deny everything else. The list of allowed FQDNs is usually five to fifteen entries for a realistic service. Dependencies that try to reach arbitrary external hosts will be blocked.
The Cloud Metadata Endpoint Is the Special Case
The cloud metadata endpoint at 169.254.169.254 deserves its own policy. It is accessible from every pod by default, returns credentials for the node's IAM role (and on EKS before IMDSv2 was universally adopted, for much more), and is the most common lateral movement path in Kubernetes compromises.
Block it at the NetworkPolicy level. Every pod should be denied egress to 169.254.169.254/32 unless it has a specific, audited reason to talk to the metadata service. AWS IAM Roles for Service Accounts (IRSA) makes this particularly important, because once IRSA is set up, no legitimate workload should need the node-level metadata endpoint at all.
The Capital One breach in 2019, the Tesla Kubernetes incident, and several smaller cloud compromises all involved attackers reaching the metadata endpoint from workload pods. Blocking it is the single most impactful network policy most teams can write.
Namespace Boundaries and Labels
Pod-to-pod policies should operate on namespace labels rather than pod labels where possible. Namespace labels are managed by platform teams and are harder to forge than pod labels, which application teams control directly.
For a workload like "api pods in the payments namespace can talk to postgres pods in the payments-db namespace," the policy selectors should key on namespace labels team: payments and tier: database rather than pod labels. This defends against an application team accidentally — or maliciously — relabeling a pod to gain access it should not have.
In 2023 and 2024, several CNIs added support for cluster-wide policies that can assert constraints across namespaces. Cilium's CiliumClusterwideNetworkPolicy and Calico's GlobalNetworkPolicy are the main examples. These are useful for enforcing baseline rules — the metadata endpoint block, for instance — that should apply regardless of what any namespace-level policy says.
Observability Matters
Writing policies without observability is guessing. Every serious CNI now ships flow logs — Cilium Hubble, Calico flow logs, Antrea Flow Exporter — that let you see what traffic was allowed or denied.
When you deploy a default-deny policy to a namespace for the first time, you will find workloads that depended on traffic you did not know about. Flow logs show you the actual patterns. Pair them with a dry-run or audit mode during rollout so you see denies without enforcing them, fix the gaps, and then flip to enforcement.
After enforcement, flow logs become your supply chain early warning. A pod suddenly denying egress to a new external host is exactly what a compromise looks like. Ship those denies to your SIEM and alert on unusual patterns.
The CVE-2024-3154 Context
Runtime CVEs remind us why the belt-and-suspenders approach matters. CVE-2024-3154 in CRI-O, CVE-2024-21626 in runc — both allowed container breakouts under specific conditions. A breakout attacker lands on the node, not just in the pod, and their network reach is whatever the node's network interface allows.
Node-level egress filtering is harder than pod-level, but it is where the defense against breakout-plus-exfiltration lives. If your node security groups or equivalent cloud network controls allow nodes to talk to arbitrary internet hosts, a container breakout is a direct path to data exfiltration. Tighten the node network perimeter as well as the pod network policies.
How Safeguard Helps
Safeguard scans cluster NetworkPolicy configurations against supply chain threat models, not just zero-trust checklists. We flag namespaces without default-deny policies, workloads with egress to the cloud metadata endpoint, and pods that can reach arbitrary external hosts from a default CNI configuration. Our dependency risk scoring correlates with network exposure: a pod running a package with recent supply chain advisories — think event-stream, xz-utils, or the Polyfill.io chain — gets flagged more urgently if its NetworkPolicy would let a compromise beacon home. We generate candidate egress rules based on observed traffic patterns so hardening a default-allow workload does not have to start from a blank page. For teams on Cilium or Calico, we surface which FQDN-based rules would be needed to model the actual traffic a service produces.