Container Security

Container Escape Techniques in 2023: What's Changed and What Hasn't

Container escapes remain a real threat in multi-tenant environments. A look at the latest techniques, CVEs, and defenses as container security matures in 2023.

Bob
DevSecOps Engineer
5 min read

Container escapes—where an attacker breaks out of a container's isolation to access the host system or other containers—remain one of the most serious threats in cloud-native environments. In 2023, several new escape techniques and vulnerabilities emerged, while the defensive landscape also evolved. Here's where things stand.

Why Container Escapes Matter

Containers provide process-level isolation, not hardware-level isolation. They share the host kernel, and the boundary between container and host is enforced by Linux kernel features: namespaces, cgroups, seccomp, and capabilities. If an attacker can subvert any of these mechanisms, they can escape the container.

In multi-tenant environments—public clouds, shared Kubernetes clusters, CI/CD build systems—a container escape can mean:

  • Accessing other tenants' data and workloads
  • Compromising the host system and all containers on it
  • Pivoting to the broader network
  • Accessing Kubernetes API credentials and cluster-wide resources

2023 CVEs and Techniques

runc CVE-2024-21626 (Disclosed Late 2023)

While the CVE was assigned in early 2024, research into the underlying runc vulnerability began in 2023. The flaw in runc's handling of file descriptors during container creation allowed an attacker to gain access to the host filesystem. Runc is the default container runtime for Docker and most Kubernetes deployments, making this a high-impact vulnerability.

Kernel Vulnerabilities

Several Linux kernel vulnerabilities in 2023 had container escape implications:

CVE-2023-0386: An OverlayFS vulnerability that allowed a local user to gain elevated privileges. In container contexts, this could be leveraged to escape from containers using OverlayFS (the default storage driver for Docker).

CVE-2023-32233: A use-after-free in Netfilter (nf_tables) that allowed local privilege escalation. From within a container with certain capabilities, this could be exploited for escape.

CVE-2023-2598: An io_uring vulnerability allowing out-of-bounds memory access. The io_uring subsystem has been a frequent source of container-relevant privilege escalation bugs.

Kubernetes-Specific Escapes

Beyond kernel vulnerabilities, Kubernetes misconfigurations provide escape paths that don't require any CVE:

Privileged containers. Running containers with --privileged or securityContext.privileged: true disables most container isolation. It's the equivalent of giving the container root access to the host.

Host PID/network namespace. Containers configured with hostPID: true or hostNetwork: true share namespaces with the host, providing direct access to host processes and network interfaces.

Sensitive mount points. Mounting the Docker socket (/var/run/docker.sock), the host filesystem, or Kubernetes service account tokens into containers provides escape paths that are features, not bugs—but misconfigurations nonetheless.

Node-level access through Kubernetes API. A compromised pod with a service account that has excessive RBAC permissions can use the Kubernetes API to schedule privileged pods, access secrets, or modify workloads across the cluster.

The Defense Stack

Container escape defense is layered. No single control is sufficient.

Kernel Hardening

Keep kernels updated. Container escape CVEs are almost always kernel vulnerabilities. Keeping host kernels patched is the single most impactful defense.

Seccomp profiles. Seccomp (Secure Computing Mode) restricts which system calls a container can make. The default Docker seccomp profile blocks approximately 44 system calls, including many used in escape techniques. Custom profiles can be more restrictive.

AppArmor/SELinux. Mandatory access control systems provide additional restrictions beyond namespace isolation. AppArmor is default on Ubuntu-based systems; SELinux on RHEL-based systems.

io_uring restrictions. Given the frequency of io_uring vulnerabilities, many security-focused deployments disable io_uring entirely via seccomp or kernel parameters. GKE and some other managed Kubernetes services block io_uring system calls by default.

Container Configuration

Don't run as root. Containers should run as non-root users. This is enforced by Kubernetes Pod Security Standards at the "restricted" level.

Drop capabilities. Linux capabilities provide fine-grained privilege control. Containers should start with all capabilities dropped and add only what's needed. Never grant CAP_SYS_ADMIN, CAP_NET_ADMIN, or CAP_SYS_PTRACE unless absolutely necessary.

Read-only root filesystem. Setting the container's root filesystem to read-only prevents many exploitation techniques that require writing to the filesystem.

Resource limits. Cgroup resource limits prevent denial-of-service from within containers and limit the impact of certain exploitation techniques.

Runtime Security

Runtime detection tools. Tools like Falco, Tracee, and Tetragon monitor container behavior in real time, alerting on suspicious activities like unexpected process execution, file access to sensitive paths, or network connections to unusual destinations.

gVisor and Kata Containers. For high-security environments, sandbox runtimes provide stronger isolation than standard Linux containers. gVisor interposes a user-space kernel between the container and the host kernel. Kata Containers runs each container in a lightweight VM.

Policy Enforcement

Pod Security Standards. Kubernetes' built-in Pod Security Standards (baseline, restricted) enforce minimum security configurations. The "restricted" profile prevents most misconfiguration-based escapes.

Policy engines. Tools like OPA Gatekeeper, Kyverno, and Kubewarden enforce custom policies that prevent dangerous container configurations from being deployed.

The Gap: Known vs. Unknown Escapes

The uncomfortable truth is that container isolation depends on the security of the Linux kernel, and the Linux kernel is a massive, complex codebase with a continuous stream of vulnerabilities. Patching known CVEs is necessary but not sufficient—there will always be unknown kernel vulnerabilities that could enable container escapes.

This is why defense in depth matters. Any single control can be bypassed. The combination of kernel hardening, container configuration, runtime detection, and policy enforcement creates multiple barriers that an attacker must overcome.

For workloads handling the most sensitive data, hardware-level isolation (VMs, bare-metal) remains the strongest guarantee. Containers are excellent for many use cases, but they're not the right isolation boundary for everything.

How Safeguard.sh Helps

Safeguard.sh monitors container security configurations across your Kubernetes clusters and container deployments, identifying misconfigurations that enable escape techniques—privileged containers, dangerous mount points, excessive capabilities, and missing security contexts. Our platform tracks kernel CVEs against your host infrastructure, alerts you when container runtime vulnerabilities are disclosed, and provides policy recommendations to harden your container security posture. By correlating container configurations with vulnerability data, Safeguard.sh helps you identify where your container isolation is weakest.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.