Container Security

Running Containers in Rootless Mode: A Practical Security Guide

Root in the container often means root on the host. Rootless mode breaks that assumption. Here is how to run Docker and Podman without root and why it matters more than you think.

Shadab Khan
DevSecOps Engineer
7 min read

Running containers as root is the default. It is also the single largest source of container escape vulnerabilities. When a process inside a container runs as UID 0, and the container runtime also runs as root on the host, every kernel exploit becomes a potential breakout. Rootless mode eliminates this entire class of attack.

Yet most production environments still run containers as root. Not because they need to. Because nobody changed the default.

What Rootless Mode Actually Changes

In a traditional Docker setup, the Docker daemon (dockerd) runs as root. Every container it spawns inherits access to the host's root-owned resources through the daemon. Even if the container process runs as a non-root user internally, the daemon itself—and by extension, the container runtime—operates with full host privileges.

Rootless mode changes the ownership model:

  • The container runtime runs as an unprivileged user
  • User namespaces map UID 0 inside the container to a non-root UID on the host
  • The container process cannot access host resources that the unprivileged user cannot access
  • Even a complete container escape lands you in an unprivileged user context

This is defense in depth at the infrastructure level. Not a policy. Not a configuration. A fundamental architectural change.

Setting Up Rootless Docker

Docker has supported rootless mode since version 20.10. The setup is straightforward but requires a few prerequisites.

Prerequisites

# Install uidmap (provides newuidmap and newgidmap)
sudo apt-get install -y uidmap

# Verify subordinate UID/GID ranges exist for your user
grep $(whoami) /etc/subuid
grep $(whoami) /etc/subgid
# Should show something like: youruser:100000:65536

If no entries exist in /etc/subuid and /etc/subgid, add them:

sudo usermod --add-subuids 100000-165535 --add-subgids 100000-165535 $(whoami)

Installation

# Run the rootless setup script
dockerd-rootless-setuptool.sh install

# Set environment variables (add to .bashrc)
export PATH=$HOME/bin:$PATH
export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock

Verification

# Check the daemon is running rootless
docker info 2>/dev/null | grep -i "root"
# Should show: rootless: true

# Verify UID mapping
docker run --rm alpine id
# uid=0(root) gid=0(root) — inside container
# But mapped to your unprivileged UID on the host

Setting Up Rootless Podman

Podman was designed rootless from the start. It does not require a daemon, which eliminates an entire attack surface.

# Podman runs rootless by default for non-root users
podman run --rm alpine id
# uid=0(root) gid=0(root) — inside container

# Verify the mapping
podman unshare cat /proc/self/uid_map
# 0     1000        1    (UID 0 in container = UID 1000 on host)
# 1   100000    65536    (UIDs 1-65536 mapped to subordinate range)

No setup script needed. No daemon to configure. This is why Podman has become the default choice for security-conscious teams.

What Breaks in Rootless Mode

Rootless mode is not free. Several common patterns require changes.

Binding to Privileged Ports

Ports below 1024 require root. In rootless mode, you cannot bind to port 80 or 443 directly:

# This fails in rootless mode
docker run -p 80:80 nginx
# Error: permission denied

# Solution: Use unprivileged ports with a reverse proxy
docker run -p 8080:80 nginx

Alternatively, adjust the kernel parameter:

sudo sysctl net.ipv4.ip_unprivileged_port_start=80

OverlayFS Limitations

On older kernels (pre-5.11), rootless mode cannot use the native overlay2 storage driver. It falls back to fuse-overlayfs or vfs, which are slower:

# Check current storage driver
docker info | grep "Storage Driver"

# For best performance on older kernels, install fuse-overlayfs
sudo apt-get install -y fuse-overlayfs

Kernel 5.11+ supports native overlayfs in user namespaces. If you can upgrade your kernel, do it.

Host Volume Mounts

File ownership inside rootless containers maps through user namespaces. A file owned by UID 1000 on the host appears as a different UID inside the container:

# Create a file on the host
touch /tmp/testfile
ls -la /tmp/testfile
# -rw-r--r-- 1 youruser youruser ... /tmp/testfile

# Inside rootless container, ownership appears different
docker run --rm -v /tmp/testfile:/tmp/testfile alpine ls -la /tmp/testfile
# -rw-r--r-- 1 nobody nobody ... /tmp/testfile

Fix this with --userns=keep-id (Podman) or by adjusting subordinate UID mappings.

Cgroup Management

Rootless containers use cgroup v2 by default. If your host still runs cgroup v1, resource limits (CPU, memory) may not work:

# Check cgroup version
mount | grep cgroup
# cgroup2 on /sys/fs/cgroup type cgroup2 — you're good
# cgroup on /sys/fs/cgroup type tmpfs — cgroup v1, limits may not work

Most modern distributions (Ubuntu 22.04+, Fedora 36+, RHEL 9+) ship with cgroup v2 enabled.

Security Impact Analysis

Container Escapes Become Low-Impact

CVE-2019-5736 (runc container escape) allowed a malicious container to overwrite the host runc binary and gain root access. In rootless mode, the escape still technically works, but you land as an unprivileged user on the host. The blast radius drops from "complete host compromise" to "one user's home directory."

Kernel Exploits Lose Potency

Dirty Pipe (CVE-2022-0847) allowed privilege escalation through the kernel's pipe mechanism. In a rootful container, this gives you root on the host. In rootless mode, you cannot exploit it because the container process runs under a non-root UID from the kernel's perspective.

Docker Socket Exposure Neutralized

Mounting the Docker socket (/var/run/docker.sock) into a container is equivalent to giving root access. In rootless mode, the socket is in the user's runtime directory, and accessing it only grants the user's unprivileged access level.

Rootless in Kubernetes

Kubernetes does not directly support rootless container runtimes at the kubelet level (as of 1.28), but you can achieve similar isolation:

User Namespaces in Kubernetes

Kubernetes 1.25 introduced alpha support for user namespaces in pods:

apiVersion: v1
kind: Pod
metadata:
  name: userns-pod
spec:
  hostUsers: false  # Enable user namespace for this pod
  containers:
  - name: app
    image: myapp:latest
    securityContext:
      runAsNonRoot: true
      runAsUser: 1000

Rootless Kubernetes Nodes

Projects like Usernetes run the entire Kubernetes stack (kubelet, CRI runtime, CNI plugins) in rootless mode. This is experimental but represents the direction the ecosystem is heading.

Performance Considerations

Rootless mode has measurable overhead:

  • Network: slirp4netns or pasta add 10-20% latency compared to host networking. RootlessKit with port-driver=slirp4netns is the default for Docker.
  • Storage: fuse-overlayfs is 5-15% slower for I/O-intensive workloads compared to native overlay2. Native overlayfs on kernel 5.11+ eliminates this gap.
  • Build time: Image builds are 10-30% slower due to namespace overhead.

For most workloads, this overhead is negligible compared to the security improvement. For I/O-heavy databases or high-throughput network services, benchmark before committing.

Migration Strategy

  1. Start with development environments. Run Docker or Podman rootless on developer machines. Find compatibility issues early.
  2. CI/CD pipelines next. Build and test containers in rootless mode. This catches permission issues in Dockerfiles.
  3. Staging environments. Run your full application stack rootless. Performance test under load.
  4. Production. Roll out rootless mode to production nodes. Keep rootful mode available for the few workloads that genuinely need it (CNI plugins, storage drivers).

How Safeguard.sh Helps

Safeguard.sh detects containers running with elevated privileges and flags workloads that could run in rootless mode but are not. The platform analyzes your container configurations, identifies unnecessary root access, and provides specific remediation steps to shift workloads to rootless execution. For teams managing the transition, Safeguard.sh tracks which environments have moved to rootless mode and which still carry the higher-risk rootful configuration, giving leadership clear metrics on security posture improvement.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.