Container Security

Chiseled / Distroless Image Rollout Program

What it takes to standardise on chiseled and distroless container images across an engineering organisation: which workloads benefit, which do not, and how to handle the operational quirks of imageless containers.

Shadab Khan
Security Engineer
7 min read

Distroless images have been around since 2017. Chiseled images, the more recent Ubuntu-flavoured cousin, arrived a few years later. Both promise the same thing: a container image that contains only the application and its direct runtime dependencies, with no shell, no package manager, and no general-purpose Linux utilities. Both deliver dramatic reductions in attack surface, image size, and CVE counts.

The reason this is still a program rather than a default is that the operational model is genuinely different. You cannot exec a shell into a distroless container. You cannot apt install a debugging tool. You cannot run the usual incident-response playbook of poking around the running pod. Engineering teams used to those workflows have to learn new ones, and the rollout has to bring them along.

This post covers the program plan, the workload fit assessment, the build pipeline changes, and the operational quirks that the rollout has to handle.

Workload Fit Assessment

Not every workload benefits equally. The first step of the program is honest classification.

Statically linked binaries, particularly Go, Rust, and similar single-binary services, are a clean fit. The application brings everything it needs. The base image can be functionally empty, often just a CA bundle and a /etc/passwd entry. The savings are dramatic, often a fifty to one reduction in image size and an effective elimination of base-layer CVEs.

JVM workloads are a moderate fit. The JVM expects a libc, a few system libraries, and sometimes a temp directory layout. Distroless Java images exist and work well for production, with the caveat that diagnostics tools like jstack and jmap need to be invoked through ephemeral debug containers rather than directly.

Python and Node workloads are a harder fit. Both ecosystems mix native and pure-language dependencies in ways that make the runtime requirements vary across applications. A Python image that depends on a few well-known libraries can run on a distroless Python base. An image whose dependency graph reaches into image processing, machine learning, or scientific computing libraries often pulls in system dependencies that are not in distroless.

Workloads with unusual requirements, such as those that need a specific shell for an init script or that bundle debug tools intentionally, are not good first candidates. They are often good third- or fourth-quarter candidates after other rationalisation work.

Build Pipeline Changes

The build pipeline change is mechanically simple and culturally significant. The Dockerfile FROM line changes from a general-purpose base to a minimal one. The build moves to a multi-stage pattern where the build stage uses a full image with all the tooling needed to compile, and the final stage uses the minimal image and copies only the artifacts.

The cultural significance is that engineers can no longer rely on the runtime image to provide tools they used during build. If your build stage installs ImageMagick to compile a binding, the final stage will not have ImageMagick available unless the binding is statically linked or the final stage explicitly includes it. This catches teams off guard the first time, and the second time it has been internalised.

For the rollout, we shipped a set of internal multi-stage Dockerfile templates per language. Engineers copying from a template got the multi-stage pattern by default, with the build stage and the final stage clearly delineated. We also shipped a CI lint step that flagged any Dockerfile that used a known-large base in the final stage as the only stage.

Debugging Without A Shell

The operational change that matters most is debugging. The standard incident playbook of "kubectl exec into the pod and look around" stops working when the pod has no shell.

The replacement playbook has three legs.

Ephemeral debug containers. Kubernetes has supported ephemeral debug containers for several releases now. The pattern is to attach a fully-featured container to a running pod's namespace, giving the responder access to the same network, IPC, and PID space as the workload, with their own filesystem that has the diagnostic tools they need. We standardised on a single internal debug image with a curated tool set, and trained on-call engineers to reach for it.

Better observability up front. The reason engineers exec into pods is usually to look at logs, files, or metrics that should have been visible elsewhere. The rollout was an opportunity to push for richer structured logs, more granular metrics, and exposed health endpoints. By the second quarter, on-call engineers were exec-ing into containers far less often than before, simply because they did not need to.

Crash dumps and core files. For workloads where post-mortem analysis matters, we configured the container runtime to write core files to a mounted volume rather than to the container filesystem. Combined with the debug container approach, this covered the diagnostic cases that the missing shell would otherwise have made awkward.

Image Lineage And Refresh

Distroless images have their own update cadence. The publishers, whether Google's distroless project or Canonical's chiseled images, push updates when the underlying components have security fixes. The rollout has to keep up.

The pattern we use is mirror-and-refresh. Every approved minimal base is mirrored into our internal registry, and a refresh job runs daily to pull the latest digest from upstream. When a new digest appears, automated pull requests are opened against application repositories that depend on that base, updating the FROM digest reference. The PR includes the upstream changelog and the diff in CVE coverage.

This is the same pattern used for general base image management, but the consequences are different. Distroless bases change less often than full distros, but when they change, the diff is meaningful and the upgrade is usually safe. The PR-driven workflow gets us to coverage within hours of an upstream release for active applications, and within a week for less active ones.

The Holdouts

Every rollout has a long tail of workloads that resist migration. The healthy posture is to track them honestly rather than to pretend they have migrated.

Three categories of holdout recur. Vendor images that we cannot rebuild ourselves and that the vendor has not yet shipped a minimal variant of. Legacy applications whose dependency graphs are too tangled to unwind in a quarter. And workloads whose owning team has been disbanded and whose maintenance is bouncing between platforms.

For each category, the program has a different remedy. Vendor images get a procurement track, with the security team's buying power leveraged to push the vendor toward minimal variants. Legacy applications get a sunset track, where the migration is part of a broader rationalisation. Orphaned workloads get an ownership track, where someone has to claim them or they get retired.

Holdouts are reported in the same dashboard as migrated workloads, so that the rollout's coverage number tells the truth rather than a flattering subset of it.

How Safeguard Helps

Safeguard supports the rollout end to end. The image scanner classifies workloads by their base image and produces a coverage report broken down by minimal-base type, so the program leadership has a single view of progress. The remediation engine watches upstream distroless and chiseled releases, mirrors them into the internal registry, and opens automated PRs against application repos with digest updates and CVE diffs. The admission policy engine can enforce minimal-base requirements per namespace, so that production namespaces accept only minimal images while development namespaces remain flexible. The runtime stack supports ephemeral debug container workflows with audit logging, so that on-call work continues without the missing shell becoming a barrier. And the holdout tracking surfaces the long tail in the same dashboard as the migrated workloads, with the procurement and sunset tracks each having their own workflows. The result is a minimal-image program that gets to high coverage and stays there.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.