The Architecture of Absolute Minimality: Mastering FROM scratch in Docker

The concept of starting from nothing is a powerful paradigm in software engineering, and in the ecosystem of containerization, this is materialized through the FROM scratch instruction. While most developers are accustomed to layering their applications atop robust distributions like Ubuntu, Alpine, or Debian, the scratch image represents the ultimate tabula rasa. It is not an image in the traditional sense—it is a reserved keyword that signals to the Docker build engine that the resulting container should begin with an entirely empty filesystem. This approach is primarily utilized for creating base images or super-minimal production images that contain nothing more than a single statically compiled binary and its absolute minimum requirements. By removing the overhead of a shell, package managers, and unnecessary system libraries, developers can achieve the smallest possible attack surface and the fastest possible startup times, provided they can overcome the significant technical hurdles associated with the absence of a standard operating environment.

The Technical Anatomy of the scratch Reserved Image

The scratch image is a unique entity within the Docker ecosystem. Unlike standard images, it is a reserved, minimal image that serves as a starting point for building containers. Technically, when a developer utilizes FROM scratch in a Dockerfile, it acts as a no-op (no-operation) in the build process. Since Docker 1.5.0 (specifically referenced in docker/docker#8827), this instruction does not create an extra layer in the image. Consequently, an image that might have previously required two layers is reduced to a single layer, optimizing the image's structural efficiency.

The primary function of scratch is to signal to the Docker build process that the very next command in the Dockerfile should be the first filesystem layer in the image. This means there is no inherited root filesystem; there are no /bin, /etc, or /usr directories unless the developer explicitly adds them via COPY or ADD instructions.

The administrative constraints of scratch are strict. Although it appears in Docker's repository on the Docker Hub, it is not a pullable entity. A user cannot execute docker pull scratch, nor can they run a container directly from it using docker run scratch. Furthermore, no image can be tagged with the name scratch. It exists solely as a reference point within the Dockerfile to initiate the creation of a new image from a state of zero bytes.

Implementation Strategies for Minimal Containers

To create a functional container starting from scratch, a developer must provide everything the application needs to execute. Because the image is empty, common utilities like wget, curl, or even a basic shell (sh) are nonexistent. This necessitates a strategy where the build environment is separated from the final runtime environment, often achieved through multi-stage builds.

For instance, a basic minimal container can be constructed using the following Dockerfile structure:

dockerfile FROM scratch COPY hello / CMD ["/hello"]

In this scenario, the COPY command takes a binary named hello from the build context and places it at the root of the image. The CMD instruction then specifies that this binary should be the entry point. However, for this to work, the hello binary must be completely self-contained. If the binary relies on any dynamic libraries (such as glibc) or language runtimes, the container will fail to start because those dependencies are not present in the scratch filesystem.

For those looking to build a more comprehensive base image, such as a minimal root filesystem, one can utilize the Alpine Linux mini root filesystem. Since scratch provides no tools for downloading files, the developer must first download the mini root filesystem onto the host machine:

bash wget -O alpine-minirootfs.tgz https://bit.ly/alpine-minirootfs-3-19-1

Once the archive is on the host, it can be added to the scratch base image using the ADD instruction in the Dockerfile, effectively layering a functional, albeit minimal, Linux distribution on top of the empty start point.

Critical Pitfalls and Technical Resolutions

The transition to FROM scratch introduces several "invisible" failures. These are errors that do not appear during the build phase but manifest as runtime panics or failures because the application expects certain system files to be present.

The Certificate Authority (CA) Bundle Gap

One of the most common failures occurs when a program attempts to make an HTTPS request. In a standard Linux environment, the system provides a bundle of CA certificates (typically located in /etc/ssl/certs/) that allows the application to verify the SSL/TLS certificates of remote servers. In a scratch container, this folder is missing.

When a Go program compiled with CGO_ENABLED=0 attempts to fetch a URL via HTTPS in a scratch container, it produces the following error:

panic: Get "https://labs.iximiuz.com/": tls: failed to verify certificate: x509: certificate signed by unknown authority

The technical cause is the absence of the root certificates required to establish a chain of trust. The resolution involves copying the CA bundle from a builder stage into the final scratch image.

Corrected implementation:

```dockerfile

syntax=docker/dockerfile:1

-=== Builder image ===-

FROM golang:1 AS builder
WORKDIR /app
COPY < package main
import (
"fmt"
"io"
"net/http"
)
func main() {
resp, err := http.Get("https://labs.iximiuz.com/")
if err != nil {
panic(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
panic(err)
}
fmt.Println("Response:", string(body))
}
EOF
RUN CGO_ENABLED=0 go build

-=== Target image ===-

FROM scratch

Copy the CA bundle from the builder stage

COPY --from=builder /etc/ssl/certs/ /etc/ssl/certs/
COPY --from=builder /app/main /
CMD ["/main"]
```

By adding COPY --from=builder /etc/ssl/certs/ /etc/ssl/certs/, the developer ensures that the runtime environment possesses the necessary cryptographic material to perform secure network communications.

The Zoneinfo and Timezone Dependency

Another critical failure relates to time and location services. Many applications require the ability to handle different time zones (e.g., Europe/Amsterdam). In Linux, this information is stored on disk, typically in the /usr/share/zoneinfo directory. Programs look up these files at runtime to calculate offsets from UTC.

If a Go program attempts to load a location using time.LoadLocation("Europe/Amsterdam") inside a scratch container, it will fail with the following panic:

panic: unknown time zone Europe/Amsterdam

The impact is a complete failure of any time-sensitive logic that relies on geographic location. Because these files do not appear magically, they must be explicitly provided.

The fix involves mirroring the CA bundle approach:

```dockerfile

-=== Target image ===-

FROM scratch

Copy the timezone info from the builder stage

COPY --from=builder /usr/share/zoneinfo/ /usr/share/zoneinfo/
COPY --from=builder /app/main /
CMD ["/main"]
```

This ensures that the application can access the binary timezone database and correctly identify the time in specific global regions.

User Management and Permissions

Running containers as the root user is a security risk. However, standard user management relies on the existence of /etc/passwd and /etc/group files. In a scratch container, these files are absent, meaning the kernel has no mapping between a numeric User ID (UID) and a human-readable username.

While creating these files manually is tedious, there is a workaround for Go programs compiled with CGO_ENABLED=0. Developers can use the USER instruction to set numeric UIDs and GIDs, and the ENV instruction to set a placeholder username.

Example of numeric user implementation:

dockerfile FROM scratch USER 65532:65532 ENV USER=nonroot COPY --from=builder /app/main / CMD ["/main"]

This allows the process to run as a non-privileged user (UID 65532), reducing the impact of a potential container breakout. However, it is important to note that this "hack" is only effective for statically compiled binaries that do not rely on C libraries for user lookup.

Dependency Analysis: Static vs. Dynamic Linking

The viability of FROM scratch depends entirely on the linking method used during the compilation of the application.

Statically Linked Binaries

A statically linked binary includes all the library code it needs to run. In the context of Go, setting CGO_ENABLED=0 tells the compiler to avoid using the C toolchain and to link the binary statically. These binaries are the primary candidates for scratch images because they do not look for shared object files (.so) at runtime.

Dynamically Linked Binaries

Most programs are dynamically linked, meaning they expect the operating system to provide common libraries (like libc) at runtime. If a program is compiled with CGO_ENABLED=1, it will attempt to load these libraries from the filesystem.

In a scratch container, these libraries are missing. Attempting to run a dynamically linked binary will result in an immediate crash because the loader cannot find the required shared libraries. This is why creating a base image using FROM scratch is described as difficult for anything other than small, simple, and statically compiled programs.

The following table summarizes the requirements and capabilities when utilizing FROM scratch versus standard base images.

Feature Standard Base Image (e.g., Ubuntu) FROM scratch
Initial Filesystem Full Root FS (bins, libs, etc.) Empty (0 bytes)
Shell Access Available (sh, bash) None
Package Manager Available (apt, apk) None
Build Layer Inherits layers from base Next command is first layer
Security Surface Large (contains many tools) Minimal (single binary)
Binary Requirement Dynamic or Static Strictly Static (or manual lib copy)
CA Certificates Pre-installed Must be manually added
Timezone Data Pre-installed Must be manually added

Conclusion: Strategic Analysis of the Tabula Rasa Approach

The decision to utilize FROM scratch is a trade-off between extreme optimization and operational complexity. From a technical perspective, the reduction of an image to a single layer and the removal of all unnecessary binaries significantly hardens the container. An attacker who gains execution capabilities within a scratch container finds themselves in a barren environment; there is no ls to list files, no cat to read configuration, and no curl to download malicious payloads. This makes scratch the gold standard for high-security, production-grade microservices.

However, the administrative burden is shifted entirely to the developer. The responsibility for maintaining the runtime environment—including the management of CA certificates, timezone databases, and user permissions—now falls on the Dockerfile's author. The "invisible" dependencies of modern software are exposed when the safety net of a distribution is removed.

In summary, FROM scratch is not a general-purpose tool for every project. It is a specialized instrument for developers who have achieved a high level of maturity in their build pipeline, specifically those utilizing statically compiled languages like Go or Rust. When the overhead of a minimal distribution like Alpine is still too large, or when the security requirements demand a zero-trust filesystem, scratch provides the only path to absolute minimality. The success of a scratch implementation lies not in the instruction itself, but in the meticulous reconstruction of the minimal viable environment required for the application to survive and thrive.

Sources

  1. Docker, FROM scratch
  2. Docker Hub - scratch
  3. Datawookie - Docker Image From Scratch
  4. Iximiuz Labs - Pitfalls of From Scratch Images
  5. Docker Documentation - Base Images

Related Posts