GitLab CI Containerization and Registry Integration

The process of automating the creation, testing, and deployment of container images within a Continuous Integration (CI) pipeline is a fundamental pillar of modern DevOps. GitLab provides a robust ecosystem that allows developers across all tiers—Free, Premium, and Ultimate—and across all offerings including GitLab.com, GitLab Self-Managed, and GitLab Dedicated, to integrate Docker and other container engines directly into their software development lifecycle. At its core, using GitLab CI/CD to build Docker images involves transforming application code into a portable, immutable artifact that can be pushed to a container registry and subsequently pulled for deployment. However, because GitLab CI jobs themselves typically execute inside Docker containers, creating a "container within a container" presents a unique architectural challenge that requires specific configuration strategies, such as Docker-in-Docker (DinD), the use of the shell executor, or the adoption of daemonless alternatives like Podman.

Architectural Strategies for Docker Command Execution

To execute Docker commands within a CI/CD pipeline, the environment must provide access to a Docker daemon (dockerd). Since the standard GitLab Runner environment is often a container, the CLI cannot simply communicate with a local daemon unless specific configurations are applied.

The Shell Executor Approach

One method to enable Docker commands is to configure the GitLab Runner to use the shell executor. In this setup, the runner does not start a new container for every job but instead executes the scripts directly on the host machine's shell.

Installation and Registration: The process begins by installing the GitLab Runner on a server and registering it. During registration, the user must specifically select the shell executor. An example registration command is:
sudo gitlab-runner register -n \ --url "https://gitlab.com/" \ --registration-token REGISTRATION_TOKEN \ --executor shell \ --description "My Runner"
Dependency Requirements: For this configuration to function, the Docker Engine must be installed on the same server where the GitLab Runner is hosted.
Permission Logic: The gitlab-runner user is the entity executing the commands. Therefore, this user must be granted the necessary permissions to interact with the Docker socket, typically by adding the user to the docker group.
Impact: This approach removes the need for complex nested virtualization but increases the security risk by granting the runner direct access to the host machine's shell.

Docker-in-Docker (DinD)

Docker-in-Docker is the standard technique for running Docker commands when the job itself is containerized. It solves the problem by running a separate Docker daemon as a service alongside the build job.

The Daemon Mechanism: The Docker CLI is merely a client. It communicates with dockerd to perform actual work. In a CI environment, the docker:dind image provides this daemon.
Service Configuration: In the .gitlab-ci.yml file, the docker:dind image is defined as a service. This allows the job container to communicate with the daemon container.
Service Aliasing: To make the connection explicit, an alias can be assigned to the service, such as dockerdaemon.
Environment Variables: The Docker CLI must be told where to find the daemon. This is achieved using the DOCKER_HOST variable. A typical configuration looks like:
DOCKER_HOST: tcp://dockerdaemon:2375/
Impact: DinD allows for a clean, isolated environment for every build but requires the runner to be configured in privileged mode, which can be a security concern in shared environments.

Podman as a Daemonless Alternative

Podman serves as a reimplemented version of Docker that offers a fundamentally different architecture. Unlike Docker, Podman does not rely on a central daemon.

Architecture: Because there is no daemon, the Podman CLI performs all the work itself. This eliminates the need for privileged mode or complex service configurations like DinD.
Compatibility: Podman supports the same command-line options as Docker, making it a drop-in replacement for most CI pipelines.
Implementation in GitLab CI: A job using Podman can be configured using the quay.io/podman/stable image. The script involves logging into the registry and building the image:
podman login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" "$CI_REGISTRY"
podman build -t "$CI_REGISTRY_IMAGE:podman" .
podman push "$CI_REGISTRY_IMAGE:podman"
Command Aliasing: To maintain compatibility with scripts written for Docker, users can create a symbolic link from podman to docker:
ln -s /usr/bin/podman /usr/bin/docker
Impact: Using Podman significantly simplifies the .gitlab-ci.yml configuration and enhances security by removing the requirement for privileged containers.

Implementing the Build Workflow

A complete build pipeline requires a source repository, a definition of the environment (Dockerfile), and a set of instructions for the CI runner (.gitlab-ci.yml).

Repository Preparation and Branching

The process begins with the creation of a project and the organization of the codebase.

Initial Setup: A new repository is created in GitLab, such as one named build-with-ci-example.
Local Environment: The repository is cloned to a local machine using the command:
git clone <repo URL>
cd build-with-ci-example
Feature Isolation: To avoid disrupting the main branch, a specific feature branch is created for CI implementation:
git checkout -b feature/add-CI

Constructing the Dockerfile

The Dockerfile defines the environment and the application layers. An effective Dockerfile uses arguments for flexibility and ensures the image is lean.

Base Image Configuration: The use of ARG allows the base image to be configurable, which is useful for testing different versions of a language. For example:
ARG BASE_IMAGE=python:3.7
FROM ${BASE_IMAGE}
System Maintenance: To ensure security and stability, the image should be updated and cleaned of unnecessary files:
USER root
RUN apt-get -qq -y update && \ apt-get -qq -y upgrade && \ apt-get -y autoclean && \ apt-get -y autoremove && \ rm -rf /var/lib/apt/lists/*
User Management: For security, images should not run as root. A dedicated user is created:
RUN useradd -m docker && \ cp /root/.bashrc /home/docker/ && \ mkdir /home/docker/data && \ chown -R --from=root docker /home/docker
Environment Setup: The HOME and WORKDIR variables are set to ensure the application has a consistent starting point:
ENV HOME /home/docker
WORKDIR ${HOME}/data
USER docker
Execution Logic: A script is used as the ENTRYPOINT to handle initialization:
COPY entrypoint.sh $HOME/entrypoint.sh
ENTRYPOINT ["/bin/bash", "/home/docker/entrypoint.sh"]
CMD ["Docker"]

The Entrypoint Script

The entrypoint.sh script acts as the gateway for the container, ensuring that the environment is correctly initialized before the main application starts. It typically includes:
- Shebang: #!/usr/bin/env bash
- Error handling: set -e to ensure the script exits on the first failure.
- Main function: A structured function to manage the startup logic.

Registry Interaction and Image Management

The GitLab Container Registry is a built-in tool for storing and managing Docker images. Proper authentication and tagging are critical for a successful pipeline.

Authentication and Pushing

Before any image can be uploaded, the CI job must authenticate with the registry.

Authentication: The pipeline uses predefined environment variables provided by GitLab, specifically $CI_REGISTRY_USER and $CI_REGISTRY_PASSWORD.
Building the Image: An image is built and tagged with the registry path:
docker build -t registry.example.com/group/project/image .
Pushing the Image: The image is then uploaded to the registry:
docker push registry.example.com/group/project/image
Versioning and Tags: Using the Git SHA in the image tag is recommended. This ensures that each job produces a unique image, preventing the use of stale images and providing a clear audit trail of which commit produced which image.

Optimal Build Practices

To ensure reliability and performance, specific Docker flags and strategies should be employed.

Pulling Base Images: Using docker build --pull ensures that the latest version of the base image is fetched, preventing the use of outdated cached versions.
Explicit Pulls: In environments with multiple runners that cache images locally, an explicit docker pull should be executed before every docker run to ensure the most recent version of the built image is used.
Parallelization: GitLab CI allows for the building of multiple Docker images in parallel, significantly reducing the overall pipeline duration.

Pulling and Running Built Images

Once an image is pushed to the registry, it can be pulled and executed on any machine with Docker installed.

Pulling the Image: The full registry name must be used:
docker pull gitlab-registry.cern.ch/<user name>/build-with-ci-example:py-3.8
Running the Container: The image can be run with specific flags, such as --rm to remove the container after exit and -ti for an interactive terminal:
docker run --rm -ti gitlab-registry.cern.ch/<user name>/build-with-ci-example:py-3.8 python3 --version

Comparison of Build Methods

The following table provides a technical comparison of the three primary methods for executing Docker builds within GitLab CI.

Method	Requirement	Daemon Needed	Privileged Mode	Complexity
Shell Executor	Docker Engine on Host	Yes (Host)	No	Low
Docker-in-Docker	`docker:dind` service	Yes (Container)	Yes	High
Podman	`quay.io/podman/stable`	No	No	Low

Detailed Analysis of CI Integration

The integration of container builds into a CI pipeline transforms the development process from a manual "it works on my machine" approach to a standardized, automated delivery system. By utilizing the GitLab Container Registry, the project maintains a strong link between the source code and the resulting artifact. The use of the .gitlab-ci.yml file allows for the definition of stages, such as a build stage, where the image is constructed and pushed.

When comparing the methods, the shift toward Podman represents a trend toward "daemonless" containers, which reduces the attack surface of the CI runner. Traditional DinD requires the --privileged flag in the runner's config.toml, which essentially gives the container root access to the host machine. Podman avoids this by utilizing a different architecture that does not require a background process to manage containers.

Furthermore, the use of the before_script section in .gitlab-ci.yml is a best practice for authentication. Placing the docker login or podman login command in the before_script ensures that all subsequent steps in the job have the necessary permissions to interact with the registry without repeating the login logic in every single task.

The combination of a configurable ARG in the Dockerfile and the use of GitLab's predefined variables creates a highly flexible pipeline. For example, a project can build separate images for Python 3.7 and Python 3.8 in parallel by passing different arguments to the build command, allowing for comprehensive compatibility testing across multiple runtime versions.