GitLab CI YAML Include Architecture and Modular Configuration

The architecture of modern continuous integration and continuous deployment pipelines necessitates a move away from monolithic configuration files. As projects scale, the .gitlab-ci.yml file often becomes an unmanageable behemoth, leading to configuration drift, merge conflicts, and significant maintenance overhead. GitLab addresses this challenge through the include keyword, a powerful mechanism that allows developers to modularize their pipeline definitions by splitting them across multiple files. This capability transforms the CI configuration from a static script into a dynamic, composable system where common patterns can be reused across different projects, branches, or environments.

The include functionality is not merely a file-concatenation tool; it is a sophisticated merging engine. When GitLab parses a pipeline, it evaluates all included files and merges them into a single effective configuration. This process allows for the centralization of governance—such as security scans or compliance checks—while permitting individual project teams to define specific build and test logic. By leveraging nested includes, templates, and conditional rules, organizations can implement a hierarchical configuration strategy that balances global standardization with local flexibility.

Modularization Strategies via Include Sub-keys

GitLab provides several distinct methods for incorporating external YAML configurations, each tailored to a specific architectural need. These are implemented as sub-keys under the primary include keyword.

Local File Inclusion

The include:local sub-key is the primary method for restructuring a pipeline within a single repository. It allows a developer to reference a file using a relative path from the project root.

  • Direct Fact: Use include:local to reference YAML files located within the same project.
  • Impact Layer: This enables the separation of concerns. For instance, a project can have a dedicated ci/build.gitlab-ci.yml for compilation and a ci/test.gitlab-ci.yml for quality assurance, preventing the main .gitlab-ci.yml from becoming thousands of lines long.
  • Contextual Layer: This is the simplest form of modularization and serves as the foundation for more complex structures, such as the .gitlab/ci/ directory pattern used in Python projects to house common aspects of the CI pipeline definition.

Project-Based Inclusion

When a configuration is useful across multiple projects within the same GitLab instance, include:file (or the project-specific syntax) is employed.

  • Direct Fact: This method allows the inclusion of a YAML file from a different project on the same GitLab instance.
  • Impact Layer: This is critical for platform engineering teams who maintain "golden paths" or standardized pipeline templates. Instead of copying and pasting YAML across fifty projects, they maintain one central repository of CI logic.
  • Contextual Layer: This shifts the maintenance burden from the individual developer to a centralized maintainer, ensuring that a security update to a build script is propagated to all projects simultaneously.

Remote File Inclusion

For organizations operating across multiple GitLab instances or utilizing external configuration sources, include:remote provides the necessary bridge.

  • Direct Fact: The include:remote sub-key fetches a YAML file from an external URL.
  • Impact Layer: This allows for the sharing of CI logic across entirely different organizational boundaries or different GitLab installations, facilitating a "Configuration as Code" approach that transcends a single instance.
  • Contextual Layer: While powerful, this introduces a dependency on external network availability and the security of the remote host.

Template-Based Inclusion

GitLab provides a vast library of sophisticated reference templates that can be integrated using include:template.

  • Direct Fact: include:template allows the use of official GitLab CI/CD templates or custom-defined templates.
  • Impact Layer: Noobs and tech enthusiasts can quickly bootstrap their pipelines using industry-standard configurations for languages like Python without writing YAML from scratch.
  • Contextual Layer: Users must be aware that official templates are subject to change over time, meaning a pipeline that works today might require updates if the underlying GitLab template is modified.

The Merging Engine and Configuration Overrides

The process of combining the main .gitlab-ci.yml with included files is known as merging. This is not a simple append operation but a logical merge of YAML objects.

Array Merging Limitations

One of the most critical technical constraints of the GitLab merging engine is how it handles arrays, such as the script section of a job.

  • Direct Fact: You cannot add or modify individual items within an array during the merge process.
  • Impact Layer: If an included template defines a script array, and the main file defines a script array for the same job, the main file's array completely replaces the template's array.
  • Contextual Layer: To achieve a "combined" script, the developer must explicitly repeat all previous commands from the template and then add the new command.

Example of Array Overwrite:

If autodevops-template.yml contains:
yaml production: stage: production script: - install_dependencies - deploy

And .gitlab-ci.yml contains:
yaml include: 'autodevops-template.yml' stages: - production production: script: - install_dependencies - deploy - notify_owner

The resulting job will execute all three commands. However, if the developer only wrote - notify_owner, the install_dependencies and deploy steps would be deleted from the final execution.

Nested Includes and Idempotency

GitLab supports the nesting of includes, meaning an included file can itself include another file.

  • Direct Fact: Nested includes can go multiple levels deep (e.g., .gitlab-ci.yml -> another-config.yml -> config-defaults.yml).
  • Impact Layer: This allows for a layered configuration hierarchy. Global defaults can be placed at the bottom of the chain, while project-specific overrides sit at the top.
  • Contextual Layer: GitLab ensures idempotency; if a duplicate configuration file is included multiple times through different paths, it is only processed once, preventing circular dependencies or redundant job definitions.

Dynamic Inclusion via Variables and Rules

Modern GitLab CI versions have introduced the ability to make the include process dynamic, allowing the pipeline to adapt based on the environment or the commit.

Variable Support in Includes

The include section can utilize specific types of variables to determine which files to load.

  • Direct Fact: Supported variables include Project, Group, Instance, and Project predefined variables.
  • Impact Layer: This allows for environment-specific configuration loading. For example, a different set of CI files can be loaded based on the $CI_PROJECT_PATH.
  • Contextual Layer: Starting in GitLab 14.2, the $CI_COMMIT_REF_NAME variable is supported, returning the full ref path (e.g., refs/heads/branch-name).

Variable Limitations

Not all variables are available during the include phase because YAML files are parsed before the pipeline is actually created.

  • Direct Fact: Pipeline predefined variables such as CI_PIPELINE_ID, CI_PIPELINE_URL, CI_PIPELINE_IID, and CI_PIPELINE_CREATED_AT are unavailable.
  • Impact Layer: Developers cannot use the unique ID of a pipeline to dynamically select an include file.
  • Contextual Layer: Furthermore, variables defined within the variables: section of a job or the global variables section cannot be used in include statements because includes are evaluated before jobs are initialized.

Conditional Inclusion with Rules

The include:rules keyword allows for the conditional inclusion of files based on the state of CI/CD variables.

  • Direct Fact: rules:if can be used to conditionally include files.
  • Impact Layer: This prevents the pipeline from becoming bloated with unnecessary jobs. For example, a "deployment" YAML can be included only if the commit is pushed to the main branch.
  • Contextual Layer: In versions prior to GitLab 14.5, regex matching for $CI_COMMIT_REF_NAME required the =~ operator rather than ==.

Troubleshooting Common Inclusion Failures

Despite the power of the include system, developers frequently encounter pitfalls related to stage mapping and job visibility.

The Missing Job Phenomenon

A common issue occurs when a developer includes a file containing multiple jobs, but only the first job appears in the merged YAML view or the pipeline.

  • Direct Fact: If the stages list in the main file does not match the stages required by the included jobs, those jobs may not execute or appear correctly.
  • Impact Layer: A developer might define build and another-build in an included file, but if another-build is not assigned to a stage defined in the main .gitlab-ci.yml's stages list, it will be omitted from the pipeline.
  • Contextual Layer: This often happens when the main file defines a strict list of stages:
    yaml stages: <ul> <li>build</li> <li>test</li> <li>deploy<br />

    If the included file contains a job that defaults to a stage not in this list, GitLab will ignore the job. To fix this, every included job must be explicitly assigned to a configured stage.

Undefined Needs in Conditional Includes

There is a known limitation regarding the `needs:` keyword when used with conditionally included jobs.

  • Direct Fact: You cannot use needs: to create a dependency pointing to a job added via include:local:rules.
  • Impact Layer: When the configuration is validated, GitLab returns undefined need: <job-name>.
  • Contextual Layer: This creates a challenge for complex DAG (Directed Acyclic Graph) pipelines where the existence of a dependency is conditional.

Technical Specifications and Version History

The evolution of the `include` keyword reflects GitLab's commitment to expanding the flexibility of CI/CD.












































Feature Version Introduced Detail
Initial Include Support Early Versions Basic local, remote, and template support.
Feature Flag Removal 13.9 Standardized the include behavior across all installations.
Var Support (Project/Group/Instance) 14.2 Allowed dynamic pathing via organizational variables.
$CICOMMITREF_NAME Support 14.2 Enabled branch-specific inclusion paths.
Pipeline Var Support 14.5 Expanded variables to include trigger and manual run variables.
Regex Resolution 14.5 Fixed behavior of $CI_COMMIT_REF_NAME matching in rules.
Nested Includes 14.8 Allowed configuration files to include other configuration files.

Implementation Example: Python Project Structure

To illustrate the practical application of these concepts, consider a professional Python project structure designed for maximum reusability.

Common Configuration (.gitlab/ci/common.gitlab-ci.yml)

This file houses the global defaults and base images to ensure consistency across all Python projects in the organization. ```yaml stages: - lint - test - run - deploy default: interruptible: true variables: PY_COLORS: "1" CACHE_PATH: "$CI_PROJECT_DIR/.cache" UV_CACHE_DIR: "$CACHE_PATH/uv" UV_PROJECT_ENVIRONMENT: "$CACHE_PATH/venv" PIP_CACHE_DIR: "$CACHE_PATH/pip" .base_image: image: python:3.13 .base_rules: rules: - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH .dependencies: before_script: - pip install --upgrade pip - pip install uv - uv sync --frozen cache: key: files: - uv.lock prefix: $CI_JOB_IMAGE paths: - "$CACHE_PATH" ```

Main Configuration (.gitlab-ci.yml)

The main file acts as the orchestrator, pulling in the common logic and defining the specific project jobs. ```yaml include: - local: .gitlab/ci/common.gitlab-ci.yml lint-job: extends: .base_image stage: lint script: - uv run flake8 . test-job: extends: - .base_image - .dependencies stage: test script: - uv run pytest ```

Analysis of Architectural Impact

The transition from a single `.gitlab-ci.yml` to a modular `include`-based architecture represents a fundamental shift in DevOps philosophy. By treating CI configurations as a set of reusable modules rather than a static script, organizations achieve several critical operational advantages. First, the reduction of redundancy. When a security vulnerability is found in a base image, a single change in a centralized `defaults.gitlab-ci.yml` file updates every pipeline in the organization. This eliminates the "update toil" associated with manual edits across hundreds of repositories. Second, the improvement in developer experience. New developers are no longer intimidated by a 2,000-line YAML file. Instead, they interact with a high-level orchestrator and only dive into the specific modular files relevant to their current task. Third, the enforcement of compliance. By using `include:project` or `include:remote`, a central governance team can mandate the inclusion of a compliance-checking job in every pipeline. Since these files can be protected in a separate repository, individual project developers cannot bypass essential security scans. However, this power comes with a trade-off: complexity of visibility. The "Merged YAML" view in the GitLab Pipeline Editor becomes the most important tool for debugging. Without it, tracing the origin of a specific job configuration through three layers of nested includes and conditional rules would be nearly impossible. The ability to visualize the final resulting YAML is the only safeguard against the "hidden override" problem, where a main file unintentionally wipes out a critical step in an included template.

Sources

  1. GitLab Forum - include only includes the first job
  2. Hifis Workshop - Using Includes
  3. GitLab Documentation - Includes

Related Posts