GitLab CI/CD Configuration Modularization via Multiple Includes

The architectural integrity of a continuous integration and continuous delivery (CI/CD) pipeline depends heavily on its maintainability and scalability. In GitLab CI/CD, the include keyword serves as the primary mechanism for modularizing pipeline configurations. By allowing the separation of the main .gitlab-ci.yml file from specific job definitions, templates, or environment-specific configurations, GitLab enables a "Don't Repeat Yourself" (DRY) approach to DevOps. When a project scales, a single monolithic YAML file becomes an operational liability, leading to merge conflicts and cognitive overload for maintainers. The implementation of multiple includes allows engineers to decompose complex pipelines into smaller, reusable fragments that can be managed independently across different projects or repositories.

The ability to incorporate multiple external files is not merely a convenience but a strategic necessity for enterprise-grade infrastructure. Whether utilizing local files within the same repository, remote files via HTTPS, or centralized templates from a dedicated DevOps project, the include mechanism allows for a tiered configuration strategy. This modularity supports the creation of a "Golden Path" for developers, where common security scans, deployment patterns, and testing frameworks are standardized across an organization, while individual projects retain the flexibility to override specific parameters.

Comprehensive Syntax for Multiple Includes

GitLab provides several distinct methods for incorporating external configuration files. Depending on the source of the configuration and the desired level of flexibility, users can employ different sub-keys within the include array.

The most basic form of inclusion is the direct file reference. A user can specify a single file or an array of files.

For a single file, the syntax can be written as a simple string:
include: 'my-config.yml'

Alternatively, it can be defined as a single-item array:
include: - 'my-config.yml'

To incorporate multiple files, an array of strings is used:
include: - 'https://gitlab.com/awesome-project/raw/main/.before-script-template.yml' - 'templates/.after-script-template.yml'

For more complex requirements, GitLab provides specialized sub-keys to define the source of the YAML content.

The local keyword is used for files within the same repository:
include: - local: 'templates/.after-script-template.yml'

The remote keyword allows for the inclusion of files hosted on external servers via URL:
include: - remote: 'https://gitlab.com/awesome-project/raw/main/.before-script-template.yml'

The template keyword accesses the official GitLab project templates:
include: - template: Auto-DevOps.gitlab-ci.yml

The project keyword is used to pull configurations from other GitLab projects, which is essential for centralized CI management. This syntax allows specifying the project path, the branch or tag (ref), and the specific file:
include: - project: 'my-group/my-project' ref: main file: 'templates/.gitlab-ci-template.yml'

A flexible configuration can combine all these types into a single array:
include: - 'https://gitlab.com/awesome-project/raw/main/.before-script-template.yml' - 'templates/.after-script-template.yml' - template: Auto-DevOps.gitlab-ci.yml - project: 'my-group/my-project' ref: main file: 'templates/.gitlab-ci-template.yml'

Optimized Project Inclusion Syntax

In earlier versions of GitLab, the include:project syntax was limited to a single file per entry. This forced DevOps engineers to repeat the project path and reference for every single file they wanted to include from a central repository. For example, if a user needed two different terraform configurations from a central project, they would have to write:

include: - project: devops/ci-cd/pipelines ref: latest file: terraform/terraform.yml - project: devops/ci-cd/pipelines ref: latest file: terraform/deploy/continuous.yml

This repetition increased the verbosity of the .gitlab-ci.yml file and made it more prone to errors during manual updates. To resolve this, GitLab updated the syntax to allow the file key to accept a list of files. This optimization allows the project and reference to be declared once, followed by multiple files.

The optimized syntax is as follows:
include: - project: devops/ci-cd/pipelines ref: latest file: - terraform/terraform.yml - terraform/deploy/continuous.yml

This change significantly reduces the size of the configuration file and simplifies the process of adding new shared templates to a pipeline.

Nested Includes and Duplicate Handling

GitLab supports nested includes, meaning an included file can itself contain an include section that pulls in further files. This creates a hierarchy of configurations. For instance, a main .gitlab-ci.yml file might include a local file called /.gitlab-ci/another-config.yml, which in turn includes /.gitlab-ci/config-defaults.yml.

A critical aspect of this system is how GitLab handles duplicate inclusions. If a configuration file is included multiple times—either directly in the main file or via different nested paths—it is treated as if it were included only once.

Consider a scenario where a main file includes unit-tests.gitlab-ci.yml and smoke-tests.gitlab-ci.yml. Both of these files independently include defaults.gitlab-ci.yml.

Contents of defaults.gitlab-ci.yml:
default: before_script: default-before-script.sh retry: 2

Contents of unit-tests.gitlab-ci.yml:
include: - template: defaults.gitlab-ci.yml unit-test-job: script: unit-test.sh retry: 0

Contents of smoke-tests.gitlab-ci.yml:
include: - template: defaults.gitlab-ci.yml smoke-test-job: script: smoke-test.sh

In this case, the final merged configuration results in the following logic:

Job Name before_script script retry
unit-test-job default-before-script.sh unit-test.sh 0
smoke-test-job default-before-script.sh smoke-test.sh 2

The unit-test-job overrides the retry value of 2 (from the default) with 0, while the smoke-test-job inherits the default value of 2.

The order of includes is paramount when overrides are involved. If multiple files attempt to define the same keyword or job, the last single inclusion in the sequence takes precedence. This means that the final processed configuration is determined by the order of the include array; later entries override earlier ones.

Variable Integration within Include Sections

The include keyword is not static; it can be dynamically influenced by variables. This allows pipelines to be environment-aware or to load different configurations based on the branch or tag being processed.

GitLab supports several types of variables within the include section:

  • Project variables
  • Group variables
  • Instance variables
  • Project predefined variables

Starting with GitLab 14.2, the $CI_COMMIT_REF_NAME predefined variable is supported. However, users must be aware that when used within an include block, $CI_COMMIT_REF_NAME returns the full reference path (e.g., refs/heads/branch-name) rather than just the branch name.

This variable support enables advanced patterns, such as loading a specific configuration file based on the current git ref, allowing for separate pipeline definitions for main versus develop branches.

Technical Constraints and Known Issues

Despite the power of multiple includes, there are specific technical pitfalls and bugs that engineers must navigate.

The Stages Definition Conflict

A significant issue occurs when multiple included templates define their own stages. In a standard GitLab CI configuration, the stages keyword defines the global order of execution for jobs.

If a user includes multiple files and each of those files defines a stages block, GitLab may fail to merge them correctly. In certain versions, only the last included file's stages definition is respected.

Example of the failure:
Main file:
include: - project: 'my-group/my-project' file: 'test1.yml' - project: 'my-group/my-project' file: 'test2.yml'

test1.yml defines:
stages: - test_stage1

test2.yml defines:
stages: - test_stage2

The resulting error is that the job in test1.yml (which belongs to test_stage1) becomes invalid because the global stages list was overwritten by test2.yml, which only contains test_stage2.

To resolve this, users must manually define the global stages in the primary .gitlab-ci.yml file. By declaring all possible stages at the top level of the main configuration, the included files will map their jobs to the existing global stages without overwriting the list.

Comparative Summary of Include Types

Include Type Source Location Best Use Case Syntax Example
Local Same Repository Project-specific modularization local: 'path/to/file.yml'
Remote External URL (HTTPS) Shared scripts across different GitLab instances remote: 'https://url.com/file.yml'
Template GitLab Internal Standardized official GitLab workflows template: Auto-DevOps.gitlab-ci.yml
Project Other GitLab Project Centralized organizational CI/CD standards project: 'group/proj' file: 'ci.yml'

Implementation Strategy for Pipeline Restructuring

To effectively transition from a monolithic pipeline to a modular one using multiple includes, the following technical steps are recommended.

First, identify common patterns across different jobs. If multiple jobs use the same before_script or retry logic, these should be moved into a defaults.gitlab-ci.yml file.

Second, group jobs by functional domain. For example, all testing jobs can be moved to tests.yml, and all deployment jobs to deploy.yml.

Third, utilize the include keyword in the main .gitlab-ci.yml to pull these files back together.

Example of a restructured main file:
include:
- local: 'ci/defaults.yml'
- local: 'ci/tests.yml'
- local: 'ci/deploy.yml'
- project: 'org/devops-templates' ref: 'v1.0' file: 'security-scan.yml'

By splitting the pipeline, the project benefits from:

  • Reduced file size: The main configuration becomes a high-level map of the pipeline rather than a thousand-line script.
  • Improved readability: New developers can easily find the deployment logic in deploy.yml without scrolling through test configurations.
  • Reusability: The security-scan.yml from the org/devops-templates project can be reused by hundreds of other projects, ensuring a consistent security posture.

Conclusion

The use of multiple includes in GitLab CI/CD transforms the pipeline configuration from a static script into a dynamic, programmable infrastructure. By leveraging local, remote, template, and project includes, organizations can implement a sophisticated hierarchy of configurations that balance global standards with local flexibility. While the system is powerful, it requires a deep understanding of the merge order and the potential for stages overrides to avoid pipeline failures. The transition from a single-file configuration to a modular array of includes is a prerequisite for any project aiming for professional DevOps maturity. Through the strategic use of project-level variables and the optimized file list syntax, teams can minimize repetition and maximize the agility of their software delivery lifecycle.

Sources

  1. GitLab Issue 348979
  2. GitLab Issue 26793
  3. GitLab Documentation - Includes (JP)
  4. GitLab Documentation - CI/CD Configuration from other files
  5. Hifis Workshop - Using Includes

Related Posts