Orchestrating Complex Pipelines via Multi-File GitLab CI Configurations

The architecture of a modern Continuous Integration and Continuous Deployment (CI/CD) pipeline often begins as a simple, monolithic .gitlab-ci.yml file. However, as projects evolve, the complexity of the automation logic grows, leading to files that span hundreds or even thousands of lines. This growth often results in a loss of structural clarity, making the pipeline difficult to maintain, audit, and scale. The solution to this systemic complexity is the transition from a single-file configuration to a multi-file architecture. By leveraging the include keyword and strategic project settings, engineers can decompose their pipelines into modular, reusable, and instance-specific components. This approach not only improves readability but also enables the reuse of common pipeline definitions across multiple projects and different GitLab instances, ensuring a consistent standard of quality across an entire organization's software delivery lifecycle.

Managing Pipelines Across Multiple GitLab Instances

When organizations employ repository mirroring, they frequently encounter a scenario where the same codebase is hosted across multiple GitLab instances (e.g., a public GitLab.com instance and several private, self-managed instances). In these environments, the pipeline requirements often diverge significantly. A primary catalyst for this divergence is the use of GitLab Runners. Because it is highly unlikely that two separate instances utilize the same runner tags, attempting to maintain a single configuration file for all instances becomes an operational impossibility. This is further complicated by the fact that, as of April 2021, GitLab does not support conditional runner tags (such as runner1 OR runner2).

To resolve these discrepancies, engineers can adopt two primary strategies: the Single YAML File approach and the Multiple YAML File approach.

The Single YAML File Strategy

It is possible to maintain a single .gitlab-ci.yml file that dynamically adapts its behavior based on the instance where the pipeline is triggered. This is achieved by utilizing the CI_SERVER_HOST predefined variable in conjunction with GitLab's rules syntax.

The logic relies on the fact that GitLab evaluates rules in a specific order until a match is found. To implement this, developers define a rule that explicitly excludes a job if the CI_SERVER_HOST does not match the expected server address. By using when: never, the job is skipped on all instances except the intended target.

The following configuration demonstrates how to isolate jobs for different sites:

```yaml
global-job:
script:
- make test

.rules-siteA:
rules:
- if: '$CISERVERHOST != "gitlab.siteA.example.com"'
when: never

siteA-job:
extends: .rules-siteA
script:
- make test-siteA

.rules-siteB:
rules:
- if: '$CISERVERHOST != "gitlab.siteB.example.com"'
when: never

siteB-job:
extends: .rules-siteB
script:
- make test-siteB
```

In this architecture, the .rules-siteA and .rules-siteB blocks act as hidden templates (indicated by the leading dot) that are extended by the actual jobs. This ensures that siteA-job only executes on the Site A server and siteB-job only executes on the Site B server.

The Multiple YAML File Strategy

When the differences between instances are too vast for a single file to manage, the most supportable method is to create completely separate pipeline files for each GitLab instance. This eliminates the need for complex conditional logic and prevents the .gitlab-ci.yml file from becoming an unmanageable "spaghetti" of rules.

GitLab allows the CI configuration path to be customized on a per-project basis. This can be configured by navigating to Settings -> CI/CD -> General pipelines and defining a Custom CI configuration path.

A typical project structure for this approach would look like this:

  • .gitlab-ci.yml (Used for GitLab.com)
  • siteA.gitlab-ci.yml (Used for Site A's instance)
  • siteB.gitlab-ci.yml (Used for Site B's instance)

By assigning siteA.gitlab-ci.yml as the configuration path in the Site A instance settings, that specific instance will ignore the default file and execute the logic tailored to its specific environment and runner tags.

The Mechanics of the Include Keyword

The include keyword is the fundamental building block for modular CI/CD. It allows a main configuration file to import external YAML files, which can then be merged into the final pipeline configuration. This functionality is available across all tiers, including Free, Premium, and Ultimate, and is supported on GitLab.com, GitLab Self-Managed, and GitLab Dedicated.

Syntax for Single File Inclusion

For simple modularity, a single configuration file can be included using two different syntax styles.

The same-line syntax:

yaml include: 'my-config.yml'

The array-style syntax:

yaml include: - 'my-config.yml'

Advanced Array-Based Inclusions

When a pipeline requires multiple external configurations, an array is used. This allows the mixing of different include types, such as remote files, local files, and predefined templates.

The following example demonstrates a hybrid inclusion strategy:

yaml include: - 'https://gitlab.com/awesome-project/raw/main/.before-script-template.yml' - 'templates/.after-script-template.yml' - template: Auto-DevOps.gitlab-ci.yml - project: 'my-group/my-project' ref: main file: 'templates/.gitlab-ci-template.yml'

The impact of this flexibility is significant. By using the project keyword, teams can centralize their CI logic in a dedicated " DevOps-Templates" repository. This means that if a company updates a security scanning script in one central project, every project including that file will automatically receive the update without requiring manual changes to each individual repository.

Specific Include Types

GitLab provides four primary ways to include files, each serving a different operational need:

  • Local: This targets files within the same repository. It is ideal for splitting a large pipeline into logical stages.
  • Remote: This imports a file from a public or private URL. This is useful for sharing configurations across different organizations.
  • Template: This imports a template provided by GitLab, such as Auto-DevOps.gitlab-ci.yml.
  • Project: This allows the inclusion of a file from a different project on the same GitLab instance.

Structural Decomposition and Pipeline Organization

As pipelines grow to several hundred lines, they often lose their conceptual structure. A best-practice approach is to split the pipeline by stage. This prevents the main .gitlab-ci.yml from becoming a bottleneck and allows developers to locate specific job definitions quickly.

Stage-Based Splitting

A highly organized pipeline can be divided into files corresponding to the logical flow of the software delivery process. For example:

  • .gitlab/ci/lint.gitlab-ci.yml: Contains all jobs related to the lint stage.
  • .gitlab/ci/test.gitlab-ci.yml: Contains all jobs related to the test stage.
  • .gitlab/ci/run.gitlab-ci.yml: Contains all jobs related to the run stage.
  • .gitlab/ci/deploy.gitlab-ci.yml: Contains all jobs related to the deploy stage.

These files are then aggregated in the main .gitlab-ci.yml using the include:local sub-key:

yaml include: - local: '.gitlab/ci/lint.gitlab-ci.yml' - local: '.gitlab/ci/test.gitlab-ci.yml' - local: '.gitlab/ci/run.gitlab-ci.yml' - local: '.gitlab/ci/deploy.gitlab-ci.yml'

This separation creates a clean entry point for the pipeline. A developer needing to modify a deployment script no longer needs to scroll through 500 lines of linting and testing code; they can navigate directly to deploy.gitlab-ci.yml.

Nested Includes and Configuration Overrides

GitLab supports nested includes, meaning a file included by the main .gitlab-ci.yml can itself include another file. This allows for the creation of a hierarchy of configurations, moving from general defaults to specific project requirements.

The Depth of Nesting

Consider a scenario where a configuration is nested three levels deep:

  1. .gitlab-ci.yml includes /.gitlab-ci/another-config.yml.
  2. /.gitlab-ci/another-config.yml includes /.gitlab-ci/config-defaults.yml.
  3. /.gitlab-ci/config-defaults.yml defines the default section.

Example of /.gitlab-ci/config-defaults.yml:

yaml default: after_script: - echo "Job complete."

This hierarchical structure allows a centralized team to define global after_script or before_script behaviors that are automatically propagated through all nested levels.

Overriding and Precedence

When the same configuration file is included multiple times, or when an included file contains a default section that is also defined in the main file, GitLab applies a specific override logic. The last time a configuration is included is the one that takes precedence.

For example, if you have a defaults.gitlab-ci.yml file:

yaml default: before_script: echo "Default before script"

And a unit-tests.gitlab-ci.yml file that includes those defaults but overrides them:

```yaml
include:
- template: defaults.gitlab-ci.yml
default:
before_script: echo "Unit test default override"

unit-test-job:
script: unit-test.sh
```

If the main .gitlab-ci.yml includes both unit-tests.gitlab-ci.yml and smoke-tests.gitlab-ci.yml, the order of these includes determines the final state of the default section. The final configuration is a merge of all included files, where the last definition of a key overrides previous definitions.

GitLab CI/CD Templates and Reuse

GitLab provides a vast library of reference templates that can be utilized to accelerate pipeline development. These templates offer sophisticated, pre-tested configurations for a variety of languages and tools.

Implementation of Templates

There are two primary ways to utilize these templates:

  • As-is: Use the include:template sub-key to bring in a template that works out of the box.
  • Modified: Copy a template into a local YAML file, modify it to fit specific project needs, and include it via include:local.

While templates provide a significant head start, it is critical to be aware that GitLab-provided templates may change over time. If a project relies on a specific version of a template, creating a local copy is a safer strategy to ensure pipeline stability.

Summary Comparison of Configuration Methods

The following table provides a technical comparison of the different methods for managing multi-instance and multi-file GitLab CI configurations.

Method Primary Use Case Implementation Mechanism Complexity Maintenance Effort
Single YAML Mirrored repos with slight differences CI_SERVER_HOST + rules Medium High (due to rule growth)
Custom Path Vastly different instance requirements Settings -> CI/CD -> General pipelines Low Low
Local Include Pipeline organization and readability include:local Low Very Low
Project Include Cross-project standardization include:project Medium Low (centralized updates)
Nested Include Hierarchical configuration management Multi-level include chains High Medium

Conclusion

The transition from a monolithic .gitlab-ci.yml to a modular, multi-file architecture is a necessity for any professional DevOps operation. By utilizing the include keyword, teams can transform their CI/CD logic into a series of discrete, manageable components. Whether it is through the use of CI_SERVER_HOST to handle mirrored instances, the use of custom configuration paths for site-specific needs, or the implementation of stage-based local includes for better organization, the goal is the same: the reduction of cognitive load for the maintainer and the increase in reliability for the pipeline.

The ability to nest includes and override defaults provides a powerful mechanism for creating "gold images" of pipeline configurations that can be shared across an entire organization. This ensures that security scans, linting standards, and deployment gates are applied consistently, regardless of the project. Ultimately, the mastery of these advanced GitLab CI features allows an organization to scale its automation without sacrificing the maintainability of its codebase.

Sources

  1. Multi GitLab Project Guide
  2. Using Includes - HiFiS Workshop
  3. GitLab CI/CD YAML Include Documentation

Related Posts