Integrating GitLab Runners with Azure Virtual Machine Scale Sets

The synergy between GitLab and Microsoft Azure creates a powerhouse for modern software delivery, merging a comprehensive web-based Git repository manager with a world-class cloud computing platform. GitLab provides the essential orchestration tools and source code management, while Azure delivers the raw computational power, storage, and sophisticated networking required to execute complex build and deployment pipelines. By integrating a GitLab runner—the specific software agent responsible for picking up and executing jobs—into an Azure environment, organizations can transition from static build servers to a dynamic, scalable infrastructure. This architecture allows teams to leverage the inherent scalability and reliability of the cloud, ensuring that development velocity is never hindered by hardware bottlenecks or queue congestion during peak commit cycles.

Understanding the GitLab Runner Ecosystem

A GitLab runner is a critical piece of software that acts as the execution arm of the GitLab CI/CD pipeline. When a developer pushes code or manually triggers a pipeline, the GitLab instance identifies the required jobs and places them in a queue. The runner is the agent that polls this queue, claims the job, and executes the scripts defined in the .gitlab-ci.yml file.

There are three primary classifications of runners, each serving a distinct organizational need:

  • Shared runners: These are provided globally by GitLab and are accessible to all projects within an instance. They are ideal for general tasks but may lack the specific environment configurations required for specialized builds.
  • Specific runners: These are dedicated to a single project. They are defined within the project's specific configuration and are often used when a project requires a unique environment or high-security constraints.
  • Group runners: These are shared across multiple projects within a specific group, offering a middle ground between shared and specific runners for team-based resource management.

The choice of executor is also pivotal. For instance, the Kubernetes executor is highly effective for large-scale deployment tasks and CI/CD pipelines that require significant resource isolation, allowing jobs to be run in ephemeral pods. However, for those requiring deep integration with Azure's infrastructure, the Virtual Machine Scale Set (VMSS) approach provides a robust alternative for horizontal scaling.

Azure Infrastructure Requirements and Prerequisites

Before initiating the deployment of a GitLab runner on Azure, certain environmental and administrative prerequisites must be satisfied to ensure a seamless integration.

Required Tooling and Accounts

The deployment process relies on a combination of command-line interfaces and active accounts:

  • Azure CLI: This must be installed and properly configured to allow the deployment scripts to communicate with the Azure API.
  • Azure Developer CLI (azd): This tool is required for the streamlined deployment of the infrastructure samples.
  • GitLab Account: A valid account with a project already created where the runner will be registered.
  • Azure Subscription: A subscription with sufficient quota to provision the necessary Virtual Machines (VMs) and Scale Sets.
  • Basic Technical Knowledge: Familiarity with Azure networking, GitLab CI/CD concepts, and Infrastructure as Code (IaC) principles.

Registration Token Acquisition

A critical security component of the registration process is the runner registration token. This token acts as the secret handshake between the Azure VM and the GitLab instance. To obtain this token, the following administrative path must be followed:

  1. Navigate to the specific GitLab project (for example, https://gitlab.com/your-username/your-project).
  2. Access the "Settings" menu and select "CI/CD".
  3. Expand the "Runners" section.
  4. Click the "New project runner" button.
  5. Configure the runner settings, including the operating system (Linux or Windows) and optional tags such as azure or vmss.
  6. Decide whether to check "Run untagged jobs", which determines if the runner picks up jobs without specific tags.
  7. Click "Create runner" and securely copy the registration token, which typically starts with the prefix glrt-.

Architecture of the GitLab Runner VMSS Solution

The deployment using the Azure-Samples/Gitlab-Runner-VMSS approach utilizes a sophisticated "Manager and Worker" architecture to handle auto-scaling.

The Manager VM Role

The Manager VM serves as the brain of the operation. It runs the GitLab Runner software configured with the "instance" executor and the Azure autoscaler plugin. Its primary responsibilities include:

  • Registration: Utilizing the glrt- token to authenticate and register itself with the GitLab instance.
  • Job Polling: Continuously monitoring the GitLab API for any pending jobs assigned to it.
  • Auto-scaling Orchestration: When the manager detects queued jobs, it triggers the Azure autoscaler to spin up new instances within the Virtual Machine Scale Set.

The VMSS Worker Cycle

The Virtual Machine Scale Set (VMSS) provides the actual compute resources where the jobs are executed. The lifecycle of a job in this environment follows a specific flow:

  • Job Distribution: Once the VMSS instances are active, the manager distributes the pending jobs to the available workers.
  • Job Execution: The workers execute the build, test, or deploy scripts.
  • Scale Down: To optimize costs, the system monitors idle time. After the jobs complete and the specified idle time expires, the instances are automatically terminated.

Deployment Implementation Process

The deployment can be achieved through two primary methods: a direct manual installation on a single Azure VM or a managed deployment via the VMSS sample repository.

Method 1: Manual Installation on Azure VM

For simpler requirements or static resource needs, the runner can be installed directly on a Linux virtual machine.

  1. Repository Setup: Install the official GitLab runner repository using the following command:
    curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bas
  2. Software Installation: Install the runner package via the package manager:
    sudo apt-get install gitlab-runnerh
  3. Registration: Initiate the registration process:
    gitlab-runner register
  4. Configuration: Provide the GitLab instance URL and the registration token when prompted, along with the runner description and relevant tags.
  5. Execution: Start the runner service:
    gitlab-runner start

Method 2: Automated VMSS Deployment

For high-availability and scalable environments, the Azure-Samples repository is utilized.

  1. Clone the repository:
    git clone https://github.com/Azure-Samples/Gitlab-Runner-VMSS.git
  2. Navigate into the project directory:
    cd Gitlab-Runner-VMSS
  3. Execute the deployment via the Azure Developer CLI (azd) or the provided scripts, ensuring the gitlabToken is correctly supplied.

Configuration Parameters and Defaults

The VMSS solution supports a variety of parameters to customize the deployment according to the specific needs of the project.

Parameter Description Default Required
appName Environment name used for resource naming - Yes
location The Azure region for deployment - Yes
gitlabToken The registration token from GitLab - Yes
runnerType Operating system for the runner Linux No
vnetAddressSpace Virtual network CIDR block 10.0.0.0/16 No*
subnetAddressSpace Subnet CIDR block 10.0.1.0/24 No*
existingVnetId ID of an existing Virtual Network - No
existingSubnetId ID of an existing subnet - No

*Note: vnetAddressSpace and subnetAddressSpace are required if a new virtual network is being created during deployment.

Autoscaler Default Settings

The autoscaling behavior is governed by settings found in scripts/configure-manager-vm.sh, which can be tuned for performance or cost:

  • Max instances: 10 (The maximum number of VMs that can be created).
  • Idle count: 1 (The minimum number of instances kept running).
  • Idle time: 20 minutes (The duration a VM remains active after a job finishes before being terminated).
  • Capacity per instance: 1 job per VM.

Validation and Monitoring

After deployment, it is essential to verify that the runner is correctly communicating with the GitLab instance.

Verification Steps in GitLab

  1. Navigate to the GitLab project.
  2. Go to Settings → CI/CD → Runners.
  3. Confirm that the runner is listed with a green "online" indicator.
  4. Ensure the description matches the pattern azure-vmss-runner-{environment-name}.

Functional Testing

To confirm the runner is executing jobs correctly, create a .gitlab-ci.yml file with the following content:

yaml test-runner: script: - echo "Hello from Azure VMSS runner!" - uname -a

Commit and push this file, then navigate to CI/CD → Pipelines to verify the job execution.

Azure Portal Monitoring

Users should monitor the resource group named rg-{your-environment-name} in the Azure Portal to verify:

  • The Manager VM is in a "Running" state.
  • The VMSS is created (it may show 0 instances when no jobs are queued).
  • Network Security Groups (NSGs) are correctly configured.
  • The Activity Log contains no deployment errors.

To quickly open the resource group in the browser, the following command can be used:
az group show --name rg-{your-environment-name} --query id -o tsv | xargs -I {} open "https://portal.azure.com/#@/resource{}"

Troubleshooting and Technical Recovery

If the runner does not appear online in GitLab, a systematic troubleshooting approach is required.

Manager VM Diagnostics

The first step is to access the Manager VM via SSH using the IP address found in the Azure Portal:
ssh azureuser@{manager-vm-ip}

Once connected, execute the following commands to verify the service status:

  • Verify registration:
    sudo gitlab-runner verify
  • Check systemd status:
    sudo systemctl status gitlab-runner
  • Inspect real-time logs:
    sudo journalctl -u gitlab-runner -f

Network and Infrastructure Analysis

Network connectivity issues can prevent the runner from polling the GitLab instance. Verify connectivity with:
curl -I https://gitlab.com

If the deployment itself failed, check the Azure Portal:
1. Navigate to the resource group.
2. Click on "Deployments" in the left-hand menu.
3. Review the failed deployment logs for specific error messages.

Conclusion

The implementation of a GitLab runner on Azure, particularly through the use of Virtual Machine Scale Sets, transforms the CI/CD pipeline from a static resource into a dynamic, elastic utility. By utilizing a Manager VM to orchestrate the scaling of worker instances, organizations can ensure that their build capacity expands and contracts in real-time based on actual demand. This prevents the "bottleneck" effect common in traditional build servers and eliminates the waste of paying for idle compute resources. The integration of the "instance" executor and the Azure autoscaler plugin allows for a sophisticated hand-off between the GitLab job queue and Azure's infrastructure, resulting in a streamlined development lifecycle that significantly accelerates software delivery. Whether utilizing a manual installation for small projects or a full VMSS deployment for enterprise-grade scaling, the combination of GitLab's orchestration and Azure's cloud agility provides a robust foundation for modern DevOps practices.

Sources

  1. Understanding GitLab Runner - Nikila Fernando
  2. GitLab Runner VMSS - Azure Samples

Related Posts