GitLab Orchestration on Amazon EC2

The integration of GitLab with Amazon Elastic Compute Cloud (EC2) represents a sophisticated synergy between a comprehensive DevOps platform and a scalable infrastructure-as-a-service (IaaS) provider. Deploying GitLab on AWS EC2 is not a singular task but a spectrum of architectural choices ranging from single-box installations for small teams to highly automated, scalable runner fleets for global enterprises. This ecosystem allows organizations to leverage the full power of Continuous Integration and Continuous Deployment (CI/CD) by utilizing the compute capabilities of AWS to execute pipeline jobs and host the GitLab instance itself. By transitioning from manual setups to Infrastructure-as-Code (IaC), enterprises can eliminate the time-consuming nature of provisioning infrastructure, installing software, and configuring runners, instead achieving a repeatable and consistent deployment pattern.

Provisioning GitLab Instances on AWS

When an organization decides to host its own GitLab instance on AWS, the primary objective is often to balance speed of deployment with the level of control required over the underlying operating system and configuration. GitLab offers two primary pathways for this deployment, catering to different levels of technical requirement and licensing needs.

The first option is the Marketplace subscription. This path is designed for teams that require a rapid start with a high-tier license. GitLab provides a 5-user subscription via the AWS Marketplace, which allows teams to launch an Ultimate licensed instance almost immediately. The operational advantage here is the integration of billing; users can maintain continued AWS billing for their license. Furthermore, this model is flexible, as it can be upgraded to any other GitLab licensing tier through an AWS Marketplace Private Offer. A critical technical detail for administrators is that no migration is necessary when moving from the initial Marketplace subscription to a larger, non-time-based license from GitLab, and per-minute licensing is automatically removed upon the acceptance of a private offer.

The second option involves the use of official GitLab Amazon Machine Images (AMIs). These AMIs are produced during the regular release process of GitLab, ensuring that the software is optimized for the AWS environment. These images serve two distinct purposes:

  • They can be used for a standard single-instance GitLab installation.
  • They can be specialized for specific GitLab service roles, such as a Gitaly server, by modifying the /etc/gitlab/gitlab.rb configuration file.

From a technical perspective, the official AMIs are built upon the Amazon-prepared Ubuntu AMI and are available for both x86 and ARM architectures. This provides flexibility in choosing the EC2 instance type to optimize for cost or performance. A unique security characteristic of these official AMIs is the root password mechanism: the root password for the instance is set to the EC2 Instance ID. This is a specific behavior exclusive to official GitLab published AMIs and is not found in other image types.

Regarding licensing, these installations typically start as either the open-source Community Edition (CE) or the Free Enterprise Edition (EE). It is generally recommended to start with the Enterprise Edition because it provides the most seamless upgrade path to paid tiers. If a user starts with the Community Edition but later decides to subscribe to a Premium or Ultimate plan, a migration to the Enterprise Edition is mandatory.

Architectural Foundations and Network Configuration

A robust GitLab deployment on AWS requires a carefully planned network topology to ensure security, availability, and scalability. The recommended architecture moves beyond a simple instance and incorporates a variety of AWS services to handle different layers of the application stack.

The networking foundation begins with the creation of a Virtual Private Cloud (VPC). This provides a logically isolated section of the AWS Cloud that gives the administrator complete control over the network environment. For a standard GitLab deployment, the VPC is typically named gitlab-vpc and configured with an IPv4 CIDR block of 10.0.0.0/16. A critical step in this process is enabling DNS resolution within the VPC settings to ensure that the GitLab instance can resolve internal and external addresses.

Within this VPC, the architecture utilizes subnets distributed across at least two Availability Zones (AZs) to ensure high availability. This design includes:

  • Public subnets: These are required for components that must be accessible from the internet, such as the Network Load Balancer. These subnets must be associated with an Internet Gateway and have a corresponding Route Table to direct traffic.
  • Private subnets: These host the actual GitLab instances, database servers, and cache layers, shielding them from direct internet exposure.

The comprehensive architecture integrates the following AWS services:

Service Purpose in GitLab Architecture Pricing Model
EC2 Hosts the GitLab application on shared hardware On-demand (or Reserved/Dedicated)
S3 Stores backups, LFS objects, and CI/CD artifacts S3 Standard/Intelligent Tiering
NLB Routes incoming network traffic to GitLab instances Network Load Balancer hourly/data
RDS Manages the PostgreSQL relational database RDS Instance pricing
ElastiCache Provides a Redis environment for in-memory caching Cache node hourly rate

For security and identity management, an IAM EC2 instance role and profile are mandatory. Because GitLab utilizes Amazon S3 for object storage, the EC2 instances require read, write, and list permissions. By using an IAM Role (such as GitLabS3Access), administrators avoid the security risk of embedding static AWS access keys within the GitLab configuration files.

GitLab Runner Automation on EC2

The GitLab Runner is the agent that executes the jobs defined in the .gitlab-ci.yml file. In an enterprise environment, manually installing runners is inefficient and prone to configuration drift. To solve this, GitLab Runner deployment is transitioned to an Infrastructure-as-Code (IaC) model.

By utilizing IaC, organizations can deploy the entire runner architecture through scripts, ensuring that every runner is configured identically. This approach allows for the enforcement of guardrails and best practices directly within the code. One of the most significant benefits of this automation is the implementation of autoscaling. This ensures that EC2 resources are only active when pipeline jobs are running and are terminated when idle, which directly reduces operational costs.

When configuring the runner's identity and access, the process involves creating a specific IAM role. This is done by selecting EC2 as the use case, attaching a predefined policy such as gl-s3-policy, and naming the role (e.g., GitLabS3Access). This role is subsequently used within a launch template to ensure that any single runner instance launched via autoscaling has the necessary permissions to interact with the project's S3 buckets.

Furthermore, GitLab supports the AWS Instance Metadata Service Version 2 (IMDSv2). This is a more secure method of retrieving instance metadata compared to Version 1. GitLab is designed to automatically use IMDSv2 when available, only falling back to IMDSv1 if necessary. Consequently, administrators can safely require IMDSv2 on all EC2 instances to enhance the security posture of the environment.

Deploying Applications to EC2 via GitLab CI/CD

GitLab provides integrated tooling to facilitate the deployment of applications from a CI/CD pipeline onto EC2 instances. This is achieved through a combination of specialized Docker images and pipeline templates.

Authentication and Connectivity

Before any deployment can occur, GitLab must be authenticated with AWS. The most common method involves the creation of an IAM user with the necessary permissions. From the AWS Security credentials menu, an Access Key ID and Secret Access Key are generated. These are then stored in the GitLab project under Settings > CI/CD as protected environment variables:

  • AWS_ACCESS_KEY_ID: The public identifier for the IAM user.
  • AWS_SECRET_ACCESS_KEY: The private key used for signing requests.
  • AWS_DEFAULT_REGION: The specific AWS region where the resources are located.

For users seeking higher security, GitLab supports the use of ID tokens and OpenID Connect (OIDC). This method is superior to storing long-lived credentials in variables, although it requires a different configuration path than the standard variable-based guidance.

Utilizing the EC2 Deployment Template

GitLab offers a specific template called AWS/CF-Provision-and-Deploy-EC2 to automate the deployment process. When this template is utilized and the corresponding JSON objects are configured, the pipeline executes a three-stage process:

  1. Infrastructure Provisioning: The pipeline uses the AWS CloudFormation API to create the necessary stack.
  2. Artifact Management: During the build phase, the pipeline generates an artifact and pushes it to an AWS S3 bucket.
  3. Application Deployment: The content is then deployed from S3 onto the target AWS EC2 instance.

To implement this, the user must create two specific JSON configurations. The first is for the CloudFormation stack. The second is for the S3 push, which must include:

json { "applicationName": "string", "source": "string", "s3Location": "s3://your/bucket/project_built_file..." }

In this configuration, the source field must precisely match the location where the previous build job stored the application.

Advanced Deployment Patterns and ECS Integration

While EC2 is a primary target, GitLab also provides deep integration for Amazon Elastic Container Service (ECS). This is managed through the AWS/Deploy-ECS.gitlab-ci.yml template.

This ECS template is a composite structure that includes two other templates: Jobs/Build.gitlab-ci.yml and Jobs/Deploy/ECS.gitlab-ci.yml. A critical operational requirement is that these sub-templates must not be included individually. Only the main AWS/Deploy-ECS.gitlab-ci.yml template should be referenced. The reason for this is that the sub-templates are designed for internal use and may change or move unexpectedly. Additionally, the job names within these templates are subject to change; attempting to override these job names in a custom pipeline will result in the override failing when the underlying template is updated by GitLab.

For those managing the rollout of ECS services, GitLab provides a mechanism to control the waiting behavior of the deployment. By default, the pipeline may wait for the rollout to complete. If a user wishes to disable this behavior, they must set the variable CI_AWS_ECS_WAIT_FOR_ROLLOUT_COMPLETE_DISABLED to any non-empty value.

Detailed Analysis of Deployment Strategies

The choice between a single-box installation and a distributed architecture on AWS depends on the organization's scale and risk tolerance. A single-box installation using an AMI is ideal for rapid prototyping or very small teams, as it minimizes the complexity of managing multiple AWS services like RDS and ElastiCache. However, this creates a single point of failure.

In contrast, the recommended architecture employing a Network Load Balancer (NLB) and RDS ensures that the GitLab instance is highly available and the data is durable. The use of the NLB allows for the distribution of traffic across multiple EC2 instances, while RDS handles the database overhead, which is often the primary bottleneck in GitLab performance.

The shift toward using GitLab Runners on EC2 via IaC represents a transition from "pet" servers to "cattle." By automating the runner lifecycle, organizations can scale their compute power horizontally based on the current queue of jobs. This not only optimizes cost but also ensures that each job starts in a clean, consistent environment, eliminating the "it works on my machine" problem that often plagues manual runner configurations.

The security integration through IMDSv2 and IAM roles further matures the deployment. By eliminating the need for static keys on the EC2 instances and relying on temporary, role-based credentials, the attack surface is significantly reduced. This is particularly important for GitLab instances that handle sensitive source code and deployment keys.

Sources

  1. GitLab Cloud Deployment
  2. Deploy and Manage Gitlab Runners on Amazon EC2
  3. Provision GitLab on a single EC2 instance in AWS
  4. GitLab Installation on AWS

Related Posts