Orchestrating GitLab CI/CD Pipelines on Amazon EC2 Infrastructure

The integration of GitLab CI/CD with Amazon Elastic Compute Cloud (EC2) represents a sophisticated intersection of version control, automated pipeline orchestration, and scalable cloud infrastructure. At its core, this architecture leverages the GitLab CI tool to automate the continuous integration, continuous delivery, and deployment processes, ensuring that software transitions from a codebase to a live production environment with minimal manual intervention. A functional GitLab CI/CD pipeline is fundamentally comprised of two critical components: the .gitlab-ci.yml file, which serves as the blueprint describing the pipeline's jobs, and the GitLab Runner, the actual application responsible for executing those jobs.

For the modern enterprise, the manual provisioning of GitLab Runners is an inefficient and time-consuming process. It traditionally requires the manual orchestration of infrastructure, the installation of specific software dependencies to handle diverse workloads, and the meticulous configuration of the runner to communicate with the GitLab instance. When an organization manages hundreds of pipelines across multiple disparate environments, this manual approach becomes a bottleneck. To resolve this, Infrastructure-as-Code (IaC) is utilized to automate the deployment and administration of GitLab Runners on Amazon EC2. By treating infrastructure as software, organizations can deploy the entire runner architecture via scripts, ensuring that every instance is repeatable and consistent. This approach allows for the enforcement of organizational guardrails and best practices directly within the code, while simultaneously enabling autoscaling capabilities to terminate unused resources and optimize operational expenditures.

Infrastructure Foundation and Network Topology

Establishing a secure and scalable environment for GitLab and its associated runners requires a structured approach to networking within Amazon Web Services. The primary step involves the creation of a Virtual Private Cloud (VPC), which provides a logically isolated section of the AWS Cloud.

To initiate this, a VPC is created with a specific name, such as gitlab-vpc, and assigned an IPv4 CIDR block, typically 10.0.0.0/16. This block defines the range of private IP addresses available within the network. Once the VPC is established, DNS resolution must be enabled via the VPC settings to ensure that internal service discovery and naming conventions function correctly across the infrastructure.

The network architecture must then be expanded through the creation of subnets distributed across at least two Availability Zones (AZs). This geographic distribution ensures high availability; if one AWS data center experiences an outage, the infrastructure remains operational in another zone. The architecture distinguishes between public and private subnets:

Public subnets are designed for resources that require direct internet access, such as load balancers or bastion hosts, and they necessitate a Route Table and an associated Internet Gateway.
Private subnets are utilized for sensitive components, such as the Gitaly Cluster, ensuring they are not directly reachable from the public internet, thereby reducing the attack surface.

Advanced Component Orchestration: Gitaly and Storage

Gitaly serves as the high-level RPC access layer for Git repositories within the GitLab ecosystem. To overcome the limitations of standard storage and ensure high performance, a Gitaly Cluster (utilizing Praefect) is recommended. This service must be hosted on a separate EC2 instance located within one of the previously configured private subnets.

The deployment of a Gitaly instance involves the following specifications:

AMI Selection: The latest Ubuntu Server LTS (HVM) with SSD Volume Type is utilized, though users should always verify the latest supported OS version in the official GitLab documentation.
Instance Sizing: A m5.xlarge instance type is selected to provide the necessary compute and memory for repository operations.
Security Group Configuration: A dedicated security group, such as gitlab-gitaly-sec-group, must be created. This group requires a custom TCP rule allowing traffic on port 8075, which is the primary port for Gitaly communication.
Network Placement: The instance is placed in a private subnet (e.g., gitlab-private-10.0.1.0) with "Auto-assign Public IP" disabled to maintain strict internal isolation.
Access Management: A dedicated key pair, such as gitaly.pem, is generated and saved for secure SSH access.

GitLab Runner Deployment via Infrastructure-as-Code

Automating the GitLab Runner deployment on EC2 eliminates the inconsistency of manual setups. By utilizing IaC, the deployment process becomes a scriptable event, allowing for rapid scaling and version-controlled configuration changes.

The process begins with the creation of a custom Amazon Machine Image (AMI). This is achieved by selecting "Create image" from the Actions menu in the EC2 dashboard and naming it GitLab-Source. This custom AMI captures the pre-configured software environment needed for the runners.

Once the AMI is ready, a Launch Template is configured to standardize the deployment of runners within an Auto Scaling Group:

Template Name: gitlab-launch-template.
AMI Selection: The GitLab-Source custom AMI.
Instance Type: A minimum of c5.2xlarge is recommended to handle the computational demands of pipeline workloads.
Key Pair: A new key pair named gitlab-launch-template.pem is created for administrative access.
Storage: The root volume is set to 8 GiB, which is sufficient given that primary data is not stored on the root volume.

To provide the runner with the necessary permissions to interact with other AWS services, an IAM role is created. This involves selecting the EC2 use case and attaching a specific policy, such as gl-s3-policy. The resulting role, GitLabS3Access, is then associated with the launch template. For enhanced security, GitLab supports AWS Instance Metadata Service Version 2 (IMDSv2). The system is designed to automatically utilize IMDSv2 when available, falling back to IMDSv1 only if necessary, which allows administrators to safely require IMDSv2 on all EC2 instances.

Implementing the CI/CD Pipeline for EC2 Deployment

Deploying an application to EC2 via GitLab CI/CD requires a strategic approach to the pipeline definition. GitLab provides a specific template, AWS/CF-Provision-and-Deploy-EC2, to facilitate this process.

The operational flow of this template involves three primary stages:

Infrastructure Provisioning: The pipeline utilizes the AWS CloudFormation API to create the necessary stack based on defined JSON objects.
Artifact Management: Upon a successful build, the pipeline creates an artifact and pushes it to a designated AWS S3 bucket.
Application Deployment: The content is then deployed from the S3 bucket onto the target AWS EC2 instance.

To configure this template, users must provide specific JSON objects. For the S3 push configuration, the JSON must include:

json { "applicationName": "string", "source": "string", "s3Location": "s3://your/bucket/project_built_file...]" }

In this context, the source attribute specifies the location where the build job generated the application files.

It is critical to adhere to template inclusion rules. The AWS/Deploy-ECS.gitlab-ci.yml template includes two sub-templates: Jobs/Build.gitlab-ci.yml and Jobs/Deploy/ECS.gitlab-ci.yml. These sub-templates are designed exclusively for use within the main template and must not be included independently. Furthermore, users are cautioned against overriding the job names within these templates, as these names are subject to change; such overrides would cause the pipeline to fail when the underlying template is updated.

For those utilizing ECS-based deployments, a specific behavior exists where the pipeline waits for the rollout to complete. To disable this behavior, the environment variable CI_AWS_ECS_WAIT_FOR_ROLLOUT_COMPLETE_DISABLED must be set to a non-empty value.

Senior-Level Deployment Strategy and Troubleshooting

A production-grade deployment to EC2 transcends simple script execution and requires a focus on idempotency and safety. A senior-level approach to deploying via GitLab CI/CD involves a multi-stage strategy:

Docker Integration: The process starts with the creation of a Dockerfile to containerize the application.
Pipeline Architecture: A multi-stage pipeline is configured to build the image and push it to a secure registry.
Secret Management: SSH keys and host information are stored as CI/CD variables within GitLab to avoid hardcoding sensitive data.
Deployment Execution: SSH is used in the deploy stage to trigger the update on the EC2 instance.
Safe Replacement: The deployment process must handle container replacement safely to avoid downtime.

During the deployment process, common conflicts may arise, such as Nginx port conflicts. For example, if a port is already occupied, a command such as systemctl stop nginx may be required to clear the path for the new deployment. To ensure a successful rollout, the deployment must be idempotent, meaning that running the deployment script multiple times results in the same state without causing errors.

For troubleshooting the GitLab Runner on EC2, administrators should connect to the instance and examine the CloudFormation log files located at /var/log/cfn-*.log. These logs provide critical insights into why a stack may have failed to provision or why a runner is not registering correctly.

Summary of Technical Specifications

The following table outlines the critical technical requirements and configurations for the GitLab and Gitaly EC2 environment.

Component	Requirement/Value	Purpose
VPC CIDR Block	`10.0.0.0/16`	Private network addressing
Gitaly Instance Type	`m5.xlarge`	Repository RPC access
Gitaly Port	`8075`	Internal Gitaly communication
Runner Instance Type	`c5.2xlarge`	Pipeline workload execution
Runner Root Volume	8 GiB	OS and system binaries
AMI Name	`GitLab-Source`	Standardized runner image
IAM Role	`GitLabS3Access`	S3 bucket interaction
IMDS Version	Version 2	Enhanced instance security

Lifecycle Management and Resource Cleanup

The ultimate benefit of using an IaC-based approach for GitLab Runners is the ease of lifecycle management. Because the infrastructure is defined as a CloudFormation stack, updating the runners is as simple as updating the template and re-running the script.

To maintain cost efficiency, autoscaling is implemented to terminate runners when they are not in use. This ensures that the organization only pays for the compute power it consumes during active pipeline runs. When the entire environment is no longer needed, the cleanup process is streamlined; by deleting the CloudFormation stack, all associated resources—including EC2 instances, security groups, and network interfaces—are removed systematically, preventing future charges.

Conclusion

The deployment of GitLab CI/CD on Amazon EC2 is a comprehensive exercise in cloud engineering that balances flexibility with strict control. By utilizing a combination of custom AMIs, Launch Templates, and IAM roles, organizations can move away from the fragile nature of manual installations toward a robust, scalable architecture. The use of Gitaly in private subnets and the adoption of IMDSv2 underscores a commitment to security and high performance.

The transition to an IaC model not only reduces the time required to deploy runners but also ensures that the environment is consistent across different AWS accounts. The ability to utilize predefined templates like AWS/CF-Provision-and-Deploy-EC2 allows for a streamlined path from build to deployment, provided that the strict rules regarding template inclusion and variable naming are followed. Ultimately, the integration of these tools creates a professional-grade pipeline capable of handling enterprise-scale workloads while maintaining the agility to scale down for cost optimization.