Integrating GitLab CI with Kubernetes Orchestration

The synergy between GitLab CI and Kubernetes represents a paradigm shift in how modern software is delivered, tested, and scaled. At its core, this integration transforms the software development lifecycle from a series of disconnected steps into a fluid, automated pipeline where the infrastructure is as programmable as the application code. Kubernetes, designed to automate the management of application containers—spanning deployment, scaling, and operation—provides the ideal foundation for GitLab's continuous integration and continuous delivery (CI/CD) capabilities. By leveraging Kubernetes orchestration, organizations can partition resources dynamically, scaling up or down based on real-time customer demand, which optimizes hardware utilization within production environments and significantly reduces the disruption typically associated with feature rollouts.

GitLab's relationship with Kubernetes is multifaceted and non-linear, allowing for three distinct operational modes that can be deployed independently or in a combined architecture. First, GitLab can be used to deploy software directly to a Kubernetes cluster. Second, Kubernetes can be utilized to manage the runners that are attached to a GitLab instance, ensuring that the compute power required for CI jobs is elastic. Third, the entire GitLab application and its associated services can be hosted on a Kubernetes cluster. The flexibility of this ecosystem is evident in the fact that an Omnibus GitLab instance running on a traditional virtual machine can still utilize a Docker runner to deploy software to a Kubernetes cluster, demonstrating a hybrid approach to infrastructure management.

Architectural Integration Modes

The integration of GitLab and Kubernetes is not a one-size-fits-all implementation. Depending on the organizational needs, the integration can be structured in the following ways:

Deploying software from GitLab to Kubernetes: This involves using GitLab CI/CD pipelines to push containerized applications into a Kubernetes environment.
Managing Runners via Kubernetes: This utilizes the Kubernetes cluster to orchestrate the lifecycle of GitLab runners, allowing them to scale based on the number of pending jobs.
Running GitLab on Kubernetes: This involves deploying the GitLab core services (Web, API, Sidekiq, etc.) as pods within a cluster, ensuring high availability and easier maintenance of the GitLab instance itself.

These modes can be layered. For instance, a team might run their GitLab instance on a VM but use a Kubernetes-based runner to deploy their applications to a separate production Kubernetes cluster.

Kubernetes Agent for CI/CD Workflows

The GitLab agent for Kubernetes provides a secure, streamlined method for connecting GitLab CI/CD pipelines to a cluster. This connection is primarily managed through agent contexts, which are integrated into the KUBECONFIG environment. Users can select these contexts via the command line using:

kubectl config use-context <path/to/agent/project>:<agent-name>

In scenarios where certificate-based connections are already present, a manual configuration of a new kubectl context is required to ensure the agent connection takes precedence. This process involves defining specific variables and a before_script to authenticate the job.

The required variables for this configuration include:

KUBE_CONTEXT: The specific name assigned to the new context.
AGENT_ID: The numeric identifier of the agent.
K8S_PROXY_URL: The URL for the agent server (KAS). For gitlab.com, this is https://kas.gitlab.com. For Omnibus deployments, it follows the pattern https://<GITLAB_DOMAIN>/-/kubernetes-agent/k8s-proxy/.

To establish this connection, the following sequence of commands must be executed in the before_script:

bash kubectl config set-credentials agent:$AGENT_ID --token="ci:${AGENT_ID}:${CI_JOB_TOKEN}" kubectl config set-cluster gitlab --server="${K8S_PROXY_URL}" kubectl config set-context "$KUBE_CONTEXT" --cluster=gitlab --user="agent:${AGENT_ID}" kubectl config use-context "$KUBE_CONTEXT"

For environments utilizing KAS with self-signed certificates, the Kubernetes client must be configured to trust the certificate authority (CA). This is achieved by setting a CI/CD variable named SSL_CERT_FILE containing the KAS certificate in PEM format.

Migrating to Kubernetes Runners

The transition from legacy Docker runners to Kubernetes-based runners requires a strategic approach to minimize downtime and ensure stability. A proven migration path involves a phased rollout and a fallback mechanism.

The migration sequence typically follows these stages:

Creation of a temporary tag: A specific tag, such as k8s-default, is created to allow a subset of users to test the new runners.
Opt-in phase: Users are randomly selected or allowed to opt-in to the new offering, allowing the infrastructure team to troubleshoot the executor and gather operational know-how.
Gradual acceptance of untagged jobs: The system begins accepting jobs that do not have specific tags, routing them to the Kubernetes runners.
Decommissioning legacy runners: Once stability is confirmed, the old Docker runners are removed.

During this transition, a fallback mechanism is critical. By maintaining the docker tag in .gitlab-ci.yml files, users can force their jobs to land on the legacy Docker runners if they encounter errors on the Kubernetes infrastructure. Only users with untagged jobs or the k8s-default tag are routed to the new system. This ensures that critical production pipelines are not interrupted by migration artifacts.

One common pitfall encountered during this migration is the "Ping" failure. In some configurations, the Ping function is disabled by default, which causes certain jobs to fail unexpectedly.

Advanced Runner Configuration and Security

Implementing GitLab runners on Kubernetes requires meticulous attention to network security and secret management to prevent unauthorized access and data leaks.

Network Policy Enforcement

To restrict network access for CI job pods, a NetworkPolicy should be implemented. This prevents pods from initiating unauthorized inbound or outbound connections. A standard configuration for gitlab-runner-jobs in the gitlab-runner namespace is as follows:

yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: gitlab-runner-jobs namespace: gitlab-runner spec: podSelector: matchLabels: app: gitlab-runner-job policyTypes: - Ingress - Egress ingress: [] egress: - to: - namespaceSelector: {} ports: - protocol: UDP port: 53 - protocol: TCP port: 53 - to: - ipBlock: cidr: 0.0.0.0/0 ports: - protocol: TCP port: 443 - to: - ipBlock: cidr: 0.0.0.0/0 ports: - protocol: TCP port: 5000

This policy ensures that inbound traffic is completely blocked (ingress: []), while outbound traffic is limited to DNS resolution (UDP/TCP port 53), GitLab API access (TCP port 443), and container registry access (TCP port 5000).

Secrets Management

Handling sensitive data within CI/CD pipelines requires moving away from environment variables toward volume-mounted secrets. This is configured within the values.yaml of the runner deployment.

The configuration should utilize the [[runners.kubernetes.volumes.secret]] section to mount secrets as read-only volumes. For example:

toml [[runners.kubernetes.volumes.secret]] name = "ci-secrets" mount_path = "/secrets" read_only = true secret_name = "gitlab-ci-secrets"

For more complex requirements, projected volumes can be used to aggregate multiple secrets into a single path. This is useful for providing both Docker and NPM configurations:

toml [[runners.kubernetes.volumes.projected]] name = "credentials" mount_path = "/credentials" [[runners.kubernetes.volumes.projected.sources.secret]] name = "docker-config" items = [ { key = "config.json", path = "docker/config.json" } ] [[runners.kubernetes.volumes.projected.sources.secret]] name = "npm-config" items = [ { key = ".npmrc", path = "npm/.npmrc" } ]

RBAC and Resource Management

Proper Role-Based Access Control (RBAC) is mandatory for GitLab CI to interact with a Kubernetes cluster. For instance, when integrating with Flux, specific ClusterRoleBinding resources must be created to grant the CI job necessary permissions.

The following configuration defines the ci-job-admin and ci-job-view roles:

```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ci-job-admin
roleRef:
name: flux-edit-flux-system
kind: ClusterRole
apiGroup: rbac.authorization.k8s.io
subjects:
- name: gitlab:ci_job

kind: Group

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ci-job-view
roleRef:
name: flux-view-flux-system
kind: ClusterRole
apiGroup: rbac.authorization.k8s.io
subjects:
- name: gitlab:ci_job
kind: Group
```

Furthermore, for a repository to function correctly with these integrations, GitLab must be version 11.3 or higher, and the user must be bound to the admin (cluster-admin) ClusterRole.

Deployment and Resource Cleanup

In a typical testing environment, resources such as nginx.yaml are deployed via the pipeline. To clean up these resources, the user should delete the corresponding file (e.g., clusters/testing/nginx.yaml), and Flux will automatically handle the removal of related resources from the cluster.

Additionally, the container-registry-secret environment must be stopped. Stopping the environment triggers its on_stop job, which removes the secret from the cluster, ensuring that no orphaned credentials remain. This modular approach allows for scaling deployments across different projects, where OCI images are built in one project and retrieved by Flux from the correct registry in another.

Summary of Technical Specifications

The following table summarizes the critical components of the GitLab CI and Kubernetes integration:

Component	Specification/Requirement	Purpose
GitLab Version	$\ge$ 11.3	Minimum version for K8s integration features
RBAC Role	`cluster-admin`	Required for full cluster management
Connection Method	KAS Proxy / Agent	Secure tunnel between CI and Cluster
Network Policy	Ingress: None / Egress: 53, 443, 5000	Hardened pod security
Secret Storage	Projected Volumes	Secure mounting of `.npmrc` and `config.json`
Load Balancing	Simultaneous Cluster Execution	Distributed job processing across multiple clusters

Analysis of Scaling and Performance

The use of Kubernetes for GitLab runners provides a significant advantage in terms of elasticity. When multiple clusters are running simultaneously, GitLab balances jobs between them, preventing any single cluster from becoming a bottleneck.

The "Scaling-up" capability is further enhanced by the existence of QA infrastructure. GitLab Service maintains a dedicated QA environment that can be registered to the production instance at any time. This allows the infrastructure to scale up instantaneously during emergency situations or peak demand periods, ensuring that the CI/CD pipeline remains responsive regardless of the load.

The integration of monitoring tools, such as the coreos/prometheus-operator ServiceMonitor, allows for the automatic monitoring of deployed applications. By integrating runner metrics into a broader observability stack, administrators can visualize the performance of their CI pipeline and identify bottlenecks in pod startup times or resource contention within the cluster.

Conclusion

The integration of GitLab CI and Kubernetes is a comprehensive architectural strategy that addresses the needs of modern, high-velocity software teams. By moving from static Docker runners to an elastic Kubernetes-based executor, organizations can achieve a level of scalability and resilience that was previously unattainable. The implementation of the GitLab Agent for Kubernetes provides a secure, identity-aware bridge that eliminates the need for exposing clusters to the public internet, while strict NetworkPolicies and projected volume secrets ensure that the security posture of the cluster is not compromised by the execution of CI jobs.

The migration process, as evidenced by the transition from k8s-default tags to full decommissioning of Docker runners, highlights the necessity of a phased approach. The ability to use fallback tags ensures that the transition to a more complex infrastructure does not result in a loss of productivity. Ultimately, the combination of Kubernetes' orchestration capabilities and GitLab's CI/CD pipeline transforms the cluster into a dynamic execution environment where resources are consumed only when needed and destroyed immediately upon completion, resulting in a highly efficient, cost-effective, and scalable software delivery engine.