Implementing high availability for a Kubernetes cluster is a critical requirement for any production-grade environment, even when utilizing lightweight distributions like K3s. In a standard single-node K3s installation, the API server resides on a single machine, creating a catastrophic single point of failure. If that node crashes or undergoes maintenance, the entire cluster management layer becomes inaccessible. To mitigate this, a sophisticated architecture involving multiple control plane nodes, an HAProxy load balancing layer, and a Keepalived virtual IP (VIP) system is required. This architecture ensures that the Kubernetes API remains reachable regardless of individual node or load balancer failures, providing a resilient foundation for containerized workloads.
Infrastructure Architecture and Resource Allocation
The foundation of a highly available K3s cluster requires a strategic distribution of virtualized resources to ensure that neither the control plane nor the load balancing layer becomes a bottleneck. A professional implementation typically leverages a hypervisor such as Proxmox Virtual Environment to manage the underlying virtual machines. The distribution of roles across these VMs allows for the isolation of the control plane, the worker nodes, and the ingress traffic management.
The following table delineates the precise resource specifications and network identifiers for a standard 7-node high availability deployment:
| VM Name | MAC Address | IP Address | Memory (Gb) | Disk Size (Gb) | CPU Cores | Role |
|---|---|---|---|---|---|---|
| k3s-server-01 | BC:24:11:66:6F:07 | 192.168.1.201 | 4 | 32 | 2 | Control Plane |
| k3s-server-02 | BC:24:11:02:57:C8 | 192.168.1.202 | 4 | 32 | 2 | Control Plane |
| k3s-server-03 | BC:24:11:4F:A3:86 | 192.168.1.203 | 4 | 32 | 2 | Control Plane |
| k3s-worker-01 | BC:24:11:12:B1:D2 | 192.168.1.211 | 8 | 32 | 4 | Worker Node |
| k3s-worker-02 | BC:24:11:07:BA:1A | 192.168.1.212 | 8 | 32 | 4 | Worker Node |
| k3s-lb-01 | BC:24:11:EA:6D:1F | 192.168.1.221 | 2 | 32 | 1 | Load Balancer (Master) |
| k3s-lb-02 | BC:24:11:6B:DC:F9 | 192.168.1.222 | 2 | 32 | 1 | Load Balancer (Backup) |
The inclusion of three control plane nodes is essential for maintaining a quorum in the embedded etcd datastore. In the event of a single node failure, the remaining two nodes maintain the majority, preventing a split-brain scenario. The worker nodes are allocated higher CPU and memory resources (8GB RAM and 4 Cores) compared to the control plane nodes, as they are responsible for the actual execution of the containerized applications. The load balancer nodes are kept lean, utilizing only 2GB of RAM and 1 CPU core, as their primary function is the efficient routing of TCP traffic.
Proxmox Virtual Environment Provisioning
The deployment process begins with the creation of a standardized template to ensure consistency across all nodes. Using a cloud-init image significantly reduces the time required to spin up new instances and ensures that all OS-level configurations are identical.
To create the initial template in Proxmox, the following shell commands are utilized:
```bash
!/bin/bash
qm create 9000 --name ubuntu-cloud-init --memory 2048 --net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-pci
qm set 9000 --scsi0 local-lvm:0,import-from=/var/lib/vz/template/iso/noble-server-cloudimg-amd64.img
qm set 9000 --ide1 local-lvm:cloudinit
qm set 9000 --boot order=scsi0
qm set 9000 --serial0 socket --vga serial0
qm template 9000
```
Once the template is established, the specific nodes are cloned and resized to match the requirements defined in the resource table. For example, to create the first server node, the following sequence is executed:
```bash
!/bin/bash
qm clone 9000 100 --name k3s-server-01
qm resize 100 scsi0 32G
qm set 100 --net0
```
This methodical approach to provisioning ensures that the environment is reproducible and scalable. By using the qm toolset, administrators can rapidly expand the cluster by adding more worker nodes or rotating control plane nodes without manually configuring each VM from scratch.
Implementing the HAProxy Load Balancing Layer
HAProxy serves as the critical traffic director for the K3s cluster. Its primary responsibility is to accept requests directed at the Kubernetes API server (typically on port 6443) and distribute them across the available control plane nodes. This prevents any single server from becoming a bottleneck and ensures that if one server fails, the API remains available.
The installation of the necessary software on the load balancer nodes (k3s-lb-01 and k3s-lb-02) is performed via the following command:
bash
sudo apt-get install haproxy keepalived
HAProxy Configuration Deep Dive
The configuration of HAProxy is split into a frontend and a backend. The frontend defines how the load balancer listens for incoming traffic, while the backend defines where that traffic is sent.
The following configuration must be appended to /etc/haproxy/haproxy.cfg on both k3s-lb-01 and k3s-lb-02:
```haproxy
frontend k3s-frontend
bind *:6443
mode tcp
option tcplog
default_backend k3s-backend
backend k3s-backend
mode tcp
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s
server k3s-server-01 192.168.1.201:6443 check
server k3s-server-02 192.168.1.202:6443 check
server k3s-server-03 192.168.1.203:6443 check
```
The use of mode tcp is mandatory because the Kubernetes API communicates via HTTPS (TLS), and HAProxy must pass the encrypted traffic through without attempting to decrypt it at the load balancer level. The balance roundrobin algorithm ensures that requests are distributed evenly among the three control plane nodes. The check parameter enables health monitoring; if a server node fails the TCP check, HAProxy will automatically stop routing traffic to it until it returns to a healthy state.
In alternative environments, such as those using OPNsense, the configuration follows a similar logic but is implemented via a GUI plugin. In such cases, a firewall alias (e.g., K3s_Control_Plane_Loadbalancer) is created for the Virtual IP, and a firewall rule is established to allow traffic from the worker nodes to that VIP on port 6443.
Ensuring Redundancy with Keepalived
While HAProxy provides load balancing across the K3s servers, the HAProxy instance itself could still be a single point of failure. Keepalived solves this by providing a Virtual IP (VIP) that floats between the two load balancer nodes.
Master Node Configuration (k3s-lb-01)
On the master load balancer, the /etc/keepalived/keepalived.conf file is configured to claim the VIP and monitor the health of the HAProxy process.
```keepalived
globaldefs {
enablescriptsecurity
scriptuser root
}
vrrpscript chkhaproxy {
script 'killall -0 haproxy' # faster than pidof
interval 2
}
vrrpinstance haproxy-vip {
interface eth0
state MASTER # MASTER on k3s-lb-01, BACKUP on k3s-lb-02
priority 200 # 200 on k3s-lb-01, 100 on k3s-lb-02
virtualrouterid 51
virtualipaddress {
192.168.1.220/24
}
trackscript {
chkhaproxy
}
}
```
Backup Node Configuration (k3s-lb-02)
The backup node is configured similarly but with a lower priority and a state of BACKUP. This ensures that it only assumes the VIP if the master node fails.
```keepalived
globaldefs {
enablescriptsecurity
scriptuser root
}
vrrpscript chkhaproxy {
script 'killall -0 haproxy' # faster than pidof
interval 2
}
vrrpinstance haproxy-vip {
interface eth0
state BACKUP # MASTER on k3s-lb-01, BACKUP on k3s-lb-02
priority 100 # 200 on k3s-lb-01, 100 on k3s-lb-02
virtualrouterid 51
virtualipaddress {
192.168.1.220/24
}
trackscript {
chkhaproxy
}
}
```
The vrrp_script chk_haproxy is a critical component. By executing killall -0 haproxy, Keepalived checks if the HAProxy process is still running. If the process crashes, the script fails, and Keepalived triggers a failover of the VIP (192.168.1.220) to the backup node. This ensures that the API gateway remains available even if the underlying load balancer software fails.
To apply these changes, both services must be restarted on both load balancer VMs:
bash
sudo systemctl restart haproxy
sudo systemctl restart keepalived
K3s Server Installation and HA Configuration
With the load balancing infrastructure in place, the K3s servers can be initialized. For high availability, the cluster must use an embedded etcd datastore rather than the default SQLite database.
Initializing the First Control Plane Node
The first server node is initialized using the --cluster-init flag. This command initializes the etcd database and prepares the cluster for additional control plane nodes. The --tls-san flag is used to include the Virtual IP in the API server's TLS certificate, ensuring that requests sent to the VIP are trusted.
To install a specific version of K3s, the INSTALL_K3S_VERSION variable is set first:
bash
export INSTALL_K3S_VERSION=v1.31.6+k3s1
sudo curl -sfL https://get.k3s.io | sh -s - server --cluster-init --disable="traefik" --disable="servicelb" --tls-san=192.168.1.220 --node-taint CriticalAddonsOnly=true:NoExecute
The use of --disable="traefik" and --disable="servicelb" is common in HA setups where external load balancers or specialized ingress controllers are preferred. The --node-taint CriticalAddonsOnly=true:NoExecute ensures that standard application pods are not scheduled on the control plane nodes, reserving their resources for cluster management.
Joining Additional Control Plane Nodes
Once the first node is active, the remaining control plane nodes are joined to the cluster. This can be automated via Ansible. In a structured Ansible role, the deployment might look like this:
yaml
- name: Setup remaining k3s servers
hosts: remaining_k3s_servers
roles:
- { role: k3s-servers, tags: [k3s-baseline] }
The installation command for subsequent nodes uses the --tls-san flag pointing to the VIP:
bash
INSTALL_K3S_EXEC: "server --disable servicelb --tls-san {{ k3s_control_plane_vip }}"
After the nodes have joined, the status can be verified using the following command:
bash
kubectl get nodes
Expected output for a healthy HA control plane:
text
NAME STATUS ROLES AGE VERSION
k8s-control-1 Ready control-plane,etcd,master 297d v1.24.17+k3s1
k8s-control-2 Ready control-plane,etcd,master 40h v1.24.17+k3s1
k8s-control-3 Ready control-plane,etcd,master 40h v1.24.17+k3s1
Transitioning from Single-Node to High Availability
It is possible to migrate an existing single-node K3s cluster to an HA cluster. This process typically involves changing the datastore from SQLite to etcd.
Using an Ansible playbook with the -e reinstall=true flag can trigger this transition:
```bash
Example of triggering a re-installation to switch to etcd
ansible-playbook site.yml -e reinstall=true
```
After the migration, the configuration file /etc/rancher/k3s/config.yaml should reflect the new VIP:
yaml
token: X
tls-san:
- 10.0.3.100
It is important to note that simply updating the config.yaml file is not sufficient to update existing certificates. A full re-installation or a certificate rotation is required to ensure the API server recognizes the VIP as a valid Subject Alternative Name (SAN).
Joining Worker Nodes to the HA Cluster
Worker nodes do not require the same complexity as control plane nodes. They simply need to be told where the API server is located and provide the correct join token. In an HA setup, the worker nodes are joined using the Keepalived VIP rather than the IP of a specific server node.
This ensures that if one control plane node fails, the worker node's connection to the API is seamlessly maintained by HAProxy routing the request to another healthy server.
Failure Scenario Analysis
The combination of HAProxy and Keepalived creates a multi-layered safety net for the K3s API.
- K3s Server Node Failure: If
k3s-server-01goes down, HAProxy detects the failure via thetcp-checkand immediately stops routing traffic to192.168.1.201. Traffic is redistributed tok3s-server-02andk3s-server-03. - HAProxy Instance Failure: If the HAProxy process on
k3s-lb-01crashes, thevrrp_script chk_haproxyfails. Keepalived immediately shifts the Virtual IP192.168.1.220tok3s-lb-02. - Total Load Balancer Node Failure: If the entire VM
k3s-lb-01suffers a hardware failure, Keepalived's VRRP protocol detects the loss of the heartbeat and promotesk3s-lb-02to MASTER status, assuming the VIP.
Conclusion
The implementation of a K3s cluster with HAProxy and Keepalived transforms a lightweight Kubernetes distribution into a robust, enterprise-ready platform. By decoupling the API access point from the physical servers using a Virtual IP and utilizing a TCP load balancer to distribute traffic across an odd number of control plane nodes, administrators eliminate single points of failure. The synergy between the Proxmox-based virtualized infrastructure and the VRRP-driven failover mechanism ensures that the cluster remains manageable and available, regardless of individual component failures. The transition from SQLite to etcd is the final architectural piece that allows the state of the cluster to be replicated across multiple nodes, completing the high-availability loop. This setup is an essential blueprint for any technician looking to deploy K3s in an environment where downtime is not an option.