Automating K3s Orchestration via Terraform and Proxmox

The deployment of a Kubernetes cluster has historically been viewed as a complex undertaking, often requiring an advanced understanding of clusterology to successfully manage node joins, authentication, and component networking. This complexity frequently led to scenarios where reproducing a working environment was nearly impossible, making the deployment of useful services a precarious task. The introduction of k3s has fundamentally shifted this paradigm. By packaging the standard Kubernetes components into a single binary and refactoring authentication mechanisms, k3s enables the operation of a full Kubernetes-compliant cluster without the prerequisite of a PhD in cluster management. When k3s is integrated with Infrastructure as Code (IaC) tools such as Terraform and configuration management tools like Ansible, the process of spinning up an entire cluster is transformed from a manual, error-prone series of steps in a web interface into a single-command execution. This architectural approach allows for the complete automation of the underlying virtual infrastructure on Proxmox, ensuring that the environment is not only reproducible but also resilient to the failures commonly associated with hardware-constrained deployments.

The Infrastructure Foundation

Before the automation pipeline can be executed, several critical prerequisites must be established. These tools form the bedrock of the deployment process, ensuring that the control machine can communicate with the Proxmox API and configure the resulting virtual machines.

The primary tool for infrastructure provisioning is Terraform. To get started, Terraform must be downloaded from its official downloads page and placed within the system's $PATH to ensure it can be invoked from any terminal directory. Beyond the core binary, the Proxmox Terraform provider is required to allow Terraform to communicate with the Proxmox API. This provider is installed using the go install command as specified in the provider's README, and the resulting binaries must be placed in the ~/.terraform.d/plugins/ directory.

In addition to Terraform, the environment requires a functional Proxmox host and the installation of Ansible. While Terraform handles the creation of the virtual machines, Ansible is utilized for the subsequent configuration of the k3s software, filling the gap where Terraform's ability to track configuration changes is limited.

To enable smooth automation, a cloud-init Ubuntu template must be constructed within Proxmox. Cloud-init is essential because it adds specific packages to the VM that allow for automatic provisioning. Without a cloud-init enabled template, Terraform would be unable to inject the necessary initial configurations into the VMs during the boot process, which would necessitate manual intervention for every node created.

Proxmox API Integration and VM Provisioning

Once the environment is prepared, the connection between the local automation machine and the Proxmox API must be established. This is achieved by exporting specific environment variables that the Proxmox Terraform provider uses for authentication and connectivity.

The following variables must be defined:

PM_API_URL: This is the endpoint for the Proxmox API, formatted as https://<node_ip>:8006/api2/json.
PM_USER: The user account used for authentication, such as root@pam.
PM_PASS: The password associated with the Proxmox user.

Failure to set these environment variables will result in the user being prompted for this information every time a terraform plan or terraform apply command is executed.

The provisioning workflow follows a standard Terraform lifecycle:

terraform init: Initializes the working directory and downloads the necessary providers.
terraform plan: Generates an execution plan, showing what resources will be created.
terraform apply: Executes the plan to create the virtual machines on the Proxmox host.

After these commands are executed, a waiting period is required to allow the VMs to complete their initial cloud-init configuration. This phase is critical as it sets up the base OS and networking before the Kubernetes layer is applied.

Resource Specification and Cluster Topology

The architecture of the cluster is defined in the terraform.tfvars file, where the specific resources for each node are mapped. This allows for a heterogeneous cluster where control plane nodes and worker nodes have different hardware allocations.

The following table outlines the specific configurations used for the cluster nodes:

Node Name	Role	CPU Cores	Memory (MB)	Disk Size	Description
k8s-control-1	Control Plane	4	4096	30G	Kubernetes control plane node 1
k8s-worker-1	Worker	2	4096	120G	Kubernetes worker node 1
k8s-worker-2	Worker	2	4096	120G	Kubernetes worker node 2
k8s-worker-3	Worker	2	4096	120G	Kubernetes worker node 3

The disk size for the control plane is notably smaller at 30G, a value previously utilized for Longhorn storage, and is slated for removal in future iterations. The worker nodes are allocated 120G to accommodate the actual workloads and data.

A significant limitation in the current Proxmox setup is that the replication of Virtual Machine Templates is not possible across the Proxmox cluster. This forces the initial provisioning to occur on a single Proxmox host, which is why the pve_node property is often ignored during the initial phase. Once the VMs are provisioned, they must be migrated manually to other nodes within the cluster. To resolve this, storage that supports replication would be required for the templates.

To simplify access to these nodes, the ~/.ssh/config file should be configured. By defining the Host, HostName, and User for each node (e.g., k8s-control-1, k8s-worker-1), the administrator can SSH into any node without entering passwords.

K3s Configuration and Ansible Orchestration

While Terraform is excellent for provisioning the virtual hardware, it lacks the granular tracking required for software configuration changes. This is where Ansible is introduced. Ansible takes the raw VMs provided by Terraform and transforms them into a functioning Kubernetes cluster.

The process begins in the ansible-roles directory. The primary configuration point is the inventory.toml file, where the IP addresses of the created VMs must be specified. This file tells Ansible which hosts are part of the cluster and how to reach them.

The execution flow for a full setup is as follows:

terraform init
terraform apply
cd ansible
ansible-playbook -i hosts setup-k3s.yml
export KUBECONFIG=./k3s.yaml
kubectl get nodes

This sequence ensures that the infrastructure is built first, the k3s software is installed and configured second, and the cluster is verified third.

For users wishing to manage the cluster from a local machine, the kubeconfig file is located at /etc/rancher/k3s/k3s.yaml. Exporting this file to the local environment allows for the use of kubectl to manage the cluster remotely.

Lifecycle Management and Node Maintenance

One of the primary advantages of moving from physical hardware, such as Raspberry Pis, to Proxmox VMs is the ability to perform rapid maintenance and recovery. In physical deployments, node failures often require physically reconnecting power or reflashing SD cards. In a virtualized environment managed by IaC, these pain points are eliminated.

The process for removing or replacing a node is streamlined into several logical steps:

Drain the node to ensure pods are moved to other healthy nodes.
Remove the node from the cluster.
Destroy and recreate the VM of the node using terraform destroy and terraform apply.
Replace the old SSH public key of the VM on the Ansible control node.
Rerun the Ansible playbook to reconfigure the new node.

This lifecycle allows the administrator to destroy and recreate the entire cluster when necessary, provided that stateful data is backed up externally. Furthermore, the use of a baseline template means that any service that cannot be containerized for k3s can still be deployed as a VM using the same automated process.

Comparative Analysis of Infrastructure Paradigms

The shift from physical ARM-based clusters (Raspberry Pi) to virtualized x86 clusters on Proxmox addresses several systemic failures. Physical nodes are prone to unresponsive states due to overload, which often necessitates a hard reboot. Additionally, physical network interfaces can fail silently, requiring a reboot to restore connectivity.

By utilizing a virtualized environment, the administrator gains:

Robustness: The ability to recreate the environment from code.
Scalability: The ease of adding new worker nodes by simply adding a new entry to the vms list in terraform.tfvars.
Consistency: Every node is based on the same cloud-init Ubuntu template, ensuring no configuration drift at the OS level.

For those looking to further optimize the process, existing Terraform modules such as fvumbaca/k3s/proxmox can be explored to reduce the reliance on custom Ansible playbooks for the initial configuration, although specific configurations like Longhorn would still require manual or scripted intervention.

Analysis of Reproducibility and Scalability

The core objective of this architecture is absolute reproducibility. By defining the infrastructure in terraform.tfvars and the configuration in Ansible playbooks, the administrator moves away from the "snowflake server" problem, where each node is uniquely configured and impossible to replicate.

The impact of this reproducibility is most evident during experimentation. If a configuration change causes a cluster-wide failure, the administrator does not need to spend hours troubleshooting the state of each node. Instead, they can execute terraform destroy to wipe the slate clean and then rerun the initialization sequence. This creates a sandbox environment where the cost of failure is minimal.

From a scalability perspective, the architecture allows for a rapid increase in compute capacity. To add a fourth worker node, the administrator simply adds a new block to the vms array in the Terraform variables, runs terraform apply, and updates the Ansible inventory. The entire process takes minutes rather than hours, and the resulting node is guaranteed to be identical to its peers.