K3s Orchestration via Ansible Automation

The deployment of Kubernetes clusters has historically been a labor-intensive process, fraught with manual configuration errors and the tedious repetition of environment setup. K3s, a lightweight Kubernetes distribution specifically engineered for IoT, edge computing, and low-powered devices like Raspberry Pis, represents a paradigm shift in how container orchestration is delivered to the edge. While K3s is designed for simplicity, the installation of multi-node clusters—especially those requiring high availability and specific networking backends—still requires a structured approach to ensure consistency across all nodes. This is where Ansible, an IT automation platform from Red Hat, transforms the deployment process. By utilizing Ansible, operators can move away from manual shell script execution and instead define a "desired state" for their infrastructure. This allows for the rapid scaling of clusters, the enforcement of configuration standards, and the ability to deploy identical environments across various hardware, ranging from home laboratories to full-scale production clusters.

The Architectural Synergy of K3s and Ansible

K3s is a CNCF-certified Kubernetes distribution that removes unnecessary legacy code and optimizes the binary for smaller footprints. This makes it an ideal candidate for environments where resource constraints are a primary concern. However, the challenge arises when moving from a single-node setup to a multi-node cluster. The process of joining worker nodes to a master node, managing tokens, and ensuring that the server API is reachable involves several manual steps that are prone to human error.

Ansible solves this by acting as the orchestration layer. Ansible operates on a push-based model, using "playbooks" to manage the state of remote machines. Instead of logging into every virtual machine individually, an operator can execute a single playbook that targets a group of servers. Ansible leverages modules to shift the system toward the desired state defined in the playbook. In the context of K3s, this means Ansible can handle the downloading of the installation script, the injection of security tokens, and the execution of the install command with the specific flags required for different node roles.

Infrastructure Prerequisites for K3s Deployment

To successfully implement a K3s cluster using Ansible, several foundational components must be in place. The environment must be prepared to support both the automation engine and the resulting Kubernetes nodes.

  • A hypervisor
    This software is required to run the virtual machines that will serve as Kubernetes nodes. Choosing a reliable hypervisor ensures that the underlying hardware is efficiently partitioned.

  • Ubuntu Server 20.04
    The deployment of node VMs typically utilizes Ubuntu Server 20.04. Using a consistent ISO across all node VMs ensures that the operating system baseline is identical, which reduces the likelihood of unexpected failures during the Ansible playbook execution.

  • Ansible Installation
    The automation platform must be installed on a control node. This is the machine from which the playbooks are executed to configure the remote servers.

  • Text Editor
    A professional text editor is necessary for writing and modifying YAML playbooks. VS Code is cited as a highly effective option for this purpose.

Deep Dive into the K3s Installation Workflow

The process of deploying K3s via Ansible involves a structured series of tasks designed to ensure that the cluster is initialized correctly and that nodes join the cluster in the proper sequence.

Pre-Installation Checks and Resource Acquisition

Before the installation process begins, the system must determine if K3s is already present to avoid redundant installations and potential configuration conflicts.

  • Checking for Existing Installations
    Ansible uses a shell command to check for the existence of the K3s binary at the path /usr/local/bin/k3s. This check is critical because subsequent installation tasks are conditioned on the result of this test. If the binary is found, the installation steps are skipped.

  • Acquiring the Installation Script
    The K3s installation script is retrieved from the official source via an HTTP GET request to https://get.k3s.io. The script is saved to /tmp/k3s_install.sh on the target nodes. This ensures that the latest version of the installer is utilized for every node in the cluster.

  • Secret Management
    Security tokens are essential for joining nodes to a cluster. Ansible can import these variables from an encrypted file, such as k3s-secrets.yml.vault. This allows sensitive data to be stored securely while still being accessible to the playbook during runtime.

Node-Specific Execution Logic

A K3s cluster consists of different roles, each requiring a different set of installation flags. Ansible manages these via conditional logic based on the node_type variable defined in the inventory file.

  • Initial Master Node
    The first master node must initialize the cluster. The installation script is executed with the --cluster-init flag. Additionally, to optimize for specific environments, the --disable=traefik flag is used to disable the default Traefik ingress controller, and --flannel-backend=vxlan is specified to define the networking backend.

  • Subsequent Master Nodes
    Additional master nodes join the existing cluster rather than initializing a new one. They are executed with a --server flag pointing to the primary master node's IP address, specifically targeting port 6443.

  • Worker Nodes
    Worker nodes are installed as agents. The command includes the agent argument, the cluster token, and the server URL of the master node. This allows the worker to communicate with the control plane and begin scheduling pods.

Advanced Cluster Bootstrapping and Service Management

For more complex deployments, specifically those involving high availability, the bootstrap process must be handled with precision. This involves temporary services that facilitate the initial cluster formation.

Bootstrap Service Lifecycle

The use of a bootstrap service allows the cluster to reach a stable state before transitioning to the permanent K3s service.

  • Bootstrap Template Deployment
    A template file, such as k3s-bootstrap-followers.service.j2, is deployed to the systemd directory as k3s-bootstrap.service. This file is configured with root ownership and 0644 permissions.

  • Service Orchestration
    The bootstrap service is started on the primary master node first. Ansible utilizes the ansible.builtin.systemd module to trigger a daemon reload and start the service. A delay and retry mechanism is implemented to ensure the service is fully operational before the rest of the cluster attempts to join.

  • Follower Node Activation
    Once the primary master is active, the bootstrap service is started on the follower nodes. This sequential activation prevents race conditions during the initial cluster handshake.

  • Cleanup and Transition
    After the bootstrap phase is complete, the k3s-bootstrap service is stopped and the service file is removed from /etc/systemd/system/k3s-bootstrap.service. The system then transitions to the primary k3s.service, which manages the long-term operation of the node.

Post-Deployment Configuration and Cluster Access

Once the Ansible playbook has finished executing, the cluster is operational, but the operator must still configure local access to manage the Kubernetes resources.

Kubeconfig Retrieval and Modification

The kubeconfig file contains the necessary credentials and server information to interact with the cluster via kubectl.

  • Locating the Config
    The kubeconfig is located on the master nodes at /etc/rancher/k3s/k3s.yaml. The operator must SSH into a master node to retrieve this content.

  • Local Environment Configuration
    To use the configuration on a local machine, the server property must be modified. By default, it points to https://127.0.0.1:6443. This must be updated to the actual IP address of one of the master nodes in the cluster.

  • Connectivity Testing
    Once the modified kubeconfig is placed on the local machine, the connection can be verified using the command kubectl get ns. This lists the namespaces and confirms that the local kubectl client can communicate with the remote K3s API server.

Comparison of K3s Provisioning Tools

While Ansible provides a robust way to manage state, other projects and tools exist to simplify the K3s deployment process depending on the user's specific needs.

Tool Primary Language Key Feature Use Case
k3s-ansible YAML/Ansible Playbook-based bootstrapping Multi-node clusters for Ansible users
k3sup Golang SSH-only requirements Fast setup with external datastores
autok3s N/A Graphical User Interface (GUI) Cloud providers and VM provisioning for non-CLI users
hetzner-k3s Crystal Hetzner Cloud integration Specific deployments on Hetzner Cloud infrastructure

Technical Implementation Summary

The following table outlines the critical Ansible components used in the K3s deployment process.

Ansible Component Purpose Implementation Detail
ansible.builtin.shell Command Execution Used for running the sh /tmp/k3s_install.sh script
ansible.builtin.uri File Download Fetches the installation script from https://get.k3s.io
ansible.builtin.include_vars Secret Injection Imports k3s-secrets.yml.vault for token management
ansible.builtin.template Config Deployment Deploys .j2 templates for systemd service files
ansible.builtin.systemd Service Management Handles daemon_reload and service state (started/stopped)

Implementation Logic for Master and Worker Nodes

The differentiation between node roles is handled through the when conditional in Ansible. This ensures that the correct installation flags are applied to the correct machines.

For the Initial Master Node, the logic is as follows:

bash sh /tmp/k3s_install.sh --token {{ k3s_token }} --disable=traefik --flannel-backend=vxlan --cluster-init

For additional Master Nodes, the logic shifts to:

bash sh /tmp/k3s_install.sh --token {{ k3s_token }} --disable=traefik --flannel-backend=vxlan --server https://{{ hostvars["k3s-m1"]["ansible_default_ipv4"]["address"] }}:6443

For Worker Nodes, the command is executed as:

bash sh /tmp/k3s_install.sh agent --token {{ k3s_token }} --server https://{{ hostvars["k3s-m1"]["ansible_default_ipv4"]["address"] }}:6443

The execution of the entire process is triggered by the following command:

bash ansible-playbook k3s.yml

If the security variables were encrypted using Ansible Vault, the command is modified to request the password:

bash ansible-playbook k3s.yml --ask-vault-pass

Analysis of Orchestration Efficiency

The integration of Ansible into the K3s lifecycle provides a significant reduction in the "time-to-cluster" metric. In a manual scenario, an operator would have to SSH into each node, download the script, manually enter the token, and verify the status. In a cluster of ten nodes, this would be a highly repetitive and error-prone process.

By using the "Deep Drilling" method of configuration, Ansible ensures that the environment is consistent. For example, the use of the ansible.builtin.pause module for 30 seconds ensures that the master node's API server is fully initialized before the follower nodes attempt to join. This eliminates the common "connection refused" errors that occur during rapid-fire manual deployments.

Furthermore, the ability to utilize ansible.builtin.template for service files allows for the dynamic injection of variables. This means that if the server IP or the token changes, the operator only needs to update the inventory or the vault file, rather than manually editing service files on every node. This creates a "single source of truth" for the cluster configuration.

The use of the failed_when: false attribute during the check for the K3s binary is a critical design choice. It prevents the playbook from crashing if the file is missing, instead allowing the logic to flow into the installation tasks. This idempotency is the core strength of the Ansible approach, allowing the playbook to be run multiple times without causing side effects or duplicating installations.

Conclusion

The deployment of K3s through Ansible represents the intersection of lightweight container orchestration and powerful infrastructure automation. By shifting the installation logic from manual shell execution to a declarative playbook, operators can achieve a level of consistency and scalability that is impossible with manual methods. The process begins with the preparation of Ubuntu Server VMs and progresses through a carefully orchestrated series of tasks: validating the current state, acquiring the official installation scripts, and applying role-specific configurations.

The strategic use of bootstrap services and sequential node activation ensures that high-availability clusters are formed without the risks of race conditions. Moreover, the integration of Ansible Vault provides a professional-grade security layer for managing cluster tokens. When compared to other tools like k3sup or autok3s, the Ansible approach offers the most flexibility for those who need to manage the entire lifecycle of their infrastructure as code. The result is a production-ready Kubernetes environment that can be deployed, scaled, and managed with minimal manual intervention, enabling the operator to focus on application deployment rather than the underlying plumbing of the cluster.

Sources

  1. Install K3s on Proxmox using Ansible
  2. Deploying K3s with Ansible
  3. K3s Related Projects
  4. K3s Init Cluster

Related Posts