Architecting K3s Clusters on Oracle Cloud Infrastructure Always Free Tier

The intersection of lightweight Kubernetes distributions and generous cloud free tiers has created a playground for developers, DevOps engineers, and students to experiment with container orchestration without incurring financial risk. Oracle Cloud Infrastructure (OCI) provides one of the most substantial "Always Free" offerings in the industry, particularly through its Ampere A1 Compute instances. When paired with K3s—a highly optimized, lightweight Kubernetes distribution—users can deploy a functional, multi-node cluster that mirrors production-grade environments. This capability transforms the OCI Always Free tier from a simple virtual machine provider into a robust platform for hosting side projects, learning GitOps workflows, or maintaining lightweight production workloads.

The core appeal of K3s lies in its reduced memory footprint and simplified installation process. Unlike standard Kubernetes (K8s), which is resource-intensive and often requires a complex setup of etcd and several control plane components, K3s bundles everything into a single binary. On OCI, this is particularly beneficial because it allows the cluster to run comfortably within the RAM limits of the free tier. By leveraging ARM-based Ampere A1 instances, which offer significantly more resources than traditional x86-based free VMs, users can scale their cluster to include multiple nodes, effectively distributing workloads and ensuring that the control plane remains responsive.

OCI Always Free Resource Allocation and Hardware Strategies

Selecting the right hardware configuration is the most critical step in ensuring a stable K3s deployment. Oracle Cloud offers two primary paths for free compute resources: the Ampere ARM-based instances and the AMD-based instances. The distribution of these resources determines the architectural capacity of the cluster.

The Ampere A1 ARM64 instances are the crown jewels of the OCI free tier. Users are allocated up to 4 OCPUs and 24 GB of RAM in total. This allocation is flexible, allowing the user to create a single massive machine or split the resources across multiple smaller instances. For a K3s cluster, splitting these resources is the optimal strategy. A common configuration involves creating two separate machines, each equipped with 2 OCPUs and 12 GB of RAM. In this scenario, one machine functions as the K3s master (server), handling the API server and scheduling, while the second serves as a worker node to execute the pods. This separation ensures that if a worker node crashes due to a memory-intensive application, the control plane remains operational.

Alternatively, OCI provides two AMD-based instances, each with 1 OCPU and 1 GB of RAM. While these are too constrained to run a modern K3s server effectively, they serve specific purposes within a larger architectural ecosystem. One such use case is deploying a dedicated MySQL database on an AMD instance to provide high availability or external state management for the K3s cluster. Using an AMD node for the database prevents the database's memory consumption from competing with the Kubernetes control plane on the ARM nodes.

The following table outlines the resource distribution options available for K3s deployment on OCI:

Instance Type CPU Architecture Max Free Allocation Recommended K3s Role Typical Resource Split
Ampere A1 ARM64 4 OCPUs / 24 GB RAM Master & Worker Nodes 2 Nodes (2 OCPU / 12 GB RAM each)
AMD x86_64 2 Instances (1 OCPU / 1 GB RAM each) Database / Utility Node 1 Node for MySQL / State

Manual Provisioning and Initial Instance Setup

For users who prefer manual control over their infrastructure, the process begins within the OCI Console under the "Create VM Instance" menu. The choice of operating system is paramount; Ubuntu is the recommended image for K3s due to its broad community support and compatibility with the K3s installation scripts.

When creating the k3s-master instance, the user must select the ARM shape and allocate approximately 15 GB of boot volume storage. A critical step during the creation process is the management of SSH keys. OCI requires a public key for access and provides a private key for download. It is imperative to download and secure both the private and public keys immediately, as these are required not only for the master node but for the subsequent provisioning of worker and database nodes.

Once the instance status turns green (Active), the user connects via SSH using the public IP address. On a Linux or macOS system, this is achieved through the terminal. For Windows users, Putty is often required to handle the .ppk or .pem key formats.

bash ssh -i id.rsa ubuntu@<PUBLIC IP ADDRESS>

After gaining access, the first administrative task is to set a descriptive hostname to avoid confusion in the kubectl output. This is done using the hostnamectl utility.

bash hostnamectl set-hostname k3s-master hostnamectl

Following the master node setup, secondary nodes are provisioned. If a database node is required, an AMD instance is created with 1 GB of RAM and 10 GB of HDD, designated as k3s-mysql. A MySQL installation is then performed on this node, including the creation of a dedicated user and database with specific admin rights. The private IP of this machine must be noted, as it will be referenced by the K3s cluster for connectivity. Finally, a second ARM instance (k3s-arm-node2) is created with 2 OCPUs and 12 GB of RAM to act as a scaling agent for the cluster.

Automated Infrastructure as Code with Terraform

For those seeking a professional DevOps approach, manual setup is replaced by Terraform. Terraform allows the entire OCI infrastructure—virtual cloud networks, security lists, and compute instances—to be codified and deployed consistently.

The automation process begins with the configuration of environment variables. These variables provide the Terraform OCI provider with the necessary authentication credentials to manage resources on behalf of the user.

bash export TF_VAR_compartment_id="<COMPARTMENT_ID>" export TF_VAR_region="<REGION_NAME>" export TF_VAR_tenancy_ocid="<TENANCY_OICD>" export TF_VAR_user_ocid="<USER_OICD>" export TF_VAR_fingerprint="<RSA_FINGERPRINT>" export TF_VAR_private_key="<PRIVATE_KEY>" export TF_VAR_ssh_authorized_keys='["<SSH_PUBLIC_KEY>"]'

The Terraform project structure typically involves three primary files: terraform.tfvars for secret and environment-specific data, provider.tf to define the OCI provider, and main.tf to define the resources.

The provider.tf file initializes the connection to Oracle:

hcl provider "oci" { tenancy_ocid = var.tenancy_ocid user_ocid = var.user_ocid private_key_path = var.private_key_path fingerprint = var.fingerprint region = var.region }

The main.tf file leverages modules to deploy the K3s cluster. By using a module, such as github.com/garutilorenzo/k3s-oci-cluster, the user can specify the cluster name, token, and public IP CIDR without writing hundreds of lines of resource definitions.

hcl module "k3s_cluster" { region = var.region availability_domain = "<change_me>" compartment_ocid = var.compartment_ocid my_public_ip_cidr = "<change_me>" cluster_name = "<change_me>" environment = "staging" k3s_token = "<change_me>" source = "github.com/garutilorenzo/k3s-oci-cluster" }

The deployment is executed through the standard Terraform workflow: terraform init to download providers and modules, followed by terraform apply to provision the hardware. This method ensures that the cluster is reproducible and can be destroyed and recreated in minutes.

Advanced Cluster Architecture and GitOps Integration

Beyond basic installation, a sophisticated K3s deployment on OCI incorporates modern cloud-native patterns. A high-maturity setup utilizes a 3-node Ampere A1 ARM64 cluster to maximize the 24 GB RAM allocation while maintaining redundancy.

In a professional-grade architecture, the traffic flow is engineered for security and scalability. The journey of a request begins at Cloudflare DNS, which directs traffic to an OCI Network Load Balancer. From there, the traffic hits an Ingress/NAT node before reaching the K3s server and worker nodes. This layout isolates the internal cluster IP space from the public internet, reducing the attack surface.

To manage applications, GitOps is implemented via Argo CD. Argo CD continuously monitors a Git repository and ensures that the state of the K3s cluster matches the configuration defined in the code. This eliminates the need for manual kubectl apply commands and provides a clear audit trail of all changes.

The networking layer is further enhanced using the Kubernetes Gateway API and Envoy Gateway. Instead of relying on the default Traefik ingress provided by K3s, Envoy Gateway offers more granular control over traffic routing. Security is automated through cert-manager, which interfaces with Let's Encrypt to provide automatic TLS certificates. DNS records are updated automatically via ExternalDNS, which syncs Kubernetes services with Cloudflare records.

Secret management is handled outside of the cluster to avoid storing sensitive data in Git. OCI Vault is used to store secrets, which are then synced into the Kubernetes namespace using External Secrets. This creates a secure chain of custody for API keys and database passwords.

Troubleshooting Common K3s Deployment Failures

Deploying K3s on OCI is not without its challenges. One of the most frequent issues encountered by users involves networking and route failures during the master node installation.

A common error manifests as dial tcp 10.43.0.1:443: connect: no route to host. This typically occurs when the K3s pods—specifically coredns and metrics-server—cannot communicate with the Kubernetes API server. In these instances, the kubectl get pods -A command reveals that pods are stuck in CrashLoopBackOff or Error states.

Analyzing the logs of a failing pod, such as coredns, often shows the same "no route to host" error. This is frequently caused by the OCI Virtual Cloud Network (VCN) security lists or the local OS firewall (iptables/ufw) blocking the necessary traffic on port 443 or the internal cluster CIDR (10.43.0.0/16).

Another point of failure is the use of the --docker flag during installation. While some users prefer Docker over the default containerd, this can introduce compatibility issues with certain K3s versions or OCI image configurations. If pods are crashing repeatedly during the installation of the master node via the curl -sfL https://get.k3s.io | sh -s - server --docker command, it is recommended to test the installation without the docker flag to determine if the container runtime is the source of the instability.

Storage Solutions and Data Persistence

In a distributed K3s cluster, managing persistent storage across multiple nodes is a significant hurdle. Since OCI instances use block storage that is tied to a specific instance, a standard PersistentVolume (PV) cannot simply migrate from one node to another.

To solve this, Longhorn is often deployed. Longhorn is a cloud-native hyper-converged storage system that turns local block storage on each OCI instance into a distributed storage pool. By replicating data across multiple nodes, Longhorn ensures that if one node fails, the volume remains available on other nodes in the cluster. This is essential for stateful applications like databases or content management systems that cannot rely on the ephemeral storage of a VM.

The interaction between Longhorn and OCI block storage allows for the creation of ReadWriteOnce (RWO) and ReadWriteMany (RWX) volumes. This means that a pod can be rescheduled to a different Ampere A1 node without losing its data, providing the high availability typically associated with paid enterprise Kubernetes services.

Resource Optimization and Pre-flight Considerations

To successfully maintain a "Forever Free" status, users must be vigilant about resource consumption. Oracle's free tier has strict limits; exceeding these can lead to instances being stopped or terminated.

A critical pre-flight check is ensuring the chosen region has sufficient ARM capacity. Because the Ampere A1 instances are highly sought after, some regions frequently report "Out of Capacity" errors. Users should be prepared to try multiple regions or use automation scripts to poll for availability.

Furthermore, users must be aware of the trial period transition. OCI typically grants a 30-day trial period with additional credits. Any "paid" resources deployed during this time that are not marked as "Always Free" will be terminated or hibernated once the trial expires. It is vital to ensure that the instance shapes selected are explicitly labeled as "Always Free Eligible."

The following checklist serves as a final verification before deployment:

  • Verify the region has available Ampere A1 capacity.
  • Ensure the SSH public key is correctly uploaded to OCI and the private key is stored securely.
  • Confirm that the VCN security lists allow traffic on port 6443 (K3s API) and port 80/443 (HTTP/S).
  • Check that the total OCPU count across all ARM instances does not exceed 4.
  • Check that the total RAM across all ARM instances does not exceed 24 GB.

Conclusion: The Strategic Value of K3s on OCI

The deployment of K3s on Oracle Cloud Infrastructure represents a sophisticated synthesis of efficiency and power. By leveraging the Ampere A1 ARM64 architecture, users can move beyond the limitations of single-node "micro-k8s" setups and enter the realm of true multi-node orchestration. The ability to split 24 GB of RAM and 4 OCPUs across a master and multiple worker nodes provides a platform capable of handling legitimate workloads, from CI/CD runners to complex microservices architectures.

The transition from manual setup to Infrastructure as Code via Terraform marks the professionalization of the environment. By codifying the network, compute, and K3s installation, the cluster becomes an asset that can be versioned, audited, and scaled. The addition of GitOps through Argo CD and automated networking via the Gateway API and cert-manager transforms the cluster into a mirror of modern enterprise Kubernetes environments, providing invaluable experience in managing the lifecycle of cloud-native applications.

Ultimately, the success of a K3s cluster on OCI depends on the balance between resource allocation and architectural choices. Whether utilizing an AMD node for a dedicated MySQL backend to preserve ARM memory or implementing Longhorn for distributed storage, the flexibility of the OCI Always Free tier allows for an exhaustive exploration of Kubernetes. This environment proves that high-quality, scalable orchestration does not require a massive budget, but rather a deep understanding of resource constraints and a commitment to automated, reproducible infrastructure.

Sources

  1. Sundeep Machado - Free Kubernetes Cluster on Oracle Cloud
  2. Sudhanva - K3s on Oracle Cloud Always Free
  3. GitHub Issue k3s-io/k3s#2046
  4. GitHub r0b2g1t/k3s-cluster-on-oracle-cloud-infrastructure
  5. Garuti Lorenzo - Deploy Kubernetes for Free Oracle Cloud

Related Posts