Synchronizing K3s Cluster Persistence via NFS CSI and Subdir External Provisioners

The fundamental challenge of stateful applications within a Kubernetes environment—and specifically within the lightweight K3s distribution—is the inherent volatility of container storage. By default, K3s utilizes the local-path provisioner. While efficient for single-node deployments or applications that do not require data mobility, the local-path provisioner binds a volume to a specific node's internal disk. In a multi-node cluster, this creates a catastrophic limitation: a pod scheduled to Node A cannot access data stored on Node B. If a pod is rescheduled to a different node due to a failure or resource rebalancing, it loses access to its persistent state, effectively rendering the application broken.

To resolve this, architects implement Network File System (NFS) integration. By leveraging an external NFS server, the storage is decoupled from the compute nodes. This transition transforms the storage architecture from "node-local" to "cluster-wide," allowing any pod, regardless of which node it resides on, to mount the same volume. This is critical for high-availability (HA) deployments where pods must be able to drift across the cluster without data loss. Achieving this requires the deployment of a Container Storage Interface (CSI) driver or an external provisioner that can dynamically create directories on the NFS server and map them to Persistent Volume Claims (PVCs) within the K3s control plane.

The Architectural Prerequisites for NFS Integration

Before initiating the software configuration within the cluster, the underlying infrastructure must be prepared. K3s does not provide the NFS server itself; it acts as the client that consumes the exported storage.

The NFS Server Requirement
A pre-existing NFS server is a non-negotiable requirement for this configuration. This server can take several forms depending on the scale of the operation:
- Network Attached Storage (NAS): Commercial NAS devices often have NFS support built-in, requiring only the enablement of the service and the creation of a shared folder.
- Dedicated Linux Server: A standalone machine running a Linux distribution configured as a server.
- Virtual Machine: A lightweight VM dedicated to file serving.

For those deploying a manual Linux-based NFS server, specifically on Debian-based systems like Raspberry Pi OS or Ubuntu, the installation of the nfs-kernel-server package is mandatory. This package provides the necessary kernel modules to export local directories over the network. The command for this installation is:

bash sudo apt install nfs-kernel-server

The impact of this installation is the creation of the NFS daemon, which listens for mount requests from the K3s nodes. Without this, the nodes will encounter "Connection Refused" or "Permission Denied" errors during the mount phase.

The NFS Client Configuration
Every node in the K3s cluster (both the server/control-plane and any worker agents) must be capable of mounting NFS shares. This requires specific client-side utilities. On Ubuntu or similar Debian-based distributions, the nfs-common package must be installed.

bash apt install nfs-common

The necessity of nfs-common stems from the fact that the standard mount command requires the NFS helper programs to understand the RPC (Remote Procedure Call) protocol used by NFS servers. Failure to install this on all nodes will result in a mount.nfs: operation not supported error when the K3s CSI driver attempts to attach a volume to a pod.

Pre-Flight Connectivity Verification

Before applying Kubernetes manifests, it is a technical best practice to verify that the network path between the K3s node and the NFS server is open and that the server is correctly exporting the desired path.

Validating Share Availability
The showmount utility is used to query the NFS server for its exported directories. This confirms that the server is active and that the client IP has been whitelisted.

bash showmount -e <server>

If this command fails, the issue usually lies in the NFS server's /etc/exports file or a firewall blocking ports 2049 (NFS) and 111 (RPCbind).

Manual Mount Testing
To eliminate Kubernetes-level variables, a manual mount should be performed on the node. This process involves creating a temporary directory and attempting to bind the remote share to it.

bash mkdir /tmp/nfscheck mount -t nfs <server>:<path> /tmp/nfscheck

Once mounted, the user should verify the disk space and connection using the df -h command:

bash df -h /tmp/nfscheck

A successful output will show the filesystem originating from the NAS or server FQDN (e.g., nas.fqdn.here:/volume1/k8s-nfs) and the total capacity available. Once verified, the mount must be cleaned up to prevent stale handles:

bash umount /tmp/nfscheck

Deploying the NFS CSI Driver via Helm

K3s includes a built-in Helm controller that allows administrators to deploy Helm charts by simply placing a manifest file in a specific directory. This eliminates the need to install the Helm binary manually on the host.

The Helm Controller Mechanism
The K3s Helm controller monitors the directory /var/lib/rancher/k3s/server/manifests/. When a HelmChart resource is detected here, K3s automatically installs and manages the lifecycle of that chart.

To deploy the NFS subdir external provisioner, a manifest file (e.g., helm-controller.yaml or nfs-controller.yml) must be created.

Configuration Variant A: Dedicated Namespace
For a clean architectural separation, the provisioner can be placed in its own namespace. The following configuration creates the nfs namespace and deploys the nfs-subdir-external-provisioner from the official Kubernetes SIGs repository.

```yaml
apiVersion: v1
kind: Namespace
metadata:

name: nfs

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: nfs
namespace: nfs
spec:
chart: nfs-subdir-external-provisioner
repo: https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
targetNamespace: nfs
set:
nfs.server: x.x.x.x
nfs.path: /exported/path
storageClass.name: nfs
```

Configuration Variant B: Default Namespace with Mount Options
In some environments, specific NFS versions are required for compatibility with older NAS devices. The valuesContent block allows for the definition of mountOptions, such as forcing NFS version 3.

```yaml
apiVersion: v1
kind: Namespace
metadata:

name: default

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: nfs
namespace: default
spec:
chart: nfs-subdir-external-provisioner
repo: https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
targetNamespace: default
set:
storageClass.name: nfs
valuesContent: |-
nfs:
server: 192.168.68.118
path: /i-data/f01e5fea/nfs/k3s
mountOptions:
- nfsvers=3
```

The nfs-subdir-external-provisioner is a critical component because it automates the creation of subdirectories on the NFS share. Instead of the administrator manually creating a folder for every volume, the provisioner creates a unique directory for every Persistent Volume Claim (PVC), preventing data collisions between different applications.

Storage Class Management and Defaulting

Once the Helm chart is applied, the K3s cluster will register a new StorageClass. The StorageClass acts as the "recipe" that the cluster uses to provision storage.

Verifying Storage Class Creation
After waiting approximately five minutes for the Helm controller to pull the images and start the pods, the available storage classes can be listed:

bash kubectl get storageclasses

The output will typically show two entries:

NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION
local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false
nfs cluster.local/nfs-nfs-subdir-external-provisioner Delete Immediate true

The "Immediate" binding mode for the NFS class is significant. Unlike WaitForFirstConsumer, which waits until a pod is scheduled to create the volume, the NFS provisioner can create the volume immediately upon the PVC request.

Setting NFS as the Default Storage Class
By default, K3s uses local-path. If an administrator wants all PVCs to use NFS unless otherwise specified, the default storage class must be changed. This is done by annotating the storage classes.

To remove the default status from the local-path provisioner:
bash kubectl annotate storageclass local-path storageclass.kubernetes.io/is-default-class=false

To set the nfs-csi or nfs provisioner as the default:
bash kubectl annotate storageclass nfs storageclass.kubernetes.io/is-default-class=true

This configuration change ensures that any deployment requesting storage without an explicit storageClassName will automatically be provisioned on the NFS server.

Persistent Volume Claim (PVC) Implementation

A Persistent Volume Claim is a request for storage by a user or application. It specifies the size and access mode required.

Creating the PVC Manifest
To test the NFS integration, a PVC manifest must be created. The accessModes should be set to ReadWriteOnce (RWO) or ReadWriteMany (RWX). One of the primary advantages of NFS is the support for ReadWriteMany, allowing multiple pods on different nodes to read and write to the same volume simultaneously.

yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: nfsclaim spec: accessModes: - ReadWriteOnce storageClassName: nfs resources: requests: storage: 100Mi

Applying the PVC:
bash kubectl apply -f pvc.yaml

Verification of the Binding Process
When the PVC is applied, the NFS provisioner detects the request, creates a directory on the NFS server (e.g., /exported/path/pvc-bdd42e9b...), and creates a corresponding Persistent Volume (PV) object in K3s.

The status check command:
bash kubectl get pvc nfsclaim

The expected output will show the status as Bound, confirming that the K3s control plane has successfully negotiated with the NFS server to allocate the requested 100Mi of space.

Practical Application: Connecting Pods to NFS Storage

The final stage of the deployment is mounting the provisioned volume into a running container. This is achieved by referencing the claimName within the pod's volume specification.

Example Nginx Deployment with NFS
The following manifest deploys an Nginx pod where the default HTML directory is mapped to the NFS volume. This allows the administrator to upload a website to the NFS server, and the Nginx pod will serve it regardless of which node the pod is running on.

yaml apiVersion: v1 kind: Pod metadata: name: nfs-pod spec: containers: - name: nfs-pod image: nginx volumeMounts: - name: nfs-volume mountPath: /usr/share/nginx/html volumes: - name: nfs-volume persistentVolumeClaim: claimName: nfsclaim

In this configuration, the volumeMounts section maps the internal container path /usr/share/nginx/html to the volume named nfs-volume, which is backed by the nfsclaim PVC.

Comparative Analysis of Storage Strategies in K3s

The choice between local-path and NFS involves a trade-off between latency and availability.

Local-Path Storage
- Performance: Extremely high, as it uses local NVMe or SSD speeds.
- Reliability: Low for pods. If the node dies, the data is inaccessible until the node recovers.
- Use Case: Database logs, scratch space, or single-node home labs.

NFS Storage
- Performance: Limited by network bandwidth (1Gbps or 10Gbps) and NFS protocol overhead.
- Reliability: High for pods. The data exists independently of the K3s nodes.
- Use Case: CMS uploads, shared configuration files, and high-availability application state.

Advanced Considerations and Troubleshooting

Despite the streamlined setup via Helm, several operational hurdles can occur.

Permissions and Ownership
A common issue with NFS in Kubernetes is the "Permission Denied" error. This occurs because the user inside the container (e.g., the www-data user in Nginx) does not have the same UID/GID as the owner of the directory on the NFS server. To resolve this, the NFS server should be configured with no_root_squash in the /etc/exports file, or the directories should be permissioned to 777 if security allows in a private environment.

Network Latency and Timeouts
In unstable network environments, NFS mounts can hang, leading to "Zombie" pods that are stuck in the Terminating state. This is often caused by the client attempting to reach a server that is no longer responding. Implementing soft mount options or adjusting the timeo and retrans parameters in the mountOptions section of the Helm chart can mitigate this.

Backup and Recovery
Because all cluster data is centralized on the NFS server, the backup strategy shifts from "backing up multiple nodes" to "backing up one server." Administrators can use standard Linux tools like rsync or tar to snapshot the NFS export directory, providing a comprehensive backup of all PVCs in the cluster.

Conclusion

The integration of NFS into a K3s cluster represents a critical evolution from a simple test environment to a production-ready stateful architecture. By replacing the restrictive local-path provisioner with the nfs-subdir-external-provisioner, administrators effectively solve the pod-scheduling deadlock associated with local volumes. The process, facilitated by the K3s built-in Helm controller, allows for a declarative approach to storage management, ensuring that infrastructure changes can be version-controlled via manifests.

The transition to NFS allows for the implementation of ReadWriteMany (RWX) volumes, which is a prerequisite for many modern microservices architectures that require shared access to assets. While this introduces a dependency on network stability and requires careful management of NFS server permissions, the benefit of seamless pod mobility across a Raspberry Pi or VM cluster far outweighs the overhead. The ultimate result is a resilient, scalable, and flexible storage layer that ensures application persistence regardless of the underlying compute node's health.

Sources

  1. Breadnet K3s NFS Guide
  2. Zaher.dev K3s NFS Blog
  3. Mark Sharpley K3s NFS Posts

Related Posts