Implementing High Availability NFS Storage Architectures in Kubernetes Environments

The integration of Network File System (NFS) protocols within Kubernetes clusters represents a foundational approach to solving the challenge of persistent, shared storage. While the technology itself is a long-standing standard in networked environments, its application within modern container orchestration platforms is critical for workloads requiring specific data access patterns. In a Kubernetes ecosystem, the primary driver for adopting NFS is the requirement for ReadWriteMany (RWX) access modes. Unlike block storage, which is typically restricted to ReadWriteOnce (RWO) for a single node at a time, NFS allows multiple pods across different nodes to simultaneously read and write to the same underlying file system. This capability is non-negotiable for complex distributed applications, such as content management systems, shared media processing pipelines, or collaborative machine learning datasets, where data consistency and availability across a multi-node cluster are paramount.

However, a standard, single-node NFS server introduces a significant single point of failure (SPOF). If the host running the NFS service encounters a hardware malfunction or a kernel panic, all Kubernetes pods relying on that storage will experience I/O errors, potentially leading to application crashes or data corruption. To mitigate this, enterprise-grade architectures move away from simple NFS setups toward High Availability (HA) NFS clusters. These clusters ensure that the storage service remains reachable even if a physical node fails, utilizing distributed data replication and virtualized networking to provide a seamless experience to the Kubernetes control plane.

The Mechanics of ReadWriteMany and Shared Storage Requirements

In Kubernetes, storage is abstracted through several layers: PersistentVolumes (PV), PersistentVolumeClaims (PVC), and the underlying storage provider. When an application requires a "shared" storage capability, it is specifically requesting the ReadWriteMany access mode.

The impact of this requirement is profound for microservices architecture. In a standard deployment where a pod is scheduled on Node A, if that pod uses a standard block volume, Node B cannot access that same volume. This forces developers to design applications that are either stateless or utilize complex application-level replication. By leveraging NFS, the infrastructure handles the sharing mechanism at the network layer, allowing the application logic to remain simple.

The following table outlines the primary access modes and how NFS facilitates specific Kubernetes requirements:

Access Mode	Description	NFS Capability	Kubernetes Use Case
ReadWriteOnce (RWO)	Volume can be mounted as read-write by a single node.	Not the primary strength of NFS	Block storage, databases (standard)
ReadOnlyMany (ROX)	Volume can be mounted as read-only by many nodes.	Fully supported	Configuration files, static assets
ReadWriteMany (RWX)	Volume can be mounted as read-write by many nodes.	Core functionality of NFS	Shared logs, ML datasets, web assets

High Availability Architectures for NFS via LINBIT Technologies

To achieve true resilience, organizations must move beyond COTS (Commercial Off-the-Shelf) hardware limitations and utilize software-defined high availability. LINBIT provides a robust framework for building these clusters using open-source tools and standard hardware, effectively bypassing the "vendor lock-in" often associated with expensive, proprietary Network Attached Storage (NAS) or Storage Area Network (SAN) hardware.

The core of this high-availability strategy is DRBD (Distributed Replicated Block Device). DRBD acts as a block storage driver that enables synchronous data replication between cluster nodes. Unlike asynchronous replication, which might lead to data loss during a failover, synchronous replication ensures that a write is acknowledged only after it has been committed to both the local device and the remote peer.

The Role of DRBD and Replication Layers

DRBD operates at the block level, making it transparent to the file system sitting above it. When an NFS server is running on top of a DRBD device, the data being exported via NFS is actually being mirrored in real-time to another node in the cluster.

Direct Fact: DRBD provides synchronous replication of data between cluster nodes.
Impact Layer: This ensures that in the event of a node failure, the secondary node has an exact, bit-for-bit copy of the data, preventing data loss and ensuring immediate availability.
Contextual Layer: Because the data is replicated at the block level, the NFS export can be rapidly transitioned between nodes without the need for a full data resynchronization.

Cluster Resource Managers: DRBD Reactor vs. Pacemaker

Managing the "intelligence" of an HA cluster—deciding when to failover and which node should host the service—requires a Cluster Resource Manager (CRM). LINBIT supports two primary methods for managing HA NFS:

DRBD Reactor: This is a LINBIT-developed CRM. It is a lighter, more streamlined solution. It uses a promoter plugin that watches the quorum state of DRBD devices. If a primary node fails, the promoter plugin automatically promotes the DRBD device on the healthy node and starts the necessary NFS services. It is ideal for users seeking a streamlined, purpose-built management tool.
Pacemaker: This is a community-developed, highly robust CRM. It is more complex and carries a steeper learning curve for configuration and management. However, it is superior for complex environments where strict ordering of services and collocation (ensuring specific services run on the same node) are required.

Implementation Strategies: DigitalOcean vs. Self-Managed HA

The method of implementing NFS in Kubernetes depends heavily on the environment, whether it is a managed service like DigitalOcean Kubernetes (DOKS) or a self-managed, high-availability cluster.

DigitalOcean NFS Integration

In managed environments like DOKS, the complexity of the storage backend is abstracted away. Users connect to a DigitalOcean NFS Share, which is pre-configured for high availability by the provider. This is particularly useful for specialized workloads like AI/ML, where multiple pods need access to large training datasets.

To use this in DOKS, the workflow follows a strict sequence:
1. Retrieve connection details: Users must access the Network File Storage page in the DigitalOcean control panel.
2. Identify the Server IP and Mount Path: For an entry like 10.128.0.69:/123456/6160d138-60cb-4e61-9ff3-076eebed5c0f, the IP is 10.128.0.69 and the path is /123456/6160d138-60cb-4e61-9ff3-076eebed5c0f.
3. API Access: For automation, a GET request can be sent to the /v2/nfs endpoint.

The implementation in Kubernetes involves three manual steps:
- Static provisioning of a PersistentVolume (PV).
- Binding the PV to a PersistentVolumeClaim (PVC).
- Mounting the PVC to the target workload.

Self-Managed HA NFS via LINBIT VSAN

For users requiring more control or those who need to integrate non-Kubernetes workloads with their storage, LINBIT VSAN provides an "easy mode." VSAN combines DRBD, LINSTOR, and a web-based front end into a Linux distribution specifically designed for creating HA storage clusters. This allows an administrator to provide persistent storage to a Kubernetes cluster while simultaneously allowing a legacy, non-Kubernetes server on the same network to access the same highly available NFS export.

Technical Workflow: Configuring Persistent Volume Claims (PVC)

When managing NFS in Kubernetes, it is a best practice to define NFS exports as PersistentVolumes (PVs) rather than mounting directly in the Pod spec. This allows for better abstraction and resource management.

Step 1: Defining the PersistentVolumeClaim

To request the shared storage, a user must create a PVC that specifies the ReadWriteMany access mode. The following command creates the necessary YAML manifest:

```bash

cat << EOF > nfs-pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-app
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 4Gi
EOF
```

Once the manifest is created, it must be applied to the cluster:

bash kubectl apply -f nfs-pvc.yaml

To verify the status of the claim, use:

bash kubectl get pvc

The output should indicate that the status is Bound, ensuring the claim is successfully linked to a provisioned PV.

Step 2: Deploying the Pod for Testing

Once the PVC is active, a Pod can be deployed to test the shared write capabilities. The following manifest uses an Alpine image and a shell command to append the hostname and date to a text file every 30 seconds, simulating a continuous write operation from multiple sources.

```bash

cat << EOF > nfs-in-pv.yaml

kind: Pod
apiVersion: v1
metadata:
name: nfs-in-pv
spec:
containers:
- name: nfs-app
image: alpine
volumeMounts:
- name: nfs-volume
mountPath: /data
command: ["/bin/sh"]
args: ["-c", "while true; do echo \$(hostname; date) >> /data/test-file.txt; sleep 30s; done"]
volumes:
- name: nfs-volume
persistentVolumeClaim:
claimName: nfs-app
EOF
```

Applying this manifest:

bash kubectl apply -f nfs-in-pv.yaml

Step 3: Validating Data Integrity and Shared Access

To confirm that multiple pods are indeed writing to the same file via the NFS mount, one can execute a tail command on the target file.

bash kubectl exec nfs-in-a-pod -- tail /data/test-file.txt

The expected output should show interleaved timestamps and hostnames from different pods, proving that the ReadWriteMany capability is functioning correctly at the storage layer.

Networking and the Virtual IP (VIP) Concept

In a High Availability NFS cluster, the storage is not tied to a single physical IP address of a single server. Instead, the services are exposed via a Virtual IP (VIP) address. This VIP "floats" across the cluster nodes.

If the node currently acting as the NFS server fails, the cluster manager (such as Pacemaker or DRBD Reactor) migrates the VIP to the healthy node. To the Kubernetes cluster, this transition appears as a brief network hiccup rather than a permanent storage loss.

In a typical configuration:
- The VIP might be 192.168.222.25/24.
- The exported file system root might be /drbd/exports/.

When defining a Pod that bypasses the PV/PVC abstraction for direct mounting (though not recommended for production), the spec.volumes.nfs.server must point to this VIP.

```bash

cat << EOF > nfs-in-pod.yaml

kind: Pod
apiVersion: v1
metadata:
name: nfs-in-a-pod
spec:
containers:
- name: nfs-app
image: alpine
volumeMounts:
- name: nfs-volume
mountPath: /data
command: ["/bin/sh"]
args: ["-c", "while true; do echo \$(hostname; date) >> /data/test-file.txt; sleep 30s; done"]
volumes:
- name: nfs-volume
nfs:
server: 192.168.222.25
path: /drbd/exports/nfs-app
EOF
```

It is important to note that Kubernetes Pod specifications do not support specifying custom NFS mount options (like nolock or soft). These must be configured globally on the server side via /etc/nfsmount.conf or at the node level.

Advanced Storage Orchestration: LINSTOR and CSI

While manual NFS configuration provides clarity, modern DevOps workflows often utilize the LINSTOR Operator and the Container Storage Interface (CSI) driver. LINSTOR is designed to manage the lifecycle of LINSTOR and DRBD resources directly through Kubernetes-native APIs.

The introduction of Kubernetes-native RWX support in the LINSTOR Operator (specifically since version 2.10.0) allows for even tighter integration between the storage layer and the container orchestration layer. This removes the need for manual PV/PVC management in many scenarios, enabling dynamic provisioning of high-availability NFS-backed storage.

Analytical Conclusion

The implementation of NFS in Kubernetes is a trade-off between simplicity and reliability. While a standard NFS mount is trivial to implement, it is fundamentally insufficient for production-grade Kubernetes environments due to the inherent lack of high availability in the NFS protocol itself. To bridge this gap, the integration of block-level replication (DRBD) and robust cluster resource management (DRBD Reactor or Pacemaker) is essential.

The architectural decision between using a managed service like DigitalOcean NFS, a specialized appliance like LINBIT VSAN, or a custom-built DRBD/Pacemaker cluster depends on the specific requirements of the workload. For AI/ML workloads requiring massive throughput and simplicity, managed services are optimal. For complex enterprise environments requiring the convergence of legacy and containerized workloads with strict data consistency requirements, a self-managed HA NFS cluster leveraging DRBD is the superior solution. Ultimately, the transition from "simple NFS" to "Highly Available NFS" is what enables Kubernetes to move from a development playground to a resilient, mission-critical production platform.

Implementing High Availability NFS Storage Architectures in Kubernetes Environments

The Mechanics of ReadWriteMany and Shared Storage Requirements

High Availability Architectures for NFS via LINBIT Technologies

The Role of DRBD and Replication Layers

Cluster Resource Managers: DRBD Reactor vs. Pacemaker

Implementation Strategies: DigitalOcean vs. Self-Managed HA

DigitalOcean NFS Integration

Self-Managed HA NFS via LINBIT VSAN

Technical Workflow: Configuring Persistent Volume Claims (PVC)

Step 1: Defining the PersistentVolumeClaim

cat << EOF > nfs-pvc.yaml

Step 2: Deploying the Pod for Testing

cat << EOF > nfs-in-pv.yaml

Step 3: Validating Data Integrity and Shared Access

Networking and the Virtual IP (VIP) Concept

cat << EOF > nfs-in-pod.yaml

Advanced Storage Orchestration: LINSTOR and CSI

Analytical Conclusion

Sources

Related Posts