Rook-Ceph Integration for k3s Environments

The intersection of lightweight Kubernetes distributions and distributed storage systems represents a critical evolution in edge computing and resource-constrained infrastructure. k3s, designed as a slim, single-binary Kubernetes distribution, provides the orchestrator necessary for rapid deployment in IoT nodes, lab servers, and edge environments. However, the utility of a lightweight orchestrator is severely limited without a persistent storage layer that can withstand the volatility of distributed workloads. This is where the integration of Ceph, a highly scalable, open-source distributed storage system, becomes transformative. Ceph provides a unified system for object, block, and file storage, engineered to run on commodity hardware. When managed by Rook—a cloud-native storage orchestrator—Ceph is transformed from a complex set of storage services into a self-managing, self-scaling, and self-healing storage solution.

The integration of Rook-Ceph within a k3s cluster solves the fundamental problem of storage persistence. In standard k3s deployments, scaling pods often leads to storage bottlenecks where the backend cannot keep pace with the orchestration layer. By deploying Rook-Ceph, the infrastructure gains a resilient storage layer capable of replicating data across multiple nodes. This ensures that if a single node fails, the data remains available, and the system automatically heals itself. This combination turns a "just enough" infrastructure into a dependable, production-ready automation engine.

Architecture of Rook-Ceph on k3s

The deployment of Rook-Ceph on k3s involves a layered architectural approach. At the base is the k3s distribution, which optimizes Kubernetes for smaller footprints. Above this sits Rook, which acts as the intelligence layer. Rook does not provide storage itself; rather, it orchestrates Ceph, automating its deployment and management. Ceph then provides the actual storage pools.

Ceph's architecture is centered on the concept of distributed, self-healing storage pools. It supports three primary types of storage:

Block storage: Provided via Ceph RBD (RADOS Block Device).
File storage: Provided via CephFS.
Object storage: Provided via Rados Gateway (RGW).

The connection between k3s and Ceph is mediated by CSI (Container Storage Interface) drivers. These drivers handle the provisioning and mount lifecycle transparently. When a k3s pod requests a Persistent Volume (PV), the CSI driver maps this request to Ceph credentials, which are governed by Role-Based Access Control (RBAC). This creates a workflow where the developer simply claims a volume, and Ceph handles the underlying replication—typically three ways—without manual intervention.

k3s-Specific Deployment Considerations

Deploying Rook-Ceph on k3s is not identical to deploying it on standard Kubernetes (k8s) due to the specific optimizations k3s implements to reduce resource consumption. These differences can cause deployment failures if not explicitly addressed.

The most critical difference lies in the data directory structures. While standard Kubernetes utilizes /var/lib/kubelet, k3s shifts this to /var/lib/rancher/k3s. Specifically, the kubelet directory is located at /var/lib/rancher/k3s/agent/kubelet. If the Rook-Ceph CSI plugin attempts to use the default Kubernetes paths, the Persistent Volume Claims (PVCs) will remain in a pending state because the CSI provisioner cannot find the necessary kubelet directories to mount the volumes.

Furthermore, k3s utilizes an embedded containerd instance rather than a separate container runtime. This means the containerd socket paths and internal data management differ from vanilla Kubernetes. To ensure a successful deployment, administrators must override the default kubelet directory path to match the k3s non-standard location.

Infrastructure Prerequisites

To establish a functional Rook-Ceph cluster on k3s, specific hardware and software prerequisites must be met to ensure stability and performance.

The cluster must consist of at least 3 nodes. This minimum is required to support the replication factor typically used in Ceph. If a cluster has fewer than three nodes, achieving the standard three-way replication of data is impossible, which compromises the self-healing and high-availability nature of the system.

Storage requirements for each node include:

Unformatted raw disks or partitions: These are required for OSDs (Object Storage Daemons).
Raw disk access: Ceph requires direct access to the disk to manage its own file system.

In virtualized environments, such as Virtual Box, these OSDs should be implemented as attached Virtual Hard Disks. A critical warning for administrators is to avoid deploying Ceph directly on a host laptop or desktop. Ceph has a reputation for utilizing the machine's primary file system as its own, which can lead to catastrophic data loss or system instability on the host machine. The use of VMs is mandatory for safety and isolation.

The software stack used in a typical implementation includes:

RHEL 9.4 VMs: Used as the base operating system.
kubectl: Configured to reach the k3s cluster.

k3s Cluster Setup Procedure

The initial stage of the deployment is the establishment of the k3s cluster itself. This involves configuring a master node and joining worker nodes to the cluster.

To install k3s on the master node (e.g., k3s-ctrl1), the following command is executed:

bash curl -sfL https://get.k3s.io | sh -

Once the installation is complete, the master node generates a cluster token. This token is essential for authenticating worker nodes. The token can be retrieved using the following command:

bash cat /var/lib/rancher/k3s/server/node-token

After obtaining the token, the remaining nodes must be joined to the cluster. This is achieved by executing the installation script on the worker nodes while providing the master node's IP address and the retrieved token:

bash curl -sfL https://get.k3s.io | K3S_URL=https://<master_node_ip:192.168.8.128>:6443 K3S_TOKEN=<token from the cat /var/lib/rancher/k3s/server/node-token command> sh -

To verify that the nodes have successfully joined the cluster and are in a Ready state, the following command is used:

bash kubectl get node -o wide

Finally, the nodes can be labeled as worker nodes to ensure proper scheduling of Rook-Ceph components.

Configuring the Rook-Ceph Storage Layer

Once the k3s cluster is operational, the Rook-Ceph storage layer must be deployed. This involves creating the orchestrator and then defining the storage pools.

Creating CephBlockPool and StorageClass

Rook does not automatically create storage pools or StorageClasses. These must be defined manually to allow k3s to provision volumes. A CephBlockPool defines how data is replicated and where the failure domain lies.

The following configuration for storageclass-k3s.yaml establishes a replicated pool and a corresponding StorageClass:

```yaml
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph
spec:
failureDomain: host
replicated:

size: 3

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
```

This configuration specifies a failure domain of host, meaning data is replicated across different physical nodes. The replication size is set to 3, ensuring high availability. The StorageClass then maps the rook-ceph-block provisioner to this pool, utilizing ext4 as the file system.

To apply this configuration, the following command is used:

bash kubectl apply -f storageclass-k3s.yaml

Testing PVC Creation

With the StorageClass in place, the final step is to verify that k3s can successfully claim persistent volumes from Ceph. A PersistentVolumeClaim (PVC) is created to request a specific amount of storage.

The following configuration for test-pvc-k3s.yaml requests 5Gi of storage:

yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ceph-test-pvc spec: accessModes: - ReadWriteOnce storageClassName: rook-ceph-block resources: requests: storage: 5Gi

Applying the PVC:

bash kubectl apply -f test-pvc-k3s.yaml

Checking the status of the PVC:

bash kubectl get pvc ceph-test-pvc

Troubleshooting and Optimization

Even with correct configurations, k3s-specific issues can arise, particularly regarding the CSI (Container Storage Interface) provisioner.

Solving Pending PVCs

If a PVC remains in a Pending state, it is often an indication that the CSI provisioner is failing to mount the volume due to path mismatches. The first step in troubleshooting is to examine the logs of the CSI provisioner:

bash kubectl -n rook-ceph logs deploy/csi-rbdplugin-provisioner -c csi-provisioner | tail -20

To ensure the kubelet path is correctly configured to match k3s's non-standard layout, administrators should check the daemonset configuration:

bash kubectl -n rook-ceph get ds csi-rbdplugin -o yaml | grep kubeletDir

Verification of the CSI driver and pod status can be performed using:

bash kubectl get csidriver kubectl -n rook-ceph get pods | grep csi kubectl -n rook-ceph describe deploy csi-rbdplugin-provisioner | grep -A5 "Volumes:"

Identity and Security Management

A common failure point in Ceph deployments is the manual distribution of Ceph keys. Manual key management increases the attack surface and complicates compliance with frameworks such as SOC 2 or ISO 27001.

To optimize security and reliability, the following strategies should be implemented:

OIDC-based Identity: Integrate identity management with systems such as Okta or AWS IAM for secrets delivery.
Automated Rotation: Implement automatic credential rotation to ensure that static files or shared secrets are not used.
RBAC Integration: Use Kubernetes Role-Based Access Control to govern how pods interact with Ceph credentials.

Comparative Analysis of Storage Components

The following table outlines the specific roles and impacts of the components used in the Rook-Ceph and k3s integration.

Component	Direct Function	Impact on User	Contextual Relation
k3s	Lightweight K8s Distribution	Reduced overhead for edge/IoT	Orchestrates the Rook operator
Rook	Storage Orchestrator	Automates Ceph deployment	Manages Ceph OSDs and pools
Ceph	Distributed Storage System	Provides block, file, and object storage	The actual data persistence layer
CSI Driver	Volume Provisioning Interface	Transparent mount/unmount	Connects k3s pods to Ceph volumes
OSD	Object Storage Daemon	Physical data storage	Distributed across k3s nodes
PVC	Volume Request	Simple API for storage claims	Triggers CSI provisioner for Ceph

Conclusion

The integration of Rook-Ceph within a k3s cluster provides a robust solution for persistent storage in edge and resource-constrained environments. By addressing the specific architectural differences of k3s—most notably the relocated kubelet directory at /var/lib/rancher/k3s/agent/kubelet—administrators can deploy a storage layer that is not only resilient but also self-healing.

The transition from manual storage management to an orchestrated Rook-Ceph system eliminates the risks associated with scaling pods, where standard storage backends often fail. The ability to provide unified block, file, and object storage on commodity hardware ensures that the infrastructure can grow without proportional increases in complexity. Furthermore, the shift toward OIDC-based identity and automated secret rotation transforms the security posture from a manual, error-prone process into a compliant, production-grade system.

Ultimately, the synergy between k3s and Rook-Ceph allows for the deployment of high-availability applications on lightweight infrastructure. The capability to replicate data three ways across multiple nodes ensures that failures are handled gracefully, maintaining system uptime. This architecture effectively bridges the gap between the simplicity of a lightweight orchestrator and the power of a distributed storage system, creating a dependable foundation for modern cloud-native workloads.