Operationalizing the ELK Stack on Kubernetes: Infrastructure, Configuration, and Deployment Strategies

The aggregation, indexing, and visualization of log data within distributed systems represent a critical challenge for modern DevOps engineering. The ELK stack, an acronym for Elasticsearch, Logstash, and Kibana, has long served as the industry standard for this purpose. As organizations increasingly migrate toward containerized architectures, the deployment of these components within Kubernetes clusters has become a necessity. Kubernetes provides the automation required to manage stateful Elasticsearch nodes, scale Logstash processing, and ensure high-availability for Kibana dashboards. This analysis examines the architectural considerations, infrastructure prerequisites, and deployment methodologies for running the ELK stack on Kubernetes, drawing upon established configurations and the Elastic Cloud on Kubernetes (ECK) framework.

Fundamental Kubernetes Concepts and Cluster Architecture

Before deploying the ELK stack, a comprehensive understanding of the underlying Kubernetes object model is essential. Kubernetes orchestrates infrastructure through distinct objects, including Pods, services, namespaces, and volumes. To manage these resources effectively, engineers utilize Labels and Selectors. Labels are key/value pairs attached to objects to group and organize them. For instance, a Pod running a web server might carry the label app:nginx, while another hosting a specific site might use site:example.com. Selectors allow services to match these labels, enabling targeted interaction with specific subsets of objects. A selector configured with app = nginx and site = example.com will identify all Pods meeting those criteria, facilitating precise traffic routing and resource management.

Traffic management within the cluster is further abstracted by Ingress controllers. While Kubernetes Services sit in front of Pods to redirect requests, Ingress sits in front of Services to handle external access. Ingress controllers load balance between different services, often utilizing SSL/TLS to encrypt web traffic or implementing name-based hosting. In a name-based hosting scenario, multiple domain names, such as a.example.com and b.example.com, can point to the same Ingress IP address, with the controller directing traffic based on the requested hostname. This layer is critical for exposing Kibana dashboards or Elasticsearch APIs to external users or monitoring systems securely.

Infrastructure Prerequisites and Environment Setup

Deploying Elasticsearch requires significant computational resources due to the memory-intensive nature of its indexing engine. Consequently, the underlying infrastructure must be provisioned carefully. For development or testing environments, Vagrant serves as a powerful tool to provision virtual machines that mimic a production Kubernetes cluster. A robust setup typically involves one master node and two worker nodes.

The provisioning process begins with downloading the necessary configuration files, such as those found in the k8s_ubuntu directory structure. The core configuration is defined in the Vagrantfile, which instructs Vagrant on how to build the virtual environment. On older Windows systems, it is advisable to edit this file using WordPad rather than Notepad, as older versions of Notepad often struggle with end-of-line (EOL) characters, resulting in formatting errors that prevent proper execution.

Critical adjustments to the Vagrantfile are required to ensure Elasticsearch runs successfully. Under the "Kubernetes Worker Nodes" section, the variable v.memory should be set to 4096. This allocates 4 GB of RAM to each worker node, a minimum requirement for Elasticsearch to function properly when managing multiple nodes. Additionally, the CPU allocation should be increased by setting v.cpus to 2 instead of the default 1. Once these modifications are saved, the cluster is initiated by executing vagrant up. This command triggers the download and configuration of the necessary components, a process that may take considerable time. Upon completion, the master node can be accessed via vagrant ssh kmaster, and the health of the cluster verified using kubectl get nodes, which lists the active nodes in the infrastructure.

Deploying the Elasticsearch Component

The first component of the ELK stack to be deployed is Elasticsearch. Before initializing the cluster, security and access controls must be established. A service account with read access to services, endpoints, and namespaces should be created to manage permissions within the cluster. This is achieved by applying the relevant Role-Based Access Control (RBAC) configuration:

bash kubectl apply -f rbac.yml

With the permissions set, the Elasticsearch cluster itself is deployed using a StatefulSet. StatefulSets are preferred for Elasticsearch because they ensure stable, unique network identifiers and persistent storage for each node, which is crucial for maintaining data integrity in a distributed search engine. The deployment is executed via:

bash kubectl apply -f elastic.yml

To make the Elasticsearch cluster accessible to other components and external tools, a service of type ClusterIP or LoadBalancer is created:

bash kubectl apply -f elastic-service.yml

Verification of the Elasticsearch instance can be performed by forwarding the service port to the local machine, allowing direct interaction with the Elasticsearch API:

bash kubectl port-forward -n kube-system svc/elasticsearch-logging 9200:9200

Users can then browse to http://localhost:9200 to confirm that the cluster is responsive and returning valid JSON responses.

Log Aggregation with Logstash

Logstash acts as the processing engine in the ELK stack. It receives logs from various sources, parses them, and formats them into a structure that Elasticsearch can index efficiently. The deployment of Logstash involves two distinct configuration steps: defining the configuration pipeline and creating the deployment resource.

First, the Logstash configuration file, which defines input, filter, and output stages, is applied to the cluster:

bash kubectl apply -f logstash-config.yml

Subsequently, the Logstash deployment itself is created, pulling the configured image and setting up the necessary containers:

bash kubectl apply -f logstash-deployment.yml

This two-step process ensures that the processing logic is separated from the container orchestration logic, allowing for easier updates to the parsing rules without redeploying the entire container infrastructure.

Shipment and Indexing with Filebeat

While Logstash processes data, Filebeat serves as the lightweight shipper. It is responsible for collecting logs from the various nodes in the Kubernetes cluster and forwarding them to Logstash. To ensure comprehensive log collection, Filebeat is typically deployed as a DaemonSet. A DaemonSet ensures that a copy of the Filebeat Pod runs on every node in the cluster, capturing logs from all running containers regardless of where they are scheduled.

The deployment command for Filebeat is:

bash kubectl apply -f filebeat-daemon-set.yml

This architecture creates a robust data pipeline: application logs are collected by Filebeat on each node, shipped to Logstash for parsing and enrichment, and finally indexed into Elasticsearch for long-term storage and search.

Visualization with Kibana

The final component, Kibana, provides the graphical interface for visualizing the data stored in Elasticsearch. Kibana is deployed as a separate service to allow users to query logs, create dashboards, and monitor system health.

The deployment is executed using:

bash kubectl apply -f kibana.yml

Access to Kibana depends on the service type defined in the configuration. If a LoadBalancer service type is used, as is common in cloud environments or tools like Minikube, the external IP address is assigned automatically. In a Minikube environment, the URL for accessing Kibana can be retrieved via:

bash minikube service kibana-logging -n kube-system

Alternatively, if a NodePort service is configured, users can access Kibana using the node’s IP address followed by the specific port number. Once accessed, users must create an index pattern, typically selecting @timestamp as the time filter field, to enable the Discover tab. This tab allows engineers to search through logs, filter by Kubernetes labels, and identify errors, providing critical insights into the health and performance of the microservices running within the cluster.

Advanced Management with Elastic Cloud on Kubernetes (ECK)

For production-grade deployments, manual YAML management can become cumbersome. The Elastic Cloud on Kubernetes (ECK) operator provides a declarative approach to managing Elasticsearch and Kibana resources. ECK automates the lifecycle management of these components, including cluster bootstrapping, scaling, and backups.

Using ECK involves interacting with Kubernetes custom resources. Engineers can create Elasticsearch clusters by defining custom resource definitions that specify the number of nodes, storage classes, and resource limits. ECK then handles the creation of the underlying StatefulSets and services. This approach also simplifies security management; for instance, extracting passwords from Kubernetes secrets is streamlined, allowing for secure authentication between Kibana and Elasticsearch. Furthermore, ECK supports the installation of plugins on Elasticsearch nodes running within Kubernetes containers, enabling extended functionality without manual container image rebuilding.

System requirements for running a local ECK-based ELK stack are rigorous. A development environment should have at least 12 GB of RAM, 8 CPU cores, and a fast internet connection. These specifications ensure that the multiple Java-based processes of Elasticsearch, along with the Kubernetes control plane, do not exhaust system resources. For users lacking such hardware, utilizing a Virtual Private Server (VPS) is recommended.

Conclusion

Deploying the ELK stack on Kubernetes transforms log management from a static, server-bound task into a dynamic, scalable operation. By leveraging Kubernetes features such as StatefulSets for Elasticsearch, DaemonSets for Filebeat, and robust service definitions for Kibana, organizations can achieve high availability and seamless scalability. The use of tools like Vagrant for local development and the Elastic Cloud on Kubernetes (ECK) operator for production management provides a comprehensive framework for handling the complexity of distributed logging. As Kubernetes adoption continues to grow, mastering the integration of the ELK stack within this ecosystem becomes an essential skill for DevOps professionals, ensuring that observability keeps pace with infrastructure innovation.

Sources

  1. ELK-kubernetes GitHub Repository
  2. Running ELK on Kubernetes with ECK – Part 1

Related Posts