Logstash ECK Orchestration and Pipeline Configuration

The deployment of Logstash within a Kubernetes environment, specifically when leveraging the Elastic Cloud on Kubernetes (ECK) operator, represents a sophisticated convergence of data pipeline processing and container orchestration. Logstash serves as the critical ingestion engine of the Elastic Stack, tasked with the ingestion, transformation, and routing of event data. When integrated into Kubernetes, Logstash transitions from a standalone server to a managed resource, where the ECK operator handles the complexities of lifecycle management, scaling, and configuration updates. This orchestration allows for a highly resilient architecture where the operational flexibility of containers is merged with the robust analytics capabilities of the Elastic Stack. The use of the ECK operator ensures that the deployment is not merely a set of pods, but a managed entity that can be upgraded, secured, and monitored with minimal manual intervention.

The ECK Operator Framework

The Elastic Cloud on Kubernetes (ECK) Operator is a specialized controller designed to simplify the deployment and management of Elastic Stack applications. It functions by watching custom resources within the Kubernetes API and automatically provisioning the necessary clusters and configurations.

The operator provides several out-of-the-box benefits that impact the overall stability of the system:

Lifecycle management: This ensures that the Logstash instance is managed from deployment to decommissioning without manual script execution.
Upgrades: The operator facilitates version transitions by performing rolling restarts of Logstash Pods when the YAML specification is edited.
Security: ECK manages the security posture of the stack, ensuring that components communicate securely.
Persistent storage: The operator automates the provisioning of storage for data that must survive pod restarts.
Monitoring: Integrated monitoring tools allow operators to track the health and performance of the Logstash pipelines.

Infrastructure Prerequisites

Before initiating the deployment of Logstash, certain environmental requirements must be met to ensure the stability of the Elastic Stack.

Requirement	Description
Kubernetes Cluster	A running cluster version 1.21+ is required.
kubectl	The command-line tool must be installed and configured for cluster communication.
Helm	Helm v3.x is highly recommended for simplified deployment.
StorageClass	A configured StorageClass is necessary for the provision of Persistent Volumes (PVs).
Ingress Controller	A LoadBalancer or Ingress Controller (e.g., NGINX, Traefik, or Istio) is required for external access.
Resources	A minimum of 8GB of available memory across cluster nodes is required to support the full stack.
Concepts	A fundamental understanding of Pods, Services, and Deployments is essential.

The requirement for a StorageClass is particularly critical because Logstash often requires persistent storage for plugins and internal queues. Without a functioning StorageClass, the automated provisioning of the logstash-data volume will fail. Furthermore, the memory requirement of 8GB is a baseline; production environments with heavy pipeline processing may require significantly more.

Logstash Resource Specification

The configuration of Logstash on ECK is managed through a YAML specification. This specification acts as the single source of truth for the Logstash instance.

Versioning and Scaling

The spec section of the YAML allows for the definition of the Logstash version and the number of replicas. For example, using version: 9.4.2 ensures that the operator pulls the correct image. The count attribute determines the number of pods, enabling horizontal scaling to handle increased data ingestion loads.

Configuration Methods

There are three primary ways to define the Logstash configuration:

Direct Specification: Using the spec.config section to define settings (the ECK equivalent of logstash.yml).
Secret-Based Configuration: Providing configuration through a Kubernetes Secret specified in the spec.configRef section.
Volume-Based Configuration: Using path.config to point to volumes mounted on the Logstash container.

Example Specification

The following example demonstrates a basic Logstash deployment with a defined pipeline:

yaml apiVersion: logstash.k8s.elastic.co/v1alpha1 kind: Logstash metadata: name: quickstart spec: version: 9.4.2 count: 1 elasticsearchRefs: - name: quickstart clusterName: qs config: pipeline.workers: 4 log.level: debug pipelines: - pipeline.id: main config.string: | input { beats { port => 5044 } } output { elasticsearch { hosts => [ "${QS_ES_HOSTS}" ] user => "${QS_ES_USER}" password => "${QS_ES_PASSWORD}" ssl_certificate_authorities => "${QS_ES_SSL_CERTIFICATE_AUTHORITY}" } }

In this configuration, pipeline.workers: 4 is used to define the concurrency of the processing pipeline, while log.level: debug provides high-verbosity logging for troubleshooting. The pipelines section allows for the direct definition of the input and output logic.

Storage and Volume Management

Storage is a critical component of Logstash's reliability, particularly when dealing with data persistence and queue management.

The logstash-data Volume

By default, the ECK operator creates a PersistentVolume called logstash-data. This volume is mapped to the path /usr/share/logstash/data within the container. This directory is typically used for storage required by various Logstash plugins.

The default specifications for this volume are as follows:

Capacity: 1.5Gi.
StorageClass: Uses the standard StorageClass of the Kubernetes cluster.

For production environments, the default 1.5Gi may be insufficient. Users can override these settings by adding a spec.volumeClaimTemplate section named logstash-data. This allows the administrator to specify the desired storage capacity and a specific Kubernetes storage class to ensure the performance and reliability of the underlying disk.

Volume Support and Breaking Changes

It is important to note that volume support for Logstash was introduced in ECK 2.9.0. This introduction constituted a breaking change. Consequently, any existing Logstash resources from earlier versions must be recreated to support the new volume management logic.

Persistent Queues and Dead Letter Queues

Persistent Queues (PQs) and Dead Letter Queues (DLQs) are not currently managed by the Logstash operator. This means that users who require these features must manually create and manage their own Volumes and VolumeMounts. This manual overhead is necessary to ensure that data is not lost during unexpected pod crashes or restarts.

Data Loss Risks with emptyDir

If data persistence is not a requirement, users may opt for an emptyDir volume. However, this is highly discouraged in production environments because emptyDir volumes are deleted when the pod is removed, leading to permanent data loss.

Queue Management and Graceful Shutdown

Ensuring that data is not lost during a pod termination is a primary concern in Kubernetes environments.

Draining the Queue

To prevent data loss during a shutdown, it is recommended to set queue.drain: true in the Logstash configuration. This setting ensures that Logstash attempts to process all queued events before the process terminates.

Termination Grace Period

Since draining a queue takes time, the standard Kubernetes termination grace period may be insufficient. Users should increase the terminationGracePeriodSeconds in the podTemplate.

Example configuration for graceful shutdown:

yaml apiVersion: logstash.k8s.elastic.co/v1alpha1 kind: Logstash metadata: name: logstash spec: config: queue.drain: true podTemplate: spec: terminationGracePeriodSeconds: 604800

Kubernetes Limitations

Despite these configurations, a known issue exists where Kubernetes may not honor terminationGracePeriodSeconds settings greater than 600. This means that even with queue.drain: true and a high grace period, a queue may not be fully drained before the pod is killed.

Manual DLQ Draining

In the current technical preview, there is no mechanism to automatically drain a Dead Letter Queue (DLQ) before Logstash shuts down. To manually drain a DLQ:

Stop sending data to the DLQ by disabling the DLQ feature or disabling the associated pipelines.
Wait for events to stop flowing through the pipelines reading from the input.

External Access and Networking

Logstash often needs to receive data from outside the Kubernetes cluster, which introduces networking challenges due to the nature of the data protocols used.

The Ingress Limitation

A critical distinction exists between how Elasticsearch/Kibana and Logstash are exposed. Standard Kubernetes Ingress resources are designed for HTTP/HTTPS traffic. While these work for Elasticsearch and Kibana, they are unsuitable for Logstash because Logstash typically relies on raw TCP/UDP for data ingestion.

Alternative Exposure Methods

To allow external data sources to reach Logstash, one of the following manual methods must be employed:

NodePort Service: This exposes the Logstash service on a high-numbered port across every node in the cluster.
LoadBalancer Service: If the cluster supports it (e.g., via MetalLB), a layer-4 LoadBalancer can be used to provide a single external IP.
Manual Ingress Controller Configuration: For users of NGINX or Traefik, a TCP/UDP Passthrough rule must be manually configured in the Ingress Controller's ConfigMap. This configuration is entirely separate from the standard Kubernetes Ingress resource.

Deployment Workflow via Helm

For those seeking a streamlined installation, Helm v3.x is the recommended tool. The deployment process involves using a values file to define the stack's parameters and then executing the installation command.

The installation command is as follows:

bash helm install elastic-stack -f elastic-stack-values.yaml ./path-to-your-helm-chart -n elastic-system

Once the installation is complete, it is necessary to verify the status of the pods to ensure the ECK operator has successfully provisioned the components.

Analysis of Logstash Kubernetes Integration

The integration of Logstash into Kubernetes via the ECK operator transforms the data pipeline from a static installation into a dynamic, scalable service. The shift toward declarative configuration through YAML allows for consistent environments across development and production. However, the complexity of the system is evident in the networking and storage layers.

The reliance on raw TCP/UDP for ingestion creates a friction point with standard Kubernetes Ingress controllers, forcing administrators to move toward Layer-4 load balancing or complex ConfigMap modifications for Passthrough. This highlights a fundamental architectural divide between the web-centric nature of Kubernetes Ingress and the stream-centric nature of Logstash.

Furthermore, the tension between Kubernetes' desire for rapid pod recycling and Logstash's need for data integrity is apparent in the queue draining process. The limitation of the terminationGracePeriodSeconds indicates that while the ECK operator provides the tools for stability, the underlying Kubernetes orchestration layer still imposes constraints that can lead to data loss if not carefully managed.

In summary, the ECK operator effectively abstracts the deployment of the Elastic Stack, but the operator is not a complete substitute for deep architectural knowledge of Kubernetes. Success in deploying Logstash depends on the correct configuration of volumeClaimTemplates for persistence and the implementation of non-HTTP networking solutions for external data ingestion.