Architectural Mastery: Deploying the Elastic Stack on Ubuntu 22.04 for Centralized Logging

The modern enterprise landscape generates an astronomical volume of telemetry data, ranging from application logs and kernel events to security audits and performance metrics. Managing this data across a distributed network of servers is a logistical impossibility without a centralized system. The Elastic Stack—historically and commonly referred to as the ELK Stack—serves as the industry-standard solution for this challenge. By integrating Elasticsearch, Logstash, and Kibana, along with the "Beats" family of shippers, organizations can transform raw, unstructured text files into actionable intelligence through a process known as centralized logging.

Centralized logging is the practice of aggregating logs from every single source in any format into a single, searchable repository. The technical necessity for this arises from the fragmented nature of distributed systems; when an application spans multiple microservices across ten different servers, identifying a request-flow failure requires correlating logs from all ten machines during a specific millisecond window. Without the Elastic Stack, a system administrator would be forced to manually SSH into each machine and grep through disparate files—a process that is both error-prone and prohibitively slow. By consolidating these logs, the Elastic Stack enables the correlation of events across different servers, allowing administrators to pinpoint the exact moment a failure cascaded from one node to another.

The ecosystem is comprised of four primary architectural pillars. Elasticsearch acts as the heart of the operation, serving as the analytics engine that stores and indexes the data. Logstash functions as the data processing pipeline, ingesting data from multiple sources, transforming it, and sending it to the storage layer. Kibana provides the visualization layer, turning the complex JSON queries of Elasticsearch into intuitive dashboards and graphs. Finally, Filebeat and other "Beats" agents serve as the lightweight shippers that forward logs and files from the edge of the network to the central stack.

Hardware and Environmental Prerequisites

Before initiating the installation process, the host environment must meet specific hardware and software criteria to ensure the stability of the Java Virtual Machine (JVM) and the efficiency of the indexing engine. Elasticsearch is resource-intensive, particularly regarding memory allocation, as it relies heavily on the filesystem cache and the JVM heap.

The minimum viable environment for a functional Elastic Stack server on Ubuntu 22.04 consists of the following specifications:

Component	Minimum RAM	Recommended RAM	CPU Cores	Disk Space
Elasticsearch	2GB+	4GB+	2 Cores	50GB+
Logstash	1GB+	2GB+	1 Core	10GB
Kibana	1GB+	2GB+	1 Core	1GB
Total Server	4GB	8GB	2+ Cores	61GB+

From a technical perspective, the 4GB RAM minimum is a strict baseline. Because Elasticsearch manages its own memory through the JVM, undersizing the RAM leads to frequent OutOfMemory (OOM) errors and potential kernel panics. The 2-CPU core requirement is essential for the parallel processing of indexing threads. If the volume of logs increases, these requirements scale linearly; a high-traffic production environment will require significantly more storage and compute power to prevent indexing lag.

The operational environment requires an Ubuntu 22.04 server configured with a non-root sudo user. This is a critical security measure to prevent the accidental execution of system-level commands as the root user, which could jeopardize the integrity of the operating system. Additionally, the system must have Java installed, as the entire stack is built on Java. Specifically, Java 11 or 17 is required. The use of the headless JRE (Java Runtime Environment) is recommended for servers to reduce the installation footprint by omitting graphical user interface components.

The Core Installation Sequence and Component Order

A critical architectural requirement of the Elastic Stack is version parity. The "Same-Version Rule" dictates that every component—Elasticsearch, Logstash, Kibana, and Filebeat—must be installed using the exact same version number. For instance, if the deployment utilizes Elasticsearch 9.3.3, then Kibana, Logstash, and the Beats agents must also be version 9.3.3. This ensures API compatibility and prevents data serialization errors between the shipper and the indexer.

When deploying a self-managed cluster, the installation must follow a specific logical order to ensure that dependencies are resolved. The recommended sequence is as follows:

Elasticsearch (The storage and indexing engine must exist before data can be sent).
Kibana (The visualization tool requires an active Elasticsearch cluster to connect to).
Logstash (The pipeline needs a destination to ship the processed data).
Beats/Filebeat (The shippers are the final link, forwarding data to Logstash or Elasticsearch).

This order ensures that when the "edge" components (like Filebeat) start, they do not encounter "Connection Refused" errors, which could lead to crash loops or data loss during the initial startup phase.

Installing the Java Runtime Environment

Since the Elastic Stack is Java-based, the first technical step is the installation of the OpenJDK. The following commands update the local package index and install the headless version of Java 17.

bash sudo apt update sudo apt install openjdk-17-jre-headless -y

After installation, it is mandatory to verify the version to ensure the environment is correctly configured.

bash java -version

The presence of the headless version is vital for server environments as it minimizes the attack surface by removing unnecessary X11 libraries.

Elasticsearch Installation and Configuration

Elasticsearch is the central nervous system of the stack. It provides the REST API and the indexing capabilities required to search through millions of log entries in milliseconds.

To install Elasticsearch, the system must first trust the official Elastic repository via a GPG key. This ensures that the packages downloaded are authentic and have not been tampered with.

bash wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

Once the key is in place, the repository is added to the system's sources list:

bash echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

The package cache is then updated and the software is installed:

bash sudo apt update sudo apt install elasticsearch -y

Following the installation, the configuration file located at /etc/elasticsearch/elasticsearch.yml must be modified to define the node's behavior within the cluster.

bash sudo nano /etc/elasticsearch/elasticsearch.yml

The following configuration parameters are essential for a basic setup:

cluster.name: elk-cluster (Defines the name of the cluster for identification).
node.name: node-1 (Assigns a unique name to this specific node).
path.data: /var/lib/elasticsearch (Specifies where the actual index data is stored on the disk).
path.logs: /var/log/elasticsearch (Defines the location for the engine's own internal logs).
network.host: localhost (Binds the service to the local interface for security).
http.port: 9200 (The standard port for REST API communication).

By binding the network host to localhost, the administrator prevents the Elasticsearch API from being exposed directly to the public internet, which would be a catastrophic security failure.

Logstash and Kibana Deployment

Logstash acts as the intermediary pipeline. It utilizes an "Input-Filter-Output" architecture. It collects data from sources (like Filebeat), filters it (using Grok patterns to parse unstructured text into JSON), and outputs it to Elasticsearch.

Kibana serves as the window into the data. However, by default, Kibana is only accessible on the localhost. To make Kibana accessible via a web browser from a remote workstation, a reverse proxy is required. Nginx is the industry-standard tool for this purpose.

Installing Nginx allows the administrator to map an external port (typically 80 or 443) to the internal Kibana port. Furthermore, because the Elastic Stack exposes sensitive server metadata and logs, it is imperative to secure the Nginx proxy with a TLS/SSL certificate. Without encryption, the credentials used to access Kibana and the log data itself would be transmitted in plaintext, leaving the system vulnerable to man-in-the-middle attacks.

Integrating Filebeat for Log Shipping

Filebeat is a lightweight shipper that resides on the servers producing the logs. Its primary role is to monitor log files, harvest new entries, and send them to Logstash or Elasticsearch.

The integration of Filebeat completes the data chain:
- Filebeat reads a log file from the disk.
- Filebeat ships the log to Logstash.
- Logstash parses the log and adds metadata.
- Logstash indexes the log into Elasticsearch.
- Kibana queries Elasticsearch to display the log in a dashboard.

This decoupled architecture ensures that if the central Elastic Stack server goes down, Filebeat can keep track of where it left off in the log file (using a registry file), preventing data loss once the connection is restored.

Alternative Deployment Methods: Docker and Containers

For users who prefer an immutable infrastructure approach, the Elastic Stack can be deployed using Docker. This avoids the complexities of manual package management on Ubuntu and ensures that the environment is identical across development, staging, and production.

Container images are available through the Elastic Docker Registry. To deploy multiple nodes simultaneously—such as a three-node Elasticsearch cluster for high availability—Docker Compose is the recommended tool. Using a docker-compose.yml file, an administrator can define the networks, volume mounts for persistent data, and environment variables for all components in a single declarative document.

Network and Security Considerations

Operating an Elasticsearch cluster requires specific ports to be open and reachable. The REST interface (Port 9200) and the Kibana interface (Port 5601) must be accessible to authorized users for the cluster to be usable.

In a production environment, security is not optional. The use of CA-signed certificates for Elasticsearch is highly recommended. If a production environment is planned, these certificates should be configured before the deployment of Fleet or the Elastic Agent. If security certificates are changed or updated after the agents are installed, the Elastic Agents must be completely reinstalled to recognize the new chain of trust.

Conclusion: A Comprehensive Analysis of the Stack's Impact

The deployment of the Elastic Stack on Ubuntu 22.04 represents more than just the installation of software; it is the implementation of a sophisticated data observability strategy. By transitioning from fragmented, local log files to a centralized analytics engine, an organization gains the ability to perform real-time forensics.

The technical synergy between the components—the raw ingestion of Filebeat, the transformative power of Logstash, the indexing speed of Elasticsearch, and the visual clarity of Kibana—creates a feedback loop that significantly reduces the Mean Time to Resolution (MTTR) during system outages. The requirement for version parity and the strict adherence to installation order underscores the interdependence of these tools. When configured with a secure Nginx reverse proxy and proper JVM memory allocation, the Elastic Stack becomes a resilient foundation for any infrastructure monitoring project. The shift toward containerized deployment via Docker further enhances this resilience, providing a pathway toward scalable, cloud-native logging architectures that can grow alongside the organization's data requirements.