Architecting Centralized Logging: A Comprehensive Guide to Deploying the Elastic Stack on Ubuntu 20.04

The deployment of the Elastic Stack—historically and commonly referred to as the ELK Stack—represents a critical transition from fragmented log management to a sophisticated centralized logging architecture. At its core, the Elastic Stack is a powerhouse collection of open-source software engineered by Elastic that empowers system administrators and DevOps engineers to search, analyze, and visualize logs generated from any source and in any format. In modern distributed environments, logs are often scattered across dozens or even hundreds of individual servers, making manual troubleshooting nearly impossible. Centralized logging solves this by aggregating all telemetry into a single, searchable repository, allowing engineers to identify systemic problems with servers or applications by querying a unified data store. Furthermore, this architecture enables the correlation of logs across multiple servers during specific time frames, which is indispensable for tracing the lifecycle of a request in a microservices environment or diagnosing cascading failures across a cluster.

The ecosystem consists of four primary components: Elasticsearch, Logstash, Kibana, and Beats (specifically Filebeat in many standard deployments). Elasticsearch serves as the heart of the stack, functioning as a highly scalable search and analytics engine. Logstash acts as the processing pipeline, ingesting data from multiple sources, transforming it, and sending it to a destination. Kibana provides the visualization layer, turning the raw data stored in Elasticsearch into intuitive dashboards and charts. Finally, Beats—such as Filebeat—are lightweight shippers that reside on the edge of the network to forward logs and files to the central stack. While these components can be distributed across a massive cluster, this guide focuses on a consolidated installation on a single Ubuntu 20.04 server, utilizing Nginx as a reverse proxy to make the Kibana interface accessible via a web browser, as Kibana is typically restricted to the localhost by default.

Hardware and System Prerequisites

Before initiating the installation process, it is imperative to ensure that the underlying hardware meets the minimum requirements to prevent kernel panics or the dreaded Out-Of-Memory (OOM) killer from terminating the Elasticsearch process. The Elastic Stack is resource-intensive, particularly regarding RAM and CPU cycles, due to the nature of the Java Virtual Machine (JVM) and the indexing requirements of the Lucene engine.

The foundational environment must be running Ubuntu 20.04 (Focal Fossa) or 22.04. From a hardware perspective, a minimum of 4GB of RAM is required, although 8GB is strongly recommended for production or data-heavy environments. The system must possess at least 2 CPU cores to handle the concurrent processing of indexing and searching. Root or sudo access is mandatory to modify system-level configurations and install packages.

The specific resource allocation per component is detailed in the following table:

Component	Minimum RAM	CPU Requirement	Minimum Disk Space
Elasticsearch	2GB+	2 cores	50GB+
Logstash	1GB+	1 core	10GB
Kibana	1GB+	1 core	1GB

Failure to adhere to these specifications can lead to catastrophic performance degradation. If the server has limited memory, administrators can manually adjust the JVM heap size in the jvm.options file, reducing the default 2GB to 1GB or 512MB, though this may limit the volume of logs the system can process efficiently.

The Java Runtime Environment Installation

Elasticsearch and Logstash are built on Java, meaning the Java Runtime Environment (JRE) is a non-negotiable dependency. Depending on the version of the Elastic Stack being deployed, Java 11 or 17 is required. For installations targeting the latest versions of the stack, OpenJDK 17 is the preferred choice.

To install the JRE, the system package cache must first be updated to ensure the latest metadata is retrieved from the Ubuntu repositories.

sudo apt update

Once updated, the headless version of OpenJDK 17 should be installed. The headless version is chosen because the Elastic Stack server typically does not require a graphical user interface, thereby reducing the system footprint.

sudo apt install openjdk-17-jre-headless -y

To verify that the Java environment is correctly installed and mapped to the system path, the following command must be executed:

java -version

This step is critical because a failure here will prevent the Elasticsearch service from starting, as the binary will be unable to locate the Java virtual machine.

Configuring the Elastic Repository

The components of the Elastic Stack are not available in the default Ubuntu package repositories. Consequently, the official Elastic APT repository must be added to the system. This process involves importing the GPG key to ensure the integrity and authenticity of the packages, protecting the system from package spoofing.

First, the GPG key is downloaded and stored in the system's keyring:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

Following the key import, the repository definition is added to the sources list. For those installing the 8.x branch:

echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

Alternatively, for those adhering to the 7.x branch, the process uses the following commands:

sudo apt -y install gnupg

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

sudo apt -y install apt-transport-https

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

After adding the repository, the package cache must be refreshed again:

sudo apt update

Elasticsearch Installation and Core Configuration

With the repository active, Elasticsearch can be installed using the standard APT package manager.

sudo apt install elasticsearch -y

If a specific version is required for compatibility reasons—which is vital because the same version must be used across the entire stack (Elasticsearch, Kibana, Logstash, and Filebeat)—the version can be specified during installation:

sudo apt -y install elasticsearch=7.10.2

Once installed, the configuration is managed via the elasticsearch.yml file. This file defines how the node behaves within a cluster and how it interacts with the network.

sudo nano /etc/elasticsearch/elasticsearch.yml

The following configurations are essential for a standard single-node setup:

Cluster and Node Identification: cluster.name: elk-cluster and node.name: node-1 allow the node to identify itself within the network.
Path Management: path.data: /var/lib/elasticsearch and path.logs: /var/log/elasticsearch define where the actual data and logs are stored on the disk.
Network Binding: network.host: localhost ensures that the service only listens for local connections by default, while http.port: 9200 sets the standard REST API port.
Discovery Mode: discovery.type: single-node is crucial for single-server installations to prevent the node from attempting to find other cluster members.

Security settings are also integrated into this file to protect sensitive log data:

xpack.security.enabled: true enables basic authentication.
xpack.security.enrollment.enabled: true allows for easier node joining.
xpack.security.http.ssl.enabled: true mandates encrypted communication via the certs/http.p12 keystore.
xpack.security.transport.ssl.enabled: true secures the communication between nodes using the certs/transport.p12 truststore.

JVM Heap Memory Tuning

Elasticsearch is notorious for its memory consumption. By default, it allocates 2GB of RAM for the JVM. If the server has limited resources, this must be tuned to avoid system instability.

For newer installations, the heap options are managed in a dedicated directory:

sudo nano /etc/elasticsearch/jvm.options.d/heap.options

For older versions or different configurations, the primary jvm.options file is used:

sudo nano /etc/elasticsearch/jvm.options

The administrator must modify the -Xms (initial heap size) and -Xmx (maximum heap size) settings. For example, to reduce usage to 1GB:

-Xms1g
-Xmx1g

This ensures that the JVM does not attempt to claim more memory than the physical hardware provides, preventing the operating system from killing the process.

Service Activation and Password Management

After configuration and memory tuning, the Elasticsearch service must be enabled to start on boot and then manually started.

sudo systemctl daemon-reload

sudo systemctl enable elasticsearch

sudo systemctl start elasticsearch

The current status of the service can be verified via:

sudo systemctl status elasticsearch

In versions where X-Pack security is enabled, a password must be generated for the default elastic superuser.

sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

It is imperative to save this password immediately, as it is required for all subsequent Kibana and API interactions.

To verify that the installation is successful and the API is responding, a curl request can be sent to the local port. If SSL is enabled:

curl -k -u elastic:YOUR_PASSWORD https://localhost:9200

If SSL is disabled:

curl -u elastic:YOUR_PASSWORD http://localhost:9200

Kibana Deployment and Reverse Proxy Setup

Kibana is the visualization window into the Elasticsearch data. Its installation is straightforward via the APT manager:

sudo apt install kibana -y

The configuration file kibana.yml must be edited to allow external access and define the server identity.

sudo nano /etc/kibana/kibana.yml

Key settings include:

server.port: 5601: The default port for Kibana.
server.host: "0.0.0.0": This binds Kibana to all network interfaces, allowing it to be reached outside the localhost.
server.name: "kibana-server": A friendly name for the server.

Because exposing port 5601 directly to the internet is a security risk, Nginx is used as a reverse proxy. This allows Kibana to be served over port 80 (HTTP) or 443 (HTTPS) and provides a layer of security.

For a professional production setup, it is strongly encouraged to install a TLS/SSL certificate. This requires a Fully Qualified Domain Name (FQDN) and corresponding DNS records:

your_domain pointing to the server's public IP.
www.your_domain pointing to the server's public IP.

The Let's Encrypt guide should be followed to secure the Nginx server block, ensuring that the valuable server information accessible via the Elastic Stack is not exposed to unauthorized users.

Log Aggregation with Logstash and Filebeat

While Elasticsearch stores the data and Kibana visualizes it, Logstash and Filebeat are responsible for the movement of data.

Filebeat is a lightweight agent installed on the servers where logs are generated. It forwards these logs to Logstash or directly to Elasticsearch. This prevents the primary server from being overwhelmed by the processing requirements of every single log line.

Logstash acts as the intermediate processor. It can filter, parse, and enrich the data. For example, it can take a raw system log and turn it into a structured JSON object that Elasticsearch can index efficiently. The flow of data follows this logical path:

Log Sources (Syslog, App Logs) -> Filebeat/Logstash -> Elasticsearch -> Kibana -> End User.

This architecture ensures that the "heavy lifting" of data transformation happens in Logstash, while Elasticsearch focuses on indexing and Kibana focuses on querying.

Conclusion

The successful deployment of the Elastic Stack on Ubuntu 20.04 transforms raw, unstructured text files into a powerful business intelligence tool. By strictly adhering to version parity across all components—ensuring that Elasticsearch, Kibana, Logstash, and Filebeat all share the same version number—administrators avoid compatibility conflicts that often plague these installations. The transition from a default installation to a production-ready environment requires a deep understanding of JVM heap management, network binding, and the implementation of Nginx as a secure gateway.

The integration of X-Pack security and SSL/TLS encryption is not merely optional but essential given the sensitivity of system logs. When configured correctly, this stack provides an unparalleled level of visibility into system health, allowing for the rapid detection of anomalies and the correlation of events across distributed nodes. The investment in hardware—specifically the 8GB RAM recommendation—pays dividends in the form of stability and query performance, ensuring that the centralized logging system remains a reliable asset rather than a bottleneck.