The Elastic Stack, historically referred to as the ELK Stack, represents a foundational suite of open-source tools designed for centralized logging, real-time analysis, and data visualization. While the acronym originally denoted Elasticsearch, Logstash, and Kibana, the modern ecosystem has expanded to include lightweight shipping agents known as Beats, specifically Filebeat, to handle log ingestion at the source. This stack provides critical visibility into system and application logs, enabling administrators to correlate events across multiple servers, identify emerging issues within specific time frames, and generate custom dashboards for monitoring and troubleshooting. The architecture relies on a strict versioning protocol; all components within the stack must share the same version number to ensure compatibility and stability. Implementing this infrastructure on Ubuntu-based systems, such as Ubuntu 20.04 or Ubuntu 24.04, requires precise configuration of Java environments, package repositories, and inter-component communication protocols.
Architectural Components and Data Flow
Understanding the specific role of each component is essential for proper configuration and troubleshooting. The stack operates as a pipeline where data moves from collection to storage to visualization.
Elasticsearch serves as the core of the stack. It is a distributed, RESTful search and analytics engine built upon Apache Lucene. Its high-performance nature makes it ideal for log analytics, particularly due to its support for schema-free JSON documents and its ability to interact with a myriad of programming languages. Elasticsearch acts as the central repository where indexed data resides.
Logstash functions as a server-side data processing pipeline. It is lightweight and designed to collect data from multiple heterogeneous sources, transform that data into a usable format, and ship it to a designated destination, typically Elasticsearch. This transformation step is critical for normalizing log formats from different applications before they are stored.
Kibana provides the user interface for the stack. As an open-source data web UI, it visualizes the logs that have been collected by Logstash and indexed by Elasticsearch. It offers a suite of visualization tools, including histograms, line graphs, pie charts, and heat maps. Additionally, Kibana includes built-in geospatial support, allowing for the mapping of data based on geographic coordinates. Because Kibana is typically exposed only on localhost for security reasons, it often requires a reverse proxy, such as Nginx, to be accessible via a web browser in production environments.
Filebeat represents the "B" in the modern Elastic Stack (though often still colloquially called ELK). It is a lightweight shipper designed to forward and centralize logs and files from local systems to the central stack. Filebeat reduces network overhead by collecting logs locally and sending them to Logstash or directly to Elasticsearch, depending on the configuration.
Prerequisites and Java Environment Configuration
Before deploying any component of the Elastic Stack, the underlying operating system must be prepared. Since Elasticsearch is developed in Java, a compatible Java Runtime Environment (JRE) or Java Development Kit (JDK) is a strict prerequisite. The system must be updated to ensure all existing packages are current before introducing new software sources.
To update the system on Ubuntu, the following commands are executed:
bash
sudo apt update
sudo apt upgrade -y
For installations on Ubuntu 24.04 and compatible systems, OpenJDK 17 is recommended as it represents the latest stable Long-Term Support (LTS) release. This version is installed directly from the standard Ubuntu repositories using the APT package manager:
bash
sudo apt install openjdk-17-jdk -y
Verification of the Java installation is critical to prevent subsequent installation errors. The version can be confirmed by running:
bash
java -version
Repository Management and Component Installation
The components of the Elastic Stack are not available in the default Ubuntu apt repositories. Therefore, the Elastic package source list must be manually added to the system. This involves importing the Elastic GPG key and adding the repository to the APT sources list. While the specific commands for adding the repository vary slightly depending on the Ubuntu version (20.04 vs 24.04) and the specific Elastic version being targeted (such as 8.x), the principle remains consistent: the system must be pointed to artifacts.elastic.co or the appropriate Elastic CDN to fetch the .deb packages.
For example, when installing Filebeat via direct download, the .deb package is retrieved from the Elastic artifacts repository:
bash
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.9.2-amd64.deb
sudo dpkg -i filebeat-8.9.2-amd64.deb
Elasticsearch Initialization and Verification
Elasticsearch should be started only after the Java environment is confirmed and the repository configuration is complete. The service is managed via systemd. To start the service for the first time:
bash
sudo systemctl start elasticsearch
To ensure Elasticsearch persists across server reboots, it must be enabled:
bash
sudo systemctl enable elasticsearch
Administrators should allow Elasticsearch a few moments to fully initialize. Attempting to query the service immediately after starting it may result in connection errors. Once initialized, the service can be tested by sending an HTTP GET request to the local host on port 9200:
bash
curl -X GET "localhost:9200"
A successful response will return JSON data containing basic information about the local node, including the cluster name, cluster UUID, version number, build flavor, and the Lucene version. The response typically includes the tagline "You Know, for Search."
Kibana Deployment
Kibana is installed after Elasticsearch is operational. Official documentation dictates that Kibana must be installed only after Elasticsearch is in place, as Kibana requires a live Elasticsearch instance to connect to for data storage and retrieval. The installation process mirrors that of the other components, involving the addition of the Elastic repository and the installation of the Kibana .deb package.
Because Kibana is configured to listen on localhost by default, it is not directly accessible from external networks. To enable web browser access, a reverse proxy such as Nginx is typically configured to forward traffic to the Kibana port (default 5601). This setup enhances security by preventing direct exposure of the Kibana interface to the internet.
Filebeat Configuration and Data Shipping
Filebeat is configured to ship logs to the processing pipeline. The configuration file is located at /etc/filebeat/filebeat.yml. Using a text editor such as nano, administrators modify this file to define inputs and outputs.
To collect system logs and Apache logs, the filestream input type is used. The configuration must enable the input and specify the glob paths for the log files:
```yaml
filestream is an input for collecting log messages from files.
- type: filestream
# Unique ID among all inputs, an ID is required.
id: my-filestream-id
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:- /var/log/.log
- /var/log/apache2/.log
```
The output configuration determines where Filebeat sends the data. If shipping to Logstash, the Elasticsearch output block must be commented out, and the Logstash output block must be uncommented and configured with the IP address of the Logstash server and the appropriate port (default 5044):
```yaml
---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
Array of hosts to connect to.
hosts: ["localhost:9200"]
------------------------------ Logstash Output -------------------------------
output.logstash:
# The Logstash hosts
hosts: ["192.168.x.x:5044"]
```
Once configured, Filebeat is enabled and started:
bash
sudo systemctl enable filebeat
sudo systemctl start filebeat
The status is verified using:
bash
sudo systemctl status filebeat
Setting Up Kibana Dashboards and Index Templates
For Kibana to visualize data effectively, specific index patterns and dashboards must be loaded. Filebeat provides setup commands to automate this process. First, the ingest pipeline for the system module is loaded:
bash
sudo filebeat setup --pipelines --modules system
Next, the index template is loaded into Elasticsearch. An index represents a set of documents with similar characteristics. The command below disables the Logstash output temporarily to connect directly to Elasticsearch for the setup process:
bash
sudo filebeat setup --index-management -E output.logstash.enabled=false -E 'output.elasticsearch.hosts=["localhost:9200"]'
Finally, the Kibana dashboards and index patterns are loaded. This step requires specifying the Kibana host:
bash
sudo filebeat setup -E output.logstash.enabled=false -E output.elasticsearch.hosts=['localhost:9200'] -E setup.kibana.host=localhost:5601
After these setup commands complete, Filebeat resumes normal operation, shipping log data to Logstash for processing and eventually to Elasticsearch for indexing, where it becomes available for visualization in Kibana.
Conclusion
The implementation of the Elastic Stack on Ubuntu provides a robust framework for centralized logging and data analysis. By strictly adhering to version compatibility across Elasticsearch, Logstash, Kibana, and Filebeat, administrators ensure a stable foundation. The process requires careful attention to Java prerequisites, repository configuration, and service management. Furthermore, the configuration of Filebeat inputs and outputs, along with the proper loading of Kibana dashboards and index templates, is critical for transforming raw log data into actionable insights. As systems grow in complexity, the ability to correlate logs across multiple servers and visualize them in real-time becomes indispensable for maintaining system health and security.