Centralized Log Management: Deploying the Elastic Stack on Linux Servers

The management of system logs across distributed infrastructure has evolved from a manual, fragmented process into a streamlined, automated discipline. In environments where multiple services run simultaneously, the traditional method of inspecting logs individually on each server is inefficient and prone to human error. The Elastic Stack, formerly known as the ELK Stack, addresses this challenge by providing a centralized logging solution. This suite of open-source software tools allows administrators to collect, search, analyze, and visualize logs generated from any source in any format. By aggregating data into a single location, the stack enables the correlation of logs across multiple servers during specific time frames, facilitating rapid troubleshooting and performance monitoring. The core components of this architecture include Elasticsearch for storage and search, Logstash for data processing, Kibana for visualization, and Filebeat as a lightweight shipper for forwarding logs from client endpoints to the central server.

System Preparation and Java Dependencies

Before deploying the Elastic Stack components, the Linux server must be prepared to handle the resource-intensive nature of Java-based applications. The foundation of the ELK architecture is the Java Development Kit (JDK), as Elasticsearch and other components rely heavily on the Java Virtual Machine. Depending on the distribution and the specific version of the Elastic Stack being deployed, the required Java version may vary. For instance, deploying Elasticsearch version 7.6.1 requires OpenJDK 1.8, whereas newer configurations may demand JDK 21.

To ensure system stability and security, the first step involves updating the operating system’s package lists and upgrading installed packages. This process applies the latest security patches and ensures compatibility with subsequent installations. On Red Hat-based systems, this is achieved using the package manager.

bash yum update

If the required Java version is not present, it must be installed. For environments targeting older Elastic versions or specific legacy requirements, OpenJDK 1.8 is a common choice. The installation can be performed via the command line.

bash yum -y install java-1.8 open-jdk*

Following installation, verification is critical to ensure the runtime environment is correctly configured. The version string confirms the build and architecture compatibility.

bash java -version openjdk version "1.8.0_362"

In more recent deployments targeting Elasticsearch 8.x, the requirement shifts to JDK 21. This version can be downloaded directly from Oracle and installed via RPM packages.

bash cd /opt wget https://download.oracle.com/java/21/latest/jdk-21_linux-x64_bin.rpm rpm -Uvh jdk-21_linux-x64_bin.rpm

Verification for the newer JDK version follows the same protocol.

bash java -version java version "21.0.2" 2024-01-16 LTS Java(TM) SE Runtime Environment (build 21.0.2+13-LTS-58) Java HotSpot(TM) 64-Bit Server VM (build 21.0.2+13-LTS-58, mixed mode, sharing)

Memory management is another critical aspect of preparation. Elasticsearch benefits significantly from locked memory to prevent the operating system from swapping memory pages to disk, which causes latency. To enable this, the mlockall capability must be configured in the systemd service file.

bash vi /usr/lib/systemd/system/elasticsearch.service

Within this file, the memory lock limit is set to infinity.

LimitMEMLOCK=infinity

Additionally, the environment configuration file must be updated to reflect this setting.

bash vi /etc/sysconfig/elasticsearch

The variable is adjusted as follows.

MAX_LOCKED_MEMORY=unlimited

Security modules such as SELinux can interfere with the log retrieval process and network communication required by the Elastic Stack. To prevent conflicts, SELinux is often disabled in testing and development environments.

bash vi /etc/sysconfig/selinux setenforce 0 reboot getenforce Disabled

Configuring Repositories and Installing Elasticsearch

Elasticsearch serves as the central repository for all collected data. It provides a distributed, RESTful search and analytics engine. Installation methods vary depending on whether the administrator prefers direct binary downloads or repository management.

For systems using YUM (CentOS, RHEL), adding the official Elastic repository ensures access to the latest packages and simplifies future updates. The first step is importing the GPG key to verify the integrity of the packages.

bash rpm --import https://artifacts.elasticsearch.co/GPG-KEY-elasticsearch

A repository configuration file is then created or edited to define the source of the packages. For Elasticsearch 8.x, the configuration looks like this.

[elasticsearch] name=Elasticsearch repository for 8.x packages baseurl=https://artifacts.elastic.co/packages/8.x/yum gpgcheck=1 gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch enabled=0 autorefresh=1 type=rpm-md

With the repository defined, the package can be installed. Note that the repository is disabled by default in the config, so it must be explicitly enabled during the install command.

bash yum install --enablerepo=elasticsearch elasticsearch

Alternatively, for specific version control or environments without repository access, the RPM can be downloaded manually. This method is useful for deploying specific legacy versions like 7.6.1.

bash mkdir ./ELK cd ELK wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.6.1-x86_64.rpm rpm -ivh elasticsearch-7.6.1-x86_64.rpm

After installation, the package presence can be verified.

bash rpm -qa | grep elasticsearch elasticsearch-7.6.1-1.x86_64

Configuration of Elasticsearch is handled through a YAML file. Key settings include the network host and HTTP port. Binding to localhost is a common security practice in initial setups, restricting access to the local machine.

bash vi /etc/elasticsearch/elasticsearch.yml

Relevant configuration lines include.

bootstrap.memory_lock: true network.host: localhost http.port: 9200

Once configured, the service must be started and enabled to persist across reboots.

Integrating Logstash and Kibana

Logstash acts as the data processing pipeline, ingesting data from various sources, transforming it, and sending it to Elasticsearch. Kibana provides the web interface for visualizing the data stored in Elasticsearch. A critical rule in Elastic Stack deployment is version consistency; all components must run the same major and minor version to ensure compatibility.

On Ubuntu 22.04, the installation process for the Elastic Stack involves adding the Elastic GPG key and repository similar to the YUM method, but using APT. After installing Elasticsearch, Logstash, and Kibana, configuration is required. Kibana is typically configured to listen only on localhost for security reasons. To make it accessible over a network, a reverse proxy such as Nginx is deployed.

Nginx intercepts incoming web requests and forwards them to the Kibana service running on localhost. This setup allows administrators to access the Kibana dashboard from remote browsers without exposing the Kibana port directly to the public network.

Filebeat, the lightweight shipper, is installed on client machines to forward logs to the central Elastic Stack server. It monitors log files and sends new entries to Logstash or directly to Elasticsearch. In the provided reference architecture, Filebeat sends data to port 5044 on the ELK host.

tcp6 0 0 :::5044 :::* LISTEN 17021/java tcp6 0 0 <ELK HOST>:5044 <Client>:35030 ESTABLISHED 17021/java

This connection state indicates that Filebeat on the client is successfully communicating with the Logstash or Beats input plugin on the ELK server.

Visualizing Logs in Kibana

Once the stack is operational and data is flowing, the final step is visualization. Accessing Kibana via a web browser opens the management interface. For new instances, users are often prompted to try sample data or configure index patterns.

An index pattern defines which indices in Elasticsearch should be available for searching and visualization. The process involves navigating to the management section, defining the index pattern, and configuring settings such as the time field. This step is crucial because Kibana needs to know which fields contain timestamp data to provide time-based analytics.

After the index pattern is established, users navigate to the Discover tab. This interface presents the raw log data in a searchable, paginated view. Advanced filtering capabilities allow administrators to narrow down results based on specific keywords, log levels, or source IPs. This capability transforms vast amounts of unstructured log data into actionable insights, enabling rapid identification of errors or anomalies across the entire infrastructure.

Conclusion

The deployment of the Elastic Stack on Linux servers represents a significant leap in operational efficiency for system administrators. By moving from manual, siloed log inspection to a centralized, automated architecture, organizations can gain comprehensive visibility into their infrastructure. The integration of Elasticsearch for storage, Logstash for processing, and Kibana for visualization, coupled with Filebeat for data collection, creates a robust ecosystem for log management. Proper preparation of the Linux environment, including Java installation, memory locking, and security configuration, is essential for optimal performance. As infrastructure scales, this stack provides the necessary tools to correlate events across multiple servers, troubleshoot complex issues, and maintain system reliability. The flexibility of the Elastic Stack, supporting various Linux distributions and installation methods, ensures it remains a versatile solution for both development and production environments.