Comprehensive Architecture and Deployment Guide for Jaeger via Docker

Distributed tracing is a critical requirement for modern microservices architectures, where a single user request may traverse dozens of independent services. Jaeger provides the necessary observability to track these requests. Deploying Jaeger through Docker allows for rapid prototyping, seamless scaling, and consistent environments across development and production. This guide explores the exhaustive details of Jaeger's Docker-based deployment, ranging from the all-in-one convenience image to the specialized components of a production-grade distributed system, and the transition toward the OpenTelemetry-based V2 architecture.

The Jaeger All-In-One Deployment Model

The all-in-one image is the primary entry point for developers and system administrators who need to evaluate distributed tracing without the complexity of managing multiple containers. This image is a consolidated binary that encompasses the three primary backend components: the Jaeger UI, the jaeger-collector, and the jaeger-query service.

To provide immediate utility, the all-in-one image includes an in-memory storage component. This means that the data is not persisted to a disk; if the container is restarted, all captured traces are wiped. This design is intended for local development, CI/CD testing, and demonstration purposes, removing the need to configure an external database like Elasticsearch or Cassandra.

The most efficient method to launch this environment is via a single Docker command. The following command initializes the Jaeger all-in-one instance with specific port mappings and environmental configurations:

docker run --rm --name jaeger -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 -p 16686:16686 -p 4317:4317 -p 4318:4318 -p 14250:14250 -p 14268:14268 -p 14269:14269 -p 9411:9411 jaegertracing/all-in-one:1.76.0

The technical breakdown of this command reveals several critical operational layers:

  • The --rm flag ensures that the container is automatically removed when it stops, preventing the accumulation of dead containers in the Docker host.
  • The -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 environment variable configures the collector to accept Zipkin-compatible traces on port 9411.
  • The -p flags map the internal container ports to the host machine, allowing external microservices and the browser to communicate with the Jaeger backend.

For users who prefer not to use Docker, the all-in-one functionality is also available as a binary distribution archive. The executable can be launched using the following command:

jaeger-all-in-one --collector.zipkin.host-port=:9411

Once the system is operational, the administrative interface is accessible through the Jaeger UI, which serves as the visualization layer for distributed traces. Users can navigate to http://localhost:16686 to access this frontend.

Detailed Network Port Analysis for Jaeger Containers

Understanding the networking layer is paramount for the successful deployment of Jaeger. Each port exposed by the container serves a specific function within the distributed tracing pipeline. Failure to map these ports correctly will result in spans failing to reach the collector or the UI failing to retrieve data from the query service.

The following table provides a comprehensive mapping of the ports utilized by the Jaeger container:

Port Protocol Component Function
16686 HTTP query serve frontend
4317 HTTP collector accept OpenTelemetry Protocol (OTLP) over gRPC
4318 HTTP collector accept OpenTelemetry Protocol (OTLP) over HTTP
14268 HTTP collector accept jaeger.thrift directly from clients
14250 HTTP collector accept model.proto
9411 HTTP collector Zipkin compatible endpoint (optional)

The functional impact of these ports is significant:

  • Port 16686 is the primary gateway for the end-user. It hosts the web-based UI that allows developers to search for traces, analyze latency, and visualize the call graph of their services.
  • Ports 4317 and 4318 are essential for modern observability. By supporting the OpenTelemetry Protocol (OTLP), Jaeger allows services to send traces using a vendor-neutral standard, facilitating easier migrations between different observability tools.
  • Port 14268 allows legacy Jaeger clients to send spans using the Thrift protocol, ensuring backward compatibility for older instrumented applications.
  • Port 14250 is used for internal communication and specifically for the model.proto format.
  • Port 9411 provides an interoperability layer. Since Zipkin is another popular tracing tool, this port allows Zipkin-instrumented applications to send their data to Jaeger without changing their client configuration.

Specialized Jaeger Component Analysis

While the all-in-one image is ideal for starters, production environments require a decoupled architecture where each component can be scaled independently. Jaeger provides several specialized images to facilitate this.

Jaeger Query

The jaeger-query component is the read-side of the Jaeger architecture. Its primary responsibility is to serve the Jaeger UI and provide an API that retrieves traces from the configured storage backend. By separating the query service from the collector, an organization can scale the UI and API access independently of the data ingestion rate.

The latest available version for this component is 1.76. Users can interact with this component using Docker or Podman. To view the help options for this specific image, the following command is used:

docker run cr.jaegertracing.io/jaegertracing/jaeger-query:1.76 --help

or

podman run cr.jaegertracing.io/jaegertracing/jaeger-query:1.76 --help

Jaeger Collector

The jaeger-collector is the ingestion engine. It receives spans from agents or directly from clients and saves them into persistent storage. In a distributed setup, multiple collectors can be deployed behind a load balancer to handle high volumes of trace data.

It is important to note that the jaeger-collector (v1) is now marked as deprecated, with the latest available version being 1.76. This deprecation is part of the strategic shift toward the OpenTelemetry Collector.

Jaeger Ingester

The jaeger-ingester serves as an alternative to the collector. Rather than receiving spans directly from clients, the ingester reads spans from a Kafka topic and saves them to the storage backend. This is a critical pattern for high-throughput environments where Kafka acts as a buffer to prevent the storage backend from being overwhelmed during traffic spikes.

The latest available version of the jaeger-ingester is 1.76. Users can verify its configuration using:

docker run cr.jaegertracing.io/jaegertracing/jaeger-ingester:1.76 --help

or

podman run cr.jaegertracing.io/jaegertracing/jaeger-ingester:1.76 --help

Jaeger Agent

The jaeger-agent was originally designed to run as a sidecar process or a host agent. Its role was to receive spans from Jaeger clients and forward them to the collector, reducing the network overhead on the application.

However, the jaeger-agent (v1) is officially deprecated and is no longer recommended for use. The latest available version was 1.62. The industry shift toward the OpenTelemetry Collector has rendered the specialized agent redundant, as OTLP can be handled more efficiently.

Jaeger Operator

For organizations utilizing Kubernetes, the jaeger-operator provided a way to package, deploy, and manage Jaeger installations using a declarative approach. This reduced the manual effort required to configure complex distributed tracing setups.

The jaeger-operator is now deprecated, with the last available version being 1.65. Users are encouraged to look toward the official Kubernetes Operator repository at https://github.com/jaegertracing/jaeger-operator for the most current deployment patterns.

Storage and Utility Components

Jaeger's ecosystem includes several utilities to manage the underlying data and storage schemas.

Jaeger Remote Storage

The jaeger-remote-storage image allows for the sharing of a single-node storage backend, such as memory, across multiple Jaeger processes. This is particularly useful in hybrid configurations where some components are containerized and others are not. The current version is 2.17.0.

docker run cr.jaegertracing.io/jaegertracing/jaeger-remote-storage:2.17.0 --help

Jaeger Cassandra Schema

When using Apache Cassandra as a persistent storage backend, the jaeger-cassandra-schema utility is required. This script initializes the Cassandra keyspace and the necessary schema to store traces efficiently. The latest version is 2.17.0.

docker run cr.jaegertracing.io/jaegertracing/jaeger-cassandra-schema:2.17.0 --help

Jaeger ES Index Cleaner

Elasticsearch does not natively support data Time-To-Live (TTL). To prevent the storage from growing indefinitely, Jaeger provides the jaeger-es-index-cleaner. This utility script purges old indices from Elasticsearch based on a retention policy.

docker run cr.jaegertracing.io/jaegertracing/jaeger-es-index-cleaner:2.17.0 --help

Jaeger Spark Dependencies

The spark-dependencies component (v2) is an Apache Spark job. It is designed to collect Jaeger spans from storage, analyze the links between services, and store the resulting analysis for presentation in the Jaeger UI. This allows for higher-level service performance monitoring.

Transition to Jaeger V2 and OpenTelemetry

The most significant evolution in the project is the introduction of Jaeger V2. This version is built directly on the OpenTelemetry Collector, merging the capabilities of Jaeger's tracing with the industry-standard OpenTelemetry framework.

This shift represents a move away from proprietary Jaeger components (like the agent and collector) toward a unified collector architecture. Jaeger V2 allows users to leverage the extensive plugin ecosystem of OpenTelemetry.

According to Docker Hub, the V2 images are available under the jaegertracing/jaeger repository. The versioning has progressed significantly, with versions such as 2.11.0, 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, and 2.17.0 available.

The images are optimized for multiple architectures, including:

  • linux/amd64
  • linux/arm64
  • linux/ppc64le

The image sizes for V2 are relatively compact, typically ranging between 46 MB and 53 MB, making them highly efficient for containerized deployment.

Deployment with Docker Compose and HotROD

To illustrate the power of Jaeger in a real-world microservices environment, the HotROD (Rides on Demand) demo application is provided. HotROD consists of several microservices that use OpenTelemetry for distributed tracing, providing a tangible example of how traces flow through a system.

The recommended way to deploy this is via Docker Compose. Users can download the necessary configuration file using curl:

curl -O https://raw.githubusercontent.com/jaegertracing/jaeger/refs/heads/main/examples/hotrod/docker-compose.yml

A critical operational detail when using Docker Compose is the management of image tags. If users rely on the latest tag, Docker Compose will pull the image once and store it in the local registry. Subsequent runs will use the local image even if a newer version is available on Docker Hub. This can lead to "stale" versions and potential incompatibilities between the Jaeger backend and the HotROD application.

To prevent this, users should explicitly define the versions using environment variables. For example:

JAEGER_VERSION=2.0.0 HOTROD_VERSION=1.63.0 docker compose -f docker-compose.yml up

To shut down the environment, the following command is used:

docker compose -f docker-compose.yml down

Troubleshooting Common Docker Deployment Issues

Deployment of Jaeger in Docker, especially on non-standard environments, can lead to connectivity and networking hurdles.

A common point of failure occurs when users deploy Jaeger on Windows using legacy tools like Docker Toolbox. In such environments, the Docker host is often a virtual machine with its own internal IP address. When a user runs a command like:

docker run -d --name jaeger -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 -p 5775:5775/udp -p 6831:6831/udp -p 6832:6832/udp -p 5778:5778 -p 16686:16686 -p 14268:14268 -p 9411:9411 jaegertracing/all-in-one:latest

They may find that although the container is reported as "Up", the ports are not actually listening on the host machine's localhost. This is a network mapping issue related to the virtualization layer of Docker Toolbox, not a failure of the Go language or the Jaeger binary itself. In these cases, the user must use the IP address of the Docker VM rather than localhost to access the UI at port 16686.

Comparative Analysis of Component Versions

The following table outlines the versioning and status of key Jaeger components as of the current documentation:

Component Latest Version Status Primary Purpose
jaeger-all-in-one 1.76 Active Dev/Demo combined backend
jaeger-query 1.76 Deprecated Trace retrieval and UI
jaeger-collector 1.76 Deprecated Span ingestion and storage
jaeger-ingester 1.76 Active Kafka-based span processing
jaeger-agent 1.62 Deprecated Sidecar span forwarding
jaeger-operator 1.65 Deprecated Kubernetes management
jaeger-remote-storage 2.17.0 Active Storage sharing
jaeger-cassandra-schema 2.17.0 Active Cassandra initialization
jaeger-es-index-cleaner 2.17.0 Active Elasticsearch cleanup
Jaeger V2 2.17.0 Experimental OTel-based architecture

Conclusion: The Evolution of Jaeger Deployment

The deployment of Jaeger via Docker has evolved from a monolithic "all-in-one" approach to a highly decoupled, specialized architecture, and finally toward a unified OpenTelemetry-based model. The all-in-one image remains the gold standard for rapid prototyping due to its in-memory storage and single-command startup. However, for production, the move toward V2 is an architectural necessity.

By integrating with the OpenTelemetry Collector, Jaeger V2 removes the need for deprecated components like the jaeger-agent and jaeger-collector (v1), replacing them with a more flexible and standardized pipeline. This transition allows organizations to avoid vendor lock-in and simplify their observability stack. Whether deploying a simple local instance for testing or a massive distributed cluster using Kafka, Cassandra, and Elasticsearch, the Docker ecosystem provides the necessary tools to ensure Jaeger is deployed consistently and scaled efficiently across any infrastructure.

Sources

  1. Jaeger Getting Started
  2. Jaeger Download
  3. Jaeger Google Groups
  4. Docker Hub - Jaeger Tags
  5. Docker Hub - Jaeger
  6. Docker Hub - Jaeger Organization

Related Posts