Thanos Kubernetes Distributed Monitoring Architecture

The orchestration of scalable monitoring systems within cloud-native ecosystems necessitates a departure from traditional monolithic observation patterns. Thanos emerges as a critical infrastructure component designed to extend the capabilities of Prometheus, transforming it from a local time-series database into a global, highly available, and long-term storage solution. By integrating with Prometheus, Thanos provides a global query view, high availability, and cost-effective historical data access. These capabilities are encapsulated within a single binary, allowing for modular deployment. This modularity is a strategic advantage, enabling organizations to deploy a specific subset of Thanos features for immediate testing or incremental roll-outs within complex production environments.

Thanos is engineered to operate across diverse environments, ranging from highly orchestrated Kubernetes clusters to traditional bare-metal installations. While it is not strictly tied to Kubernetes, the symbiotic relationship between Kubernetes, Thanos, and Prometheus—all of which are part of the Cloud Native Computing Foundation (CNCF)—has led to the most popular deployment patterns being centered on Kubernetes. The system is built using the Golang programming language, ensuring compatibility across various x64 operating systems.

Distributed Architecture and Component Port Layout

Thanos is fundamentally designed as a distributed system. For production environments, running Thanos on a single node is strongly discouraged due to the inherent risks of single points of failure and the inability to scale horizontally. For small-scale setups, a vanilla Prometheus installation may be sufficient. However, for those utilizing a single node for development, testing, or experimentation, a specific port layout is recommended to avoid interface conflicts and ensure component communication.

The communication between Thanos components relies heavily on gRPC for high-performance data transfer and HTTP for management and querying.

Component Interface Port
Sidecar gRPC 10901
Sidecar HTTP 10902
Query gRPC 10903
Query HTTP 10904
Store gRPC 10905
Store HTTP 10906
Receive gRPC (store API) 10907
Receive HTTP (remote write API) 10908
Receive HTTP 10909
Rule gRPC 10910
Rule HTTP 10911
Compact HTTP 10912
Query Frontend HTTP 10913

Deployment and Installation Methodology

The deployment of Thanos focuses on a simple maintenance model. Because it is written in Go, it can be executed on any x64 operating system.

Software Acquisition and Toolchain

Depending on the version of Go being used, the method for acquiring the Thanos binary varies.

  • For older environments utilizing Go 1.12+, users can build Thanos from source by ensuring the toolchain is installed and the environment variables are configured as follows:
    GOPATH must be set, and PATH=${GOPATH}/bin:${PATH}.
    The installation is then performed via:
    go get github.com/thanos-io/thanos/cmd/thanos
    This process places the thanos binary directly into the user's PATH.

  • For modern environments using Go 1.17 and later, the go get method for installing executables is deprecated. In these cases, go install is the appropriate replacement. To avoid ambiguity when using a version suffix with go install, all arguments must refer to main packages within the same module at the same version.

Kubernetes Integration and Object Storage

In a Kubernetes environment, the integration of Thanos often involves the use of cloud-native object storage for long-term persistence. This allows the system to offload Time Series Database (TSDB) data from expensive SSDs to cheaper, scalable storage. For demonstration or local development purposes, MinIO can be used as a replacement for S3.

The process for establishing a MinIO-based storage backend on Kubernetes involves several steps:

  1. Adding the Bitnami repository:
    helm repo add bitnami https://charts.bitnami.com/bitnami

  2. Installing the MinIO instance:
    helm install minio bitnami/minio --set persistence.enabled=false

  3. Retrieving the necessary security credentials:
    export ROOT_USER=$(kubectl get secret --namespace monitoring minio -o jsonpath="{.data.root-user}" | base64 -d)
    export ROOT_PASSWORD=$(kubectl get secret --namespace monitoring minio -o jsonpath="{.data.root-password}" | base64 -d)

  4. Creating a dedicated Thanos bucket:
    kubectl run --namespace monitoring minio-client --rm --tty -i --restart='Never' --env MINIO_SERVER_ROOT_USER=$ROOT_USER --env MINIO_SERVER_ROOT_PASSWORD=$ROOT_PASSWORD --env MINIO_SERVER_HOST=minio --image docker.io/bitnami/minio-client -- mb -p minio/thanos

Advanced Configuration and Component Scaling

The flexibility of Thanos allows for granular control over how components are deployed and scaled within a cluster. This is typically achieved through Helm charts that manage Kubernetes manifests.

High Availability and Anti-Affinity

To ensure the monitoring system remains operational during node failures, Thanos components are often deployed with multiple replicas and strict scheduling constraints.

  • The replicaCount setting allows for horizontal scaling. For example, configuring replicaCount: 3 for the Query component ensures that no single pod failure disrupts the global query view.
  • The podAntiAffinityPreset: hard setting ensures that replicas of the same component are not scheduled on the same physical node, preventing a single node crash from taking down all replicas.
  • PodDisruptionBudgets (PDB) can be enabled via pdb: create: true to ensure a minimum number of available pods during voluntary disruptions.

Component-Specific Configurations

Different Thanos components require specific flags and configurations to optimize performance and data retention.

  • Query Component:
    The replicaLabel (e.g., prometheus_replica) is used to differentiate between high-availability Prometheus replicas.
    The --query.promql-engine=thanos flag specifies the engine used for processing queries.
    Service discovery for StoreAPI is managed via a ConfigMap.

  • Query Frontend:
    The Query Frontend can be optimized for connectivity using flags such as:
    --query-frontend.downstream-tripper-config="max_idle_conns_per_host": 100
    It can also be integrated with Redis for caching to reduce the load on backend components:
    type: REDIS
    config: addr: 'redis:6379'

  • Compactor:
    The Compactor manages data retention and downsampling. Configuration allows for different retention periods based on resolution:
    retentionResolutionRaw: 90d
    retentionResolution5m: 180d
    retentionResolution1h: 2y

  • Store Gateway:
    The Store Gateway provides access to historical data stored in object storage. It can be configured with Redis caching to improve retrieval speeds:
    config: type: REDIS
    config: addr: 'redis:6379'
    cache_size: '1G'

Theoretical Architecture for Maximum Resilience

A robust monitoring architecture must account for the possibility of a total cluster collapse. If the Prometheus server and the Thanos components reside within the same cluster they are monitoring, a catastrophic cluster failure will result in the loss of observability precisely when it is most needed.

The External Monitoring Pattern

To achieve a "survivable" monitoring state, a hybrid deployment pattern is employed where critical retrieval and management components are placed outside the main Kubernetes cluster.

  • Metrics Collection:
    Prometheus exporters gather metrics from various sources. This includes capturing metrics from short-lived jobs. The Prometheus server pulls these metrics from exporters and collects data pushed to the Pushgateway.

  • Storage and Sidecar Integration:
    Prometheus stores immediate data on SSDs. A Thanos Sidecar is attached to the Prometheus server. This Sidecar is responsible for sharing data and uploading TSDB blocks to an external S3-compatible bucket for long-term storage.

  • Data Optimization:
    The Thanos Compactor processes the data in the S3 bucket to downsample metrics and remove duplicates. This optimizes storage costs and increases query performance for long-term historical data.

  • Retrieval and Visualization:
    The Thanos Store retrieves the archived data from S3. When a user requests data via Grafana, the request is routed to the Thanos Query component, which then fetches the necessary data from the Store and S3.

  • Resilience Strategy:
    By placing Grafana, Thanos Query, Thanos Compactor, and Thanos Store outside the primary Kubernetes cluster, the administrator ensures that dashboards remain accessible and historical data can be queried even if the monitored cluster is completely offline.

Analysis of Distributed Monitoring Performance

The transition from a standalone Prometheus instance to a Thanos-integrated architecture represents a shift from local vertical scaling to distributed horizontal scaling. The primary impact for the operator is the removal of the storage bottleneck. In a standard Prometheus setup, the retention period is limited by the size of the local disk; however, with Thanos, the retention is limited only by the capacity of the object storage.

The implementation of the Sidecar creates a bridge between the short-term, high-performance local storage of Prometheus and the long-term, low-cost storage of the cloud. This allows for a tiered storage strategy where the most recent data is accessed with minimum latency via the Prometheus API, while historical data is retrieved via the Store Gateway.

From a performance perspective, the introduction of the Query Frontend and Redis caching is essential. Without these, global queries across multiple Prometheus instances would lead to significant latency and potential timeouts. The Query Frontend acts as a sophisticated caching layer and request aggregator, ensuring that the underlying Store components are not overwhelmed.

The use of gRPC throughout the Thanos ecosystem is a deliberate design choice to minimize overhead. In a distributed system where large volumes of time-series data must be transferred between the Store, Query, and Sidecar components, the efficiency of gRPC's binary serialization is superior to traditional REST/JSON interfaces.

Sources

  1. Thanos Getting Started v0.25
  2. Thanos Getting Started v0.10
  3. Enix.io - Deploying Thanos and Prometheus on a K8s Cluster
  4. Kubernetes Discussion - High Available Prometheus using Thanos Sidecar

Related Posts