Distributed Identity Management via Keycloak on Kubernetes Infrastructure

The orchestration of modern identity and access management (IAM) requires a level of resilience, scalability, and availability that traditional single-instance deployments cannot provide. As microservices architectures become the standard for enterprise applications, the role of a centralized Single Sign-On (SSO) provider becomes critical. Keycloak, an open-source identity and access management solution, provides the necessary framework for managing federated identities, social logins, and complex authorization flows. When deployed within a Kubernetes environment, Keycloak transforms from a simple authentication server into a highly available, cloud-native service capable of scaling alongside the application workloads it protects. This deployment strategy leverages the orchestration capabilities of Kubernetes to handle container lifecycle management, automated scaling, and self-healing, ensuring that authentication remains an uninterrupted service for end-users.

Architectural Framework for High Availability Keycloak Clusters

A robust Keycloak deployment on Kubernetes is not merely a single pod running a container; it is a multi-layered architecture designed to prevent single points of failure. The architecture involves several distinct components working in concert to maintain session integrity and service availability.

At the edge of the cluster, the User Browser initiates a Login Request which is intercepted by an Ingress Controller. The Ingress Controller acts as the entry point, managing external traffic and routing it to the appropriate internal service. The traffic is then directed to a Keycloak Service, which functions as a stable internal load balancer. This service distributes incoming requests across multiple Keycloak Pods. In a highly available configuration, these pods operate in a cluster mode, often consisting of at least two or more replicas to ensure that the failure of a single node does not result in a service outage.

The data persistence layer is handled by a dedicated database, such as PostgreSQL, which stores critical information including realm configurations, client details, and user credentials. To maintain a seamless user experience, Keycloak utilizes Infinispan for distributed session caching. This ensures that if a user is authenticated by Pod 1, their session data is synchronized across the cluster, allowing Pod 2 to recognize the user's session if the subsequent request is routed there. This synchronization is vital for "seamless" integration, as it prevents users from being forced to re-authenticate when a pod is rescheduled or fails.

The logical flow of data and authentication follows these paths:

  • User Browser to Ingress Controller via Login Request
  • Ingress Controller to Keycloak Service
  • Keycloak Service to Keycloak Pod 1 and Keycloak Pod 2
  • Keycloak Pods to PostgreSQL Database for persistent storage
  • Keycloak Pod 1 and Keycloak Pod 2 to Infinispan Cache for distributed session management
  • Application Service A to Keycloak Service via OIDC Token Validation
  • Application Service B to Keycloak Service via SAML Assertion
  • Application Service C to Keycloak Service via Social Login

Infrastructure Prerequisites and Environment Initialization

Before initiating the deployment, the underlying infrastructure must meet specific versioning and resource requirements to ensure stability and compatibility with modern Kubernetes features.

The primary prerequisite is a running Kubernetes cluster version 1.25 or later. This version ensures support for the necessary networking and API resources required by Keycloak and its associated ingress controllers. The user must have kubectl configured with appropriate administrative permissions to interact with the cluster API. Additionally, a PostgreSQL database must be available; this can be a managed cloud service or a PostgreSQL instance deployed within the cluster itself.

For local development and testing of the cluster configuration, Minikube serves as the preferred tool. The environment must have the Ingress addon enabled to manage external traffic routing.

The initial cluster preparation involves several command-line operations to establish the environment:

  • minikube start to bootstrap the local cluster
  • minikube addons enable ingress to activate the ingress controller
  • minikube addons list to verify the status of installed addons
  • minikube tunnel to provide network access for the ingress controller in a local environment

Namespace and Secret Management

To maintain strict isolation and follow the principle of least privilege, Keycloak should be deployed within its own dedicated namespace. This prevents resource conflicts and allows for granular Role-Based Access Control (RBAC).

Secrets are utilized to handle sensitive information, ensuring that credentials such as database passwords and admin usernames are not hardcoded into deployment manifests. These secrets are injected into the pods at runtime as environment variables.

The following operations establish the necessary logical boundaries and sensitive data stores:

```bash

Create a dedicated namespace for Keycloak

kubectl create namespace keycloak

Create a secret for the Keycloak admin credentials

kubectl create secret generic keycloak-admin \
--namespace keycloak \
--from-literal=username=admin \
--from-literal=password='YourStrongPassword123!'

Create a secret for the PostgreSQL connection

kubectl create secret generic keycloak-db \
--namespace keycloak \
--from-literal=db-url='jdbc:postgresql://postgres-service:5432/keycloak' \
--from-literal=db-username=keycloak \
--from-literal=db-password='DbPassword456!'
```

The creation of the keycloak-db secret is particularly critical, as it encapsulates the JDBC URL, the database username, and the password required for the Keycloak instance to establish a connection to the persistence layer.

Data Persistence and PostgreSQL Deployment

Keycloak requires a reliable, ACID-compliant relational database to function. While various databases are supported, PostgreSQL is a standard choice for production-grade deployments. In a production-like scenario, such as a deployment on AWS, the database configuration must be highly available to match the uptime requirements of the identity service.

In a local development environment using Helm, a PostgreSQL cluster can be deployed with a single replica for testing, though production environments should utilize more replicas to ensure data redundancy.

The deployment of the database via Helm follows this structure:

```bash

Add the Bitnami repository to manage Helm charts

helm repo add bitnami https://charts.bitnami.com/bitnami

Install the PostgreSQL HA (High Availability) cluster

helm install -n hotel keycloak-db bitnami/postgresql-ha --set postgresql.replicaCount=1
```

In a production setting, the postgresql.replicaCount would be increased to 3 or more to ensure that even if a database node fails, the identity service remains operational.

Keycloak Deployment Configuration and Environment Variables

The configuration of Keycloak within Kubernetes is heavily reliant on environment variables that control the internal behavior of the Quarkus-based Keycloak runtime. These variables define how the service handles clustering, health checks, metrics, and networking.

The integration of Infinispan is perhaps the most critical component for achieving high availability. Without proper cache configuration, the pods will act as isolated islands, unable to share session state. By using the jdbc-ping transport stack, Keycloak pods can use the shared database to discover each other and form a cluster, ensuring that session data is distributed and available across all members of the cluster.

The following table outlines the essential environment variables required for a functional and observable Keycloak deployment:

Environment Variable Value/Setting Impact and Purpose
KC_CACHE ispn Enables the use of Infinispan for distributed caching.
KC_CACHE_STACK jdbc-ping Configures the discovery mechanism to use the database for cluster membership.
KC_DB_URL (Database JDBC URL) Defines the connection string for the persistent PostgreSQL instance.
KC_DB_USERNAME (Secret-derived) The username used to authenticate with the PostgreSQL database.
KC_DB_PASSWORD (Secret-derived) The password used to authenticate with the PostgreSQL database.
KC_HOSTNAME https://keycloak.example.com Sets the base URL for the Keycloak instance to prevent redirect issues.
KC_HTTP_ENABLED true Allows HTTP traffic, typically used behind an ingress that handles TLS termination.
KC_PROXY_HEADERS xforwarded Ensures Keycloak correctly interprets X-Forwarded-* headers from the Ingress.
KC_METRICS_ENABLED true Activates the Prometheus-compatible metrics endpoint.
KC_HEALTH_ENABLED true Activates the management endpoints for health and readiness checks.

For monitoring and reliability, Kubernetes Probes are essential. A readinessProbe ensures that the service does not receive traffic until it is fully initialized, while a livenessProbe monitors the health of the container and restarts it if the process becomes unresponsive.

The following configuration snippet demonstrates how these variables and probes are integrated into a deployment manifest:

```yaml

Example configuration fragments for Keycloak deployment

  • name: KCDBURL
    valueFrom:
    secretKeyRef:
    name: keycloak-db
    key: db-url
  • name: KCDBPASSWORD
    valueFrom:
    secretKeyRef:
    name: keycloak-db
    key: db-password
  • name: KCHEALTHENABLED
    value: "true"
  • name: KCMETRICSENABLED
    value: "true"
  • name: KC_HOSTNAME
    value: "https://keycloak.example.com"
  • name: KCHTTPENABLED
    value: "true"
  • name: KCPROXYHEADERS
    value: "xforwarded"
    readinessProbe:
    httpGet:
    path: /health/ready
    port: management
    initialDelaySeconds: 30
    periodSeconds: 10
    livenessProbe:
    httpGet:
    path: /health/live
    port: management
    initialDelaySeconds: 60
    periodSeconds: 30
    resources:
    requests:
    cpu: 500m
    memory: 512Mi
    limits:
    cpu: "2"
    memory: 2Gi
    ```

Networking, Ingress, and TLS Configuration

To expose Keycloak to the outside world, an Ingress resource is required. This resource manages how external traffic is routed to the Keycloak service. In a production environment, traffic must be encrypted using TLS (Transport Layer Security).

The deployment typically uses an Ingress Controller like nginx-ingress. In local development with Minikube, the Ingress addon is used. For automated TLS management, cert-manager with a provider like Let's Encrypt is the industry standard.

The process of configuring the Ingress involves:

  • Installing the Ingress Controller (e.g., ingress-nginx).
  • Creating a TLS Secret containing the certificate and private key.
  • Defining an Ingress resource that specifies the hostnames and the secret to use.

The following command is used to create a quick Ingress resource for local testing, using wget to fetch a template and sed to inject the correct Minikube IP:

bash wget -q -O - https://raw.githubusercontent.com/keycloak/keycloak-quickstarts/refs/heads/main/kubernetes/keycloak-ingress.yaml | \ sed "s/KEYCLOAK_HOST/keycloak.$(minikube ip).nip.io/" | \ kubectl create -f -

For a more manual or controlled approach, a standard Ingress manifest is used:

```yaml

Ingress resource for external access with TLS

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: keycloak-ingress
namespace: keycloak
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- hosts:
- keycloak.example.com
secretName: auth-tls-secret
rules:
- host: keycloak.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: keycloak-service
port:
number: 80
```

Observability and Monitoring Strategies

A mission-critical identity service requires deep visibility into its operational state. Keycloak provides built-in support for Prometheus, which allows for the scraping of performance and health metrics.

When KC_METRICS_ENABLED is set to true, Keycloak exposes metrics at the /metrics endpoint on the management port. These metrics are essential for tracking key performance indicators (KPIs) such as:

  • Login success and failure rates
  • Total number of tokens issued
  • Active session durations
  • Request latency and throughput

For comprehensive observability, tools like OneUptime can be integrated. OneUptime provides a layer of proactive monitoring, uptime tracking, and incident management. By setting up internal HTTP monitors that specifically target Keycloak's health endpoints (/health/ready and /health/live), administrators can receive alerts through communication channels like Slack, email, or PagerDuty. This allows the technical team to detect authentication outages and remediate them before the impact reaches the end-user.

Testing and Validation in Local Environments

Before a deployment is pushed to a production environment, it must be validated in a local cluster. This involves running the Keycloak statefulset and verifying the administrative access.

A quick way to start Keycloak in a development cluster is through the official quickstart YAML:

bash kubectl create -f https://raw.githubusercontent.com/keycloak/keycloak-quickstarts/refs/heads/main/kubernetes/keycloak.yaml

This command initializes Keycloak with default credentials (admin/admin). Once the pods are running and the Ingress is configured, the administrator can access the console through the following URLs:

  • Keycloak Home: https://<host>/
  • Keycloak Admin Console: https://<host>/admin
  • Keycloak Account Console: https://<host>/realms/myrealm/account

It is important to note that the Account Console URL will only function once a specific "realm" has been created within the Keycloak configuration. In Keycloak, a "realm" acts as a multi-tenant boundary, allowing different organizations or environments to maintain separate sets of users and clients within the same deployment.

Conclusion

Deploying Keycloak on Kubernetes represents a significant shift from traditional server-based identity management toward a resilient, cloud-native paradigm. By leveraging Kubernetes' orchestration capabilities—such as Namespaces for isolation, Secrets for security, and Ingress for controlled external access—organizations can build a highly available identity backbone. The integration of Infinispan for distributed caching via jdbc-ping is the technical linchpin that ensures session continuity across a cluster of pods, effectively solving the problem of stateful sessions in a stateless container environment. Furthermore, the adoption of observability through Prometheus metrics and external monitoring platforms like OneUptime ensures that the identity layer is not a "black box," but a transparent, measurable component of the larger microservices ecosystem. Ultimately, the combination of Keycloak's robust IAM features and Kubernetes' scalable infrastructure provides the necessary foundation for secure, enterprise-grade distributed applications.

Sources

  1. OneUptime: Keycloak on Kubernetes
  2. Lukasz Budnik: Keycloak Kubernetes Repository
  3. Keycloak Official: Getting Started with Kubernetes

Related Posts