Identity Federation and Observability: Implementing OAuth2 via Keycloak for Grafana Ecosystems

The integration of Keycloak and Grafana represents a critical architectural junction for modern DevOps and Site Reliability Engineering (SRE) workflows. At the core of this integration lies the transition from fragmented, siloed authentication mechanisms to a unified, centralized Single Sign-On (SSO) architecture. Keycloak, an open-source identity provider, serves as the authoritative source of truth, capable of managing complex identity lifecycles through support for standardized protocols such as OpenID Connect (OIDC), OAuth 2.0, and SAML. When this identity layer is successfully bridged to Grafana, the result is a streamlined security posture where user permissions, roles, and access controls are managed in a single, auditable location.

This technical configuration goes beyond simple login capabilities. It encompasses the orchestration of user federation from legacy systems like Active Directory or LDAP, the automated provisioning of clients via Infrastructure as Code (IaC) tools like OpenTofu, and the sophisticated mapping of JWT (JSON Web Token) claims to Grafana-specific permission levels. Furthermore, the integration extends into the realm of observability, where Keycloak's internal metrics—such as HTTP request latency and deployment evolution—can be visualized within Grafana dashboards to ensure the health of the entire authentication pipeline.

The Role of Keycloak as an Identity Provider

Keycloak operates as a robust identity management solution that facilitates the central management of users, applications, and permissions. Within a modern Kubernetes-based deployment, Keycloak acts as the gatekeeper for various microservices and dashboards.

The fundamental power of Keycloak lies in its ability to federate identities from diverse, pre-existing sources. This capability ensures that organizations do not need to migrate entire user databases to a new system. Instead, Keycloak can act as a proxy or bridge for:

  • Active Directory (AD) integration
  • LDAP-based user directories
  • External relational databases

In a typical production environment, such as a rhmlab realm, administrators configure LDAP federation to connect to an existing domain, such as rhmlamb.local. This allows users already present in the corporate directory to access Grafana without the need for redundant credential creation. It is important to note that during the initial configuration of LDAP federation, the user list within the Keycloak UI may appear empty. To verify connectivity and user availability, an explicit search must be performed within the Keycloak administration console.

Beyond federation, Keycloak allows for the direct creation of local users, such as a kctest user, which can be utilized for testing authentication flows in isolation from the enterprise directory. This dual approach—combining external federation with local identity management—provides a flexible foundation for both development and production-scale authentication requirements.

Automated Client Provisioning with OpenTofu

In the context of modern DevOps, manual configuration of OAuth2 clients is considered an anti-pattern. To maintain consistency and scalability, the provisioning of the Grafana client within Keycloak should be managed through Infrastructure as Code (IaC). Using OpenTofu, engineers can define the Keycloak client, its secrets, and its redirect URIs in a declarative manner.

The configuration of a keycloak_openid_client resource for Grafana requires precise definitions to ensure the OAuth2 handshake completes successfully. The following configuration block illustrates a robust setup for a confidential client:

hcl resource "keyCLOAK_openid_client" "grafana" { realm_id = keycloak_realm.homelab.id client_id = "grafana" name = "Grafana Client" enabled = true access_type = "CONFIDENTIAL" client_secret = "yourclientsecret" standard_flow_enabled = true implicit_flow_enabled = false direct_access_grants_enabled = true use_refresh_tokens = true root_url = "https://grafana.lan" admin_url = "https://grafana.lan" base_url = "/applications" valid_redirect_uris = [ "https://grafana.lan/login/generic_oauth" ] web_origins = [ "https://grafana.lan" ] valid_post_logout_redirect_uris = [ "https://grafana and .lan/login" ] }

Each attribute in this block has a direct impact on the security and functionality of the SSO flow:

  • client_secret: This is the cryptographic proof used by the Grafana backend to authenticate itself to Keycloak. If this does not match, the OAuth2 exchange will fail.
  • standardflowenabled: Setting this to true enables the Authorization Code Flow, which is the most secure method for web applications.
  • validredirecturis: This is a critical security boundary. The URI https://grafana.lan/login/generic_oauth must be explicitly whitelisted to prevent authorization code interception attacks.
  • web_origins: This setting controls Cross-Origin Resource Sharing (CORS) behavior, allowing the Grafana frontend to communicate correctly with the Keycloak identity provider.

Furthermore, the configuration of scopes is essential for ensuring that the JWT contains the necessary user metadata. By defining default_scopes, such as email, profile, and roles, the administrator ensures that Grafana receives the identity information required to build the user profile and assign permissions.

hcl resource "keycloak_openid_client_default_scopes" "grafana_client_default_scopes" { depends_on = [keycloak_openid_client_optional_scopes.grafana_client_optional_scopes] realm_id = keycloak_realm.homelab.id client_id = keycloak_openid_client.grafana.id default_scopes = [ "email", "offline_access", "profile", "roles", ] }

Advanced Role Mapping and JWT Claim Parsing

One of the most complex aspects of integrating Keycloak with Grafana is the translation of Keycloak roles into Grafana-specific permissions. Grafana recognizes a specific hierarchy of permissions:

  • GrafanaAdmin: Full access to all configuration and user management.
  • Admin: Access to dashboard management and data source configuration.
  • Editor: Ability to create and modify dashboards and panels.
  • Viewer: Read-only access to existing dashboards and data.

To automate this, administrators must map Keycloak roles (either Realm roles or Client roles) to these Grafana levels. While both types of roles can be utilized, the configuration of the "Mapper" is the deciding factor in a successful integration.

In recent versions of Keycloak, the structure of the JSON Web Token (JWT) has undergone a significant change. Historically, realm roles were mapped directly to a roles attribute. In the current architecture, roles are nested within the resource_access object, specifically under the client identifier. A typical JWT payload now looks like this:

json { "resource_access": { "grafana": { "roles": [ "grafanaadmin" ] } } }

This structural change means that a standard, unconfigured parsing of the JWT by Grafana will fail to find the required roles. To resolve this, engineers must implement one of two strategies:

  1. Reconfigure Keycloak Mappers: Modify the Keycloak client mapper to map roles directly to a top-level attribute rather than using the resource_access nesting. This is often achieved via a Client Role Mapper that maps directly to roles.
  2. Configure Grafana Role Attribute Path: Adjust the Grafana configuration to point the role_attribute_path to the correct JSONPath within the JWT. This allows Grafana to traverse the nested structure, for example, by targeting resource_access.grafana.roles.

The choice between these two methods depends on whether the administrator prefers to centralize the logic within the Identity Provider or the Service Provider (Grafana). Using a client role mapper that maps directly to roles is often considered the more straightforward approach for simplifying the role_attribute_path configuration in Grafana.

Observability: Visualizing Keycloak Metrics in Grafana

The integration between Keycloak and Grafana is not merely about authentication; it is also about monitoring the health of the authentication infrastructure itself. Keycloak provides a wealth of internal metrics that can be scraped by Prometheus and subsequently visualized in Grafana to provide insights into deployment performance and security events.

To begin this observability journey, the Keycloak instance must be configured to expose metrics. For high-resolution monitoring, particularly when analyzing HTTP request latency via heatmaps, the following setting must be enabled:

http-metrics-histograms-enabled = true

Enabling histograms allows for the calculation of quantiles and the generation of heatmaps, which are essential for detecting performance degradation or latency spikes in the authentication process.

The process for setting up these dashboards involves the following steps:

  1. Ensure Keycloak metrics are enabled within the Keycloak deployment configuration.
  2. Configure a Prometheus instance to scrape the Keycloak metrics endpoint.
  3. Retrieve the official Keycloak-Grafana dashboard JSON files.
  4. Execute the following command to clone the necessary dashboard assets:

bash git clone -b BRANCH_FROM_STEP_1 https://github.com/keycloak/keycloak-grafana-dashboard.git

  1. Import the downloaded JSON files into your running Grafana instance.

Once imported, these dashboards allow administrators to visualize the evolution of Keycloak metrics over time, such as login success/failure rates, session counts, and resource usage. This level of visibility is critical for identifying potential brute-force attacks or misconfigurations in the OAuth2 flow.

Configuration Requirements and Potential Pitfalls

When configuring OAuth2 authentication, several critical configuration parameters must be meticulously verified to avoid "looping" login attempts or authentication failures.

Configuration Parameter Description Impact of Error
root_url The base URL of the Grafana instance. Incorrect URLs will cause the OAuth2 callback to fail.
client_id The unique identifier of the client in Keycloak. Mismatched IDs prevent the initial handshake.
client_secret The secret key shared between Keycloak and Grafana. Incorrect secrets result in 401 Unauthorized errors.
valid_redirect_uris The whitelist of allowed callback URLs. Incorrect URIs trigger "Invalid Redirect URI" errors in Keycloak.
role_attribute_path The JSONPath used to find roles in the JWT. Incorrect paths result in users being assigned the 'Viewer' role by default.

One significant pitfall occurs when a user utilizes the same email address across multiple authentication providers (e.g., Keycloak and Grafana.com). If this is not addressed, Grafana may struggle to correctly match the incoming SSO identity with an existing local user profile. To mitigate this, additional configuration is required to ensure that the email-based matching logic is uniquely identified through specific claims or subject identifiers.

Furthermore, the [server] section of the Grafana configuration must have its root_url set correctly. This is essential because the callback URL generated during the OAuth2 request depends on the server's knowledge of its own public-facing address. If the root_url is misconfigured, the valid_redirect_uris defined in Keycloak will not match the incoming request, leading to a complete failure of the SSO flow.

Analysis of User Experience and Permission Enforcement

The success of a Keycloak-Grafana integration is ultimately measured by the seamlessness of the user experience and the integrity of the permission enforcement. When the integration is functioning correctly, the user experience is characterized by a single click on the "Sign in with Keycloak" button, followed by a standardized Keycloak login interface.

The impact of role mapping is most visible in the user interface of the Grafana Dashboard itself. The presence or absence of specific menus and actions is strictly controlled by the roles parsed from the JWT:

  • Users with the Admin role will observe a comprehensive set of menus, including access to the administration panel and data source management.
  • Users with the Editor role will have the ability to interact with panels and dashboards but will lack access to critical system configurations.
  • Users with the Viewer role will be presented with a restricted interface, where they can consume data but cannot alter the underlying visualization structure.

The distinction between these roles is even apparent in the "Profile" section of the Grafana Dashboard. An Admin user can view expanded profile details and management options, whereas a Viewer's profile is limited to basic identity information. This granular control ensures that even in a shared environment, the principle of least privilege is maintained, preventing unauthorized users from making destructive changes to the monitoring infrastructure.

Sources

  1. Rudi Martinsen: Grafana authentication with Keycloak
  2. Keycloak: Grafana Dashboards
  3. Max Pfeiffer: Single Sign-On (SSO) with Grafana and Keycloak
  4. Grafana: Configure Keycloak OAuth2 authentication
  5. Young Gyu Kim: OAuth2 Authentication in Grafana

Related Posts