The integration of Ansible with Kerberos for Windows management represents a shift from basic authentication protocols to a robust, ticket-based security architecture. In enterprise environments, particularly those leveraging Active Directory (AD), the transition from NTLM (New Technology LAN Manager) to Kerberos is often mandated by security policies due to the inherent vulnerabilities of NTLM, such as its susceptibility to relay attacks. Implementing Kerberos authentication for WinRM (Windows Remote Management) allows an Ansible controller to authenticate against a Windows target without transmitting passwords across the network in a manner vulnerable to interception, utilizing instead a Ticket Granting Ticket (TGT) system. This architecture ensures that identity verification is handled by a trusted third party—the Key Distribution Center (KDC)—providing a scalable and secure method for managing thousands of Windows endpoints.
The Technical Foundation of Kerberos in Ansible
Kerberos is a network authentication protocol designed to provide strong authentication for client-server applications by using secret-key cryptography. In the context of Ansible, Kerberos transforms the way the controller interacts with the Windows target. Instead of the controller sending a password to the target server via WinRM, the controller requests a service ticket from the KDC. The Windows server then validates this ticket, granting access based on the trust established between the server and the KDC.
One critical technicality is the case-sensitivity of Kerberos realm names. In Active Directory environments, realms are always uppercase. Failure to adhere to this specific formatting leads to authentication failures, as the KDC will not recognize a lowercase realm request.
Environment Preparation and Dependency Installation
To establish a functional Kerberos bridge between a Linux-based Ansible controller (such as Red Hat Enterprise Linux or Rocky Linux) and a Windows Server 2022 environment, specific system-level and Python-level dependencies must be present.
System-Level Packages
The controller requires the Kerberos workstation tools and development headers to compile the necessary Python bindings. On RHEL or Rocky Linux, the following commands are utilized:
sudo yum update
sudo yum -y install python3.12
sudo yum -y install python3.12-pip
sudo yum -y install gcc python3.12-devel krb5-devel krb5-libs krb5-workstation
These packages provide the kinit utility for ticket acquisition and the shared libraries required by the Python GSSAPI and Kerberos modules.
Python Library Requirements
Ansible relies on specific Python libraries to handle the Kerberos handshake over WinRM. If these are missing, the user will encounter the specific error: kerberos: the python kerberos library is not installed.
The installation process involves using the pip package manager to install the pywinrm library with Kerberos support, along with other essential helper libraries:
python3.12 -m pip install --user ansible
python3.12 -m pip install --user ansible-lint
python3.12 -m pip install --user pywinrm
python3.12 -m pip install pywinrm[kerberos]
For users utilizing Python 3.12 on Rocky Linux, additional libraries such as pykerberos, gssapi, krb5, and pypsrp[kerberos]<=1.0.0 are recommended to ensure stability and compatibility with the Kerberos protocol.
Configuring the Kerberos Client (krb5.conf)
The krb5.conf file is the central configuration for the Kerberos client. It tells the system where the KDC is located and which realm to use. While typically located in /etc/krb5.conf, advanced users may implement a local configuration to avoid system-wide changes.
Standard Configuration Structure
A typical krb5.conf file contains three primary sections: [libdefaults], [realms], and [domain_realm].
[libdefaults]
default_realm = EXAMPLE.COM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
[realms]
EXAMPLE.COM = {
kdc = 192.168.100.2
admin_server = 192.168.100.2
}
[domain_realm]
.example.com = EXAMPLE.COM
example.com= EXAMPLE.COM
Technical Analysis of Configuration Parameters
default_realm: Specifies the default realm for the system. This must be uppercase (e.g.,EXAMPLE.COM).dns_lookup_realmanddns_lookup_kdc: When set tofalse, the system ignores DNS SRV records and relies strictly on the IP addresses provided in the[realms]section. This is particularly useful in sandbox environments where DNS is unreliable.forwardable: When set totrue, it allows the ticket to be forwarded to another server, which is a prerequisite for Kerberos delegation (the "double-hop" scenario).rdns: Disabling reverse DNS lookups (false) prevents delays and failures when the network does not have a consistent reverse lookup zone.
Ansible Inventory Configuration for Kerberos
Once the system is prepared and the tickets are obtainable, the Ansible inventory must be modified to route traffic through Kerberos instead of NTLM.
Inventory Variables for Kerberos
The following variables must be defined in the inventory (e.g., hosts.ini or inventory.yml) to ensure the connection is established correctly:
| Variable | Recommended Value | Purpose |
|---|---|---|
ansible_connection |
winrm |
Defines the connection plugin to use |
ansible_port |
5986 |
The standard port for HTTPS WinRM |
ansible_winrm_transport |
kerberos |
Explicitly tells WinRM to use Kerberos authentication |
ansible_winrm_scheme |
https |
Ensures the connection is encrypted via SSL/TLS |
ansible_winrm_server_cert_validation |
ignore |
Skips certificate validation (common in internal labs) |
ansible_user |
user@REALM |
The user principal name (UPN) in uppercase realm |
Comparison: Kerberos vs. NTLM Configuration
The primary technical difference lies in the ansible_winrm_transport variable. While NTLM uses a simple password-based exchange, Kerberos requires the user to be formatted as user@REALM. Furthermore, if a valid Ticket Granting Ticket (TGT) exists on the controller, the ansible_password variable can be omitted entirely, as the ticket is used for authentication.
Validating and Testing the Kerberos Chain
Before executing a playbook, it is mandatory to verify that the controller can actually communicate with the KDC and obtain a ticket.
Manual Ticket Acquisition
The kinit command is used to authenticate the user and store the TGT in the local cache.
kinit [email protected]
After running kinit, the user should verify the ticket status using klist.
klist
The output should show a valid ticket for the service principal krbtgt/[email protected]. If kinit fails, the administrator must investigate the following:
- DNS resolution: Can the controller resolve the FQDN of the Domain Controllers?
- Network connectivity: Is port 88 (Kerberos) open on the domain controllers?
- Realm formatting: Is the realm name provided in uppercase?
- Credentials: Are the account credentials correct?
Advanced Implementation Strategies
Localized Kerberos Configurations
In scenarios where the administrator does not have root access to /etc/krb5.conf or needs different configurations for different targets, a local configuration can be used. This is achieved by creating a wrapper script (e.g., kinit.sh) and pointing Ansible to it.
kinit.sh content:
```bash
!/bin/bash
cd "$(dirname "$0")"
export KRB5_CONFIG=./krb5.conf
kinit $1
```
In the Ansible inventory, this is configured via:
ansible_winrm_kinit_cmd: "./kinit.sh"
This approach decouples the Kerberos configuration from the global system state, allowing for more flexible multi-domain management.
Managing Kerberos in Execution Environments (EE)
When running Ansible within an Execution Environment (such as AWX or Ansible Automation Platform on K3s), the kinit process occurs inside the container. This introduces complexity in debugging, as the ticket cache is internal to the pod.
To investigate Kerberos issues in an EE, a debug playbook can be used to pause the execution, allowing the administrator to enter the pod.
Debug Playbook:
yaml
- name: Debug Kerberos Authentication
hosts: localhost
gather_facts: false
tasks:
- name: Ensure /etc/krb5.conf is mounted
ansible.builtin.debug:
msg: "{{ lookup( 'file', '/etc/krb5.conf' ) }}"
- name: Pause for specified minutes for debugging
ansible.builtin.pause:
minutes: 10
The debugging workflow follows these steps:
1. Launch the job.
2. Identify the pod: kubectl -n <namespace> get pod (look for automation-job-*).
3. Access the pod: kubectl -n <namespace> exec -it <pod name> -- bash.
4. Verify the configuration: cat /etc/krb5.conf.
Handling Kerberos Delegation and the "Double-Hop" Problem
A common challenge in Windows automation is Kerberos delegation. This occurs when the Ansible controller authenticates to Server A, and Server A then needs to authenticate to Server B (e.g., accessing a network share). This is known as the "double-hop" scenario.
Debugging with klist.exe
To verify if a ticket has been successfully delegated to the remote Windows server, the administrator can use klist.exe on the remote side. This will display the TGT information for the current session, confirming whether the identity of the user was passed from the controller to the server.
Using Become for Credential Delegation
While the become keyword in Ansible is not technically part of the Kerberos authentication mechanism, it is used to execute tasks as a different user. To test if a network share is accessible via delegation, a hardcoded become configuration can be used for verification:
yaml
- name: "Copy File From Network Share"
ansible.windows.win_copy:
src: \\path\to\file.txt
dest: C:\Temp\Test\
remote_src: True
become: True
become_method: runas
become_flags: logon_type=new_credentials logon_flags=netcredentials_only
vars:
ansible_become_user: hard code username
ansible_become_pass: hard code password
Service Principal Names (SPN) and Delegation Failures
If Kerberos delegation fails despite correct configuration, the issue often lies with the Service Principal Names (SPN). Delegation only works if the outbound authentication also uses Kerberos. If the system providing the share has incorrect SPNs associated with it, the authentication request will attempt to reach a machine that "doesn't exist," resulting in a failure.
Conclusion: Strategic Analysis of Kerberos Integration
The transition to Kerberos for Ansible-managed Windows environments is a necessity for any organization prioritizing security over convenience. The technical overhead—specifically the requirement for krb5-devel libraries, the precise configuration of krb5.conf, and the management of TGT lifetimes—is significant but provides a superior security posture.
The most critical failure point in this architecture is usually the discrepancy between DNS and the Kerberos realm. Because Kerberos relies heavily on the FQDN (Fully Qualified Domain Name), any mismatch between the hostname and the SPN will break the authentication chain. Furthermore, the move toward Execution Environments (EE) necessitates a shift in how tickets are handled; administrators must move away from manual kinit calls on the controller and instead focus on ConfigMaps and volume mounts to provide krb5.conf to the containers.
Ultimately, the use of Kerberos removes the need for storing clear-text passwords in Ansible vaults for every connection, shifting the trust to the KDC. While NTLM remains an option for isolated labs, the Kerberos implementation described here is the only viable path for scalable, secure, and auditable enterprise automation.