The orchestration of VMware ESXi environments using Ansible represents a sophisticated intersection of infrastructure-as-code (IaC) and virtualization management. At its core, this process relies on the ability of an Ansible controller to communicate with the ESXi hypervisor through two primary channels: the VMware API (typically via the community.vmware collection) and the Secure Shell (SSH) interface for direct command-line execution. Achieving a stable automation pipeline requires a deep understanding of the underlying Python dependencies, the specific requirements of the vSphere API, and the nuances of how Ansible handles delegation and connection plugins. When these components are misconfigured, administrators often encounter cryptic errors related to SSL versions or unreachable hosts, which are usually symptoms of a fundamental misunderstanding of the communication port or the target execution environment.
Architectural Foundation of Ansible and VMware Integration
The integration between Ansible and VMware ESXi is not a monolithic process but rather a layered architectural stack. To successfully automate an ESXi host, the environment must be provisioned with specific software components that act as the bridge between the YAML playbooks and the hypervisor's API.
The Technical Stack for VMware Automation
The following table delineates the critical components required for a functional Ansible-to-ESXi deployment based on observed working configurations.
| Component | Version/Specification | Role in Automation |
|---|---|---|
| Ansible Controller OS | CentOS 8.4 | The execution environment hosting the Ansible engine |
| Ansible Version | 2.13.5 | The core automation engine managing playbooks |
| community.vmware Collection | 3.5.0 | The specialized module set for VMware interaction |
| PyVmOmi | 8.0.0.1.2 | The Python library facilitating VMware API communication |
| Python Version | 3.9 | The runtime environment for Ansible and its libraries |
| urllib3 | 2.0.4 | The HTTP library handling network requests to the API |
The Python Dependency Layer
The reliance on PyVmOmi and urllib3 is critical because Ansible does not communicate with ESXi natively; it uses these libraries to wrap API calls into a format the hypervisor understands. For instance, PyVmOmi provides the object-oriented interface to the VMware vSphere API. If urllib3 is outdated, the controller may experience failures during the SSL handshake, leading to connectivity errors. The impact of a version mismatch in these libraries is an immediate cessation of the playbook execution, often manifesting as a wrong_version_number error during the SSL negotiation process. This creates a ripple effect where the administrator may incorrectly assume the issue lies with the network firewall rather than the local Python environment.
Navigating API Communication and the SSL Port Conflict
A common failure point in ESXi automation is the confusion between the management port (SSH) and the API port (HTTPS). This distinction is vital for the community.vmware modules.
The SSL WRONGVERSIONNUMBER Error
When an administrator attempts to use a module like community.vmware.vmware_host_facts and explicitly defines port: 22 in the task, the playbook will either fail or produce an SSL WRONG_VERSION_NUMBER error.
- Direct Fact: The VMware API listens on HTTPS port 443, not SSH port 22.
- Technical Layer: When
port: 22is specified, Ansible attempts to initiate an HTTPS handshake with a port that is expecting an SSH protocol. Because the protocol formats differ, the SSL library returns a version mismatch error. - Impact Layer: The user experiences a total failure of the task, despite having valid credentials and network connectivity.
- Contextual Layer: This highlights the necessity of understanding that
community.vmwaremodules are API-driven, whereas thecommandorshellmodules are SSH-driven.
Correcting the Playbook Logic
To resolve the connectivity issue, the port parameter must be removed or changed to 443. By removing port: 22, the module defaults to the standard API port.
Example of a corrected configuration for gathering host facts:
yaml
- name: mengecek konfigurasi VMWare ESXI Server
community.vmware.vmware_host_facts:
esxi_hostname: "esx7-dev"
hostname: "192.168.50.5"
username: "root"
password: "********"
validate_certs: false
delegate_to: "localhost"
In this configuration, delegate_to: "localhost" is used because the vmware_host_facts module is a local action; the Ansible controller makes an API call to the remote host rather than logging into the remote host to run a script.
Advanced Deployment and Lifecycle Management
Beyond simple fact-gathering, Ansible can be used for the full deployment lifecycle of ESXi, from hardware firmware updates to the installation of the hypervisor itself.
Hardware-Level Orchestration (Dell PowerEdge Example)
Sophisticated playbooks, such as those found in the ansible-vsphere repository, extend automation to the hardware layer. This involves a multi-stage process:
- DRAC Update: Updating the Dell Remote Access Controller (DRAC) to the latest software version. This requires the
drac=trueextra variable to trigger the update logic. - Management Host Setup: Configuring the controller to act as the orchestration point.
- NFS Server Deployment: Setting up an NFS server on the controller to share the ESXi installation software with the target nodes.
- ISO Customization: Customizing the ESXi installation ISO to allow for a self-installing (unattended) process.
- OS Installation: Installing the ESXi hypervisor across all defined nodes.
- vCenter Deployment: Deploying vCenter on the primary node and subsequently creating the datacenter and cluster structures.
For these playbooks to function, a specific file structure is required:
- hosts: Based on hosts-example for inventory.
- group_vars/all.yml: Based on group_vars/all_yml.example.
- host_vars/<hostname>: Individual files for each ESXi node based on host_vars/host-example.
Nested ESXi Deployment Challenges
Deploying a nested ESXi environment (running ESXi inside another ESXi VM) introduces complexities regarding OVF/OVA deployment. Users have reported failures when using the vmware_deploy_ovf module, specifically citing unsupported parameters.
The following parameters have been identified as causing failures in certain versions or configurations:
- ova_hardware_networks
- ova_networks
- ova_properties
These errors indicate a mismatch between the module version and the parameters passed, requiring the administrator to verify the documentation for the specific version of the community.vmware collection being used.
Execution Context: Delegation and SSH Failures
One of the most complex aspects of Ansible for ESXi is the distinction between running a module via API and running a command via SSH.
The delegate_to and hosts Dilemma
When performing tasks like installing a VIB (VMware Installation Bundle) update, the administrator must use the command module. Unlike API modules, the command module requires an SSH connection to the target host.
A common failure occurs when a task is defined as:
yaml
- name: ESXi Install Update
command: "esxcli software vib install -d /vmfs/volumes/datastore_x.x.x.x/VMware-ESXi-7.0U3n-219308-depot.zip"
If the hosts value is localhost and no delegation is specified, Ansible attempts to run esxcli on the Linux controller. Since esxcli only exists on the ESXi hypervisor, the task fails with:
[Errno 2] No such file or directory: b’esxcli’
Solving the "Unreachable" SSH State
To successfully run esxcli commands, the target host must be correctly specified. If the host is changed to the ESXi IP but the error Permission denied (publickey,keyboard-interactive) appears, the following prerequisites must be met:
- SSH Service Enablement: The SSH service must be active on the ESXi host. This can be done manually or via the
community.vmware.vmware_host_service_managermodule. - Key Exchange: Public keys must be correctly placed in
/etc/ssh/key-roots/authorized_keyson the ESXi host. - Task Delegation: For tasks that enter maintenance mode, the
vmware_maintenancemodemodule should be delegated tolocalhostbecause it uses the API. However, the subsequentcommandfor the update must be targeted at the ESXi host.
Correct execution flow for updates:
1. Task: vmware_maintenancemode -> delegate_to: localhost (API call).
2. Task: command: esxcli... -> delegate_to: <ESXI_IP> or defined in hosts (SSH call).
Comparative Analysis of Management Tools
While Ansible provides powerful orchestration capabilities, there are scenarios where other tools are more appropriate. Expert analysis suggests a hybrid approach.
| Tool | Best Use Case | Strength | Weakness in ESXi Context |
|---|---|---|---|
| Ansible | Orchestration, Provisioning, Config Mgmt | Agentless, YAML-based, scalable | Limited by API/SSH availability, complex delegation |
| vCenter Lifecycle Manager | Host Patching, Firmware Updates | Native integration, validated paths | Less flexible for custom workflows |
| PowerCLI (Get-EsxCli) | Complex Hypervisor Queries, Scripting | Deep integration with ESXi internals | Requires PowerShell environment |
Conclusion: Strategic Implementation Analysis
The automation of VMware ESXi via Ansible is a powerful but fragile process that demands strict adherence to communication protocols. The primary failure vectors are not typically found in the YAML syntax but in the underlying transport layer. The transition from API-based modules (which operate on port 443 and should be delegated to localhost) to shell-based commands (which operate on port 22 and must be targeted at the remote host) is the most critical logic gate for an administrator.
For those deploying nested environments or performing massive rollouts, the integration of hardware-level automation (like DRAC updates) and software-level orchestration (via the community.vmware collection) allows for a "zero-touch" deployment. However, the recurring issue of urllib3 versioning and SSL mismatches underscores the need for a standardized Python environment on the Ansible controller. Ultimately, while Ansible is superlative for deployment and configuration, for the specific task of updating ESXi hosts, the industry consensus leans toward utilizing vCenter Lifecycle Manager or PowerCLI due to the inherent risks and complexity of managing SSH sessions and esxcli executions at scale.