Architecting Network Automation with Ansible for Cisco NX-OS

The shift toward Infrastructure as Code (IaC) has fundamentally transformed the management of data center networking, moving away from manual command-line interface (CLI) interactions toward programmatic, scalable, and repeatable configurations. At the center of this transformation for Cisco Nexus environments is the integration of Ansible with NX-OS. Ansible, an agentless open-source automation engine, allows network engineers to treat their network configuration as software, leveraging YAML-based playbooks to orchestrate deployments, manage configurations, and perform complex audits across vast fleets of switches. By removing the need for a resident agent on the switch—relying instead on standard communication protocols—Ansible minimizes the overhead on the control plane while providing a robust framework for configuration management and orchestration.

The ecosystem for Cisco NX-OS automation is primarily driven by the cisco.nxos collection, a comprehensive set of modules and plugins designed to interface specifically with the Nexus series of switches. This collection aligns the NX-OS experience with other core networking platforms, ensuring that a consistent operational model is applied across the enterprise. The power of this integration lies in its modularity; tasks are broken down into discrete units of work, each handled by a Python-based module that abstracts the underlying CLI or API complexity. Whether the goal is to deploy a spine-leaf architecture, configure BGP peering, or manage VRFs, the Ansible framework provides the necessary tools to ensure that the state of the network matches the intended design defined in the codebase.

Core Infrastructure and Installation Requirements

Establishing a functional automation environment requires a precise alignment of software dependencies and system configurations. The foundation of any Ansible deployment is Python, as Ansible itself is written in Python and utilizes it to execute the modules that interact with the network hardware.

For users seeking a rapid start, a VSCode Devcontainer provides a pre-configured, isolated environment that ensures all dependencies are met without polluting the host operating system. Alternatively, a standard Linux machine is sufficient, provided it has Python version 3.11 or higher installed. This version requirement ensures compatibility with the latest asynchronous features and type-hinting capabilities of the Python language, which are critical for the stability of the cisco.nxos collection.

The installation process begins with the deployment of the Python package manager, known as pip. Pip is an essential administrative tool that allows for the installation and management of external libraries and dependencies that are not included in the Python standard library. Without pip, the installation of the Ansible engine and its associated collections would be manually intensive and prone to versioning errors.

Once pip is operational, the Ansible engine is installed via the terminal using the following command:

sudo pip install ansible

Following the installation, it is imperative to verify the version of the installed software to ensure compatibility with the intended collection versions. This verification step prevents "version skew," where the automation engine lacks the necessary features to support the modules defined in the playbooks.

Technical Architecture of Ansible Playbooks

The operational logic of Ansible is structured around the concept of playbooks. A playbook is a human-readable file written in YAML (YAML Ain't Markup Language), a data-serialization language specifically designed to be intuitive for both humans and machines. The hierarchical structure of a playbook is as follows:

  • Playbooks: The top-level container for the automation logic.
  • Plays: Within a playbook, one or more plays are defined. A play maps a specific group of hosts (e.g., "spine_switches") to a set of tasks.
  • Tasks: Each play consists of one or more tasks. A task is the smallest unit of execution that describes a specific action to be taken.
  • Modules: Every task is associated with a module. A module is a specialized Python script shipped with the Ansible installation or via a collection. The module is the actual "worker" that logs into the switch and executes the required configuration or data retrieval.

For example, when configuring a Nexus 9000 switch, a playbook might define a play for "Leaf Nodes," and within that play, a task using the cisco.nxos.nxos_l2_interfaces module to set the switchport mode. This modular approach ensures that the automation is granular and that failures can be isolated to specific tasks rather than crashing the entire deployment.

Deep Dive into the cisco.nxos Collection

The cisco.nxos collection is a specialized bundle of content designed to automate Cisco NX-OS network appliances. This collection is categorized as Red Hat Ansible Certified Content, meaning it undergoes rigorous testing and is entitled to official support through the Ansible Automation Platform (AAP).

Connection Protocols

To interact with a Nexus switch, Ansible must establish a communication channel. The cisco.nxos collection supports two primary connection methods:

  • network_cli: This method simulates a human operator logging into the CLI via SSH. It is the most universal method and is supported across almost all NX-OS versions.
  • httpapi: This method leverages the NX-API, a REST-based interface that allows Ansible to send JSON-RPC requests to the switch. This is generally faster and more efficient for large-scale data retrieval than scraping CLI text.

Compatibility and Support Matrix

The collection has been tested against specific software versions to ensure stability. For MDS switches, the modules are validated against NX-OS 8.4(1) and 9.4(3a), and they are expected to function on all releases above 8.4(1).

In terms of the Ansible engine, the collection requires Ansible versions greater than or equal to 2.16.0. There is a critical dependency on the ansible.netcommon collection. Specifically:

  • For ansible-core versions up to 2.18.x, the compatible versions are ansible.netcommon v8.0.1 and cisco.nxos v10.2.0.
  • For ansible-core versions 2.19 and above, ansible.netcommon must be version 8.1.0 or higher to resolve compatibility issues.

Comprehensive Module Analysis

The cisco.nxos collection provides a vast array of resource modules. A resource module is designed to manage a specific part of the device configuration, ensuring that the device reaches a "desired state" (idempotency).

Command and Generic Execution Modules

When a specific resource module does not exist for a particular feature, the nxos_config module serves as a fallback. This module allows the operator to issue raw CLI commands to the device, providing a safety net for emerging features that have not yet been codified into a dedicated module. Additionally, the cisco.nxos.nxos module can be used to run commands via cliconf, NX-API, or the netconf plugin.

Layer 2 and Layer 3 Interface Management

The management of interfaces is a core component of any network automation strategy. The following modules provide granular control:

  • cisco.nxos.nxosl2interfaces: Manages Layer 2 interface settings. Recent updates in version 11.1.3 added an alias for the mode option as switchport_mode. Version 10.2.0 enhanced this module's ability to handle CDP, Link flap, and beacon attributes.
  • cisco.nxos.nxosl3interfaces: Manages Layer 3 IP addresses and routing settings. Version 11.1.3 included a rewrite of this module to improve the code logic for handling redirects.

Routing and Advanced Networking

The collection includes high-level modules for complex routing protocols:

  • cisco.nxos.nxosbgpglobal: Manages the global BGP process and configuration.
  • cisco.nxos.nxosbgpaddress_family: Handles the specific address family configurations within BGP.
  • cisco.nxos.nxosbgpneighboraddressfamily: Manages the relationship between BGP neighbors and their associated address families.
  • cisco.nxos.nxosbgptemplates: Used to standardize BGP configurations across multiple neighbors using templates.
  • cisco.nxos.nxosstaticroutes: Manages static routing. Version 11.1.3 fixed a critical facts parser issue that previously filtered inline VRF routes from the global route collection, which had led to incorrect VRF route deletions.

Device Security and Global Configuration

Security is managed through a series of targeted modules:

  • cisco.nxos.nxosaaaserver: Manages the global configuration of Authentication, Authorization, and Accounting (AAA) servers.
  • cisco.nxos.nxosaaaserver_host: Handles the host-specific configurations for AAA servers.
  • cisco.nxos.nxos_acls: A resource module for managing Access Control Lists. Version 10.2.0 fixed an issue where TCAM bank errors were not being correctly captured by the error regex.
  • cisco.nxos.nxosaclinterfaces: Manages the application of ACLs to specific interfaces.
  • cisco.nxos.nxos_banner: Used to manage the multiline banners that appear upon login to the device.

Infrastructure Health and Monitoring

To maintain network stability, Ansible provides modules for monitoring and redundancy:

  • cisco.nxos.nxosbfdglobal: Configures Bidirectional Forwarding Detection (BFD) at the global level to detect link failures rapidly.
  • cisco.nxos.nxosbfdinterfaces: Manages BFD settings on a per-interface basis.
  • cisco.nxos.nxoshsrpinterfaces: Manages Hot Standby Router Protocol (HSRP) configurations. This module replaces the deprecated nxos_hsrp module. Version 11.1.3 fixed parsers for the preempt and priority attributes.
  • cisco.nxos.nxossnmpserver: Manages SNMP configurations. Version 11.1.3 resolved a community parsing issue.

Detailed Module Mapping and Specifications

The following table provides a structured overview of the core modules available within the cisco.nxos collection and their primary functions.

Module Name Primary Function Key Capability
cisco.nxos.nxos Command Execution Supports cliconf, NX-API, and Netconf
cisco.nxos.nxos_aaa_server Security Global AAA server configuration
cisco.nxos.nxos_aaa_server_host Security Host-specific AAA configuration
cisco.nxos.nxos_acls Traffic Control Access Control List resource management
cisco.nxos.nxos_acl_interfaces Traffic Control ACL application to interfaces
cisco.nxos.nxos_banner Administration Multiline banner management
cisco.nxos.nxos_bfd_global Reliability Global BFD configuration
cisco.nxos.nxos_bfd_interfaces Reliability Per-interface BFD configuration
cisco.nxos.nxos_bgp_global Routing Global BGP process management
cisco.nxos.nxos_bgp_address_family Routing BGP address family resource management
cisco.nxos.nxos_bgp_neighbor_address_family Routing BGP neighbor address family management
cisco.nxos.nxos_bgp_templates Routing BGP template resource management
cisco.nxos.nxos_l2_interfaces Interface L2 switchport and VLAN configuration
cisco.nxos.nxos_l3_interfaces Interface L3 IP and routing interface management
cisco.nxos.nxos_hsrp_interfaces Redundancy HSRP configuration on interfaces
cisco.nxos.nxos_vrf_global Routing Global VRF management (incl. RD attribute)

Versioning Evolution and Release Analysis

The cisco.nxos collection follows a strict versioning schema (PEP440), which allows engineers to track changes, bug fixes, and deprecations.

Analysis of Version 11.x Releases

The 11.x series focuses on refinement and bug resolution. Version 11.1.3 is particularly significant for its fixes in the nxos_facts module, specifically improving how facts are handled when using the httpapi connection type. This ensures that the state of the device is accurately reported back to the Ansible controller. Furthermore, the nxos_static_routes module received a critical update to prevent the accidental deletion of VRF routes by correctly filtering them from the global route collection.

Analysis of Version 10.x Releases

The 10.x series introduced major structural changes. Version 10.0.0 established the minimum required ansible-core version as 2.16.0. Version 10.1.0 marked a transition in the collection's API, deprecating the nxos_hsrp and nxos_vrf_interface modules in favor of nxos_hsrp_interfaces and nxos_vrf_interfaces, respectively. This change reflects a shift toward "interface-centric" resource management, which is more aligned with how NX-OS is architected.

Version 10.2.0 introduced enhancements to the nxos_interfaces module, adding support for:

  • service-policy
  • logging
  • mac-address
  • snmp configuration

Additionally, the nxos_vrf_global module was updated to support the rd (Route Distinguisher) attribute, which is essential for MPLS and VRF-Lite deployments.

Operational Implementation Workflow

To successfully implement Ansible for NX-OS, a structured workflow must be followed. This process moves from the local environment setup to the execution of role-based tasks.

Environment Preparation

  1. Install Python 3.11 or higher.
  2. Install the pip package manager.
  3. Execute sudo pip install ansible.
  4. Install the cisco.nxos collection from Ansible Galaxy or GitHub.

Playbook Development

The development of playbooks usually involves the creation of "Roles." Roles allow for the organization of tasks into reusable components. For instance, in a data center deployment, an engineer would create separate role task files for "Spine" switches and "Leaf" switches.

  • Spine Role Tasks: These tasks typically focus on high-throughput routing, BGP configurations, and core interface settings.
  • Leaf Role Tasks: These tasks focus on VLANs, L2 interface configurations, and end-device connectivity.

The actual tasks are compiled into YAML files. Each task calls a specific module from the cisco.nxos collection, passing parameters such as state: present to ensure the configuration is applied if it is missing, or state: absent to remove it.

Execution and Validation

Once the playbooks are written, they are executed against the target inventory. The agentless nature of Ansible means the controller connects via SSH or HTTP, pushes the Python module to the switch's memory (or executes it via the API), and retrieves the result. Validation is performed by checking the changed or failed status of each task in the Ansible output.

Conclusion

The integration of Ansible with Cisco NX-OS represents a paradigm shift in network administration, moving from reactive, manual configuration to a proactive, declarative model. By utilizing the cisco.nxos collection, organizations can achieve a level of consistency and scalability that is impossible through manual CLI entry. The transition from deprecated modules like nxos_hsrp to more robust, interface-specific modules like nxos_hsrp_interfaces demonstrates the maturity of the toolset.

The critical importance of version alignment—specifically the dependency between ansible-core, ansible.netcommon, and the cisco.nxos collection—cannot be overstated. A failure to align these versions, such as using ansible-core 2.19 with an outdated ansible.netcommon version below 8.1.0, will result in compatibility failures. When these technical requirements are met, the result is a powerful automation framework capable of managing everything from basic banners to complex BGP address families and VRF routing. The ability to treat the network as code not only reduces the risk of human error but also enables rapid disaster recovery and seamless scaling of the data center fabric.

Sources

  1. Cisco DevNet - Ansible NXOS
  2. Cisco DevNet - Automating NX-OS with Ansible
  3. Red Hat Ansible Catalog - Cisco NXOS
  4. GitHub - ansible-collections/cisco.nxos Releases
  5. GitHub - ansible-collections/cisco.nxos

Related Posts