Apr 4

Greg

Arista Zero Touch Provisioning Using The Ansible Automation Platform

Zero Touch Provisioning(ZTP) is the dream for network engineers, is it not? The idea is you take a fresh out of the box switch(or one that has had its configuration scrubbed), plug it in, then it is auto provisioned. There are ZTP procedures for every major switch vendor from Arista, Cisco, Juniper, and on. Each of these seems to be fairly similar in flow. I’m going to show you the simple steps I followed to do this with Arista kit.

Video Demo

Basic Flow

This is the basic order of operation.

1. Plug in a new switch and power it on.
2. The switch sends a bootp query that asks for an IP.
3. The DHCP/Bootp server will return an IP address and also send an option 67 message with the path to a base configuration file for the switch.
4. The switch will then pull the base config from some source: TFTP, SFTP, FTP, HTTP, HTTPs.
5. The switch loads the config and reboots.
6. In the base config I placed a simple script that will call the Ansible Automation Platform(AAP)’s API with a curl command. Curl is just a command line web browser. In the request I send over the IP address of the switch.
7. AAP will then connect to the switch and lookup its serial number.
8. AAP will use that serial number to determine what config options should be set for this device, connect to the switch, make all of the proper adjustments, and save the settings.

The whole process completes in less than 7 minutes…which is pretty crazy.

DHCP/Bootp Configuration

My lab router that runs all of my infrastructure is a Mikrotik router. This device will act both as my DHCP/Bootp server(to hand out an IP and point towards the initial config file) and act as a TFTP server(to hand out the initial config files).
So I really just enalbe Bootp and then add DHCP option 67 as follows:
Configure DHCP server:

Setup option 67(you can see that I’ve configured it to use TFTP). Keep in mind that you need to put single quotes around this string or it will fail:

Last I put in the option group associated with this specific option:

TFTP Initial Configuration Script

Here’s a copy of my initial config script:

?!
hostname provision-me
ip name-server vrf default 8.8.8.8
!
! ntp server <NTP-SERVER-IP>
!
username admin privilege 15 role network-admin secret lab
!
interface Management1
 ip address 10.1.12.99/24
!
ip access-list open
 10 permit ip any any
!
ip route 0.0.0.0/0 10.1.12.1
!
ip routing
!
management api http-commands
 no shutdown
!
! banner login
! Welcome to $(hostname)!
! This switch has been provisioned using the ZTPServer from Arista Networks
! Docs: http://ztpserver.readthedocs.org/
! Source Code: https://github.com/arista-eosplus/ztpserver
! EOF
!
event-handler callaap
 trigger on-startup-config
 ! For default VRF, make sure to update the ztpserver url
 action bash export SYSIP=`FastCli -p 15 -c 'show run int management 1 | grep -Eo "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"'`; curl -f -k -H 'Content-Type: application/json' -XPOST -d '{"extra_vars": "{\"host_ip\": \"'$SYSIP'\"}"}' --user MyUser:MyPassword https://10.1.12.34/api/v2/job_templates/146/launch/
end

Taking a look at the script above, it sets the device on the management subnet of that local network. This script will need to be configured differently(IP address wise) depending on what site you have it configred on. This could easily be done via automation and a jinja2 template.

The real important bit here is the event-handler right at the end named “callaap”.
This script is triggered to run at config startup. So once the switch pulls this config it will reboot the switch. Once the switch comes back online it will then execute this script.
Breaking the script down it first figures out the management IP and saves that to a variable. It then calls the AAP API and fires off a job template(it additionally passes over the management IP to AAP in this call). It does this API call with a simple curl command!

AAP Configuration/Playbooks

I’m not going to detail every single playbook, as they are mostly duplicates of each other. I am, however, going to break down three of them. Allllll of the files can be found here in my public github repo.

arista-ztp.yml playbook:

---
- name: zero touch provisioning for an Arista host
  hosts: provision_host
  gather_facts: false
  vars:
    host_ip: 1.1.1.1
 
  tasks:
  - name: set new ansible host via passed variables
    ansible.builtin.set_fact:
      ansible_host: "{{ host_ip }}"
 
  - name: gather facts on host
    arista.eos.eos_facts:
      gather_subset: hardware
    register: provision_facts
 
  - name: loop through hosts in inventory looking for matching serial number
    when: hostvars[item]['serial'] == provision_facts.ansible_facts.ansible_net_serialnum
    ansible.builtin.set_fact:
      new_host: "{{ item }}"
    loop: "{{ groups['all'] }}"
 
  - name: set stats so the hostname will be passed between workflows
    ansible.builtin.set_stats:
      data:
        stat_host: "{{ new_host }}"
 
- name: provision the found switch
  hosts: "{{ hostvars['provision_host']['new_host'] }}"
  gather_facts: false
  vars:
    secret_password: lab
    # figure out the default gateway based on switch IP
    default_gateway: "{{ int_ip | regex_search('\\b(?:[0-9]{1,3}\\.){3}\\b') }}1"
 
  tasks:
  - name: set new ansible host via passed variables
    ansible.builtin.set_fact:
      int_ip: "{{ hostvars[inventory_hostname]['ansible_host'] }}"
      ansible_host: "{{ host_ip }}"
 
  - name: place the template config file on the host
    arista.eos.eos_config:
      lines: "{{ lookup('template', 'arista_config.j2') }}"
      replace: block
    ignore_errors: true
 
- name: connect into new switch and save
  hosts: "{{ hostvars['provision_host']['new_host'] }}"
  gather_facts: false
  vars:
 
  tasks:
  - name: reset ip for host
    ansible.builtin.set_fact:
      ansible_host: "{{ int_ip }}"
 
  - name: save to startup config
    arista.eos.eos_command:
      commands: copy running-config startup-config

In my inventory I have a host setup with a bogus ip named “provision_host”. This gives me a target for my “hosts” section in my playbook. The very first task just resets this host’s IP to the IP address that was passed via the API when the basic config script makes its call. I then connect to the switch, gather facts from it, loop through my inventory looking for a matching serial number, once I do, I set a variable to the proper name for the new switch. I’m going to use this to not only set the hostname on the switch, but also it will be used in the “hosts” section of following plays.

The second play in the above playbook sets the hosts field to the name of the host we just discovered in the inventory. It then parses the IP address and builds the default gateway from it(takes the first three octets and adds a 1 to the end). I next use the switch template to blast on some new settings based on the info pulled from the inventory.

Last play connects in and saves the config. It has to reconnect in because I’ve updated the switch’s IP, so I need to reconnect and finish the save.

After this I run all of my infrastructure as code playbooks to finish filling out the configs. I do this by creating a simple workflow:

The cool thing about a workflow is that I can run things easily in parallel if I like, which means configuration happens faster.

I’m going to break down a couple of the playbooks as there are different ways to accomplish similar tasks.
arista-vlandb.yml

- name: configure vlan db on aristas
  hosts: "{{ stat_host }}"
  gather_facts: false
  vars:
  tasks:
  - name: parse the vlandb config
    arista.eos.eos_vlans:
      running_config: "{{ lookup('file', 'configs/' + inventory_hostname + '-vlansdb') }}"
      state: parsed
    register: parsed_config
 
  - name: set vlans based on file settings
    arista.eos.eos_vlans:
      config: "{{ parsed_config.parsed }}"
      state: overridden
 
  - name: save to startup config
    arista.eos.eos_command:
      commands: copy running-config startup-config

In this one, and most of the remaining playbooks I use the awesome “parsed” feature built into the modules. What it does is take a standard CLI config, parses it into a YAML data model. I store that data model into a variable in memory, then turn around and push that back into the module. It’s a simple way to take standard CLI and push it into your kit. Below is an example of using a data model for your configuration.

arista-vlans.yml

- name: configure vlan db on aristas
  hosts: "{{ stat_host }}"
  gather_facts: false
  vars:
  tasks:
  - name: pull in config file
    ansible.builtin.include_vars:
      file: "configs/{{ inventory_hostname }}-vlans.yml"
 
  - name: Configure trunk ports
    when: item.mode == "trunk"
    arista.eos.eos_l2_interfaces:
      config:
      - name: "{{ item.int }}"
        mode: trunk
        trunk:
          native_vlan: "{{ item.native | default(omit) }}"
          trunk_allowed_vlans: "{{ item.trunk_allowed | default(omit) }}"
      state: replaced
    loop: "{{ vlans }}"
 
  - name: Configure access ports
    when: item.mode == "access"
    arista.eos.eos_l2_interfaces:
      config:
      - name: "{{ item.int }}"
        mode: access
        access:
          vlan: "{{ item.access_vlan }}"
      state: replaced
    loop: "{{ vlans }}"
 
  - name: save to startup config
    arista.eos.eos_command:
      commands: copy running-config startup-config

In this playbook I pull in a data model from a file named HOSTNAME-vlans. This gives me the variables that I place into the playbook. I distinguish between a trunk port and a non trunked port so I know how to appropriately place said variables. Last step I save the config.

My AAP Inventory

In the variables section of the inventory I have a few common settings configured:

My inventory consists of three hosts. These could easily be sourced from a CMDB like ServiceNow.

Last here you can see how I have an the IP address configured as well as the designated serial number for each switch.

Conclusion

This is a really awesome way to deploy a LOT of kit quickly. With this process even fairly non-technical folks should be able to deploy a lot of kit on their own.

I also really enjoy the infrastructure as code approach taken here. The idea that all of your configuration can be done via config files in your code repository is something of a game changer. If I want to add a VLAN, I don’t login to the switch, rather I update the config in the repo, and have my automation push the changes. This way I can have a full audit trail with revision history on all changes(allowing other engineers to approve the changes).

If you have any questions or comments, I’d love to hear them. Good luck, and happy automating!

Arista Zero Touch Provisioning Using The Ansible Automation Platform

Video Demo

Basic Flow

DHCP/Bootp Configuration

TFTP Initial Configuration Script

AAP Configuration/Playbooks

My AAP Inventory

Conclusion

Leave a Comment

Donate

Blogroll

Contact Me

Contributors

Links

Categories

Pages

Archives

Search