Skip to content
Feb 20 / Greg

Automating VMware Alternatives With Ansible And Ascender

Automating VMware Alternatives With Ansible And Ascender
I have personally used and had good success with VMware for nearly two decades. While it is a good product, I occasionally have folks talk about some alternatives, and how viable they are. For this article/demo I’m going to use Proxmox VE, which is a competent/user friendly hypervisor. Really, I’m going to examine this question from an automation perspective.

First, and most importantly, the user experience is exactly the same no matter what hypervisor you use. This means your users can smoothly transition from one platform to the other with no knowledge that anything has changed.

The backend, using Ansible and Ascender, will have different playbooks, which means you will be using different modules, and often some slightly different procedures, but frequently they aren’t so different.

Demo Video

Interfaces Compared

VMware’s Vcenter has been a pretty consistent interface for a while now:

Proxmox VE may be new to you, but the interface should look and feel awfully familiar:

They both will very similarly allow you to access and edit settings, add/delete VMs and templates…BUT, who wants to manage infrastructure manually?

Playbooks Compared

You can find my Proxmox playbooks here, and my VMware playbooks are here.

The two playbooks in question, one for Proxmox and one for VMware, are structured very similarly, but use their respective modules.

I’m going to breakdown and comment on each playbook below.
Proxmox Playbook(proxmox-vm.yml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
---
- name: Create and control VMs on Proxmox using Ascender
  hosts: localhost
  gather_facts: false
  vars:
    # Configure the api connection info for Proxmox host
    proxmox_auth: &proxmox_auth
      api_host: proxmox.gregsowell.com
      api_user: "{{ gen1_user }}"
      # api_user: root@pam # username format example
      # Use standard password
      api_password: "{{ gen1_pword }}"
      # Use api token and secret - example format
      # api_token_id: gregisa5.0
      # api_token_secret: 72a72987-ff68-44f1-9fee-c09adaaecf4d

Above I use a concept in YAML known as an anchor. In every task below I will reference this anchor with an alias. This essentially allows me to reference a set of options and inject them in various other places in my playbook. An anchor is set via the & symbol, which says “everything else after this is part of the anchor.”

Here I’m setting up the authentication. I have examples of doing username / password or using an api token. Notice that for username and password I’m using variables. This is because I’m passing in a custom credential via Ascender to the playbook at run time. Doing this allows me to securely maintain those credentials in the Ascender database.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    # Configure template to clone for new vm
    clone_template: rocky9-template
 
    # Configure location on Host to store new vm
    storage_location: local-lvm
 
    # Linked Clone needs format=unspecified and full=false
    # Format of new VM
    vm_format: qcow2
 
    # Default name to use for the VM
    vm_name: test-rocky9
 
    # How many cores
    # vm_cores: 2
 
    # How many vcpus:
    # vm_vcpus: 2
 
    # Options for specifying network interface type
    #vm_net0: 'virtio,bridge=vmbr0'
 
    #vm_ipconfig0: 'ip=192.168.1.1/24,gw=192.168.1.1'

Above here you can see that I’m setting up defaults for my standard variables that can be overridden at run time. Remember that if you pass in variables as extra_vars they have the highest level of precedence.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
    # Switches for playbook control
    # vm_action: provision
    # vm_action: start
    # vm_action: stop
    # vm_action: delete
 
 
  tasks:
  - name: Block for creating a vm
    when: vm_action == "provision"
    block:
    - name: Create a VM based on a template
      community.general.proxmox_kvm:
        # api_user: root@pam
        # api_password: secret
        # api_host: helldorado
        <<: *proxmox_auth
        clone: "{{ clone_template }}"
        name: "{{ vm_name }}"
        node: proxmox
        storage: "{{ storage_location }}"
        format: "{{ vm_format }}"
        timeout: 500
        cores: "{{ vm_cores | default(omit) }}"
        vcpus: "{{ vm_vcpus | default(omit) }}"
        net:
          net0: "{{ vm_net0 | default(omit) }}"
        ipconfig:
          ipconfig0: "{{ vm_ipconfig0 | default(omit) }}"
      register: vm_provision_info
 
    - name: Pause after provision to give the API a chance to catch up
      ansible.builtin.pause:
        seconds: 10

Here you can see my first two tasks. I’m looking for the vm_action variable to be set to provision, and if it is I will run this block of code. I’ll then clone a template. Here I’m doing a full clone, but I could also set it to be a linked clone if I wanted. You can also see I have some customization variables here, that when not set are omitted. After the VM provisions, it will pause for about 10 seconds to give the system time to register the new VM before it proceeds on.

This is also where you see the anchor being referenced. The “<<: *proxmox_auth” is the alias with override. This will take the variable chunk from above and inject it into this module(which makes modification to it easier and saves several lines of code in each task).

1
2
3
4
5
6
7
  - name: Start VM
    when: (vm_action == "provision" and vm_provision_info.changed) or vm_action == "start"
    community.general.proxmox_kvm:
      <<: *proxmox_auth
      name: "{{ vm_name }}"
      node: proxmox
      state: started

You can see in this start task that I have two conditionals. One if the vm_action is set to start and one if the action is set to provision and it actually made a change when doing the provision. It does this on the provision task because by default when you clone a template it won’t be started. We also check for the changed status because the clone operation is idempotent, which means if that VM already exists, it won’t do anything, it will simply report back “ok”.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
  - name: Stop VM
    when: vm_action == "stop"
    community.general.proxmox_kvm:
      <<: *proxmox_auth
      name: "{{ vm_name }}"
      node: proxmox
      state: stopped
 
  - name: Delete VM block
    when: vm_action == "delete"
    block:
    - name: Stop VM with force
      community.general.proxmox_kvm:
        <<: *proxmox_auth
        name: "{{ vm_name }}"
        node: proxmox
        state: stopped
        force: true
 
    - name: Pause to allow shutdown to complete
      ansible.builtin.pause:
        seconds: 10
 
    - name: Delete VM
      community.general.proxmox_kvm:
        <<: *proxmox_auth
        name: "{{ vm_name }}"
        node: proxmox
        state: absent

The delete block isn’t too complex. It first stops the VM, pauses so the system will register the change, then performs the delete from disk.

VMware Playbook(vmware-vm.yml)
Compare and contrast the two playbooks: they are laid out almost identically, the tasks are in the same order, and they are configured almost the same. This means transitioning from one to the other should be pretty seamless.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
- name: Create and control VMs on VMware using Ascender
  hosts: localhost
  gather_facts: false
  vars:
    # below are all of the details for the VM.  I'm overriding these at runtime.
    vm_datacenter: MNS
    vm_name: snowtest2
#    vm_template: Windows2016
    vm_template: Rocky8.6
#    vm_template: Rocky8
#    vm_template: Rocky9
    vm_folder: /Greg/ciq
    vm_disksize: 50
    vm_datastore: SSD
    # minimum of 4GB of RAM
    vm_memory: 4096
    vm_cpus: 4
    vm_netname: Greg
    vm_ip: 10.1.12.56
    vm_netmask: 255.255.255.0
    vm_gateway: 10.1.12.1
 
    vmware_auth: &vmware_auth
      hostname: "{{ vcenter_hostname }}"
      username: "{{ gen1_user }}"
      password: "{{ gen1_pword }}"
      validate_certs: no
 
    # Switches for playbook control
    # vm_action: provision
    # vm_action: start
    # vm_action: stop
    # vm_action: delete
 
 
  tasks:
  - name: Provision a VM
    when: vm_action == "provision"
    community.vmware.vmware_guest:
      <<: *vmware_auth
      folder: "{{ vm_folder }}"
      name: "{{ vm_name }}"
      datacenter: "{{ vm_datacenter }}"
      state: poweredon
#      guest_id: centos64Guest
      template: "{{ vm_template }}"
  # This is hostname of particular ESXi server on which user wants VM to be deployed
      disk:
      - size_gb: "{{ vm_disksize }}"
        type: thin
        datastore: "{{ vm_datastore }}"
      hardware:
        memory_mb: "{{ vm_memory }}"
        num_cpus: "{{ vm_cpus}}"
        scsi: paravirtual
      networks:
      - name: "{{ vm_netname}}"
        connected: true
        start_connected: true
        type: dhcp
        # type: static
        # ip: "{{ vm_ip }}"
        # netmask: "{{ vm_netmask }}"
        # gateway: "{{ vm_gateway }}"
        # dns_servers: "{{ vm_gateway }}"
#      wait_for_ip_address: true
#        device_type: vmxnet3
    register: deploy_vm
 
  - name: Start a VM
    when: vm_action == "start"
    community.vmware.vmware_guest:
      <<: *vmware_auth
      folder: "{{ vm_folder }}"
      name: "{{ vm_name }}"
      datacenter: "{{ vm_datacenter }}"
      state: poweredon
 
  - name: Stop a VM
    when: vm_action == "stop"
    community.vmware.vmware_guest:
      <<: *vmware_auth
      folder: "{{ vm_folder }}"
      name: "{{ vm_name }}"
      datacenter: "{{ vm_datacenter }}"
      # state: poweredoff
      state: shutdownguest 
 
  - name: Delete a VM
    when: vm_action == "delete"
    community.vmware.vmware_guest:
      <<: *vmware_auth
      folder: "{{ vm_folder }}"
      name: "{{ vm_name }}"
      datacenter: "{{ vm_datacenter }}"
      state: absent
      force: true

Interface Comparison In Ascender

For my job templates(how you pull your playbook and required components together in Ascender) I’m using something called a survey. This allows you to quickly/easily configure a set of questions for the user to answer before the automation is run. Here’s an example of my VMware survey:

Adding an entry is as simple as clicking add and filling in the blanks:

Notice in the above there is a blank space for “Answer variable name”. This is the extra_vars variable that the info will be passed to the playbook with. That’s how these surveys work; they simply pass info to the playbook as an extra_var.

So for the sake of comparison, here’s what it looks like when you launch the two different job templates:
VMware Job Template:

Proxmox Job Template:

Notice how they are almost identical. I added all of the options to the VMware JT, but I kept the Proxmox one a little cleaner. I could have just as easily added all of the knobs, but I wanted to show how you can make them full featured or simple. In the end they accomplish the same task. You put in the VM’s name, choose an action to perform, and add any additional data required. That’s literally all there is to it!

When I say choose an action it’s as easy as this:

Conclusion

Now this should just serve to show you how easy it is to automate different hypervisors. Keep in mind that this example is simple, but can easily be expanded to easily automate virtually all of the functions of your virtual environment.

As always, I’m looking for feedback. How would you use this in your environment…how would you tweak or tune this?

CIQ also does professional services, so if you need help building, configuring, or migrating your automation strategy, environments, or systems, please reach out!

Thanks and happy automating!

Feb 19 / Greg

Why Am I Ryan Shacklett

Hey everybody, I’m Greg Sowell and this is Why Am I, a podcast where I talk to interesting people and try to trace a path to where they find themselves today.  My guest this go around is Ryan Shacklett.  Ryan has multiple fursonas with his main as Wild Acai.  For the uninitiated, that means he attends conventions and events dressed in an insanely impressive animal suite…often referred to as a “furry”.  Not only does it provide a place to explore parts of your personality, but also allows for a lot of creative expression.  Not only does he participate, but he also gets to give these experiences to others through his own multi employee company that creates these impressive suites.. Help us grow by sharing with someone!

Youtube version here:

Please show them some love on their socials here: http://waggerycos.com/,

https://twitter.com/waggerycos.

If you want to support the podcast you can do so via https://www.patreon.com/whyamipod (this gives you access to bonus content including their Fantasy Restaurant!)

Feb 11 / Greg

Why Am I Dan De Leon

Hey everybody, I’m Greg Sowell and this is Why Am I, a podcast where I talk to interesting people and try to trace a path to where they find themselves today.  My guest this go around is Dan De Leon.  Dan lives his life the way he leads his church, open and affirming.  That was a term new to me, but essentially it means all people are welcome and supported…no matter how able bodied you are, or where you are in the LGBTQIA+ box of crayons.  Dan gives me hope.  Hope that the religious people I care about can come to the one real truth; loving others unconditionally is the only thing that matters.  You follow this principle first, and have your faith fit in around that, not the other way around.  I learn a LOT in this conversation, and I hope you do to.  Please enjoy this chat with Dan. Help us grow by sharing with someone!

Youtube version here:

Please show them some love on their socials here: https://www.friends-ucc.org/,

https://www.facebook.com/friendschurchucc,

https://www.instagram.com/friends_ucc.

If you want to support the podcast you can do so via https://www.patreon.com/whyamipod (this gives you access to bonus content including their Fantasy Restaurant!)

Feb 4 / Greg

Why Am I 2024-02-04 10:48:50

Hey everybody, I’m Greg Sowell and this is Why Am I, a podcast where I talk to interesting people and try to trace a path to where they find themselves today.  My guest this go around is Scott Jones.  He’s a British born Aussie who has always felt a bit like a fish out of water.  Part of that feeling came to light when he recently realized he’s gay, and now the world literally looks different.  I mean imagine waking up one day, and getting to experience the world through a beautiful new lens.  I hope you enjoy this chat with Scott.  Help us grow by sharing with someone!

Youtube version here:

If you want to support the podcast you can do so via https://www.patreon.com/whyamipod (this gives you access to bonus content including their Fantasy Restaurant!)

Jan 29 / Greg

Install Performance Co-pilot Via Ansible And Ascender


Performance co-pilot(PCP) is a suite of tools used for performance monitoring for a variety of things. We see it used quite a bit in the HPC space to either squeeze as much performance out of a system as possible or to troubleshoot performance issues. It can often be tedious to install and manage…unless, of course, you use automation!

I’ll describe my architecture, review my playbooks, and have a look at it all working.

Video Demo

How It Works

PCP has a LOT of components and options, I really intend to just describe how I’m configuring it.

First, what is a “Collection host”? Any regular server or VM running PCP to gather info on itself is considered a collection host. So most of the configured hosts will be collection hosts.


Once a collector is configured an admin will generally SSH into it to access the PCP data. These hosts can also run something like redis with grafana to graph info, which means the admin is going straight to the host either way.


When your environment begins to grow it can be a bit tedious to connect to each host to access PCP info.


This is where a “Monitoring Host” comes in. A monitoring host stores info from multiple collection hosts. This means an admin only needs to connect to the monitoring host to gain insight about any of the collection hosts…a one-stop-shop as it were.

You can either push or pull data. If you push data from the collectors they will incur some additional overhead. If you pull from the monitoring host, it will incur the additional cost, which is less likely to skew your performance metrics from the collection hosts.

I’ve also seen some data saying that a monitoring host should be capped somewhere around a thousand collectors.

Playbooks

All of my playbooks can be found here in my git repository.

pcp-install.yml. This playbook connects to PCP collectors and configures them to both collect locally and prepares them to allow monitoring hosts to access them:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
- name: Install/configure PCP on various hosts
  hosts: pcp-hosts
  gather_facts: false
  vars:
    # Services to be enabled/started
    enable_services: 
      - pmcd
      - pmlogger
 
    # The subnets or ranges of hosts allowed to connect to clients to fetch info
    remote_subnets:
      - 10.0.*
      - 192.168.5.10
 
  tasks:
  # dnf install required pcp packages
  - name: Install pcp packages
    ansible.builtin.dnf:
      name: "{{ item }}"
      state: latest
    loop:
      - pcp
      - pcp-system-tools
    notify: restart pcp
 
  - name: Configure the pmcd process(add all of the allowed subnets)
    ansible.builtin.blockinfile:
      path: /etc/pcp/pmcd/pmcd.conf
      block: "{{ lookup('ansible.builtin.template', 'pmcd-access.j2') }}"
      insertafter: "\\[access\\]"
    notify: restart pcp
 
  - name: Configure the pmcd options to listen on the correct IP
    ansible.builtin.lineinfile:
      path: /etc/pcp/pmcd/pmcd.options
      line: "-i {{ hostvars[inventory_hostname].ansible_host }}"
 
  - name: Enable pmcd listening ports on firewall
    ansible.posix.firewalld:
      port: 44321/tcp
      permanent: true
      immediate: true
      state: enabled
    ignore_errors: true
 
  - name: Enable selinux for pmcd services
    ansible.builtin.shell: "{{ item }}"
    ignore_errors: true
    loop:
      - setsebool -P pcp_read_generic_logs on
      - setsebool -P pcp_bind_all_unreserved_ports on
 
  - name: Start and enable pcp services
    ansible.builtin.service:
      name: "{{ item }}"
      state: started
      enabled: true
    loop: "{{ enable_services }}"
 
  handlers:
  - name: restart pcp
    ansible.builtin.service:
      name: "{{ item }}"
      state: restarted
    loop: "{{ enable_services }}"

I’m going to point out some things of note in the above playbook.
First is the remote_subnets variable. This should be populated with the IP or subnet of your monitoring hosts. It’s essentially an access list of who is allowed to connect in to retrieve PCP data.

Most of the tasks are pretty straightforward, but I thought I would have a look at one that includes a jinja2 template:

1
2
3
4
5
6
  - name: Configure the pmcd process(add all of the allowed subnets)
    ansible.builtin.blockinfile:
      path: /etc/pcp/pmcd/pmcd.conf
      block: "{{ lookup('ansible.builtin.template', 'pmcd-access.j2') }}"
      insertafter: "\\[access\\]"
    notify: restart pcp

This replaces a block of code using the blockinfile module, but I’m pulling that block from a dynamic j2 template(in the templates folder) named pmcd-access.j2:

1
2
3
{% for item in remote_subnets %}
allow hosts {{ item }} : fetch;
{% endfor %}

Taking a look at the template above you can see I have a simple “for loop”. I loop over the contents of remote_subnets and fill out the allow hosts section based on it. Anything inside of {% %} is omitted from the actual output of the template.

Now that the PCP collectors are installed and configured, I’ll run the pcp-monitor.yml playbook to configure the monitor host:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
- name: Install/configure PCP monitor host
  hosts: pcp-monitor
  gather_facts: false
  vars:
    # Services to be enabled/started
    enable_services: 
#      - pmcd
      - pmlogger
 
    collection_directory: /var/log/pcp/pmlogger/
 
    # Do you want to set the pmlogger config files to use host IP address instead of inventory_hostname
    config_via_host: true
 
  tasks:
  # - name: debug data
  #   ansible.builtin.debug:
  #     var: hostvars[item]
  #   loop: "{{ groups['pcp-hosts'] }}"
 
  - name: Install pcp packages
    ansible.builtin.dnf:
      name: "{{ item }}"
      state: latest
    loop:
      - pcp
      - pcp-system-tools
    notify: restart pcp
 
  - name: Create config file for each pcp-host
    ansible.builtin.template:
      src: pmlogger-monitor.j2
      dest: "/etc/pcp/pmlogger/control.d/{{ item }}"
    loop: "{{ groups['pcp-hosts'] }}"
    notify: restart pcp
 
  - name: Create collector host directories by looping over pcp-hosts group
    ansible.builtin.file:
      path: "{{ collection_directory }}{{ item }}"
      state: directory
      mode: '0777'
    loop: "{{ groups['pcp-hosts'] }}"
 
  - name: Start and enable pcp services
    ansible.builtin.service:
      name: "{{ item }}"
      state: started
      enabled: true
    loop: "{{ enable_services }}"
 
  handlers:
  - name: restart pcp
    ansible.builtin.service:
      name: "{{ item }}"
      state: restarted
    loop: "{{ enable_services }}"

Again, I’ll try and point out the less obvious or perhaps more interesting parts of the above playbook.

The variable collection_directory is where the collected PCP data from the collectors will be stored.

The config_via_host variable is one I put in especially for my lab environment. When the config files are created, they point to a host to collect. If this variable is set to true, then the host’s IP address will be used. If it’s set to false, then the inventory_hostname will be used(generally a Fully Qualified Domain Name(FQDN)).

In the previous playbook I used a template, and I’m using one here in the monitor host configuration also in the following task:

1
2
3
4
5
6
  - name: Create config file for each pcp-host
    ansible.builtin.template:
      src: pmlogger-monitor.j2
      dest: "/etc/pcp/pmlogger/control.d/{{ item }}"
    loop: "{{ groups['pcp-hosts'] }}"
    notify: restart pcp

Here I’m using the template module directory rather than the template lookup plugin. Let’s examine the reference pmlogger-monitor.j2 template:

1
2
3
4
5
{% if config_via_host %}
{{ hostvars[item].ansible_host }} n n PCP_LOG_DIR/pmlogger/{{ item }} -r -T24h10m -c config.{{ item }}
{% else %}
{{ item }} n n PCP_LOG_DIR/pmlogger/{{ item }} -r -T24h10m -c config.{{ item }}
{% endif %}

This one uses a conditional “if else” statement, rather than just a loop. This is where I check if the collector host should be referenced via the inventory_hostname or via the ansible_host.

Executing/Troubleshooting Automation

Configure/Install/Troubleshoot Collector
Once you’ve added your inventories, projects, credentials and job templates, you can execute the automaton for installing the collectors:

If you want to test the collector host, you can pretty easily do it by SSHing in and issuing the “pcp” command:

If the monitor is getting “connection refused”, be sure the check the listening ports on the collector with “ss -tlp | grep 44321”:

Configure/Install/Troubleshoot Monitor
Once you run the monitor playbook you should see the successful message:

Now, if you want to test the monitor host, you can SSH into it and check the collection_directory. In my case I had it as “/var/log/pcp/pmlogger/”:

You can see here my PCP collector Greg-rocky9 folder is showing up, but is there data inside?:

This folder is full of data. If it wasn’t I would do a “tail pmlogger.log” in that folder to get an idea of what was happening:

Conclusion

While PCP data may not be for everyone, it can, quite easily, be configured. The trick about performance data is that if you have a performance issue, you can’t go back in time and enable the data collection, so why not go ahead and start collecting BEFORE there’s an issue 🙂.

As always, thanks for reading. If you have any questions or comments, I’d love to hear them. If you use PCP in your environment, I’d love to hear about that also! If we can help you on your automation journey, please reach out to me.

Good luck and happy PCP automating!

Jan 28 / Greg

Why Am I Geoffrey Mark

Hey everybody, I’m Greg Sowell and this is Why Am I, a podcast where I talk to interesting people and try to trace a path to where they find themselves today.  My guest this go around is Geoffrey Mark.  This cat started in show business at 15, and hasn’t stopped since.  He done Broadway, he dances, sings, is a comedian, and just to fill some time has also been a best selling author.  He has some valuable advice on grief, life, and talent.  Oh, and don’t forget to sparkle like Geoff.  Help us grow by sharing with someone!

Youtube version here:

Please show them some love on their socials here: https://www.instagram.com/geoffreymarkshowbiz/?hl=en,

https://twitter.com/thegeoffreymark,

https://www.facebook.com/groups/478789255814114/,

https://a.co/d/5c1P6Wv.

If you want to support the podcast you can do so via https://www.patreon.com/whyamipod (this gives you access to bonus content including their Fantasy Restaurant!)

Jan 24 / Greg

“Build And Replace” Linux Migration Via Ansible, Ascender, AWX, AAP

Migrating from one Linux major version to another never seems to be a simple task, but through the magic of automation, it can be a lot simpler and reproducible. I’m going to cover the Ansible playbooks I created to do the work, then I’ll execute it using our enterprise automation platform called Ascender.

Our recommended method is to:
– Backup configuration and data from the old server
– Provision a brand new server with the required apps
– Restore configurations and data to the new server
– Test services to the new server
– Sunset the old server

Video Demo

Playbooks

First, I’m using resources from the community.general collection found here. I actually have a version of it included in my git repository.

All of my playbooks can be found here in my git repository.

I’ll cover some of the playbooks here… mostly discussing the highlights. The discover-backup.yml playbook is the first playbook run:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
- name: Discover/backup hosts to be migrated
  hosts: migration-hosts
  gather_facts: false
  vars:
    # The host to store backup info to
    backup_storage: backup-storage
 
    # The location on the backup host to store info
    backup_location: /tmp/migration
  tasks:
  - name: Execute rpm to get list of installed packages
    ansible.builtin.command: rpm -qa --qf "%{NAME} %{VERSION}-%{RELEASE}\n"
    register: rpm_query
 
  - name: Populate service facts - look for running services
    ansible.builtin.service_facts:
 
  # - name: Print service facts
  #   ansible.builtin.debug:
  #     var: ansible_facts.services
 
  - name: Create backup directory on backup server - unique for each host
    ansible.builtin.file:
      path: "{{ backup_location }}/{{ inventory_hostname }}"
      state: directory
      mode: '0733'
    delegate_to: "{{ backup_storage }}"
 
  # - name: Backup groups
  #   ansible.builtin.include_tasks:
  #     file: group-backup.yml
 
  - name: Backup Apache when httpd is installed and enabled
    when: item is search('httpd ') and ansible_facts.services['httpd.service'].status == 'enabled'
    ansible.builtin.include_tasks:
      file: apache-backup.yml 
    loop: "{{ rpm_query.stdout_lines }}"

In the above, the first task I run uses the RPM command to gather information on all of the installed packages. Generally, I prefer to use a purpose-built module if one exists. In this instance, the ansible.builtin.package_facts module is designed to do this, but I found it didn’t always report correctly for Centos7 servers, so I went with the RPM command as it always works. This list of apps will be used towards the bottom.

Next, I create a directory for each host on a backup server. This will be the repository for all of my configs and data backed up from the old server.

The last task is where the real work happens. I loop over the list of the installed packages on the server and check if one is the Apache service and if it is enabled. If those conditions are met, it will pull in the apache-backup.yml task file. This task file is something I created to backup things from my environment. If I had FTP services on some of my servers, I would also need an ftp-backup task file and an additional matching task, just like the apache-backup file.

The apache-backup.yml file is actually fairly simple:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Task file for backing up apache
 
# Backup apache config files
- name: Create an archive of the config files
  community.general.archive:
    path: /etc/httpd/con*
    dest: "/tmp/{{ inventory_hostname }}-httpd.tgz"
 
- name: Copy apache config files to ansible server
  ansible.builtin.fetch:
    src: "/tmp/{{ inventory_hostname }}-httpd.tgz"
    dest: "/tmp/{{ inventory_hostname }}-httpd.tgz"
    flat: true # Changes default fetch so it will save directly in destination
 
- name: Copy config archive to backup server from local ansible server
  ansible.builtin.copy:
    src: "/tmp/{{ inventory_hostname }}-httpd.tgz"
    dest: "{{ backup_location }}/{{ inventory_hostname }}/{{ inventory_hostname }}-httpd.tgz"
  delegate_to: "{{ backup_storage }}"
 
# Backup apache data files
- name: Create an archive of the data directories
  community.general.archive:
    path: /var/www
    dest: "/tmp/{{ inventory_hostname }}-httpd-data.tgz"
 
- name: Copy apache data files to ansible server
  ansible.builtin.fetch:
    src: "/tmp/{{ inventory_hostname }}-httpd-data.tgz"
    dest: "/tmp/{{ inventory_hostname }}-httpd-data.tgz"
    flat: true # Changes default fetch so it will save directly in destination
 
- name: Copy data archive to backup server from local ansible server
  ansible.builtin.copy:
    src: "/tmp/{{ inventory_hostname }}-httpd-data.tgz"
    dest: "{{ backup_location }}/{{ inventory_hostname }}/{{ inventory_hostname }}-httpd-data.tgz"
  delegate_to: "{{ backup_storage }}"

Taking a look at the above task file, you can see that it first creates an archive of the Apache configuration files. Really, it’s more or less a zip file.

It pulls the archive off the server, then pushes it over to a backup server.

It then repeats these actions for the data directories.

The next playbook is called provision-new-server.yml. I’ll leave you to look at it if you like, but it:
Connects to vcenter and provisions a new server
Waits for the server to pull an IP address
Adds the new host to the inventory via the Ascender API

Now that the old server is backed up and the new server has been provisioned, it’s time to restore some services on the new. This is done with the restore.yml playbook:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
- name: Playbook to restore configs on new servers
  hosts: migration-hosts 
  gather_facts: false
  vars:
    # The host to store backup info to
    backup_storage: backup-storage
 
    # The location on the backup host to store info
    backup_location: /tmp/migration
 
  tasks:
  - name: Set the restore server variables
    ansible.builtin.set_fact:
      restore_server: "new-{{ inventory_hostname }}"
 
  # - name: Debug restore_server
  #   ansible.builtin.debug:
  #     var: restore_server
 
  # grab a list of the files on the backup server for this host
  - name: Find all files in hosts' backup directories
    ansible.builtin.find:
      paths: "{{ backup_location }}/{{ inventory_hostname }}"
#      recurse: yes
    delegate_to: "{{ backup_storage }}"
    register: config_files
 
  # - name: Debug config_files
  #   when: item.path is search(inventory_hostname + '-httpd.tgz')
  #   ansible.builtin.debug:
  #     var: config_files
  #   loop: "{{ config_files.files }}"
 
  # for each task type, loop through backup files and see if they exist - call restore task file
  - name: If apache is installed, call install task file
    when: item.path is search(inventory_hostname + '-httpd.tgz')
    ansible.builtin.include_tasks: 
      file: apache-restore.yml
    loop: "{{ config_files.files }}"

The first task in the above sets a restore_server variable to the name of the new server. My playbooks I named the new server “new-{{ inventory_hostname }}”. This means it’s the name of the old server with “new-” on the front… not overly complex, but it does the trick.

The second task will search the backup folder’s directory and find all files that have been backed up for each host.

Somewhat similar to the backup procedure, the last task in the restore procedure is to loop over the files from the backup server, then calling task files for the various applications/packages. In this case, I’m looking for the Apache backup files, and when found, running the apache-restore.yml task file.

Next is to examine the apache-restore.yml file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# Task file for installing and configuring apache
 
# - name: Debug restore_server
#   ansible.builtin.debug:
#     var: restore_server
 
# Install apache
- name: Install apache
  ansible.builtin.dnf:
    name: httpd
    state: latest
  delegate_to: "{{ restore_server }}"
 
- name: Copy apache config files to ansible server
  ansible.builtin.fetch:
    src: "{{ backup_location }}/{{ inventory_hostname }}/{{ inventory_hostname }}-httpd.tgz"
    dest: "/tmp/{{ inventory_hostname }}-httpd.tgz"
    flat: true # Changes default fetch so it will save directly in destination
  delegate_to: "{{ backup_storage }}"
 
- name: Copy config archive to new server from local ansible server
  ansible.builtin.copy:
    src: "/tmp/{{ inventory_hostname }}-httpd.tgz"
    dest: "/tmp/{{ inventory_hostname }}-httpd.tgz"
  delegate_to: "{{ restore_server }}"
 
- name: Extract config archive
  ansible.builtin.unarchive:
    src: "/tmp/{{ inventory_hostname }}-httpd.tgz"
    dest: /etc/httpd
    remote_src: true
  delegate_to: "{{ restore_server }}"
 
- name: Copy apache data files to ansible server
  ansible.builtin.fetch:
    src: "{{ backup_location }}/{{ inventory_hostname }}/{{ inventory_hostname }}-httpd-data.tgz"
    dest: "/tmp/{{ inventory_hostname }}-httpd-data.tgz"
    flat: true # Changes default fetch so it will save directly in destination
  delegate_to: "{{ backup_storage }}"
 
- name: Copy data archive to new server from local ansible server
  ansible.builtin.copy:
    src: "/tmp/{{ inventory_hostname }}-httpd-data.tgz"
    dest: "/tmp/{{ inventory_hostname }}-httpd-data.tgz"
  delegate_to: "{{ restore_server }}"
 
- name: Extract config archive
  ansible.builtin.unarchive:
    src: "/tmp/{{ inventory_hostname }}-httpd-data.tgz"
    dest: /var/www
    remote_src: true
  delegate_to: "{{ restore_server }}"
 
- name: Start service httpd and enable it on boot
  ansible.builtin.service:
    name: httpd
    state: started
    enabled: yes
  delegate_to: "{{ restore_server }}"

The above is quite simple. First things first, I install Apache. Next I connect to the backup server, copy the archive config files over, and extract them. I then do the same thing for the data files. Last, I start and enable the Apache service.

After this, I run the suspend-old.yml playbook to pause the old VM.

Very last, I’ll run my testing playbooks that are designed for each app.

Ascender Configuration

I’ve covered adding inventories, projects, and job templates in other blog posts.

I will show the workflow template I created to tie all of the job templates together, though:

A workflow allows me to take playbooks of all sorts and string them together with branching on success or on failure logic. It also allows me to make my playbooks flexible and reusable.

Conclusion

Migrating infrastructure is often complex and time consuming, and while we can’t get more hours or employees to complete the task, we can employ our secret weapon, automation.

CIQ is ready to help you not only standup Ascender in your environment, but to also o experts at helping you migrate your infrastructure. We have tools to assist and at the end you have the automations for your environment ready for continued and future use!

As always, thanks for reading and I appreciate your feedback; happy migrating!