Installing An Ansible Automation Platform Cluster
Clustering the AAP is a good idea for multiple reasons: it allows some HA(a node can die and you can keep operating), you are able to distribute the load across multiple control nodes, and you can connect to any of them via the standard GUI.
A cluster setup follows the standard install process, but with a couple of tweaks.
The standard install process has you download the latest AAP files, but in my case I’m sticking with 3.7.4.
Here’s the standard cluster install documentation, but it leaves out a couple of key points.
Install Process
I’ve spun up 4 updated Centos7 boxes for my quick lab demo. 10.1.12.81, 82, and 83 are my clustered user interface servers(Tower), and 84 is my standalone database server.
Here’s my modified inventory file for this setup:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | [tower] # localhost ansible_connection=local 10.1.12.81 ansible_user=root ansible_password=MyPassword 10.1.12.82 ansible_user=root ansible_password=MyPassword 10.1.12.83 ansible_user=root ansible_password=MyPassword [database] 10.1.12.84 ansible_user=root ansible_password=MyPassword [all:vars] admin_password='redhat' pg_host='10.1.12.84' pg_port='5432' pg_database='awx' pg_username='awx' pg_password='redhat' |
I’m actually running this via the 10.1.12.81 server, which begs the question, why did I comment out the localhost entry and instead put it in just like the other hosts? Well, the answer is because it threw an error when I did that LOL. So even if you are doing the install from one of the servers, add it to the list just like the others.
Notice that I also specified a user and password for the process to SSH to the hosts with. If I run the script now it will fail with the following message:
1 2 3 4 | TASK [awx_install : Fail play when grabbing SECRET_KEY fails] ****************************************************************************************** fatal: [10.1.12.81]: FAILED! => {"changed": false, "msg": "Failed to read /etc/tower/SECRET_KEY from primary tower node"} fatal: [10.1.12.82]: FAILED! => {"changed": false, "msg": "Failed to read /etc/tower/SECRET_KEY from primary tower node"} fatal: [10.1.12.83]: FAILED! => {"changed": false, "msg": "Failed to read /etc/tower/SECRET_KEY from primary tower node"} |
The quick fix for this is to type “ssh 10.1.12.81”, then accept the SSH key. Then ssh to 82, 83, and 84.
Now once I run the ./setup.sh script it completes juuuuust fine.
1 2 3 4 5 | PLAY RECAP ********************************************************************************************************************************************* 10.1.12.81 : ok=136 changed=77 unreachable=0 failed=0 skipped=66 rescued=0 ignored=2 10.1.12.82 : ok=124 changed=69 unreachable=0 failed=0 skipped=63 rescued=0 ignored=1 10.1.12.83 : ok=124 changed=69 unreachable=0 failed=0 skipped=63 rescued=0 ignored=1 10.1.12.84 : ok=55 changed=23 unreachable=0 failed=0 skipped=35 rescued=0 ignored=0 |
Conclusion
Setting up a cluster really isn’t that bad, and it can profit you a lot of resiliency and additional flexibility in your environment. Let me know if you have any questions or comments.
Thanks and happy clustering.