Saturday, October 23, 2021

Upgraded three servers from Devuan Beowulf OS version to Devuan Chimaera OS version

Upgraded three servers from Devuan Beowulf version to Devuan Chimaera version today.

A gold OS template for deployment, and two DNS servers.

The procedure to turn a gold OS image into a DNS server is handled entirely by some Ansible scripts I wrote; I used to use Puppet a long time ago, but tired of restarting Puppet agents to resolve misconfigured systems, and I prefer the "push" configuration of Ansible over "pull" technique of Puppet.  I can deploy a template in VMware, change its name and IP address, reboot it, connect it to the SSH web of trust and the Active Directory web of trust, run Ansible against it, and it turns into a fully featured DNS server in a couple minutes.  I could have downed the old version servers and brought up new, but the upgrade process was so flawless and fast on the template that I upgraded the pair of DNS servers instead of making new and reload; it only took minutes either way and I was interested to see what happens (given that nearly instant rollback is possible with VMware, and I'm alone on a Saturday morning, its not like there's any risk LOL)

First, in VMware vSphere, my back out plan, in case the upgrade went poorly, was to shut down the images, make a duplicate, and upgrade the dupes, and keep the untouched originals around in case something went wrong.  I've seen performance problem due to forgetting VMware snapshots were left up; less headache and "shut down the new one and start up the old one" is faster than VMware snapshot rollbacks, and I only use FOSS software on these servers, so there are no licensing issues like windows would require.  I can leave untouched images running and connect/disconnect image network interfaces in mere seconds...

These are resolution DNS servers not authoritative DNS servers, so a simpler plan is a better plan.  If they were authoritative I'd spin up new servers and test using 'dig' that they work properly.  But I'm the only person using these resolution servers on a Saturday morning, so its pretty safe.  The simplest plan that gets the job done is the plan most likely to be successful.  I would have to allocate two more routable IP addresses to run both test and production images simultaneously; its not really worth it to log into NetBox and justify the allocation.

After the VMware work, I upgraded the Devuan Beowulf packages to the latest/last versions.  The usual "apt-get update" "apt-get dist-upgrade" "apt-get autoremove" finally "apt-get clean" then run the ansible-playbook for each server against it, and test everything works.  Not much happened in the upgrades (I generally maintain each server every two months)

I do not store or configure major version configurations like Apt 'list' files in Ansible as the chance of a "surprise upgrade" is not worth the risk.  I only had a couple servers to upgrade so I removed the old /etc/apt/sources.list.d/beowulf.list file and set up a new /etc/apt/sources.list.d/chimaera.list file along the lines of the Devuan suggested file. 

The "apt-get upgrade" and "apt-get dist-upgrade" as per the Devuan suggested upgrade path was completely uneventful.  I have apt-listchanges configured to send changelogs and news to me via email.  I will save those emails for later reference in case any problem develops, but usually those upgrade logs end up deleted after a couple months.

According to those upgrade emails, recently Exim, the main transport agent, has undergone a substantial major upgrade possibly requiring configuration changes, and Gnupg now no longer uses ~/.gnupg/options file in favor of ~/.gnupg/gpg.conf.  For me, everything is fine, others may find those changes more relevant.

I ran "apt-get autoremove" and "apt-get clean" to clean up the upgrade.  Interesting to see Devuan no longer uses Python version 2 (although it is installable) so I had to update my Ansible configuration system inventory to specify Devuan based operating systems have a python path of "ansible_python_interpreter=/usr/bin/python3" instead of "ansible_python_interpreter=/usr/bin/python" for legacy python2.  I keep my Ansible scripts in a Git repository so I committed documented and uploaded my small change.

I ran my Ansible configuration script on the DNS server.  Aside from the previously mentioned upgrade from python2 to python3, it was uneventful.

I did a server reboot (technically un-necessary) to verify everything starts up correctly after a reboot, which it did.

I verified everything working on the DNS servers (note, I have a cluster and did one server at a time).  They both do forward and reverse for ipv4 and ipv6, and also forward a subdomain to an Active Directory domain controller cluster I also maintain, that's based on Samba, and it all works quite well.

I cleared any alerts in Zabbix, a LAN server monitoring system.  I run Zabbix using Docker images; it works well and alerts me to any server failures (such as reboots).  I could set a maintenance interval in Zabbix to silence alerting, but I believe it counterproductive; if the software upgrade fails and DNS queries no longer resolve, I want to know immediately rather than at the end of a scheduled maintenance interval...  Zabbix caught the server reboot, and also automatically opened a problem ticket "Operating system description has changed".  I acknowledged and closed that automatically opened problem ticket.

After the servers ran for an hour I checked the Zabbix performance graphs and there's no substantial change in performance.  Much less granular VMware monitoring more or less matched what I saw in Zabbix.  Always worrisome if CPU use or disk space go wildly higher OR lower after an upgrade.  Everything seems to be working normally.

Finally I updated the three runbooks I maintain in Todoist, an online web and mobile app for to do lists.  I set the next date to check up on the servers for two months from now, as usual for these servers, documented the upgrade in the server log, let the users know I'm done and how to reach me if necessary, etc.

In the future I will clean up and remove the old stuff in VMware, assuming the new DNS servers work fine and there's no reason to roll back.  Nice to know I can rollback almost instantly although typically there's no need.

Hilariously the only problem I had with the entire major version upgrade was the spelling of Chimera has apparently changed since my Dungeons and Dragons days, and is now spelled Chimaera.

My primary reference for the project was:

https://www.devuan.org/os/documentation/install-guides/chimaera/upgrade-to-chimaera

And that, in summary, is how to spend about two hours painlessly upgrading three Devuan servers to the latest version.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.