Friday, November 24, 2023

Proxmox VE Cluster - Chapter 015 - Migrate workload of the old OS2 cluster

Proxmox VE Cluster - Chapter 015 - Migrate workload of the old OS2 cluster


A voyage of adventure, moving a diverse workload running on OpenStack, Harvester, and RKE2 K8S clusters over to a Proxmox VE cluster.


I plan to reuse the OpenStack cluster OS2 hardware to increase Proxmox cluster capacity.  Before I can reuse the hardware, I need to move all remaining workload off the OpenStack hardware.  The OS2 workload is either immediately migrated to Proxmox if its important enough, or "paused" for a few days and restored on Proxmox after the cluster work is done.  Here is a list of the OS2 cluster workload, and a description of what I did to each application:

storage

An Ubuntu Samba and NFS fileserver.

Converted from FreeBSD to Ubuntu.  While doing that, I added NFS file serving for Ubuntu on Ansible, so that is scripted and automated now.

netbox

https://netbox.dev/

A complete FOSS IPAM system.

This one was tricky, its a docker-compose with local volumes so need to backup the DB on the old system then restore onto the new system, more or less.  I need to completely redesign this so all data is stored over NFS instead of on local host volumes.  Its the only container I have that stores data in local volumes which makes management a hassle.

I recommend against directly following the database restore instructions on:

https://github.com/netbox-community/netbox-docker/wiki/Troubleshooting#database-operations

I unfortunately have extensive experience with restoring an older schema on top of a freshly installed empty new schema, resulting in considerable data loss.  At least I keep good backups LOL, so I was able to recover from that.  A better strategy is NOT to start everything then shut off the client processes, then (try to) restore an old schema backup over an empty new schema as per existing online docs, but instead to start ONLY the postgres container, then while it's empty, restore the old schema backup into postgres, then and only then start up the client containers (everything else) and let the automatic upgrade process upgrade the schema of the freshly restored old schema database.  That strategy worked perfectly with no data loss.

I filed a docs improvement bug regarding the above adventure at GitHub:

https://github.com/netbox-community/netbox-docker/issues/1113

unifi

The "unifi" controller software in a docker container for a cloudy-ish WIFI network based on Ubiquiti hardware.

The Unifi Controller is just another Docker container.  However, a new controller IP address means re-homing all Unifi devices off the old server and old IP address and onto the new server with it's new IP address.  I copied the NFS mounted volume for unifi controller over to unificontroller-old, because then I could start controllers on both the new and old servers.  Obviously, every Ubiquiti hardware device on the LAN reconnected to the unifi-old server, although the new unifi server looked fine (other than acting like all devices were suddenly disconnected from it, which is accurate).  Then on the old controller, in "Settings" "System" "Advanced" there's an "inform host" setting which had the old server's IP address, so I put in the new address and hit apply.  There's also a way to manually SSH into each device individually, which can be a bit of a pain, so I used the web UI "Inform" method.

The above resulted in a minor problem, the "inform host" on the new controller was  pointing to the old controller IP address because the new controller was a clone of the old controller, and the "inform host" on the old controller was pointing to the new host, so the devices ping-pong-ed back and forth between the old and new controllers for awhile.  I fixed the "inform host" setting on the new controller and Wi-Fi devices started coming online.  Cool.  The ethernet switches were mad at me for at least several minutes, I think I crashed some firmware or something, although they did eventually ALL come online.  Mildly interesting that the Wi-Fi devices connect much faster than the ethernet switches.  In summary, that was exciting for awhile, but in the end it all worked pretty well.

redmine

https://www.redmine.org/

A FOSS project management suite

Shut down the docker-compose on the old server, start it on the new server, painless.  All the data volumes reside on the same NAS NFS server, this was just moving the docker "compute host" from the old OpenStack cluster number two, to the new Proxmox docker "compute host".

es02, es03, kibana

https://www.elastic.co/

Elasticsearch infrastructure for syslog storage and analysis

This turned into a larger adventure that initially planned.  Ended up being an upgrade of Elasticsearch from 8.6 to 8.10 and a new cluster being formed on ES01, ES02, and ES03.  Later I will add ES04, ES05 and ES06.  This seems like a lot of work to store syslog messages, but Elasticsearch as a database technology is fun to play with and Kibana can make really cool graphical dashboards, so its worth the effort.

portainer

https://www.portainer.io/

A FOSS centralized web based Docker management tool.

Shut down the docker container on the old server, start it on the new server, painless.

guacamole

The Apache Guacamole project provides a website that turns any web browser into a SSH client or RDesktop client.

Shut down the old docker container, start on the new server, painless.

dc21, dc22 to dc02, dc03

https://www.samba.org/

Samba Active Directory Domain Controller Cluster.

Need to remove dc21 and dc22 as the second to last VMs removed from OS2, some VMs on OS2 will point to dc21 and dc22 for DNS resolution.

Probably the only "interesting" thing to remember to do was move the FSMO roles to dc01 off of dc21.

I took this opportunity to clean up the old DNS entries.  I use the RSAT tools on a windows 11 desktop, works pretty well to control Samba Active Directory.

dns21, dns22 to dns02, dns03

Ubuntu servers doing DNS resolving requests that are forwarded from the Domain Controller cluster.

dc01 forwards DNS resolution to dns01 and dns02, dc02 forwards to dns02 and dns03, and so forth, so everything has multiple backups.  This works pretty well.

Need to remove dns21 dns22 after removing dc21 dc22.  These will be the last VMs removed from OS2.

emby

https://emby.media/

A media server and DVR for Roku and other household set top boxes.

Just an Ubuntu server with emby package installed.

Need to convert when the recordings list is empty

Note there is, of course, a new IP addrs.

Emby has a very elaborate and detailed manual provisioning process, documented in it's Redmine runbook "issue".

ubuntu

General end user use.

Move old ubuntu to ubuntu-old in DNS

Set up a new Ubuntu for enduser use.

Make sure Ansible runs on the new ubuntu before shutting down the old ubuntu.

backup

An Ubuntu NFS and Samba fileserver holding backup data.

This was a test fileserver before converting "storage2" to "storage".  Only problem I ran into was minor, FreeBSD prefers the use of the group "wheel" whereas Ubuntu prefers the use of the group "root".


Next blog post, there's nothing left on OpenStack cluster 2, shut it down and prepare the OS2 cluster hardware for reuse as additional Proxmox capacity.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.