Wednesday, November 15, 2023

Proxmox VE Cluster - Chapter 011 - Move OpenStack cluster 1 workload to Proxmox

Proxmox VE Cluster - Chapter 011 - Move OpenStack cluster 1 workload

A voyage of adventure, moving a diverse workload running on OpenStack, Harvester, and RKE2 K8S clusters over to a Proxmox VE cluster.

Before the hardware used for OpenStack cluster OS1 can be repurposed for the Proxmox cluster, I need to move all the virtual machines and containers off OS1.  There are several options: temporarily delete them until there is more capacity, permanently delete them if no longer needed, or move to the Proxmox cluster.

The old "warm backup" availability strategy for OpenStack was some workload was installed on both clusters, but only operating on one cluster at a time, for example, one of the minor file servers.  It was expensive to keep two copies around of "everything" and only running one copy on the Proxmox cluster should save quite a bit of capacity, overall.

Here is a list of workload I moved to Proxmox:

netbootxyz

Netboot.xyz provides network booting infrastructure.  Network booting starts with a DHCP server like ISC-DHCP (Or KEA...) pointing a booting PC to a TFTP address, the address of the netboot.xyz server.  This is where Netboot.xyz comes into the picture, it serves a really nice CLI menu of dozens of operating system install ISO files.  A very convenient way to install an OS.  There are also plenty of testing and troubleshooting images available.

https://netboot.xyz/

The VM is a simple Ubuntu 20.04 install that runs Docker.  I NFS mount all my Docker volumes, this has worked well for several years.  The move was uneventful.  Shut down the old VM on the OS1 cluster, start the new VM (on a new address) on the Proxmox cluster, run a script I keep in the NFS mounted docker directory to pull and start a netboot.xyz container, repoint the DHCP servers to the new netboot.xyz ip address, and it just works.

wiki

This is a Docker container of DokuWiki.  I use it as a "home page" or "phone book" for the LAN.  If its a web-accessible server, it has a link to it on the wiki.

https://www.dokuwiki.org/dokuwiki

This is another simple Ubuntu VM holding a Docker container, much like the netbootxyz VM above.  One of many advantages of storing my Docker volumes over NFS is a move like this is so simple; shut down the Docker container on the old server, start the Docker container on the new server, done.  The move was uneventful.

dhcp11, dhcp12, dhcp21, dhcp22 all replaced by dhcp01, dhcp02

This is a classic dual server ISC-DHCPD cluster.

https://www.isc.org/dhcp/

On the OS1 cluster this was running FreeBSD and converted this to Ubuntu 20.04.  This conversion was uneventful.  I am aware ISC DHCP is discontinued as of 2022, and KEA is the next generation of ISC supported DHCP servers.  Will convert to KEA later, stay tuned for a Spring City Solutions Youtube channel video about that conversion process.

dc11, dc12 replaced by dc01

This LAN uses Samba servers as Active Directory Domain Controllers.  Really nice to have network access from any machine to my home directory, and SSO is also pretty cool.

https://www.samba.org/

This was also a conversion from FreeBSD Samba (which is pretty easy to use) to Ubuntu Samba (which is definitely not as easy to use).  After the initial OS install and Ansible configuration, Do not do a "net ads join -U administrator" during the Ansible process.  The Ubuntu samba-tool utility has to create a fresh smb.conf file from scratch during the joining process, so just move the Ansible provided file out of the way temporarily (the Ansible file should be identical other than configuring DNS forwarder servers).  After initial configuration you will have to manually edit (or use ansible) to fix the DNS forwarder entry in /etc/samba/smb.conf.  You can join a Ubuntu DC to a domain with no error messages while running the "user" set of samba services SMBD, NMBD, Winbond and there will be zero error messages aside from "samba-tool drs showrepl" command failing to connect to port 135, and of course the DC not working in general.  It seems that on Ubuntu, you need to shut down the user class daemons SMBD, NMBD, and WINBOND using systemctl, then look up how systemd permanently shuts down samba-ad-dc in order to figure out how to "unmask" that service.  The next Ubuntu Samba related problem is systemd-resolved is autoconfigured to start on port 53 before trying to start Samba while refusing external connections, and samba-ad-dc will successfully start and not output any error messages while failing to bind to port 53, so in summary by default domain controller authoritative DNS will fail to work.  Systemd is always so annoying to use and just makes everything harder.  The solution is to  "systemctl stop" and systemctl mask" the "systemd-resolved.service".  A final minor problem, or surprise, is I couldn't get replication working without a reboot.  Yes I know its on a fifteen minute (or so) timer but it just wouldn't start without a reboot.  Believe it or not, on FreeBSD, none of this drama is necessary.  Regardless of extensive effort required to work around systemd, Samba worked eventually.

dns11, dns12 replaced by dns01

These are typical ISC Bind DNS servers.  Samba Active Directory Domain Controllers do not have a very advanced resolver, so I forward DNS queries for everything they're not authoritative for to a resolver cluster.  I've always preferred a DNS architecture that keeps authoritative DNS and DNS resolving separate.

https://www.isc.org/bind/

The VM is a simple Ubuntu server acting as a DNS resolver (not doing authoritative DNS).  The conversion was uneventful.  Active directory domain controllers DC11 and DC12 use DNS11 and DNS12 so I could not shut down and remove DNS11 and DNS12 until after removing DC11 and DC12.

observium

Observium is a SNMP-based monitoring system.  Mostly monitors my Netgear ethernet switches, but can monitor other devices.

https://www.observium.org/

The VM is another simple Ubuntu Docker server.  I had to expand the memory to 8 GB and expand the hard disk in Proxmox (then expand the LVM PV, then the LVM LV, finally expand the FS) to 32 GB and expand the CPU allocation to dual core.  Starts up slow, but works.

zabbix

Zabbix is an agent-based detailed monitoring system.  All VMs and some physical hardware run the zabbix agent for advanced monitoring and trend analysis.

https://www.zabbix.com/

The Zabbix system has two VMs.  In the past I've run into weird problems with the Zabbix trying to monitor itself.  One way around that is Zabbix has a proxy/concentration server, so I install one of those, and it can connect to the main Zabbix server.  Probably no longer necessary, but it is pretty cool.

The zabbix VM is yet another simple Ubuntu Docker server.  This was a bit of a headache.  I ran into some kind of incompatibility between old and new MySQL versions.  Would have impacted me upon next container upgrade had I not moved to a new cluster.  Ended up doing a complete Zabbix reinstall, there are too many new features to make the backup useful, etc.  In the end after some labor, this works pretty well.

The zabbixproxy VM, again, yet another simple Ubuntu Docker server, holds the Docker container for zabbix-proxy.  The move from the OS1 cluster was uneventful.

zerotier

ZeroTier is a complete VPN solution.  Pretty cool!

https://www.zerotier.com/

Uneventful reinstallation.  Ended up creating a new connection to Zerotier then reroute traffic for the LAN to the new connection (because for awhile I was running both in parallel for testing purposes, which also necessitated some LAN IP address and static route juggling).  Also updated DNS to point to dc01 (and later on, added dc02 and dc03, although they don't exist yet).


Next blog post will be about hardware prep work to turn the old OpenStack OS1 cluster into more Proxmox nodes.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.