Monday, December 4, 2023

Proxmox VE Cluster - Chapter 018 - Moving the remainder of workload to the full size cluster

Proxmox VE Cluster - Chapter 018 - Moving the remainder of workload to the full size cluster


A voyage of adventure, moving a diverse workload running on OpenStack, Harvester, and RKE2 K8S clusters over to a Proxmox VE cluster.


Some notes on moving the remainder of the old OpenStack workload to the full size Proxmox cluster.  These VMs were "paused" for a couple days and recreated on Proxmox.


Elasticsearch cluster members es04, es05, es06

This is the other half of the six host Elasticsearch cluster.  Rather than storing the disk images over CEPH (foreshadowing of future posts...) or enabling HA high availability (more foreshadowing of adventures to come...) I use local 100 LVM disks because the Proxmox VE system only uses a couple gigs of my 1 TB SSD OS install drives.

Adding more cluster members to an existing Elasticsearch cluster is no big deal.  Create a temporary cluster enrollment token on any existing cluster member, install a blank unused Elasticsearch binary on the VM, run the cluster-mode reconfiguration script with the previously mentioned token, wait until it's done.  The main effort is adjusting the config for kibana, filebeat, and metricbeat on Ansible so I can push out config changes to all hosts to use the additional three cluster members.  It 'just works'.  Currently, I have index lifecycle management to store only a couple days of logs and metrics because it seems 600 gigs of logs fills up faster than it did back in the 'old days'.

jupyter, mattermost, navidrome, pocketmine, tasmoadmin, ttrss, others..

These are just docker hosts that run docker containers.  The scripts to set up the docker containers, and the docker volumes, are stored on the main NFS server, so re-deployment amounts to install an Ubuntu server, let Ansible set it up to join the AD domain, install Docker for me, set up autofs, etc, then simply run my NFS mounted scripts to run Docker containers accessing NFS mounted Docker volumes.

booksonic, others...

Another Docker host like the above paragraph.  I had set up Active Directory authentication for a couple applications running in Docker containers and I had some "fun" reconfiguring them to use the new domain controller IP addresses.  No big deal, however, AD auth reconfiguration was an unexpected additional step.  If "everything" is configured automatically in Ansible, but its not REALLY "everything", then its easy to forget some application-level configuration remains necessary.  Every system that's big enough, has a couple loose ends somewhere.

kapua, kura, hawkbit, mqttrouter (containing Eclipse Mosquitto)

This is my local install of the Eclipse project Java IoT suite that I use for microcontroller experimentation and applications.

Kapua is a web based server for IoT that does everything except firmware updates.  The software is run via a complicated shell script running version 1 docker-compose that works fine with version 2 docker compose, after exporting some shell environment variables to force the correct Kapua version and editing the start up script to run v2 "docker compose" instead of v1 "docker-compose".  Kapua overall is a bit too complicated to explain in this blog post.

Kura is an example Java IoT device framework running locally in Docker instead of on real hardware, for testing Kapua and generally messing around.

Hawkbit is a firmware updater and it works great, anything with wifi/ethernet and MCUboot can upgrade itself very reliably, or recover from being bricked.  Works great with STM32 boards.

Finally, as for mqttrouter, simply start the NFS config and Eclipse Mosquitto works.

The Eclipse project Java-based IoT suite is REALLY cool and once upon a time I planned a multi-video Youtube series using it and Zephyr but I ran out of RAM on my STM32 boards before implementing more than 50% of the Kapua/Kura protocol and now-a-days I'd just install Kura on a Raspberry Pi, if not Node-RED or on the smaller end install one of the microcontroller Python implementations and call it good; maybe some day I'll get back into Eclipse Java IoT.

win11

This was a gigantic struggle.  The Proxmox side works perfectly, with emulated TPM and the install went perfectly smoothly.  The problem was I have a valid windows license on microsoft.com for this VM but the image refused to 'activate'.  I paid list price for this license that I can't even use; I can see why people have a bad attitude about Microsoft...  None the less, via various technical means I now have a remotely accessible domain-joined windows 11 image that I can access via Apache Guacamole's rdesktop feature from any modern web browser (including my Chromebook) to run windows "stuff" remotely.  Works pretty well, aside from the previously mentioned license activation problem.  Everything 'Microsoft' is a struggle all the time.

ibm7090, pdp8, rdos, rsx11, tops10, mvs, a couple others

Runs the latest OpenSIMH retrocomputing emulator in a tmux window.  The MVS host has the "famous" MVS/370 Turnkey 5 installed along with a console 3270 emulator.  The disk images are normally stored over NFS along with all configs.  All data is stored in projects on Redmine.  I have login entries on Apache Guacamole so I have full access to my retrocomputing environment via any web browser.


Next blog post:  Various operations issues.  Upgrading Proxmox VE software, daily stuff like that.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.