Friday, December 8, 2023

Proxmox VE Cluster - Chapter 020 - Architecture 1.0 Review and Future Directions

Proxmox VE Cluster - Chapter 020 - Architecture 1.0 Review and Future Directions


A voyage of adventure, moving a diverse workload running on OpenStack, Harvester, and RKE2 K8S clusters over to a Proxmox VE cluster.


What worked

So far, everything.  And it all works better than I expected, and was generally less of a headache than I anticipated.  Performance is vastly better than OpenStack for various design and overall architectural reasons.

How long it took

I am writing these blog posts in a non-linear fashion so the final editing of this post is being done on a CEPH cluster with HA and the new software defined networking system and quite a few other interesting items, most of which already have rough draft blog posts.

However, if you believe Clockify, over the course of half a year of hobby-scale effort, I have logged 125 hours, 57 minutes, and 52 seconds getting to this point.  So, around "three weeks full time labor" to convert a small OpenStack cluster to a medium size half-way-configured Proxmox VE cluster.  I believe this is about half the time it took to convert from VMware to OpenStack a couple years back.

Future Adventures

This is just a list of topics you can expect to see in blog posts and Spring City Solutions Youtube videos probably after the holidays or in late winter / early spring:  Setting up and upgrading CEPH.  Optimizing memory, CPU, storage.  Adding the new SDN feature.  Open vSwitch and Netgear hardware QoS.  Connecting Ansible and probably Terraform to Proxmox.  Monitoring using Observium, Zabbix, and Elasticsearch.  Setting up Rancher and RKE2 production clusters on top of Proxmox.  Backups using the Proxmox Backup Server product.  Cloud-init, will I ever it it working the way I want it to work?  HA High Availability, unfortunately I can verify this software feature works excellently during hardware failures.  USB pass thru.  

Most of the stuff listed above is done or in process, and already partially documented in rough draft blog posts.  CEPH integration, for example, has been unimaginably cool.


Anyway, thanks for reading and have a great day!

Wednesday, December 6, 2023

Proxmox VE Cluster - Chapter 019 - Proxmox Operations

Proxmox VE Cluster - Chapter 019 - Proxmox Operations


A voyage of adventure, moving a diverse workload running on OpenStack, Harvester, and RKE2 K8S clusters over to a Proxmox VE cluster.


Proxmox Operations is a broad and complicated topic.


Day to day operations are performed ALMOST entirely in the web GUI, with very few visits to the CLI.  I have years of experience with VMware and OpenStack, and weeks, maybe even months of experience with Proxmox, so let's compare the experience:

  • VMware:  vSphere is installed on the cluster as an image, and as an incredibly expensive piece of licensed software, you get one (maybe two, depending on HA success) installation of vSphere and you get to hope it works.  Backup, restore, upgrades, and installation work about as well as you expect for "enterprise" grade software.
  • OpenStack: Horizon is installed on the controller and the controller is NOT part of the cluster.  It's free, feel free to install multiple controllers although I never operated that way.  Its expensive in terms of hardware as the core assumptions of the design assume you're throwing a rather large cloud, not a couple hosts in a rack.  Upgrades are terrifying and moderately painful and long process.  The kolla-ansible solution of running it all in containers is interesting although it replaces the un-troubleshoot-able complication of bare metal installation with an equal level of un-troubleshoot-able complication of Docker containers.
  • Proxmox VE: Every VE node has a web front end to do CRUD operations against the shared cluster configuration database.  The VE system magically synchronizes the hardware to match the configuration database.  Very cool design and 100% reliable so far.  Scalability is excellent; whereas OpenStack assumes you're rolling in with a minimum of a dozen or so nodes, Proxmox works from as low as one isolated node.
An interesting operational note is the UI on Proxmox is more "polished" and "professional" and "complete" than either alternative.  Usually FOSS has a reputation for inadequate UI but Proxmox has the best UI of the three.

Upgrades

Lets consider one operational task.  Upgrades.  Proxmox is essentially a Debian Linux installation with a bunch of Proxmox specific packages installed on top of it.  Not all that different from installing Docker or ElasticSearch from upstream.  I try to upgrade every node in the cluster at least monthly, the less stuff that changes per upgrade the less "exciting" the upgrade.  The level of excitement and drama and stress scales exponentially with the number of upgraded software packages with Debian-based operating systems in general.

The official Proxmox process for upgrades is just hit it, maybe have to reboot, all good.

As you'd expect, there are complications, IRL.

First I make a plan, upgrading all the hosts in one sitting because I don't want cross-version compatibility cluster issues, and I start with the least sensitive cluster host.  Note that if you log into proxmox001 and upgrade/reboot proxmox002, you stay logged into the cluster.  However if you log into proxmox001 and upgrade and reboot proxmox001, you lose web access to the rest of the cluster during the reboot (as a work around, simply log into the proxmox002 webui while rebooting proxmox001).

Next I verify the backups of the VMs on a node, and generally poke thru the logs.  If I'm getting hardware errors or something I want to know before I start changing software.  Yes this blog post series is non-linear and I haven't mentioned backups or the Proxmox Backup Server product but those posts are coming soon.

I generally shutdown clustered VMs and unimportant VMs and migrate "important" VMs to other hosts. 

There are special notes about Beelink DKMS process for the custom ethernet driver using non-free firmware.  Basically Proxmox 8.0 shipped with a Linux kernel that could be modified to use the DKMS driver for the broken Realtek ethernet driver, however, the DKMS driver does NOT seem compatible with the kernel shipped with Proxmox 8.1, so after some completely fruitless hours of effort, I simply removed my three Beelink microservers from the cluster.  "Life's too short to use Realtek".  You'd think Linux compatibility would be better in 2023 than 1993 when I got started, but really there isn't much difference between 2023 and 1993 and plenty of stuff just doesn't work.  So, here's a URL to remove nodes from a cluster, which is a bit more involved than adding nodes LOL:

Other than fully completing and verifying operation of exactly one node at a time, I have no serious advice.  Upgrades on Proxmox generally just work, somehow even less drama than VMware upgrades.  Lightyears less stress than an OpenStack upgrade.  Don't forget to update the Runbook docs and due date in Redmine after each node upgrade.

Note that upgrading the Proxmox VE software is only half the job, once that's done entirely across the cluster its time to look at CEPH.  Again I mention these blog posts are being written long after the action, and I haven't mentioned CEPH in a blog post.  Those posts are on the way.


Shortly after I rough drafted these blog posts, Proxmox 8.1 dropped along with an upgrade from CEPH Quincy to CEPH Reef.  AFAIK any CEPH upgrade even a minor version number is basically the same as a major upgrade, just much less exciting and stressful.  I do everything for a minor upgrade in the same order and process, more or less, as a major CEPH version upgrade, and that may even be correct.  It does work, at least so far.

Next post, a summary and evaluation of "Architecture Level 1.0" where we've been and where we're going.

Monday, December 4, 2023

Proxmox VE Cluster - Chapter 018 - Moving the remainder of workload to the full size cluster

Proxmox VE Cluster - Chapter 018 - Moving the remainder of workload to the full size cluster


A voyage of adventure, moving a diverse workload running on OpenStack, Harvester, and RKE2 K8S clusters over to a Proxmox VE cluster.


Some notes on moving the remainder of the old OpenStack workload to the full size Proxmox cluster.  These VMs were "paused" for a couple days and recreated on Proxmox.


Elasticsearch cluster members es04, es05, es06

This is the other half of the six host Elasticsearch cluster.  Rather than storing the disk images over CEPH (foreshadowing of future posts...) or enabling HA high availability (more foreshadowing of adventures to come...) I use local 100 LVM disks because the Proxmox VE system only uses a couple gigs of my 1 TB SSD OS install drives.

Adding more cluster members to an existing Elasticsearch cluster is no big deal.  Create a temporary cluster enrollment token on any existing cluster member, install a blank unused Elasticsearch binary on the VM, run the cluster-mode reconfiguration script with the previously mentioned token, wait until it's done.  The main effort is adjusting the config for kibana, filebeat, and metricbeat on Ansible so I can push out config changes to all hosts to use the additional three cluster members.  It 'just works'.  Currently, I have index lifecycle management to store only a couple days of logs and metrics because it seems 600 gigs of logs fills up faster than it did back in the 'old days'.

jupyter, mattermost, navidrome, pocketmine, tasmoadmin, ttrss, others..

These are just docker hosts that run docker containers.  The scripts to set up the docker containers, and the docker volumes, are stored on the main NFS server, so re-deployment amounts to install an Ubuntu server, let Ansible set it up to join the AD domain, install Docker for me, set up autofs, etc, then simply run my NFS mounted scripts to run Docker containers accessing NFS mounted Docker volumes.

booksonic, others...

Another Docker host like the above paragraph.  I had set up Active Directory authentication for a couple applications running in Docker containers and I had some "fun" reconfiguring them to use the new domain controller IP addresses.  No big deal, however, AD auth reconfiguration was an unexpected additional step.  If "everything" is configured automatically in Ansible, but its not REALLY "everything", then its easy to forget some application-level configuration remains necessary.  Every system that's big enough, has a couple loose ends somewhere.

kapua, kura, hawkbit, mqttrouter (containing Eclipse Mosquitto)

This is my local install of the Eclipse project Java IoT suite that I use for microcontroller experimentation and applications.

Kapua is a web based server for IoT that does everything except firmware updates.  The software is run via a complicated shell script running version 1 docker-compose that works fine with version 2 docker compose, after exporting some shell environment variables to force the correct Kapua version and editing the start up script to run v2 "docker compose" instead of v1 "docker-compose".  Kapua overall is a bit too complicated to explain in this blog post.

Kura is an example Java IoT device framework running locally in Docker instead of on real hardware, for testing Kapua and generally messing around.

Hawkbit is a firmware updater and it works great, anything with wifi/ethernet and MCUboot can upgrade itself very reliably, or recover from being bricked.  Works great with STM32 boards.

Finally, as for mqttrouter, simply start the NFS config and Eclipse Mosquitto works.

The Eclipse project Java-based IoT suite is REALLY cool and once upon a time I planned a multi-video Youtube series using it and Zephyr but I ran out of RAM on my STM32 boards before implementing more than 50% of the Kapua/Kura protocol and now-a-days I'd just install Kura on a Raspberry Pi, if not Node-RED or on the smaller end install one of the microcontroller Python implementations and call it good; maybe some day I'll get back into Eclipse Java IoT.

win11

This was a gigantic struggle.  The Proxmox side works perfectly, with emulated TPM and the install went perfectly smoothly.  The problem was I have a valid windows license on microsoft.com for this VM but the image refused to 'activate'.  I paid list price for this license that I can't even use; I can see why people have a bad attitude about Microsoft...  None the less, via various technical means I now have a remotely accessible domain-joined windows 11 image that I can access via Apache Guacamole's rdesktop feature from any modern web browser (including my Chromebook) to run windows "stuff" remotely.  Works pretty well, aside from the previously mentioned license activation problem.  Everything 'Microsoft' is a struggle all the time.

ibm7090, pdp8, rdos, rsx11, tops10, mvs, a couple others

Runs the latest OpenSIMH retrocomputing emulator in a tmux window.  The MVS host has the "famous" MVS/370 Turnkey 5 installed along with a console 3270 emulator.  The disk images are normally stored over NFS along with all configs.  All data is stored in projects on Redmine.  I have login entries on Apache Guacamole so I have full access to my retrocomputing environment via any web browser.


Next blog post:  Various operations issues.  Upgrading Proxmox VE software, daily stuff like that.