Proxmox VE Cluster - Chapter 001 - Why switch to Proxmox VE?
A voyage of adventure, moving a diverse workload running on OpenStack, Harvester, and RKE2 K8S clusters over to a Proxmox VE cluster.
Why switch from OpenStack / Harvester / RKE2 K8S, VMware, and other tech to a standard data center baseline of Proxmox VE?
- The hardware load and requirement is high for Harvester. Harvester is awesome but some of my smaller nodes spend most of their CPU cycles and memory running Harvester itself rather than running my workloads. A full Rancher cluster to control Harvester is very cool technology, but the hardware load is expensive.
- Harvester upgrades fail because the hardware load is too high. Related to the above, I'm having trouble upgrading the more heavily loaded nodes because they can barely run Harvester at all, much less afford the extra system load to upgrade K8S.
- I can't really go backward to VMware. Hardware compatibility lists, etc. Technically I could spend the money but I don't think it would be worth it.
- I've learned everything I can learn from OpenStack and looking at trends it's time to 'jump ship' from OpenStack.
- Upgrades for OpenStack kolla-ansible are non-trivial and a little more interactive than I would prefer. I'm running two OS clusters and will push the workload over to one cluster temporarily while upgrading the other cluster. It takes a lot of time and sysadmin effort.
- Proxmox has some cool new features to experiment with, like native CEPH distributed cluster filesystem integrated into the system, and the very cool looking Proxmox Backup Server system.
- I want a "single pane of glass" to manage my cluster hardware with respect to monitoring, control, backup, etc. I will put everything on Proxmox and control everything via a single Proxmox cluster. I don't want a RKE2 K8S cluster AND a Harvester cluster AND two OpenStack clusters to manage, just one big Proxmox cluster, ideally.
Very high level plan for the overall conversion project:
- Architecture level 1.0 will be a phased conversion of all workload into a large Proxmox VE cluster. This will be an OpenStack / Harvester type design and workload wedged into fitting in to Proxmox VE.
- Architecture level 2.0 will be system integration to make this a Proxmox-styled cluster, integrating with monitoring and automation and generally adapting the workload to make everything feel integrated rather than a different system's workload being temporarily run on Proxmox.
- Architecture level 3.0 will be more R+D focused, advancing into interesting extra features that are Proxmox-specific, such as a cluster wide filesystem, the Proxmox Backup Server system, some interesting advanced networking ideas, new stuff in general.
How did I get here?
Can't figure out how to get where you're going, unless you know how you got where you are right now.
- Around the turn of the century, had bare metal Linux servers running LXC and also the FreeBSD equivalent.
- Around the 2010s, had some sysadmin level experience with VMware at work, and I signed up for the "ESXi Evaluation Experience" which gives you limited non-commercial license to pretty much the entire collection of VMware software. This was pretty awesome for several years, although the continual drift in the hardware compatibility list and hardware requirements increasing dramatically over time, and generally being tired of paying for even a discounted VMware license, meant I moved away from VMware.
- Around the late 2010s / 2020 timeframe, replaced the VMware cluster with OpenStack. OpenStack is FOSS, but the labor required to keep it up is expensive.
- In the early 2020s I started experimenting with RKE2 K8S on bare metal, and Rancher's Harvester bare metal virtualization solution. Nice tech and works well, but it's designed for "larger" individual nodes than I can afford.
The above leads me to being interested in Proxmox VE to underlie my entire mini-datacenter. I will eventually put everything on Proxmox except for my NAS and a stand alone Proxmox backup server.
Something to note about this series is the blog posts appear some time after "the action". So as you read a new blog post, this all happened some weeks / months ago. This gives me time to document things I've missed, circle back around, etc. On the bad side I might miss some details of something I did months ago. On the good side, I've circled back to document bugs and workarounds and any other areas of friction, which should save you, as the reader, some time if you implement a Proxmox cluster at your site.
Next post in the series will be a more detailed description of my Architecture level 1.0.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.