Wednesday, June 29, 2022

Adventures of a Small Time OpenStack Sysadmin Chapter 002 - The Plan, v 1.0

Adventures of a Small Time OpenStack Sysadmin relate the experience of converting a small VMware cluster into two small OpenStack clusters, and the adventures and friends I made along the way.

Adventures of a Small Time OpenStack Sysadmin Chapter 002 - The Plan, v 1.0

First, I need an operations and logistical plan.  What's going to go where, when, the usual project planning puzzle.

Before the big hardware crash, I had some experience with OpenStack and researched my options, so I was not going in blind.  Also, if you're creative enough, and have enough resources at hand, its hard to get yourself boxed in.  At least so far in life, I've always managed to figure my way out of whatever I got myself into.  So far, LOL.

The existing VMware cluster has six hosts, and vSAN really needs a minimum of three hosts, and the cluster workload was somewhat less than fifty percent, probably less than a third would be "tolerable" for awhile.  This would imply I could split the six host cluster, convert half of it, the first three hosts, to OpenStack, move everything from VMware to OpenStack, convert and add the other half of the cluster to the larger OpenStack, and I'd be done.  

No big deal, probably a long weekend's work.  With respect to installing OpenStack, I'm sure the experience and docs have improved over the years so a couple "apt-get install" lines on a fresh Ubuntu install and I'll be running.  Looking back, I was sooooooo optimistic.  "In the old days" when I was experimenting with OpenStack many years ago this new-fangled "Kolla-Ansible" project was too new to use, so I figured I'd use the openstack repos and just install components one at a time by hand, just like the good ole days, with some help from my existing Ansible infrastructure to replicate across all the hosts and keep the cluster identical.

So in more detail, my plan, version 1.0, looked like this:

Decommission as many legacy cruft/junk/obsolete VMs as possible.  If its not there anymore, I don't have to move it nor worry about making it work.  All clusters accumulate old junk; what a glorious time for a spring cleaning marathon?

Shut down vSAN, for safety's sake, and run everything off the slow NAS over NFS.  I've done this before and have some HUGE storage servers that run off this NAS; its just a matter of some Storage vMotion moves, then shut off the vSAN.  NFS to a spinning rust HDD NAS is slow, a lot slower than vSAN on an all SSD cluster with 10G ethernet, but its "fast enough for awhile".

Infrastructure prep before starting.  Why is my netboot.xyz infrastructure not working for PXE netboot/installation of software, I've certainly installed ESXi over the network before, along with other OS.  Get a head start on documenting everything in Netbox, tidy up the ethernet switch VLANs and stuff, prep Ansible for future hosts and services...

Shut down ESXi hosts 1, 2, and 3 safely and cleanly.  Clean, dust, relabel the hardware, update the docs in Netbox and in the Active Directory DNS (which is actually a cluster of Samba servers acting as DCs, works great for many years now).

Bare metal Ubuntu installs on hosts 1, 2, and 3.  Docs imply OpenStack "Yoga" version works best on Ubuntu 20.04.  I plan to have Bind DNS running on all bare metal hosts for OS Designate to dynamically configure DNS for the cluster, will be interesting to see how that interacts with the existing Active Directory install (spoiler, it was challenging and needed a lot of reworking...)  Originally I planned to move my DHCP servers to bare metal installs, that plan changed along the way also.  One plan that worked pretty well was setting up the entire OpenStack cluster as a giant NTP cluster.  My innovative and creative solution to OpenStack not handling USB passthru like VMware, was to just set up Docker on one of the hosts and not virtualize those USB-requiring applications at all.  This step also includes proving out the LAN, and in retrospect my MTU testing was not careful enough, leading to some considerable trouble later on.

OpenStack has a wildly different project architecture than VMware.  VMware sets up simple hypervisors on all hosts, then runs "the cool stuff" as virtualized hosts, so vCenter lives as just another image.  OpenStack kind of inverts that architecture, so you need a controller host (or, ideally, cluster) that runs bare metal MySQL as a database, or RabbitMQ or similar, and the hypervisors ONLY run production workload.  So there's some infrastructure to set up, those applications along with memcached, etcd, etc.

At this point I planned to follow the online OpenStack "Installation Tutorial" documentation.  I was a little nervous that most of the docs referenced Ubuntu 18 or even Ubuntu 14...  I designed a sensible dependency tree, Keystone first, Swift before Glance, Placement before Nova, etc.  I figured I'd set up the basics now, and experiment with more advanced features like Magnum and Mistral and Zun and Trove later.

"Move everything from VMware to OpenStack".  Sounds simple.  In retrospect, as usual, the last ten percent of any effort takes ninety percent of the time, recursively...

Shut down and clean up and re-install the last three hosts, 4, 5, and 6.  This will be the end of the VMware cluster, don't have to be so careful, plus with previous experience, this should be pretty smooth.

Add hosts 4, 5, and 6 to the existing cluster consisting of hosts 1, 2, and 3.  I have some experience messing around with OS Swift so I know it'll take some effort but its quite possible.

Add "cool new extra" services to the larger capacity full size OpenStack.  Maybe try out Trove for databases instead of spawning off more Docker containers, that sort of thing.  At the time, I planned on setting up an ELK stack, which I've done before, to replace Log Insight.

I expected to do one last sweep thru the entire system to update docs, update hardware labels, verify and maybe even test new backup strategies.

The plan sounded great...  However, IT work is similar to my military experience, in that no matter how well designed a plan is, the plan never survives contact with the enemy.  The mental effort of making a plan provides the "virtual experience" to improve the odds of success, so its not wasted time.

Stay tuned for the next chapter!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.