Thursday, June 30, 2022

Adventures of a Small Time OpenStack Sysadmin Chapter 003 - Starting Conditions

Adventures of a Small Time OpenStack Sysadmin relate the experience of converting a small VMware cluster into two small OpenStack clusters, and the adventures and friends I made along the way.

Adventures of a Small Time OpenStack Sysadmin Chapter 003 - Starting Conditions

To get where you're trying to go, first you have to figure out where you are.

The good news is, I'm the guy whom stood up the VMware cluster, so I have a pretty good idea how it works.  And I like having good documentation, so I have a local installation of Netbox, which is the best IPAM solution I've ever seen, or even dreamed of.

There are six identical ESXi hosts, all a couple years old, the "famous" SuperMicro SYS-E200-8D model that is so popular in home labs around the world.  96 gigs of RAM, each, because in my short term, wild and experimental VMware NSX era, NSX was incredibly memory hungry, to the point where you'd think the world's memory manufacturers bribed VMware to find some way to use more memory.  Like what is it even doing with all those gigabytes of RAM on each host?

As for networking, each ESXi host has an IPMI port with a pretty good web accessible HTML KVM, and dual one-gig ethernet ports and dual ten-gig ethernet ports.  The VMware networking concept, at least pre-NSX, is to set up distributed switches across the hosts and uplink each switch to VLANs, preferably across all the ethernet ports, to different switches.  So I have a one-gig ethernet switch with 12 connections and a ten-gig ethernet switch (which was admittedly expensive some years ago...) with 12 connections.  All the ports are configured the same way and all identically trunked together although I tried to follow VMware guidelines to prefer vmotion vmk to use to one dedicated 10G port and prefer vSAN vmk to use a different dedicated 10G port, but in theory (and in practice, a couple times...) it was possible to yank out any three of the four ethernet ports and the system would keep working, although perhaps slowly.  Well, it was possible to REALLY fool VMware in the old days by admin down ports or blocking VLANs on the switch side, but in general it was pretty bulletproof.  NSX was somewhat less bulletproof due to extreme complexity, but I had given up on NSX many years ago.  OpenStack does things quite a bit different than VMware, will get to that later...

As for storage, there's vSAN across all six hosts, all SSD, ESXi hosts, which for years has always been one hundred percent reliable, and I've never personally experienced data loss or frankly even much of  a problem with vSAN.  I also have a huge iX Systems hardware NAS of many terabytes, and a couple smaller TrueNAS boxes based on Intel NUCs from a couple years ago with a fraction of a TB each.  VMware works reliably over NFS although obviously the vSAN is enormously faster.  Over the years I upgraded the storage on each host, started out with cheap small HDD and moved to small SSD and later larger SSDs as prices fell over the years.  Those SSDs will be used for the bare metal OS and Cinder storage.  Each host also has a M.2 NVME internal SSD that was used for vSAN cache and intend to use for OpenStack Swift.  Storage will also be a lot different than VMware for OpenStack Cinder...

Generally the easiest way to load balance and backup and replicate and generally administer Docker workloads on VMware was to set up a host for every project (or even, every container!) and let vMotion and DRS and HA do its magic.  With Ansible to automate the Linux side and products like Orchestrator to automate the VMware side it only takes minutes to spin up a new Active Directory connected Docker host on the old VMware system, which I can likely replicate on OpenStack using Heat (or Magnum with really small Docker Swarms?) or I could move to Zun on OpenStack, eventually.

The biggest changes were infrastructural in nature.  Aside from previously mentioned Zun to handle Docker, I would be using Heat instead of Orchestrator, and 'something' in place of Log Insight.  Probably a homemade ELK stack?  In the old days you'd install an ELK stack on bare metal, or a bare metal image anyway, just like you'd install something like Apache Guacamole on bare metal in the old days, but now there are Docker containers for seemingly all services.  So somehow this conversion project is already morphing into a larger conversion; not just drag and drop existing images, but change entire software architectures to use more Docker containerization and so forth.  Not exactly the first IT project in history to experience massive scope creep over time... 

Stay tuned for the next chapter!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.