Tuesday, February 21, 2023

Rancher Suite K8S Adventure - Chapter 007 - Helm

Rancher Suite K8S Adventure - Chapter 007 - Helm

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Helm version 3.11 is installed on all members of the Rancher RKE2 cluster and on my Ubuntu experimentation box using Ansible.  Honestly this is almost identical to the process for installing kubectl yesterday, it's just a different repo and different package.

https://helm.sh/docs/intro/install/

The exact version of the Ubuntu package I'm installing is 1.24.10-00 as seen at

https://helm.baltorepo.com/stable/debian/packages/helm/releases/

And I'm doing an "apt hold" on it to make sure its not accidentally upgraded.

Here is a link to the gitlab repo directory for the Ansible helm role:

https://gitlab.com/SpringCitySolutionsLLC/ansible/-/tree/master/roles/helm

If you look at the Ansible task named packages.yml, the task installs some boring required packages first, then deletes the repo key if its too old, then downloads a new copy of the repo key if its not already present, add the local copy of the repo key to apt's list of known good keys, installs the sources.list file for the repo, does an apt-get update, takes helm out of "hold" state, installs the latest package for helm version 3.11, finally places helm back on "hold" state so its not magically upgraded to the latest version (3.12 or 3.13 or something by now).  Glad I don't have to do that manually by hand on every machine.

Simply add "- helm" to a machine's Ansible playbook, then run "ansible-playbook --tags helm playbooks/someHostname.yml" and it works.

As of the time this blog was written, "helm version" looks like this:

vince@ubuntu:~$ helm version
version.BuildInfo{Version:"v3.11.1", GitCommit:"293b50c65d4d56187cd4e2f390f0ada46b4c4737", GitTreeState:"clean", GoVersion:"go1.18.10"}
vince@ubuntu:~$ 

Monday, February 20, 2023

Rancher Suite K8S Adventure - Chapter 006 - Kubectl

Rancher Suite K8S Adventure - Chapter 006 - Kubectl

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Kubectl version 1.24 is installed on all members of the Rancher RKE2 cluster and on my Ubuntu experimentation box using Ansible.

https://kubernetes.io/docs/reference/kubectl/

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands

https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/

The exact version of the Ubuntu package I'm installing is 1.24.10-00 as seen at

https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages

And I'm doing an "apt hold" on it to make sure its not accidentally upgraded.

Here is a link to the gitlab repo directory for the Ansible kubectl role:

https://gitlab.com/SpringCitySolutionsLLC/ansible/-/tree/master/roles/kubectl

If you look at the Ansible task named packages.yml, the task installs some boring required packages first, then deletes the Google K8S repo key if its too old, then downloads a new copy of the Google K8S repo key if its not already present, add the local copy of the Google K8S repo key to apt's list of known good keys, installs the sources.list file for Google's K8S repo, does an apt-get update, takes kubectl out of "hold" state, installs the latest package for kubectl version 1.24, finally places kubectl back on "hold" state so its not magically upgraded to the latest version (1.26 or 1.27 or something by now).  Glad I don't have to do that manually by hand on every machine, LOL!

Ansible makes life easy, all I need to do to have the most recent kubectl installed on a Ubuntu system is add "- kubectl" to that system's playbook, then run "ansible-playbook --tags kubectl playbooks/someHostname.yml" and like magic in seconds it'll work.

As of the time this blog was written, "kubectl version --short" looks like this:

vince@ubuntu:~$ kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.24.10
Kustomize Version: v4.5.4
The connection to the server localhost:8080 was refused - did you specify the right host or port?
vince@ubuntu:~$ 

Note that the last step is you probably want to enable bash autocompletion for kubectl in .bashrc for whatever username you log in as. My .bashrc file has a line like this:

source <(kubectl completion bash)

Mine is actually wrapped by some if $HOSTNAME lines, but whatever.

After you do this and log back in, you can type "kubectl" and hit tab a couple times and autocompletion will work. Pretty cool!

Friday, February 17, 2023

Rancher Suite K8S Adventure - Chapter 005 - Ubuntu 20.04 install on a Beelink Mini S 5095

Rancher Suite K8S Adventure - Chapter 005 - Ubuntu 20.04 install on a Beelink Mini S 5095

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

I do all installs using MatterMost Playbooks, this particular example is named "Bare Metal Ubuntu 20".  Life is easier with MatterMost.  When I figure out a convenient way to share Mattermost playbooks I'll add a link here.  If you've never used this software, you're missing out... I would describe it as similar to Slack meets an outline oriented todo app.

https://mattermost.com/

Hardware

The hardware I selected for my three node Rancher RKE2 cluster is Beelink Mini S 5095.  It's considerably cheaper than a Raspberry Pi, easier to get, Intel CPU based, much faster, much more storage, sadly the Raspberry Pi platform has been eliminated from the market by heavy competition and supply chain problems.  The Raspberry Pi was cool tech, for it's day, but its unavailable and/or too expensive now.  The Beelink is simply a mini size PC.  This particular model seems very popular in the set top box media player subculture, often used as a Plex or Emby front end instead of using Roku-type hardware.

https://www.bee-link.com/beelink-mini-s-n5095-mini-pc

BIOS configuration was uneventful.

Hit Del while booting to enter BIOS setup

Menu "Main" - Set hwclock to UTC time

Menu "Advanced" "MAC whatever IPv4 Network Configuration" - Configured Enabled, Enable DHCP

"Security" "Secure Boot" "Disable"

"Boot" - Setup Prompt Timeout change from 1 to 3, Quiet Boot Disabled

"Save and Exit" - "Save and Reset"

Reboot, hit del again to enter setup again (can't save and do a pxeboot in the same step, don't know why, doesn't really matter in the long run)

"Save and Exit" Boot Override "UEFI PXE"

I have a netboot.xyz installation on the LAN so I can PXE boot for OS installations.

https://netboot.xyz/

An example of how to configure the ISC DHCP server for PXE based netboot.xyz:

https://gitlab.com/SpringCitySolutionsLLC/dhcp/-/blob/master/header.dhcpd.conf.dhcp11

Likewise, if you use OpenStack and its HEAT template system, you can install netboot.xyz on Zun container service using this example:

https://gitlab.com/SpringCitySolutionsLLC/openstack-scripts/-/blob/master/projects/infrastructure/netbootxyz/netbootxyz.yml

OS

The Ubuntu 20.04 install was mostly uneventful, aside from the usual annoyances revolving around timezones settings, avoiding DHCP incorrect autoconfiguration, etc.  It's the usual Ubuntu experience.

In the Netboot.xyz menu: "Linux Network Installs (64-bit)"

"Ubuntu"

"Ubuntu 20.04 LTS Focal Fossa (Legacy)"

Don't use: "Ubuntu 20.04 LTS Focal Fossa (Subiquity)" - Install seems to hang at "hdaudio hdaudioCOD2: Unable to bind the codec"

"Install"

Reasonable defaults as usual

Full name for the new user: Ubuntu

Password for ubuntu user is "the standard LAN password", doesn't matter the username will be deleted after ansible connects it to AD anyway.

Force timezone to "Central" I don't live in Chicago LOL

The only software to install is OpenSSH server

Note that upon bootup it looks like a failed boot but ctrl-alt-f1 etc will work, very annoying.

Super annoying that it autoconfigures the enp2s0 ethernet as DHCP with no option to change. You can crash out of the DHCP setting and enter manual config mode.  If that fails and it installs in DHCP mode (super annoying) then:

boot, log in as ubuntu, sudo vi /etc/netplan/01-netcfg.yaml and do something like this:

network:
  version: 2
  renderer: networkd
  ethernets:
    enp2s0:
      dhcp4: no
      addresses: [10.10.20.71/16]
      gateway4: 10.10.1.1
      nameservers:
        addresses: [10.10.7.3,10.10.7.4]

Then a quick "sudo netplan apply" and "ip addr" to verify and of course ssh in over the LAN to verify.

sudo reboot now

There is some weird bug where Ubuntu looks like the boot failed but as soon as you hit C-A-F1 you see a login, who knows.  Weird text console bug at bootup doesn't seem to matter.

verify SSH works over the lan as the ubuntu user which ansible will bootstrap into an AD connection

sudo shutdown now

At this point I physically installed the new server in the data center rack.  Properly label ethernet cables on both sides using the BradyLabel model 41 (yeah, its a bit of a brag, I really like this label maker), update the port name so the Observium installation makes pretty graphs with the correct server name, all the usual tasks.

Here is a link to the Ansible playbook for rancher1.  There's nothing special or unusual about it, its just a very small desktop PC being configured into a server.

https://gitlab.com/SpringCitySolutionsLLC/ansible/-/blob/master/playbooks/rancher1.yml

At this point the server is completely integrated in my infrastructure, although no "K8S specific" software has been installed.  AD SSO works, NTP works, Elasticsearch logging and metrics work, Zabbix monitoring works, etc.

Thursday, February 16, 2023

Rancher Suite K8S Adventure - Chapter 004 - Rancher RKE2 IP Addressing and DNS

Rancher Suite K8S Adventure - Chapter 004 - Rancher RKE2 IP Addressing and DNS

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Each machine in the cluster needs an ip address, so I added them in Netbox (Its a web-based IPAM system) and physically labeled each machine.

rancher1.cedar.mulhollon.com = 10.10.20.71

rancher2.cedar.mulhollon.com = 10.10.20.72

rancher3.cedar.mulhollon.com = 10.10.20.73

Ansible does my DNS configuration in Active Directory as seen at this URL:

https://gitlab.com/SpringCitySolutionsLLC/ansible/-/blob/master/roles/activedirectory/tasks/rancher1.yml

The cluster overall needs an entry where rancher.cedar.mulhollon.com points to all the load balancers in the cluster.  This results in several problems:

I've never really found "the right way" to store a DNS entry like that in Netbox.  Then again, its not using an ip address, so does it really "need" to be stored in Netbox?

The other problem is Samba round robin DNS is not supported LOL.

https://wiki.samba.org/index.php/Samba_Internal_DNS_Back_End#Limitations

So, that's annoying.  It'll always return the same answer in the same order.  I think this will be "OK" for access when/if the first host crashes, but it will be very suboptimal for load balancing because essentially all incoming traffic will go to only one cluster machine.

In the long run I've been considering setting up a "nice" load balancer on the large Harvester cluster to do proper load balancing for the small cluster and vice versa, or something like that.

Wednesday, February 15, 2023

Rancher Suite K8S Adventure - Chapter 003 - Software Version Selections

Rancher Suite K8S Adventure - Chapter 003 - Software Version Selections

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

I've been unsatisfied with how other examples online of Rancher Suite installations select their software versions.  I do not personally approve of "just install whatever is git branch latest or main", I've seen that blow up pretty badly.  Everybody knows Helm only supports the previous K8S API version, although that's wrong, because way back in the 3.0.0 days, Helm changed to supporting the last n-3 versions of the K8S API, and who can really memorize anyway that Helm 3.8.x only supports 1.20.x to 1.23.x (honestly, I had to look it up...).  A few minutes of planning ahead, can save many hours of troubleshooting later on.

Anyway.  The process I used was to open up diagram.net and create a big "flowchart" of the nine software components I plan to use, then sequentially cycle thru a list of URLs trying to condense my version selections into one coherent set where everything is predicted to work.

https://www.suse.com/suse-rancher/support-matrix/all-supported-versions/rancher-v2-7-1/

https://www.suse.com/suse-rke2/support-matrix/all-supported-versions/rke2-v1-24/

https://www.suse.com/suse-harvester/support-matrix/all-supported-versions/harvester-v1-1-1/

https://helm.sh/docs/topics/version_skew/

(And many other URLs unfortunately not recorded)

The one exception to the above is "officially" Harvester v1.1.1 only works with the v2.6 branch of Rancher, but unofficially extensive online research indicates everyone says it works just great for them.  Presumably support for Harvester v1.1.1 will only improve with time as Rancher's v2.7 branch develops.

As a disclaimer, this is being written in very early February 2023, on a posting delay to later that month, and you might be reading this post months or even years later, so don't laugh too hard at the "ancient" software versions listed above.

Another disclaimer is this is merely the first version of the plan.  I'm sure changes will be required.

Tuesday, February 14, 2023

Rancher Suite K8S Adventure - Chapter 002 - Plan 1.0

Rancher Suite K8S Adventure - Chapter 002 - Plan 1.0

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Starting conditions

I have two small OpenStack clusters running independently.  This way I can move everything from cluster one to cluster two, upgrade cluster one from version 'yoga' to version 'zed', then test and migrate everything to cluster one, then upgrade cluster two, etc.  It has proven to be operationally highly effective and efficient.  All I'm being wasteful about is running two copies of OpenStack Horizon which is pretty lightweight.  VMware had excellent incredibly reliable USB passthru and nothing else has that feature, so I have a couple tiny dedicated bare metal servers connected to various USB hardware (SDR devices, IoT microcontroller hardware, some home automation using ZWAVE, etc).  I have a TrueNAS NAS machine for slow bulk NFS and iSCSI storage that works pretty well.

First, set up a Rancher cluster.  Small Intel SBCs are now much cheaper and somewhat more readily available than slower lower capacity Raspberry Pi SBCs, which seems crazy to me, but I guess we live in interesting times.  My intent is to set up a three node RKE2 cluster and install Rancher on that microcluster.

Second, I have enough spare parts laying around to set up a small Harvester HCI cluster.  All I intend to buy is new, fresh, NVME for Longhorn cluster storage.  I can backup my Longhorn data to my existing TrueNAS server.  These ancient Intel NUCs are not high performance; but should be more than adequate for testing and experimentation.  I intend to apply "real world" container and VM workloads to this small cluster and document my experiences.

Third, I will migrate the workload off one of the two OpenStack clusters to the other OpenStack cluster and convert that hardware to a medium size Harvester cluster.  Given the experience in step two above, it should be easy to migrate the real world workload off the remaining OpenStack cluster to the new Harvester cluster.

Fourth, as the remaining OpenStack cluster should be empty now, I will convert that hardware into additional capacity for the now "large" sized Harvester cluster.

I would anticipate along the way I will have several systems to convert and document.  For example, currently the primary NTP servers for the LAN are the OpenStack servers.  I'm tentatively considering changing that to use the Rancher cluster hosts as the primary LAN NTP servers; everything on the LAN is configured with Ansible so this should be a half hour job?  Another example of a larger scale system is I do logging using an ElasticSearch cluster and MetricBeat/FileBeat and Kibana, etc, and I'm curious how well, or if, K8S integrates into an ElasticSearch logging infrastructure.  Speaking of everything on the LAN being configured with Ansible, I wonder how (or if?) I will be able to integrate the 'Rancher Suite' with Ansible.

Ending Conditions

One micro RKE2 cluster dedicated solely to running Rancher.  A small Harvester cluster mostly for experimentation although I may distribute some infrastructure workload on it for increased system reliability.  A 'large' Harvester cluster for real workload.

After you experience the operational flexibility and increased reliability of having multiple clusters, its hard to go back to having only one cluster.  Technically workload would easily fit on the 'large' Harvester cluster, but I will keep the small Harvester cluster around for experimentation purposes, etc.

Monday, February 13, 2023

Rancher Suite K8S Adventure - Chapter 001 - Goals

Rancher Suite K8S Adventure - Chapter 001 - Goals

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Goal

The goal of this new project is to convert a small OpenStack cluster to the Suse 'Rancher Suite'.  AFAIK the integrated set of Suse K8S cluster projects doesn't have a formal name so I'm calling it the 'Rancher Suite'.

History

Here's how I see the historical epic of virtualization.  (epic rant incoming)  Around the turn of the century was the rise of VM virtualization, run a hypervisor on the bare metal, which emulated/virtualized a VM, which runs a OS that usually doesn't entirely realize it's a VM, then run your apps on the OS.  I had admin access on some corporate VMware clusters and a little later, on some large corporate OpenStack clusters.  Meanwhile in the mid 00s, a different approach was becoming popular; chroot jails for enhanced security had been a thing since at least the 90s, maybe before, and the Linux kernel was "threaded" sorta by the mid 00s such that LXC containers could run multiple parallel runtime instances of the OS on top of one running kernel.  This is not theoretical; in the late 00s I was running multiple virtual machines on LXC on some Linux servers and it worked quite well.  Starting with LXC-type containerization, the addition of overlay filesystems, and, admittedly, a lot of hype, then Docker was born.  Then Docker led to Docker-Swarm (Docker-Swarm is like 'Fight Club' in that the first rule of Docker-Swarm is nobody talks about Docker-Swarm).  Docker-Swarm plus a lot of cool/complicated stuff leads to Kubernetes, which everyone abbreviates to K8S.

So, how to square the circle of virtualization?  Do we run virtual VMs on hypervisors and put workload on top of that, or run containers on K8S clusters and put workload on top of that?  The 'Marketing Department' name for trying to mix VMs with Containers seems to be 'Hyper Converged Infrastructure'.

I've tried several HCI strategies in the past.  Running hand configured VMs that host Docker, Docker-Swarm, and small K8S clusters like the K3S project worked fine on VMware and OpenStack.  Likewise, as you'd expect, there are automation 'solutions' for VMware and OpenStack that eliminate the hand configuration aspect by being, more or less, very small shell scripts that install preconfigured images for you.  Another way around it is OpenStack has the Zun project, which provides Docker Engine drivers for OpenStack resources such that containers can natively connect to OpenStacks block storage and network infrastructure, and this works very well for me.  Going the opposite direction, although I've never personally worked with it, the kubevirt.io project lets you run VMs on top of K8S containers, at least as I understand it.

An interesting looking HCI system I want to try with this project, is the Suse 'Rancher Suite' of tightly integrated projects.  Rancher is a controller and orchestrator of K8S clusters, Longhorn is a cool distributed block storage for K8S (Analogy is a vSAN for containers...) Harvester is a linux bare metal OS designed to hold K8S clusters and VMs, RKE2 (and, I suppose, RKE1) is a really nice K8S implementation.  What's noteworthy about this list of projects from SUSE, is they're all very compatible with each other, so I will deploy them as an integrated system.  So I will move the entire infrastructure from OpenStack to Rancher Suite.  What could possibly go wrong?

Why blog this project?

I think Rancher Suite sounds really cool.  If we're honest with ourselves most technology selections are based primarily on this criteria, although there's usually a dense wrapper of rationalization surrounding the decision.

The usual self-promotion, of course.  Need a C2C or W2 contract systems engineer?

I'm dissatisfied with the design process seen in YouTube and blog posts of similar deployments.  Just install 'latest' or 'main' branch of all 15 software components and hope for the best.  Its a moving target.  Its like throwing a net at a flock of birds after they've already scattered.  Why doesn't SUSE sell/promote a package deal named 'SUSE cluster 2023' where it's all guaranteed to "just work" together?

Likewise I'm dissatisfied with the demarcation point seen in other creative expositions.  Too many end with the admin running 'kubectl get nodes' then concluding the demonstration.  Wait a minute, I have many containers and VMs that need to be imported into the cluster and there's numerous other day to day 'operational concerns' such as logging and monitoring and backups that have not been addressed.  I'm reminded of the bad old days of Linux OS reviews in the 90s when the review would end at the conclusion of the installation process; wait a minute, I need to do actual stuff AFTER the install; where's the review about anything more than five minutes after the install?  LOL some things never change, do they?

Finally I write these things to informally document.  I have IPAM using Netbox, and I keep runbooks in MatterMost, and I keep detailed documentation in Redmine just like any other civilized engineer; but sometimes informal prose will effectively jog my memory.