Tuesday, February 28, 2023

Rancher Suite K8S Adventure - Chapter 012 - A first log in and tour of Rancher

Rancher Suite K8S Adventure - Chapter 012 - A first log in and tour of Rancher

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Today is the first log in and tour of Rancher.

Where we left off yesterday, a 'rollout status' kubectl shows Rancher is up and running.

Go to https://YourClusterHostname where YourClusterHostname is the round robin DNS name not one specific host.  Because you're using a self signed cert you'll have to click "ok" or "proceed" or whatever your web browser requires.

Log in with your bootstrap password from yesterday.

Rancher will want to verify the cluster URL, unless you messed something up the default should be correct.  Also you'll have to click the EULA acceptance checkbox.

This will drop you in the default dashboard page.

One cool feature of Rancher is centralized authorization for cluster control.  I'm not going to configure Active Directory auth.  Partially because I could never get it working; AD has near infinite flexibility therefore if the only feedback is "authentication failed" it can take a near infinite amount of time to configure.  The other problem is my DCs are hosted as VMs and I don't like the idea of being locked out of Rancher due to a cluster problem therefore being unable to log into Rancher to fix the cluster problem.  Kind of like the old joke about hosting your DHCP controllers as VMware images then making your ESXi hosts configure their networks using DHCP.

Anyway, add at least one user for daily use, much as most sysadmins do not use root on a Linux box all day, its probably good to not use admin on Rancher:

Left Hamburger menu, "Configuration" "Users and Authentication"

"Users" "Create" and pay close attention to Global Permissions, Administrator vs Standard User, etc.

To work around the "Password must be at least 12 characters" error:

Left Hamburger menu, "Global Settings" change password-min-length to something that doesn't force people to use post it notes as a password manager.

Log out as admin and log in as a normal-ish user.

Set your preferences in the right icon "Preferences".  The default color theme changes at night which I find incredible disturbing when it happens, so I always force it to "Light" theme.  This is also where you can change the "Login Landing Page" from home to a specific cluster.

Time for a quick tour.  This tour provides a high level view of the rest of the series.

Home

At login you will be dropped in "Home" which you can reach from the left Hamburger menu "Home".  Gives you a list of your clusters, and we will look at clusters later.

List of Clusters

Next hamburger menu entry is a list of your clusters, just local right now.  Again, we will look at clusters later.

Continuous Delivery

After the cluster list in the Hamburger menu is "Continuous Delivery" that Git Ops stuff where committing code results, optimistically, in passed tests and successful deployments.  You will get a "You don't have any Git Repositories in your Workspaces" and we will return to this cool feature another day.  Its an undermarketed feature, very cool...

Cluster Management

"Cluster Management" is the next entry in the Hamburger menu.  The right side is yet another view of your clusters, but the left side is where you enter your cloud credentials, select drivers for Rancher clusters, etc.  AWS is far too expensive to permanently use compared to cheap onsite cluster hardware, but its fun and cool to experiment with.

The Harvester component of the Rancher Suite allows HCI integration of clusters and virtual machines.  This is the page where you import your Harvester Cluster into Rancher.

Users & Authentication

You previously visited "Users & Authentication" when creating a non-admin user for daily Rancher use.  This is also where you configure Authorization Providers.

Extensions

By default, the "Extensions" menu does nothing because the Extension Operator is not enabled.  Clicking the button to enable it adds a new repo full of cool Rancher "stuff".  Obviously you can not install stuff from the internet if your install is air-gapped from the internet.  Anyway after enabling the Extension Operator, note that no extensions are installed by default, click "Available" and as of the time of writing this there are exactly two extensions available, one for elemental OS and one for Kubewarden.

Global Settings

"Global Settings" was where you reconfigured the minimum password length but there are all kinds of cool settings here.  "Home Links" lets you add or change the links on the home page, which is pretty cool.

Examine a typical cluster

Time for a quick glance at a cluster.  The only cluster we have right now is "local", the one running Rancher.  There are at least three different ways to access cluster "local", the home page, the entry in the hamburger menu, and looking at cluster management and clicking local.  Note that cluster management provides more "control" features and fewer "monitoring" features so just use the home page for now, because first stop is the event log.

If you enabled the Extensions Operator that will result in about two dozen events.  I selected them all and deleted them.  This is a much more exciting page when you're troubleshooting a problem.

Looking at the main cluster page, there's an option to add a cluster badge, which AFAIK is purely decorative.  I like to add the URL of my cluster, so a cluster named "local" for Rancher has, in my case, a badge named "rancher.cedar.mulhollon.com"

Note that next to the badge setting there's an option to enable monitoring via the cluster tools charts page.  This is a long story for another day.  Aside from monitoring there's plenty of other cluster scale tools Rancher can install, including the backup system and Longhorn for data storage and various scanning and alerting systems.

Something you'll rapidly discover when looking at the Rancher cluster's workload, etc, is there's a drop down menu at the top of the screen, probably by default set to "Only User Namespaces" and there is no user workload on the Rancher cluster so, for example, the deployment list will be empty.  Simply change the dropdown to something like "All Namspaces" and the list will fill with rancher and cert-manager and so forth.  Its instructive to click thru on the deployment for Rancher, then perhaps selec tthe services or ingresses for that deployment and see there's there is the https port you're using to access Rancher being examined in rancher.  Or click thru on the Rancher deployment to the pod for rancher then the container, then click the three dots to look at the container logs for Rancher.

That was a whirlwind tour of Rancher.  Here's a great operational resource for new Rancher users:

https://ranchermanager.docs.rancher.com/pages-for-subheaders/new-user-guides

At this point you have a great manager of clusters, but no clusters to manage.  Next we work on adding a small Harvester cluster.

Monday, February 27, 2023

Rancher Suite K8S Adventure - Chapter 011 - Install Rancher on RKE2 cluster

Rancher Suite K8S Adventure - Chapter 011 - Install Rancher on RKE2 cluster

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

The next step is to install Rancher on the RKE2 cluster.  Happily, this is the simplest step of the entire series, so far.

The references are:

https://www.rancher.com/products/rancher

https://ranchermanager.docs.rancher.com/

https://ranchermanager.docs.rancher.com/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli

https://ranchermanager.docs.rancher.com/pages-for-subheaders/install-upgrade-on-a-kubernetes-cluster

There's really only one line to run, admittedly a very long line:

helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --version 2.7.1 \
  --set hostname=rancher.cedar.mulhollon.com \
  --set replicas=1 \
  --set bootstrapPassword=ThisIsNotMyRealPassword

Obviously your hostname and password and maybe even requested version will be different than the above.  Don't use a real password, I will explain later.

Lets watch the process of the install:

kubectl -n cattle-system rollout status deploy/rancher

Note this will take awhile... at least five minutes in my experience.

Remember when I noted that you should not use a "real" password for the bootstrap password?  Try this command line:

kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}{{"\n"}}'

Oh.  That's why.  Although I suppose anyone who has root or kubectl access on your cluster pretty much owns the cluster and everything on it anyway so not much loss in having the password in there.

Tomorrow we're done with the CLI and installation tasks; its time to tour Rancher!

Friday, February 24, 2023

Rancher Suite K8S Adventure - Chapter 010 - Cert-Manager

Rancher Suite K8S Adventure - Chapter 010 - Cert-Manager

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

The next step is to install cert-manager on the new Rancher RKE2 cluster.

The references are:

https://ranchermanager.docs.rancher.com/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli

https://www.jetstack.io/open-source/cert-manager/

https://cert-manager.io/docs/

Add the repo for jetstack and rancher:

helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
helm repo add jetstack https://charts.jetstack.io
helm repo update

then verify:

vince@ubuntu:~$ helm repo list
NAME                 URL                                              
rancher-latest       https://releases.rancher.com/server-charts/latest
jetstack                https://charts.jetstack.io   
vince@ubuntu:~$ 

Create the namespace for rancher, we'll create the ns for jetstack as part of the install:

kubectl create namespace cattle-system

then verify

vince@ubuntu:~$ kubectl get namespaces | grep cattle-system
cattle-system     Active   30s
vince@ubuntu:~$ 

Next install the CRDs (Custom Resource Definitions) used by cert-manager:

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.7.1/cert-manager.crds.yaml

Finally have helm create the cert-manager namespace (could have made it above... whatever) and install cert-manager:

helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.7.1

Lets take a look at the cert-manager namespace:

vince@ubuntu:~$ kubectl get all --namespace=cert-manager
NAME                                           READY   STATUS    RESTARTS   AGE
pod/cert-manager-646c67487-kmrml               1/1     Running   0          112s
pod/cert-manager-cainjector-7cb8669d6b-wjdcz   1/1     Running   0          112s
pod/cert-manager-webhook-696c5db7ff-slrsv      1/1     Running   0          112s

NAME                           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/cert-manager           ClusterIP   10.43.61.195    <none>        9402/TCP   112s
service/cert-manager-webhook   ClusterIP   10.43.207.185   <none>        443/TCP    112s

NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cert-manager              1/1     1            1           112s
deployment.apps/cert-manager-cainjector   1/1     1            1           112s
deployment.apps/cert-manager-webhook      1/1     1            1           112s

NAME                                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/cert-manager-646c67487               1         1         1       112s
replicaset.apps/cert-manager-cainjector-7cb8669d6b   1         1         1       112s
replicaset.apps/cert-manager-webhook-696c5db7ff      1         1         1       112s
vince@ubuntu:~$ 

Obviously you'll have different IP addresses and times above, but it should look similar, plus or minus obvious blogging platform formatting issues.

Now its time to verify cert-manager works.  You can go thru the steps listed here, but its tedious to cut and paste:

https://cert-manager.io/docs/installation/verify/

The verification process has you install cmctl which requires brew which I don't have on Ubuntu (long story) so that's tedious.  Next the verification process has you look at the pods in the namespace (see above when we did a get all, its the first 'paragraph' above.  After that is a long process to create a YAML cert request and submit that to cert manager, then see if cert manager issues you a self signed cert per the YAML, finally delete it.  There's also a cert-manager-verifier tool:

https://github.com/alenkacz/cert-manager-verifier

However, unless it looks like something broken, the simplest way to test cert-manager would be to install Rancher, and as the plan is to install Rancher tomorrow, its probably OK to skip extensive testing.


Thursday, February 23, 2023

Rancher Suite K8S Adventure - Chapter 009 - Rancher Cluster Additional Nodes Install

Rancher Suite K8S Adventure - Chapter 009 - Rancher Cluster Additional Nodes Install

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Now its time to connect the remaining nodes to the first node, which was configured yesterday.  Server nodes are schedulable on RKE2, and it takes three RKE2 server nodes for HA, and coincidentally I have exactly three mini servers allocated to this project, so my cluster design is three servers, zero agents.

Second and subsequent members of a cluster install process:

# curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION="v1.24.10+rke2r1" sh -

Now, create a micro-mini config just enough to bootstrap:

mkdir -p /etc/rancher/rke2/

cat node-token >> /etc/rancher/rke2/config.yaml

vi /etc/rancher/rke2/config.yaml

server: https://<server>:9345

token: move that first line here

# systemctl enable rke2-server.service

# systemctl start rke2-server.service

(Insert an extremely dramatic very long pause here, about nine minutes?)

Eventually these two commands will settle down and look normal:

systemctl status rke2-server

journalctl -u rke2-server -f

(Repeat above for all future additional nodes in your cluster)

Some fun commands to try:

kubectl get node

kubectl get pods --all-namespaces

At this point you should have a stable three (or more?) node RKE2 cluster ready to run K8S workloads.

Tomorrow, "cert-manager"

Wednesday, February 22, 2023

Rancher Suite K8S Adventure - Chapter 008 - Rancher Cluster First RKE2 Install

Rancher Suite K8S Adventure - Chapter 008 - Rancher Cluster First RKE2 Install

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

I chose not to automate this install as RKE2 doesn't support anything more modern than downloading shell scripts from the internet; I suppose the entire point of installing Rancher as a cluster orchestrator is to avoid this kind of weirdness in the future.

The strange looking RKE2_VERSION line below more or less comes from:

https://update.rke2.io/v1-release/channels

First member of a cluster install process:

# curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION="v1.24.10+rke2r1" sh -

# systemctl enable rke2-server.service

# systemctl start rke2-server.service

(Insert an extremely dramatic very long pause here)

Eventually these two commands will settle down and look normal.

systemctl status rke2-server

journalctl -u rke2-server -f

Note that a kubectl file will be written to /etc/rancher/rke2/rke2.yaml

and a token file will be here /var/lib/rancher/rke2/server/node-token

scp /var/lib/rancher/rke2/server/node-token root@otherHosts:

(Repeat above for all future additional nodes in your cluster)

scp /etc/rancher/rke2/rke2.yaml vince@ubuntu:

(Or whatever your "local" experimenting system)

Some fun commands to try as root on your first node:

export KUBECONFIG=/etc/rancher/rke2/rke2.yaml

kubectl get node

kubectl get pods --all-namespaces

helm ls --all-namespaces

Log into your experimental machine, for me that's vince@ubuntu: and recall you scp'd over the kubectl config, filename rke2.yaml.  That yaml specifies the server address as 127.0.0.1, that's not going to do.

First, mkdir ~/.kube then cp rke2.yaml ~/.kube/config

vi ~/.kube/config and change the "server: https://127.0.0.1:6443" to something probably reminiscent of "server: https://rancher1.cedar.mulhollon.com:6443" obviously your DNS name will be different.

At this point you should be able to run "kubectl get node" from a remote machine, or try "helm ls --all-namespaces".  Cool.

Tomorrow, we add the rest of the nodes to the cluster.

Tuesday, February 21, 2023

Rancher Suite K8S Adventure - Chapter 007 - Helm

Rancher Suite K8S Adventure - Chapter 007 - Helm

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Helm version 3.11 is installed on all members of the Rancher RKE2 cluster and on my Ubuntu experimentation box using Ansible.  Honestly this is almost identical to the process for installing kubectl yesterday, it's just a different repo and different package.

https://helm.sh/docs/intro/install/

The exact version of the Ubuntu package I'm installing is 1.24.10-00 as seen at

https://helm.baltorepo.com/stable/debian/packages/helm/releases/

And I'm doing an "apt hold" on it to make sure its not accidentally upgraded.

Here is a link to the gitlab repo directory for the Ansible helm role:

https://gitlab.com/SpringCitySolutionsLLC/ansible/-/tree/master/roles/helm

If you look at the Ansible task named packages.yml, the task installs some boring required packages first, then deletes the repo key if its too old, then downloads a new copy of the repo key if its not already present, add the local copy of the repo key to apt's list of known good keys, installs the sources.list file for the repo, does an apt-get update, takes helm out of "hold" state, installs the latest package for helm version 3.11, finally places helm back on "hold" state so its not magically upgraded to the latest version (3.12 or 3.13 or something by now).  Glad I don't have to do that manually by hand on every machine.

Simply add "- helm" to a machine's Ansible playbook, then run "ansible-playbook --tags helm playbooks/someHostname.yml" and it works.

As of the time this blog was written, "helm version" looks like this:

vince@ubuntu:~$ helm version
version.BuildInfo{Version:"v3.11.1", GitCommit:"293b50c65d4d56187cd4e2f390f0ada46b4c4737", GitTreeState:"clean", GoVersion:"go1.18.10"}
vince@ubuntu:~$ 

Monday, February 20, 2023

Rancher Suite K8S Adventure - Chapter 006 - Kubectl

Rancher Suite K8S Adventure - Chapter 006 - Kubectl

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Kubectl version 1.24 is installed on all members of the Rancher RKE2 cluster and on my Ubuntu experimentation box using Ansible.

https://kubernetes.io/docs/reference/kubectl/

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands

https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/

The exact version of the Ubuntu package I'm installing is 1.24.10-00 as seen at

https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages

And I'm doing an "apt hold" on it to make sure its not accidentally upgraded.

Here is a link to the gitlab repo directory for the Ansible kubectl role:

https://gitlab.com/SpringCitySolutionsLLC/ansible/-/tree/master/roles/kubectl

If you look at the Ansible task named packages.yml, the task installs some boring required packages first, then deletes the Google K8S repo key if its too old, then downloads a new copy of the Google K8S repo key if its not already present, add the local copy of the Google K8S repo key to apt's list of known good keys, installs the sources.list file for Google's K8S repo, does an apt-get update, takes kubectl out of "hold" state, installs the latest package for kubectl version 1.24, finally places kubectl back on "hold" state so its not magically upgraded to the latest version (1.26 or 1.27 or something by now).  Glad I don't have to do that manually by hand on every machine, LOL!

Ansible makes life easy, all I need to do to have the most recent kubectl installed on a Ubuntu system is add "- kubectl" to that system's playbook, then run "ansible-playbook --tags kubectl playbooks/someHostname.yml" and like magic in seconds it'll work.

As of the time this blog was written, "kubectl version --short" looks like this:

vince@ubuntu:~$ kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.24.10
Kustomize Version: v4.5.4
The connection to the server localhost:8080 was refused - did you specify the right host or port?
vince@ubuntu:~$ 

Note that the last step is you probably want to enable bash autocompletion for kubectl in .bashrc for whatever username you log in as. My .bashrc file has a line like this:

source <(kubectl completion bash)

Mine is actually wrapped by some if $HOSTNAME lines, but whatever.

After you do this and log back in, you can type "kubectl" and hit tab a couple times and autocompletion will work. Pretty cool!

Friday, February 17, 2023

Rancher Suite K8S Adventure - Chapter 005 - Ubuntu 20.04 install on a Beelink Mini S 5095

Rancher Suite K8S Adventure - Chapter 005 - Ubuntu 20.04 install on a Beelink Mini S 5095

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

I do all installs using MatterMost Playbooks, this particular example is named "Bare Metal Ubuntu 20".  Life is easier with MatterMost.  When I figure out a convenient way to share Mattermost playbooks I'll add a link here.  If you've never used this software, you're missing out... I would describe it as similar to Slack meets an outline oriented todo app.

https://mattermost.com/

Hardware

The hardware I selected for my three node Rancher RKE2 cluster is Beelink Mini S 5095.  It's considerably cheaper than a Raspberry Pi, easier to get, Intel CPU based, much faster, much more storage, sadly the Raspberry Pi platform has been eliminated from the market by heavy competition and supply chain problems.  The Raspberry Pi was cool tech, for it's day, but its unavailable and/or too expensive now.  The Beelink is simply a mini size PC.  This particular model seems very popular in the set top box media player subculture, often used as a Plex or Emby front end instead of using Roku-type hardware.

https://www.bee-link.com/beelink-mini-s-n5095-mini-pc

BIOS configuration was uneventful.

Hit Del while booting to enter BIOS setup

Menu "Main" - Set hwclock to UTC time

Menu "Advanced" "MAC whatever IPv4 Network Configuration" - Configured Enabled, Enable DHCP

"Security" "Secure Boot" "Disable"

"Boot" - Setup Prompt Timeout change from 1 to 3, Quiet Boot Disabled

"Save and Exit" - "Save and Reset"

Reboot, hit del again to enter setup again (can't save and do a pxeboot in the same step, don't know why, doesn't really matter in the long run)

"Save and Exit" Boot Override "UEFI PXE"

I have a netboot.xyz installation on the LAN so I can PXE boot for OS installations.

https://netboot.xyz/

An example of how to configure the ISC DHCP server for PXE based netboot.xyz:

https://gitlab.com/SpringCitySolutionsLLC/dhcp/-/blob/master/header.dhcpd.conf.dhcp11

Likewise, if you use OpenStack and its HEAT template system, you can install netboot.xyz on Zun container service using this example:

https://gitlab.com/SpringCitySolutionsLLC/openstack-scripts/-/blob/master/projects/infrastructure/netbootxyz/netbootxyz.yml

OS

The Ubuntu 20.04 install was mostly uneventful, aside from the usual annoyances revolving around timezones settings, avoiding DHCP incorrect autoconfiguration, etc.  It's the usual Ubuntu experience.

In the Netboot.xyz menu: "Linux Network Installs (64-bit)"

"Ubuntu"

"Ubuntu 20.04 LTS Focal Fossa (Legacy)"

Don't use: "Ubuntu 20.04 LTS Focal Fossa (Subiquity)" - Install seems to hang at "hdaudio hdaudioCOD2: Unable to bind the codec"

"Install"

Reasonable defaults as usual

Full name for the new user: Ubuntu

Password for ubuntu user is "the standard LAN password", doesn't matter the username will be deleted after ansible connects it to AD anyway.

Force timezone to "Central" I don't live in Chicago LOL

The only software to install is OpenSSH server

Note that upon bootup it looks like a failed boot but ctrl-alt-f1 etc will work, very annoying.

Super annoying that it autoconfigures the enp2s0 ethernet as DHCP with no option to change. You can crash out of the DHCP setting and enter manual config mode.  If that fails and it installs in DHCP mode (super annoying) then:

boot, log in as ubuntu, sudo vi /etc/netplan/01-netcfg.yaml and do something like this:

network:
  version: 2
  renderer: networkd
  ethernets:
    enp2s0:
      dhcp4: no
      addresses: [10.10.20.71/16]
      gateway4: 10.10.1.1
      nameservers:
        addresses: [10.10.7.3,10.10.7.4]

Then a quick "sudo netplan apply" and "ip addr" to verify and of course ssh in over the LAN to verify.

sudo reboot now

There is some weird bug where Ubuntu looks like the boot failed but as soon as you hit C-A-F1 you see a login, who knows.  Weird text console bug at bootup doesn't seem to matter.

verify SSH works over the lan as the ubuntu user which ansible will bootstrap into an AD connection

sudo shutdown now

At this point I physically installed the new server in the data center rack.  Properly label ethernet cables on both sides using the BradyLabel model 41 (yeah, its a bit of a brag, I really like this label maker), update the port name so the Observium installation makes pretty graphs with the correct server name, all the usual tasks.

Here is a link to the Ansible playbook for rancher1.  There's nothing special or unusual about it, its just a very small desktop PC being configured into a server.

https://gitlab.com/SpringCitySolutionsLLC/ansible/-/blob/master/playbooks/rancher1.yml

At this point the server is completely integrated in my infrastructure, although no "K8S specific" software has been installed.  AD SSO works, NTP works, Elasticsearch logging and metrics work, Zabbix monitoring works, etc.

Thursday, February 16, 2023

Rancher Suite K8S Adventure - Chapter 004 - Rancher RKE2 IP Addressing and DNS

Rancher Suite K8S Adventure - Chapter 004 - Rancher RKE2 IP Addressing and DNS

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Each machine in the cluster needs an ip address, so I added them in Netbox (Its a web-based IPAM system) and physically labeled each machine.

rancher1.cedar.mulhollon.com = 10.10.20.71

rancher2.cedar.mulhollon.com = 10.10.20.72

rancher3.cedar.mulhollon.com = 10.10.20.73

Ansible does my DNS configuration in Active Directory as seen at this URL:

https://gitlab.com/SpringCitySolutionsLLC/ansible/-/blob/master/roles/activedirectory/tasks/rancher1.yml

The cluster overall needs an entry where rancher.cedar.mulhollon.com points to all the load balancers in the cluster.  This results in several problems:

I've never really found "the right way" to store a DNS entry like that in Netbox.  Then again, its not using an ip address, so does it really "need" to be stored in Netbox?

The other problem is Samba round robin DNS is not supported LOL.

https://wiki.samba.org/index.php/Samba_Internal_DNS_Back_End#Limitations

So, that's annoying.  It'll always return the same answer in the same order.  I think this will be "OK" for access when/if the first host crashes, but it will be very suboptimal for load balancing because essentially all incoming traffic will go to only one cluster machine.

In the long run I've been considering setting up a "nice" load balancer on the large Harvester cluster to do proper load balancing for the small cluster and vice versa, or something like that.

Wednesday, February 15, 2023

Rancher Suite K8S Adventure - Chapter 003 - Software Version Selections

Rancher Suite K8S Adventure - Chapter 003 - Software Version Selections

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

I've been unsatisfied with how other examples online of Rancher Suite installations select their software versions.  I do not personally approve of "just install whatever is git branch latest or main", I've seen that blow up pretty badly.  Everybody knows Helm only supports the previous K8S API version, although that's wrong, because way back in the 3.0.0 days, Helm changed to supporting the last n-3 versions of the K8S API, and who can really memorize anyway that Helm 3.8.x only supports 1.20.x to 1.23.x (honestly, I had to look it up...).  A few minutes of planning ahead, can save many hours of troubleshooting later on.

Anyway.  The process I used was to open up diagram.net and create a big "flowchart" of the nine software components I plan to use, then sequentially cycle thru a list of URLs trying to condense my version selections into one coherent set where everything is predicted to work.

https://www.suse.com/suse-rancher/support-matrix/all-supported-versions/rancher-v2-7-1/

https://www.suse.com/suse-rke2/support-matrix/all-supported-versions/rke2-v1-24/

https://www.suse.com/suse-harvester/support-matrix/all-supported-versions/harvester-v1-1-1/

https://helm.sh/docs/topics/version_skew/

(And many other URLs unfortunately not recorded)

The one exception to the above is "officially" Harvester v1.1.1 only works with the v2.6 branch of Rancher, but unofficially extensive online research indicates everyone says it works just great for them.  Presumably support for Harvester v1.1.1 will only improve with time as Rancher's v2.7 branch develops.

As a disclaimer, this is being written in very early February 2023, on a posting delay to later that month, and you might be reading this post months or even years later, so don't laugh too hard at the "ancient" software versions listed above.

Another disclaimer is this is merely the first version of the plan.  I'm sure changes will be required.

Tuesday, February 14, 2023

Rancher Suite K8S Adventure - Chapter 002 - Plan 1.0

Rancher Suite K8S Adventure - Chapter 002 - Plan 1.0

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Starting conditions

I have two small OpenStack clusters running independently.  This way I can move everything from cluster one to cluster two, upgrade cluster one from version 'yoga' to version 'zed', then test and migrate everything to cluster one, then upgrade cluster two, etc.  It has proven to be operationally highly effective and efficient.  All I'm being wasteful about is running two copies of OpenStack Horizon which is pretty lightweight.  VMware had excellent incredibly reliable USB passthru and nothing else has that feature, so I have a couple tiny dedicated bare metal servers connected to various USB hardware (SDR devices, IoT microcontroller hardware, some home automation using ZWAVE, etc).  I have a TrueNAS NAS machine for slow bulk NFS and iSCSI storage that works pretty well.

First, set up a Rancher cluster.  Small Intel SBCs are now much cheaper and somewhat more readily available than slower lower capacity Raspberry Pi SBCs, which seems crazy to me, but I guess we live in interesting times.  My intent is to set up a three node RKE2 cluster and install Rancher on that microcluster.

Second, I have enough spare parts laying around to set up a small Harvester HCI cluster.  All I intend to buy is new, fresh, NVME for Longhorn cluster storage.  I can backup my Longhorn data to my existing TrueNAS server.  These ancient Intel NUCs are not high performance; but should be more than adequate for testing and experimentation.  I intend to apply "real world" container and VM workloads to this small cluster and document my experiences.

Third, I will migrate the workload off one of the two OpenStack clusters to the other OpenStack cluster and convert that hardware to a medium size Harvester cluster.  Given the experience in step two above, it should be easy to migrate the real world workload off the remaining OpenStack cluster to the new Harvester cluster.

Fourth, as the remaining OpenStack cluster should be empty now, I will convert that hardware into additional capacity for the now "large" sized Harvester cluster.

I would anticipate along the way I will have several systems to convert and document.  For example, currently the primary NTP servers for the LAN are the OpenStack servers.  I'm tentatively considering changing that to use the Rancher cluster hosts as the primary LAN NTP servers; everything on the LAN is configured with Ansible so this should be a half hour job?  Another example of a larger scale system is I do logging using an ElasticSearch cluster and MetricBeat/FileBeat and Kibana, etc, and I'm curious how well, or if, K8S integrates into an ElasticSearch logging infrastructure.  Speaking of everything on the LAN being configured with Ansible, I wonder how (or if?) I will be able to integrate the 'Rancher Suite' with Ansible.

Ending Conditions

One micro RKE2 cluster dedicated solely to running Rancher.  A small Harvester cluster mostly for experimentation although I may distribute some infrastructure workload on it for increased system reliability.  A 'large' Harvester cluster for real workload.

After you experience the operational flexibility and increased reliability of having multiple clusters, its hard to go back to having only one cluster.  Technically workload would easily fit on the 'large' Harvester cluster, but I will keep the small Harvester cluster around for experimentation purposes, etc.

Monday, February 13, 2023

Rancher Suite K8S Adventure - Chapter 001 - Goals

Rancher Suite K8S Adventure - Chapter 001 - Goals

A travelogue of converting from OpenStack to Suse's Rancher Suite for K8S including RKE2, Harvester, kubectl, helm.

Goal

The goal of this new project is to convert a small OpenStack cluster to the Suse 'Rancher Suite'.  AFAIK the integrated set of Suse K8S cluster projects doesn't have a formal name so I'm calling it the 'Rancher Suite'.

History

Here's how I see the historical epic of virtualization.  (epic rant incoming)  Around the turn of the century was the rise of VM virtualization, run a hypervisor on the bare metal, which emulated/virtualized a VM, which runs a OS that usually doesn't entirely realize it's a VM, then run your apps on the OS.  I had admin access on some corporate VMware clusters and a little later, on some large corporate OpenStack clusters.  Meanwhile in the mid 00s, a different approach was becoming popular; chroot jails for enhanced security had been a thing since at least the 90s, maybe before, and the Linux kernel was "threaded" sorta by the mid 00s such that LXC containers could run multiple parallel runtime instances of the OS on top of one running kernel.  This is not theoretical; in the late 00s I was running multiple virtual machines on LXC on some Linux servers and it worked quite well.  Starting with LXC-type containerization, the addition of overlay filesystems, and, admittedly, a lot of hype, then Docker was born.  Then Docker led to Docker-Swarm (Docker-Swarm is like 'Fight Club' in that the first rule of Docker-Swarm is nobody talks about Docker-Swarm).  Docker-Swarm plus a lot of cool/complicated stuff leads to Kubernetes, which everyone abbreviates to K8S.

So, how to square the circle of virtualization?  Do we run virtual VMs on hypervisors and put workload on top of that, or run containers on K8S clusters and put workload on top of that?  The 'Marketing Department' name for trying to mix VMs with Containers seems to be 'Hyper Converged Infrastructure'.

I've tried several HCI strategies in the past.  Running hand configured VMs that host Docker, Docker-Swarm, and small K8S clusters like the K3S project worked fine on VMware and OpenStack.  Likewise, as you'd expect, there are automation 'solutions' for VMware and OpenStack that eliminate the hand configuration aspect by being, more or less, very small shell scripts that install preconfigured images for you.  Another way around it is OpenStack has the Zun project, which provides Docker Engine drivers for OpenStack resources such that containers can natively connect to OpenStacks block storage and network infrastructure, and this works very well for me.  Going the opposite direction, although I've never personally worked with it, the kubevirt.io project lets you run VMs on top of K8S containers, at least as I understand it.

An interesting looking HCI system I want to try with this project, is the Suse 'Rancher Suite' of tightly integrated projects.  Rancher is a controller and orchestrator of K8S clusters, Longhorn is a cool distributed block storage for K8S (Analogy is a vSAN for containers...) Harvester is a linux bare metal OS designed to hold K8S clusters and VMs, RKE2 (and, I suppose, RKE1) is a really nice K8S implementation.  What's noteworthy about this list of projects from SUSE, is they're all very compatible with each other, so I will deploy them as an integrated system.  So I will move the entire infrastructure from OpenStack to Rancher Suite.  What could possibly go wrong?

Why blog this project?

I think Rancher Suite sounds really cool.  If we're honest with ourselves most technology selections are based primarily on this criteria, although there's usually a dense wrapper of rationalization surrounding the decision.

The usual self-promotion, of course.  Need a C2C or W2 contract systems engineer?

I'm dissatisfied with the design process seen in YouTube and blog posts of similar deployments.  Just install 'latest' or 'main' branch of all 15 software components and hope for the best.  Its a moving target.  Its like throwing a net at a flock of birds after they've already scattered.  Why doesn't SUSE sell/promote a package deal named 'SUSE cluster 2023' where it's all guaranteed to "just work" together?

Likewise I'm dissatisfied with the demarcation point seen in other creative expositions.  Too many end with the admin running 'kubectl get nodes' then concluding the demonstration.  Wait a minute, I have many containers and VMs that need to be imported into the cluster and there's numerous other day to day 'operational concerns' such as logging and monitoring and backups that have not been addressed.  I'm reminded of the bad old days of Linux OS reviews in the 90s when the review would end at the conclusion of the installation process; wait a minute, I need to do actual stuff AFTER the install; where's the review about anything more than five minutes after the install?  LOL some things never change, do they?

Finally I write these things to informally document.  I have IPAM using Netbox, and I keep runbooks in MatterMost, and I keep detailed documentation in Redmine just like any other civilized engineer; but sometimes informal prose will effectively jog my memory.