In another “running kuberets” flavoured post I’ll be trying out
microk8s
High availability K8s.
Low-ops, minimal production Kubernetes,
for devs, cloud, clusters, workstations, Edge and IoT.
In the past I’ve been using k3s for small (mostly single
node) “clusters”. The pitch is very similar. The few differences I can point
out now are:
microk8s is a bit heavier - I don’t think it will run on 1G vps, let’s try it
out
microk8s has a better HA story - definately want to try this out
k3s comes as a single “run everywhere” binary whereas microk8s requires
snap
Setting up
I’ll be using Hetzner Cloud to provision a few
small VMs for my laboratory. But I’ll start with a single one.
Specs: Ubuntu 20.04 (so we get snap out of the box), 40GB local NVMe storage, 2x
vCPU, 4GB ram. And a private network so I can connect future instances to it as
well. A minute later I’m in
root@microk8s-1:~# microk8s enable dashboard dns helm3 metrics-server registry
storage traefik
Enabling Kubernetes Dashboard
Enabling Metrics-Server
...
clusterrolebinding.rbac.authorization.k8s.io/traefik-ingress-controller created
service/traefik-web-ui created
ingress.networking.k8s.io/traefik-web-ui created
traefik ingress controller has been installed on port 8080
I must say this already feels much smoother than k3s setup.
Step 4: start using (like k3s there is a kubectl packaged in)
1
2
3
4
5
6
7
8
9
10
11
root@microk8s-1:~# microk8s kubectl get namespaces
NAME STATUS AGE
kube-system Active 7m5s
kube-public Active 7m4s
kube-node-lease Active 7m4s
default Active 7m4s
container-registry Active 3m43s
traefik Active 3m41s
root@microk8s-1:~# microk8s kubectl get nodes
NAME STATUS ROLES AGE VERSION
microk8s-1 Ready <none> 7m8s v1.20.5-34+40f5951bd9888a
Now how do I connect to this from my laptop? Quick search says microk8s config`
will outtput a kubeconfig file. And indeed it works
1
2
3
4
5
6
andraz@amaterasu /tmp/temp
$ export KUBECONFIG=$(pwd)/kubeconfig
andraz@amaterasu /tmp/temp
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
microk8s-1 Ready <none> 12m v1.20.5-34+40f5951bd9888a
Can we get a dashboard?
1
2
3
4
5
6
7
8
andraz@amaterasu /tmp/temp
$ k get service -A | grep dashboard
kube-system kubernetes-dashboard ClusterIP 10.152.183.220 <none> 443/TCP 12m
kube-system dashboard-metrics-scraper ClusterIP 10.152.183.160 <none> 8000/TCP 12m
andraz@amaterasu /tmp/temp
$ k proxy
Starting to serve on 127.0.0.1:8001
And indded I get the Web UI available on
localhost
How about resource usage?
1
2
3
4
root@microk8s-1:~# free -m
total used free shared buff/cache available
Mem: 3840 1154 331 1 2354 2524
Swap: 0 0 0
microk8s comes in at about 1.1G. Slightly heavier than k3
(around 0.8G) but lighter than RKE (around
1.7G).
Deploying workloads
Let’s try to get a demo server up and running. I’ll be using nginx-hello from
nginx-demos
$ curl http://demo.microk8s.edofic.com:8080
Server address: 10.1.35.141:80
Server name: hello-767b8bc964-scq6z
Date: 14/Apr/2021:17:05:10 +0000
URI: /
Request ID: bebd6693485e5581305da8c3e68060de
And indeed it does.
Storage
I’ll skip over storage as I can see from the description in status that it’sjj
just a host path provisioner - pretty much the same as K3s. If you want fancier
storage you need to bring your own (plenty of options).
Storage classes confirm there is nothing fancy going on
1
2
3
4
$ k get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
microk8s-hostpath (default) microk8s.io/hostpath Delete Immediate false 8h
an
Scaling / high availability
So let’s try out the magical HA setup! According to the
docs it should be as simple as
creating a token on the existing node and then issuing a single command on new
nodes to join. But first I need to create some new nodes.
I created two more VMs just like the first one and ran through the same steps on
each one:
root@microk8s-1:~# microk8s add-node
From the node you wish to join to this cluster, run the following:
microk8s join 162.55.52.75:25000/0b9399975ac32273a17f210ae7e25802
If the node you are adding is not reachable through the default interface you can use one of the following:
microk8s join 162.55.52.75:25000/0b9399975ac32273a17f210ae7e25802
microk8s join 10.0.0.2:25000/0b9399975ac32273a17f210ae7e25802
And follow the instructions (using the private IP). First on the second node.
1
2
3
4
5
6
7
8
9
10
11
12
13
root@microk8s-2:~# microk8s join 10.0.0.2:25000/0b9399975ac32273a17f210ae7e25802
Contacting cluster at 10.0.0.2
Waiting for this node to finish joining the cluster. ..
root@microk8s-2:~# microk8s status
microk8s is running
high-availability: no
datastore master nodes: 10.0.0.2:19001
datastore standby nodes: none
...
root@microk8s-2:~# microk8s kubectl get nodes
NAME STATUS ROLES AGE VERSION
microk8s-1 Ready <none> 8h v1.20.5-34+40f5951bd9888a
microk8s-2 Ready <none> 69s v1.20.5-34+40f5951bd9888a
Apparently I have a two node cluster now. Let’s join in the third node -
microk8s should automatically switch to HA mode.
1
2
3
root@microk8s-3:~# microk8s join 10.0.0.2:25000/0b9399975ac32273a17f210ae7e25802
Contacting cluster at 10.0.0.2
Failed to join cluster. Error code 500. Invalid token
Huh, looks like I need a new token. Fine.
1
2
3
root@microk8s-3:~# microk8s join 10.0.0.2:25000/cab6b9f7f538bbb5c7e1a3a3f6239c50
Contacting cluster at 10.0.0.2
Waiting for this node to finish joining the cluster. ..
After this is done I can check (on any node, outputs agree now)
1
2
3
4
5
6
root@microk8s-1:~# microk8s status
microk8s is running
high-availability: yes
datastore master nodes: 10.0.0.2:19001 10.0.0.3:19001 10.0.0.4:19001
datastore standby nodes: none
...
Yay: high-availability: yes. It worked. Looking from my laptop I now also see
three nodes.
1
2
3
4
5
$ k get nodes
NAME STATUS ROLES AGE VERSION
microk8s-2 Ready <none> 5m10s v1.20.5-34+40f5951bd9888a
microk8s-1 Ready <none> 9h v1.20.5-34+40f5951bd9888a
microk8s-3 Ready <none> 33s v1.20.5-34+40f5951bd9888a
If I check pod placement I still see that everything is still runing on node 1
1
2
3
4
5
6
andraz@amaterasu /tmp/temp
$ k get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-767b8bc964-scq6z 1/1 Running 0 92m 10.1.35.141 microk8s-1 <none> <none>
hello-767b8bc964-f4xfs 1/1 Running 0 92m 10.1.35.142 microk8s-1 <none> <none>
hello-767b8bc964-sf4ds 1/1 Running 0 92m 10.1.35.143 microk8s-1 <none> <none>
But if I kill any pod it should quickly be rescheduled around the cluster
1
2
3
4
5
6
7
8
9
10
andraz@amaterasu /tmp/temp
$ k delete pod hello-767b8bc964-sf4ds
pod "hello-767b8bc964-sf4ds" deleted
andraz@amaterasu /tmp/temp
$ k get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-767b8bc964-scq6z 1/1 Running 0 92m 10.1.35.141 microk8s-1 <none> <none>
hello-767b8bc964-f4xfs 1/1 Running 0 92m 10.1.35.142 microk8s-1 <none> <none>
hello-767b8bc964-p9xw8 1/1 Running 0 9s 10.1.100.129 microk8s-2 <none> <none>
andraz@amaterasu /tmp/temp
Great, everything behaving as expected.
Load balancing
I want to try to kill my initial node to see HA in action but this will bork my
ingress as the IP is hard coded. Luckily Hetzner also provides managed load
balancers so I’ll set one up. I can use labels to automatically pick up servers
to use as targets, use health checks to see which ones take traffic and even
expose port 80 and target 8080 internally via internal IPs. Yes I can target any
server as traefik ingress controller is provisioned on all of them and will
route traffic internally to wherever target pods are scheduled.
So all I really need to do know is to update my domain record and my cluster is
none the wiser.
Mind the lack of port 8080
1
2
3
4
5
6
$ curl http://demo.microk8s.edofic.com
Server address: 10.1.35.142:80
Server name: hello-767b8bc964-f4xfs
Date: 14/Apr/2021:17:32:52 +0000
URI: /
Request ID: 237198f55f60c5cc4629750d1f4e3908
Deleting the initial node
With my setup now truly HA let’s try to kill the initial node. I simulated this
by doing a hard power off of microk8s-1.
Load balancer needed a few moment to detecd the node is down and then it stoped
routing traffic to it.
Microk8s needed a bit more time. Initially kutectl get node still showed no.1
as ready. But after a few minutes it turned NotReady
1
2
3
4
5
root@microk8s-2:~# microk8s kubectl get nodes
NAME STATUS ROLES AGE VERSION
microk8s-1 NotReady <none> 9h v1.20.5-34+40f5951bd9888a
microk8s-2 Ready <none> 36m v1.20.5-34+40f5951bd9888a
microk8s-3 Ready <none> 32m v1.20.5-34+40f5951bd9888a
However this means that my pods are still scheduled to node 1 so I get reduced
availability of my service. However the service objects correctly routes so
requests still go through. Not great though.
Reading the docs I now need manual intervention to tell kubernetes that this
node is not coming back.
1
2
3
4
5
root@microk8s-2:~# microk8s remove-node microk8s-1 --force
root@microk8s-2:~# microk8s kubectl get nodes
NAME STATUS ROLES AGE VERSION
microk8s-2 Ready <none> 39m v1.20.5-34+40f5951bd9888a
microk8s-3 Ready <none> 35m v1.20.5-34+40f5951bd9888a
Better. How about my pods?
1
2
3
4
5
root@microk8s-2:~# microk8s kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-767b8bc964-p9xw8 1/1 Running 0 34m 10.1.100.129 microk8s-2 <none> <none>
hello-767b8bc964-5787w 1/1 Running 0 3m38s 10.1.151.68 microk8s-3 <none> <none>
hello-767b8bc964-bk5gd 1/1 Running 0 3m38s 10.1.151.65 microk8s-3 <none> <none>
All rescheduled. So microk8s HA is not a silver bullet - it does not know about
underlying comput infrastructure. But it’s really low-ops - just as the
description claims :D
Closing thoughts
It’s an interesting piece of technology. I definately see a use case where you
want to easily setup a cluster on a small number of manually managed machines.
May come use in the future.