Kubeadm defaults to running a single member etcd cluster in a static pod, managed by the kubelet on the control-plane node. This is not a high availability (HA) setup as the etcd cluster contains only one member and cannot sustain any members becoming unavailable.
$ kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE etcd-k8s-master1.novalocal 1/1 Running 0 4d3h
A cluster with external etcd, instead, is a topology where the distributed data storage cluster provided by etcd is external to the cluster formed by the nodes that run control plane components. So, etcd members run on separate hosts and each etcd host communicates with thekube-apiserver
of each control plane node. This topology decouples the control plane and etcd member. It therefore provides an HA setup where losing an etcd member has less impact and does not affect the cluster redundancy. On the other hand, higher reliability pays off with more hosts employed: the minimum number of hosts is 3, and then goes up to 5 or 7 (although there is no hard limit, an etcd cluster probably should have no more than seven nodes).
Why an odd number of cluster members?
An etcd cluster needs a majority of nodes, a quorum, to agree on updates to the cluster state. For a cluster with n
members, quorum is the integer part of (n/2)+1
. For any odd-sized cluster, adding one node will always increase the number of nodes necessary for quorum. Although adding a node to an odd-sized cluster appears better since there are more machines, the fault tolerance is worse since exactly the same number of nodes may fail without losing quorum but there are more nodes that can fail.
This page walks through the process of creating a HA etcd cluster of three members, that can be used as an external etcd when using kubeadm to set up a kubernetes cluster (official guide).
Before you begin
Make sure that:
- three hosts that can talk to each other over defaults ports 2379 and 2380 (see note in Create cluster with Kubeadm);
- the three hosts have been set up following the steps in the chapter Prepare the cluster nodes;
- each host should have access to the Kubernetes container image registry k8s.gcr.io (try the commands
kubeadm config images list
orkubeadm config images pull
).
Setting up the etcd cluster
Configure kubelet
In each host, configure the kubelet to be a service manager for etcd. With administrator privileges, override the service priority by creating a new unit file that has higher precedence than the kubeadm-provided kubelet unit file
Then, restart and check the kubelet status to ensure it is running
Location of Unit Files
The files that define how systemd
will handle a unit can be found in many different locations, each of which have different priorities and implications. The system’s copy of unit files are generally kept in the /lib/systemd/system
directory. When software installs unit files on the system, this is the location where they are placed by default. You should not edit files in this directory. Instead you should override the file, if necessary, using another unit file location which will supersede the file in this location. If you wish to modify the way that a unit functions, the best location to do so is within the /etc/systemd/system
directory. Unit files found in this directory location take precedence over any of the other locations on the filesystem. If you need to modify the system’s copy of a unit file, putting a replacement in this directory is the safest and most flexible way to do this.
Generate the certificate authority
If you already have a CA then the only action that is copying the CA's crt
and key
file to /etc/kubernetes/pki/etcd/ca.crt
and /etc/kubernetes/pki/etcd/ca.key
. After those files have been copied, proceed to the next step. If you do not already have a CA then run kubeadm init phase certs etcd-ca
on master etcd, which will create the two files mentioned above. Appoint one of the etcd nodes as master, which will be used to generate all the necessary certificates, which will then be distributed on the other nodes.
Create configuration files and certificates
Generate one kubeadm configuration file and certificates for each host, that will have an etcd member running on it using the following script. Once the script has been copied, run it on the master etcd.
Copy kubeadm configs and certificates
The certificates have been generated and now they must be moved to their respective hosts. The complete list of files required on various hosts is:
/tmp/${HOST0}
└── kubeadmcfg.yaml
---
/etc/kubernetes/pki
├── apiserver-etcd-client.crt
├── apiserver-etcd-client.key
└── etcd
├── ca.crt
├── ca.key
├── healthcheck-client.crt
├── healthcheck-client.key
├── peer.crt
├── peer.key
├── server.crt
└── server.key
$HOME # HOST1
└── kubeadmcfg.yaml
---
/etc/kubernetes/pki
├── apiserver-etcd-client.crt
├── apiserver-etcd-client.key
└── etcd
├── ca.crt
├── healthcheck-client.crt
├── healthcheck-client.key
├── peer.crt
├── peer.key
├── server.crt
└── server.key
$HOME # HOST2
└── kubeadmcfg.yaml
---
/etc/kubernetes/pki
├── apiserver-etcd-client.crt
├── apiserver-etcd-client.key
└── etcd
├── ca.crt
├── healthcheck-client.crt
├── healthcheck-client.key
├── peer.crt
├── peer.key
├── server.crt
└── server.key
Create the static pod manifests
Now that the certificates and configs are in place it's time to create the manifests. On each host run the kubeadm
command to generate a static manifest for etcd
Optional: Check the cluster health
To verify that the procedure was performed correctly, we use the following command
Set ${ETCD_TAG}
to the version tag of your etcd image (e.g. 3.4.13-0
). To see the etcd image and tag that kubeadm uses execute kubeadm config images list --kubernetes-version ${K8S_VERSION}
, where ${K8S_VERSION}
is for example v1.20.2
. Obviously, set ${HOST0}
to the IP address of the host you are testing.
Join the etcd cluster
At this point the etcd cluster is ready and must be joined to the control-plane. Let's move on to the control-plane, on which we paste the following files from the master etcd
Create a file called kubeadm-config.yaml
with the following contents (replace the IPs appropriately) and save it in /etc/kubernetes/manifests
Run on control-plane (it is recommended to write the output join commands that are returned to a text file for later use)
We perform the steps suggested in the command output and already seen in the chapter Building the cluster. Finally, we quickly verify that the etcd cluster is "seen" by the control-plane and that it is healthy
Later we will see how to monitor an external etcd cluster with Prometheus Operator.