Kubespray provides additional playbooks to manage your cluster: you can upgrade your cluster by running the upgrade-cluster.yml playbook. The reasons that can lead to updating your cluster may be due to an update of the GitHub repository or because you want to make some customizations. In practice, the update command is similar to that seen for cluster deployment

# As usual, run the command from the "kubespray" folder
$ ansible-playbook -i inventory/mycluster/hosts.yaml -b upgrade-cluster.yml

The difference lies in the playbook used, of course. Components are upgraded in the order in which they were installed in the Ansible playbook. The order of component installation is as follows:

Docker
etcd
kubelet and kube-proxy
network_plugin (such as Calico or Weave)
kube-apiserver, kube-scheduler, and kube-controller-manager
Add-ons (such as KubeDNS)

It is important to note that upgrade-cluster.yml can only be used for upgrading an existing cluster. That means there must be at least 1 kube-master already deployed. Anyway, before launching the update, let's see a handful of parameters that can be changed and what to do in order not to run into versioning problems.

Plan your cluster deployment

The main files to be analyzed and modified for the configuration of our cluster are:

inventory/mycluster/group_vars/all/all.yml;
inventory/mycluster/group_vars/k8s-cluster/addons.yml;
inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml;
inventory/mycluster/group_vars/k8s-cluster/k8s-net-<cni_chosen>.yml.

In particular, the second on the list allows you to find applications such as Helm, Metrics Server, Nginx ingress controller, etc. already installed in the cluster. The third allows you to choose the destination path of k8s files, the CNI (calico, cilium, weave or flannel), the CRI (docker, cri-o, containerd) and much more. Furthermore, it offers the possibility to access the Kubernetes API directly from the SA, even if it is not part of the cluster, and to install Netchecker, which thanks to its agents ensures connectivity between the Pods.

Multiple upgrades

Attempting to upgrade from an older release straight to the latest release is unsupported and likely to break something: do not skip releases when upgrading, but by one tag at a time.

For instance, if you're on v2.13.1, then check out v2.13.2, run the upgrade, check out the next tag, and run the next upgrade, etc. Proceed in this way until the latest version of the repository. To manage the repository tags, run the following commands inside the kubespray folder

# Current repo tag
$ git describe --tag
v2.13.1-115-g68d18daf

# List of all tags
$ git tag
...
v2.13.1	# current version
v2.13.2
v2.13.3
v2.13.4
v2.14.0
v2.14.1
v2.14.2
v2.15.0	# final version

# Edit the tag
$ git checkout v2.13.2
Previous HEAD position was a923f4e7 Update kube_version_min_required and cleanup hashes for release (#7160)
HEAD is now at 3d6b9d6c Update hashes and set default to 1.17.7 (#6286)

Therefore, update the cluster with the command at the beginning of the page, based on the v2.13.2 tag and so on up to the v2.15.0 tag. It is inevitable that in the updating phase, from time to time, small manual interventions may be necessary.

Appendix

It can be instructive to analyze what happens in the cluster during the update. Then run the update command from the SA and, in another terminal connected to a cluster node, watch live what happens inside it. Run the command

$ watch -x kubectl get pod,node -o wide -A
# The following screen will appear, which updates periodically
Every 2.0s: kubectl get pod,node -o wide -A                                            node1: Tue Mar  9 17:18:01 2021
NAMESPACE     NAME                                           READY   STATUS    RESTARTS   AGE   IP               NODE
default       pod/netchecker-agent-hostnet-l6s5x             1/1     Running   0          25h   192.168.100.18   node1
default       pod/netchecker-agent-hostnet-rf5jl             1/1     Running   0          25h   192.168.100.23   node2
default       pod/netchecker-agent-hostnet-sc5h7             1/1     Running   0          25h   192.168.100.25   node3
default       pod/netchecker-agent-kqsz7                     1/1     Running   0          25h   10.233.90.3      node1
default       pod/netchecker-agent-lp5pf                     1/1     Running   0          25h   10.233.92.2      node3
default       pod/netchecker-agent-z7vb5                     1/1     Running   0          25h   10.233.96.2      node2
default       pod/netchecker-server-f98789d55-xr6n9          1/1     Running   2          24h   10.233.96.8      node2
kube-system   pod/calico-kube-controllers-596bd759d5-x2zqc   1/1     Running   0          24h   192.168.100.23   node2
kube-system   pod/calico-node-772q2                          1/1     Running   0          25h   192.168.100.23   node2
kube-system   pod/calico-node-lnh5z                          1/1     Running   0          25h   192.168.100.25   node3
kube-system   pod/calico-node-zcqjh                          1/1     Running   0          25h   192.168.100.18   node1
kube-system   pod/coredns-657959df74-7289c                   1/1     Running   0          24h   10.233.96.7      node2
kube-system   pod/coredns-657959df74-rtl2d                   1/1     Running   0          24h   10.233.90.4      node1
kube-system   pod/dns-autoscaler-b5c786945-brq6n             1/1     Running   0          24h   10.233.90.5      node1
kube-system   pod/kube-apiserver-node1                       1/1     Running   0          25h   192.168.100.18   node1
kube-system   pod/kube-controller-manager-node1              1/1     Running   0          25h   192.168.100.18   node1
kube-system   pod/kube-proxy-67lvh                           1/1     Running   0          24h   192.168.100.18   node1
kube-system   pod/kube-proxy-whqwb                           1/1     Running   0          24h   192.168.100.25   node3
kube-system   pod/kube-proxy-zs6kf                           1/1     Running   0          24h   192.168.100.23   node2
kube-system   pod/kube-scheduler-node1                       1/1     Running   0          25h   192.168.100.18   node1
kube-system   pod/metrics-server-5cd75b7749-d2594            2/2     Running   0          24h   10.233.90.6      node1
kube-system   pod/nginx-proxy-node2                          1/1     Running   0          25h   192.168.100.23   node2
kube-system   pod/nginx-proxy-node3                          1/1     Running   0          25h   192.168.100.25   node3
kube-system   pod/nodelocaldns-hj5t8                         1/1     Running   0          25h   192.168.100.18   node1
kube-system   pod/nodelocaldns-j7zvh                         1/1     Running   0          25h   192.168.100.23   node2
kube-system   pod/nodelocaldns-jqbx7                         1/1     Running   0          25h   192.168.100.25   node3

NAMESPACE   NAME         STATUS   ROLES                  AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE
            node/node1   Ready    control-plane,master   25h   v1.20.4   192.168.100.18   <none>        CentOS Linux 8
            node/node2   Ready    <none>                 25h   v1.20.4   192.168.100.23   <none>        CentOS Linux 8
            node/node3   Ready    <none>                 25h   v1.20.4   192.168.100.25   <none>        CentOS Linux 8

The nodes are not updated at the same time, but in turn. The node being updated changes its STATUS to Ready, SchedulingDisabled. As long as it remains in this state, you will notice that all the Pods implemented on it are eliminated and moved to the other available nodes. Once the update is finished, it will return to Ready and move on to the next node.