Page History

...

To restore a cluster, all that is needed is a single snapshot snapshot.db file. A cluster restore with etcdctl snapshot restore creates new etcd data directories; all members should restore using the same snapshot. Restoring overwrites some snapshot metadata (specifically, the member ID and cluster ID); the member loses its former identity. Therefore in order to start a cluster from a snapshot, the restore must start a new logical cluster.

...

Code Block

language	bash
title	Restore snapshot

# Copy the snapshot.db file to all etcd nodes
$ scp snapshot.db etcd1:
# Repeat this command for all etcd members, to create the directory
$ etcdctl snapshot restore <path>/<snapshot> [--data-dir <data_dir>] --name etcd1 --initial-cluster etcd1=https://<IP1>:2380,etcd2=https://<IP2>:2380,etcd3=https://<IP3>:2380 --initial-cluster-token <token> --initial-advertise-peer-urls https://<IP1>:2380
# For instance, for the first node
$ etcdctl snapshot restore snapshot.db --data-dir /tmp/restore_snap1 --name etcd1 --name etcd1 --initial-cluster etcd1=https://etcd1:2380,etcd2=https://etcd2:2380,etcd3=https://etcd3:2380 --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://etcd1:2380
$ etcdctl snapshot restore snapshot.db --data-dir /tmp/restore_snap2 --name etcd2 --initial-cluster etcd1=https://etcd1:2380,etcd2=https://etcd2:2380,etcd3=https://etcd3:2380 --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://etcd2:2380 
$ etcdctl snapshot restore snapshot.db --data-dir /tmp/restore_snap3 --name etcd3 --initial-cluster etcd1=https://etcd1:2380,etcd2=https://etcd2:2380,etcd3=

Before we continue, let's stop all the API server instances. Then stop the etcd service on the nodes.

Code Block

language	bash
title	Pause cluster

# Let's go to the master(s) and temporarily move the "kube-apiserver.yaml" file
$ sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/            
# Stop etcd service on etcd node(s)
$ sudo systemctl stop etcd.service

As said, the restore command generates the member directory, which will be pasted into the path where the etcd node data are stored (the default path is /var/lib/etcd/)

Code Block

language	bash
title	Copy snapshot

https://etcd3:2380 --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://etcd3:2380  

# Paste the snapshot into the path where the etcd node data are stored
$ cp -r /tmp/snap_dir1/member<path>/<restore> /var/lib/etcd/

...


# For each etcd node 
$ cp -r $HOME/etcd1.etcd/member/ /var/lib/etcd/

...

Finally, we restart the etcd service on the nodes and restore the API server

Code Block

language	bash
title	Restart cluster

# Start etcd service on etcd node(s)
$ sudo systemctl start etcd.service
# Restore the API server from the master(s)
$ sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/

Tip
We also recommend restarting any components (e.g. `kube-scheduler`, `kube-controller-manager`, `kubelet`) to ensure that they don't rely on some stale data. Note that in practice, the restore takes a bit of time. During the restoration, critical components will lose leader lock and restart themselves.

...

Code Block

language	bash
title	Copy snapshot

# Paste the snapshot into the path where the etcd node data are stored $ cp -r <path>/<restore> /var/lib/etcd/ # For each etcd node $ cp -r /tmp/restore_snap1/member /var/lib/etcd/ $ cp -r /tmp/restore_snap2/member /var/lib/etcd/ $ cp -r /tmp/restore_snap3/member /var/lib/etcd/

Page tree

Versions Compared

Old Version 18

New Version 19

Key