Rook provides a growing number of storage providers to a Kubernetes cluster, each with its own operator to deploy and manage the resources for the storage provider. One of these is Cepha highly scalable distributed storage solution for block storage, object storage, and shared filesystems with years of production deployments. Here the official Rook documentation.

Glossary

We present a small glossary that can be useful in reading this page:

  • Container Storage Interface (CSI) is a set of specifications for container orchestration frameworks to manage storage. The CSI spec abstracts common storage features such as create/delete volumes, publish/unpublish volumes, stage/unstage volumes, and more. It is projected that CSI will be the only supported persistent storage driver in the near feature.
  • Logical Volume Manager (LVM) is a tool for logical volume management. LVM can be used to create easy to maintain logical volumes, manage disk quotas using logical volumes, resize logical volumes on the fly, create software RAIDs, combining hard drives into a big storage pool and many more.
  • The Ceph ManaGeRs daemon (MGRs) are runtime daemons responsible for keeping track of runtime metrics and the current state of your Ceph cluster. They runs alongside monitor daemons, to provide additional monitoring and interfaces to external monitoring and management systems.
  • Ceph MONitors (MONs) are responsible for maintaining the maps of the cluster required for the Ceph daemons to coordinate with each other. There should always be more than one MON running to increase the reliability and availability of your storage service.
  • Ceph Object Storage Daemons (OSDs) are the heart and soul of the Ceph storage platform. Each OSD manages a local device and together they provide the distributed storage. Rook will automate creation and management of OSDs to hide the complexity based on the desired state in the CephCluster CR as much as possible.
  • Placement Groups (PGs) are an internal implementation detail of how Ceph distributes data. You can allow the cluster to either make recommendations or automatically tune PGs based on how the cluster is used by enabling pg-autoscaling.

Prerequisites

In order to configure the Ceph storage cluster, the following prerogatives must be respected:

  1. Kubernetes cluster v1.16 or higher;
  2. LVM needs to be available on the hosts where OSDs will be running;
  3. Raw devices or partitions (no partitions or formatted filesystems);
  4. Required at least three worker nodes.

The first 2 points can be easily crossed out. Check the version of your k8s cluster with kubectl version --short and install (on CentOS) the mentioned package with sudo yum install -y lvm2. About the third point, you can confirm whether your partitions or devices are formatted filesystems with the following command. If the FSTYPE field is not empty, there is a filesystem on top of the corresponding device. In this case, you can use only vdb for Ceph and can’t use vda and its partitions.

Formatted FS?
$ lsblk -f
NAME                  FSTYPE      LABEL UUID                                   MOUNTPOINT
vda
└─vda1                LVM2_member       eSO50t-GkUV-YKTH-WsGq-hNJY-eKNf-3i07IB
  ├─ubuntu--vg-root   ext4              c2366f76-6e21-4f10-a8f3-6776212e2fe4   /
  └─ubuntu--vg-swap_1 swap              9492a3dc-ad75-47cd-9596-678e8cf17ff9   [SWAP]
vdb

From the last point of the list it is clear that 3 worker nodes and therefore 3 memory archives are required. Attach as many volumes to your worker nodes (refer to the OpenStack guide). In the volume creation dialog, it is important to set the Volume Source field to No source, empty volume. Regarding the space, a few tens of GB is enough.

Deploy the Rook Operator

The first step is to deploy the Rook operator. To do this, clone the repository from GitHub, move to the indicated folder and run

Rook Operator
$ git clone --single-branch --branch v1.8.7 https://github.com/rook/rook.git
$ cd rook/deploy/examples
$ kubectl create -f crds.yaml -f common.yaml -f operator.yaml

The operator, like the other components that we will see shortly, are implemented in the rook-ceph namespace. Verify the rook-ceph-operator is in the Running state before proceeding

Verify Operator
$ kubectl get all -l app=rook-ceph-operator -n rook-ceph
NAME                                      READY   STATUS    RESTARTS   AGE
pod/rook-ceph-operator-5ff4d5c446-4ldhx   1/1     Running   0          5h40m
NAME                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/rook-ceph-operator-5ff4d5c446   1         1         1       5h40m

Create a Rook Ceph Cluster

Now that the Rook operator is running we can create the Ceph cluster. For the cluster to survive reboots, make sure you set the dataDirHostPath property that is valid for your hosts (the default is /var/lib/rook). Create the cluster (the operation takes a few minutes)

Create Cluster
$ kubectl create -f cluster.yaml
# List pods in the rook-ceph namespace. You should be able to see the following pods once they are all running
$ kubectl -n rook-ceph get pod
NAME                                                              READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-9r4hd                                            3/3     Running     0          5h51m
csi-cephfsplugin-dfffx                                            3/3     Running     0          5h51m
csi-cephfsplugin-mlr6c                                            3/3     Running     0          5h51m
csi-cephfsplugin-provisioner-8658f67749-bprh7                     6/6     Running     9          5h51m
csi-cephfsplugin-provisioner-8658f67749-lqlm6                     6/6     Running     24         5h51m
csi-rbdplugin-provisioner-6bc6766db-2m72j                         6/6     Running     21         5h51m
csi-rbdplugin-provisioner-6bc6766db-vfv6n                         6/6     Running     6          5h51m
csi-rbdplugin-r6kzg                                               3/3     Running     0          5h51m
csi-rbdplugin-slglp                                               3/3     Running     0          5h51m
csi-rbdplugin-xksk8                                               3/3     Running     0          5h51m
rook-ceph-crashcollector-k8s-worker-1.novalocal-685685cd4bfr2fp   1/1     Running     0          5h50m
rook-ceph-crashcollector-k8s-worker-2.novalocal-65799fd97cvbq78   1/1     Running     0          5h40m
rook-ceph-crashcollector-k8s-worker-3.novalocal-78499fc58dwhgwm   1/1     Running     0          5h51m
rook-ceph-mgr-a-774d799bc7-jfc9m                                  1/1     Running     0          5h50m
rook-ceph-mon-a-57498775bf-d9kjk                                  1/1     Running     0          5h51m
rook-ceph-mon-b-866d86c8ff-rj5g9                                  1/1     Running     0          5h51m
rook-ceph-mon-c-dbdc6994b-wtvfz                                   1/1     Running     0          5h50m
rook-ceph-operator-5ff4d5c446-4ldhx                               1/1     Running     0          5h53m
rook-ceph-osd-0-57bf74dc8-kj444                                   1/1     Running     0          5h40m
rook-ceph-osd-1-86dc8bf468-rsvld                                  1/1     Running     0          5h40m
rook-ceph-osd-2-5b87cf587d-9pqsx                                  1/1     Running     0          5h40m
rook-ceph-osd-prepare-k8s-worker-1.novalocal-9wzk2                0/1     Completed   0          9m10s
rook-ceph-osd-prepare-k8s-worker-2.novalocal-sgcdb                0/1     Completed   0          9m8s
rook-ceph-osd-prepare-k8s-worker-3.novalocal-lwpxz                0/1     Completed   0          9m5s

If you did not modify the cluster.yaml above, it is expected that one OSD will be created per node. The file, which is fine in most cases, contains many parameters that can be changed. Here you will find a detailed list.

Cleanup

If you want to tear down the cluster and bring up a new one, be aware of the following resources that will need to be cleaned up:

  • rook-ceph namespace: The Rook operator and cluster created by operator.yaml and cluster.yaml (the cluster CRD);
  • /var/lib/rook: Path on each host in the cluster where configuration is cached by the ceph mons and osds.

Note

If you changed the default namespaces or paths such as dataDirHostPath in the sample yaml files, you will need to adjust these namespaces and paths throughout these instructions. Moreover, first you will need to clean up the resources created on top of the Rook cluster.

A namespace cannot be removed until all of its resources are removed. Therefore, to eliminate it, we execute the commands in the following order

Delete CephCluster, Operator and related Resources
$ kubectl -n rook-ceph delete cephcluster rook-ceph
# Verify that the cluster CRD has been deleted (kubectl -n rook-ceph get cephcluster), before continuing.
# Remember that the path of the following files is "rook/deploy/examples".
$ kubectl delete -f operator.yaml
$ kubectl delete -f common.yaml
$ kubectl delete -f crds.yaml

At this point connect to each machine and delete /var/lib/rook, or the path specified by the dataDirHostPath. If the cleanup instructions are not executed in the order above, or you otherwise have difficulty cleaning up the cluster, here are a few things to try.

  • No labels