Rook provides a growing number of storage providers to a Kubernetes cluster, each with its own operator to deploy and manage the resources for the storage provider. One of these is Ceph: a highly scalable distributed storage solution for block storage, object storage, and shared filesystems with years of production deployments. Here the official Rook documentation.
Glossary
We present a small glossary that can be useful in reading this page:
- Container Storage Interface (CSI) is a set of specifications for container orchestration frameworks to manage storage. The CSI spec abstracts common storage features such as create/delete volumes, publish/unpublish volumes, stage/unstage volumes, and more. It is projected that CSI will be the only supported persistent storage driver in the near feature.
- Logical Volume Manager (LVM) is a tool for logical volume management. LVM can be used to create easy to maintain logical volumes, manage disk quotas using logical volumes, resize logical volumes on the fly, create software RAIDs, combining hard drives into a big storage pool and many more.
- The Ceph ManaGeRs daemon (MGRs) are runtime daemons responsible for keeping track of runtime metrics and the current state of your Ceph cluster. They runs alongside monitor daemons, to provide additional monitoring and interfaces to external monitoring and management systems.
- Ceph MONitors (MONs) are responsible for maintaining the maps of the cluster required for the Ceph daemons to coordinate with each other. There should always be more than one MON running to increase the reliability and availability of your storage service.
- Ceph Object Storage Daemons (OSDs) are the heart and soul of the Ceph storage platform. Each OSD manages a local device and together they provide the distributed storage. Rook will automate creation and management of OSDs to hide the complexity based on the desired state in the CephCluster CR as much as possible.
- Placement Groups (PGs) are an internal implementation detail of how Ceph distributes data. You can allow the cluster to either make recommendations or automatically tune PGs based on how the cluster is used by enabling pg-autoscaling.
Prerequisites
In order to configure the Ceph storage cluster, the following prerogatives must be respected:
- Kubernetes cluster v1.16 or higher;
- LVM needs to be available on the hosts where OSDs will be running;
- Raw devices or partitions (no partitions or formatted filesystems);
- Required at least three worker nodes.
The first 2 points can be easily crossed out. Check the version of your k8s cluster with kubectl version --short
and install (on CentOS) the mentioned package with sudo yum install -y lvm2
. About the third point, you can confirm whether your partitions or devices are formatted filesystems with the following command. If the FSTYPE
field is not empty, there is a filesystem on top of the corresponding device. In this case, you can use only vdb
for Ceph and can’t use vda
and its partitions.
From the last point of the list it is clear that 3 worker nodes and therefore 3 memory archives are required. Attach as many volumes to your worker nodes (refer to the OpenStack guide). In the volume creation dialog, it is important to set the Volume Source
field to No source, empty volume
. Regarding the space, a few tens of GB is enough.
Deploy the Rook Operator
The first step is to deploy the Rook operator. To do this, clone the repository from GitHub, move to the indicated folder and run
The operator, like the other components that we will see shortly, are implemented in the rook-ceph
namespace. Verify the rook-ceph-operator is in the Running
state before proceeding
Create a Rook Ceph Cluster
Now that the Rook operator is running we can create the Ceph cluster. For the cluster to survive reboots, make sure you set the dataDirHostPath
property that is valid for your hosts (the default is /var/lib/rook
). Create the cluster (the operation takes a few minutes)
If you did not modify the cluster.yaml
above, it is expected that one OSD will be created per node. The file, which is fine in most cases, contains many parameters that can be changed. Here you will find a detailed list.
Cleanup
If you want to tear down the cluster and bring up a new one, be aware of the following resources that will need to be cleaned up:
rook-ceph
namespace: The Rook operator and cluster created byoperator.yaml
andcluster.yaml
(the cluster CRD);/var/lib/rook
: Path on each host in the cluster where configuration is cached by the ceph mons and osds.
Note
If you changed the default namespaces or paths such as dataDirHostPath
in the sample yaml files, you will need to adjust these namespaces and paths throughout these instructions. Moreover, first you will need to clean up the resources created on top of the Rook cluster.
A namespace cannot be removed until all of its resources are removed. Therefore, to eliminate it, we execute the commands in the following order
At this point connect to each machine and delete /var/lib/rook
, or the path specified by the dataDirHostPath
. If the cleanup instructions are not executed in the order above, or you otherwise have difficulty cleaning up the cluster, here are a few things to try.