Why Persistent Volume?
The need for persistent storage in Kubernetes arises for two reasons:
- ephemerality of the Pods;
- the ability to share data between different Pods, which are part of the same application, or between the containers of the same Pod.
Kubernetes doesn't provide data persistence out of the box, which means when a pod is re-created, the data is gone. So, you need to create and configure the actual physical storage and manage it by yourself. Once configured, you can use that physical storage using Kubernetes storage components.
How does storage work in Kubernetes?
To fulfill this work, Kubernetes provides 3 components, that you need to use to connect the actual physical storage to your pod, so that the application inside the container can access it. They are (references to the official guide in the list):
- Persistent Volume (PV): a piece of storage in the cluster that has been provisioned by an administrator (static provisioning) or using Storage Classes (dynamic provisioning). It is a resource in the cluster, like CPU or RAM, and, like them, have a lifecycle independent of any individual Pod that uses the PV (reference to the guide).
- Storage Class (SC): a way for administrators to describe the "classes" of storage they offer (reference to the guide). Borrowing the concepts of the OOP paradigm, the PV is an instance of the SC, which takes place thanks to the PVC constructor.
- Persistent Volume Claim (PVC): a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (more details later or in the guide).
From the descriptions of the 3 components it is clear that PV and SC are managed by the administrators (backend), while PVC concerns the user (frontend). Furthermore, it is clear that there are two ways of implementing the resources useful for storage, namely PVs:
- static provisioning: a cluster administrator creates a number of PVs. They carry the details of the real storage, which is available for use by cluster users. They exist in the Kubernetes API and are available for consumption.
- dynamic provisioning: in this case the cluster administrator is limited to creating a manifest, which will be used (responsibly) by the user to instantiate the PVs necessary for his applications.
In the static approach, if no PV present on the cluster meets the user's needs, the latter will have to contact the administrator and request the creation of a new PV with the desired specifications. This scenario, although more controlled than the dynamic one, can be inefficient if the requests become numerous.
Parameterization
These Kubernetes components can be finely customized thanks to a number of parameters. We briefly present the theoretical aspects concerning their personalization.
Persistent Volume
The aspects concerning the PV are:
- Capacity. A PV will have a specific storage capacity. This is set using the PV's capacity attribute. An example list of accepted values are 500K, 100M, 5G, 800Ki, 350Mi, 1Ti. The letter "i" accompanying the various SI prefixes indicates numbers on a binary basis, rather than a decimal basis (more details here).
- Volume mode. Kubernetes supports Filesystem (default) and Block volume modes. In the first case, a volume is mounted into Pods into a directory. In the second, volume is presented into a Pod as a block device, without any filesystem on it. This mode is useful to provide a Pod the fastest possible way to access a volume, without any filesystem layer between the Pod and the volume.
- Access Mode. The ways of accessing the volume are shown below. Not all providers, however, support the 3 modes listed.
- ReadWriteOnce (RWO): the volume can be mounted as read-write by a single node;
- ReadOnlyMany (ROX): the volume can be mounted read-only by many nodes;
- ReadWriteMany (RWX): the volume can be mounted as read-write by many nodes.