Why Persistent Volume?
The need for persistent storage in Kubernetes arises for two reasons:
- ephemerality of the Pods;
- the ability to share data between different Pods, which are part of the same application, or between the containers of the same Pod.
Kubernetes doesn't provide data persistence out of the box, which means when a pod is re-created, the data is gone. So, you need to create and configure the actual physical storage and manage it by yourself. Once configured, you can use that physical storage using Kubernetes storage components.
How does storage work in Kubernetes?
To fulfill this work, Kubernetes provides 3 components, that you need to use to connect the actual physical storage to your pod, so that the application inside the container can access it. They are (references to the official guide in the list):
- Persistent Volume (PV): a piece of storage in the cluster that has been provisioned by an administrator (static provisioning) or using Storage Classes (dynamic provisioning). It is a resource in the cluster, like CPU or RAM, and, like them, have a lifecycle independent of any individual Pod that uses the PV (reference to the guide).
- Storage Class (SC): a way for administrators to describe the "classes" of storage they offer (reference to the guide). Borrowing the concepts of the OOP paradigm, the PV is an instance of the SC, which takes place thanks to the PVC constructor.
- Persistent Volume Claim (PVC): a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (more details later or in the guide).
From the descriptions of the 3 components it is clear that PV and SC are managed by the administrators (backend), while PVC concerns the user (frontend). Furthermore, it is clear that there are two ways of implementing the resources useful for storage, namely PVs:
- static provisioning: a cluster administrator creates a number of PVs. They carry the details of the real storage, which is available for use by cluster users. They exist in the Kubernetes API and are available for consumption.
- dynamic provisioning: in this case the cluster administrator is limited to creating a manifest, which will be used (responsibly) by the user to instantiate the PVs necessary for his applications.
In the static approach, if no PV present on the cluster meets the user's needs, the latter will have to contact the administrator and request the creation of a new PV with the desired specifications. This scenario, although more controlled than the dynamic one, can be inefficient if the requests become numerous.
Parameterization
These Kubernetes components can be finely customized thanks to a number of parameters. We briefly present the theoretical aspects concerning their personalization.
Persistent Volume
The following parameters can be managed only if the static mode is chosen, since in the dynamic case the PVs are generated on the basis of the SC and the PVCs. The aspects concerning the PV are:
- Capacity. A PV will have a specific storage capacity. This is set using the PV's capacity attribute. An example list of accepted values are 500K, 100M, 5G, 800Ki, 350Mi, 1Ti. The letter "i" accompanying the various SI prefixes indicates numbers on a binary basis, rather than a decimal basis (more details here).
- Volume mode. Kubernetes supports
Filesystem (default) and Block volume modes. In the first case, a volume is mounted into Pods into a directory. In the second, volume is presented into a Pod as a block device, without any filesystem on it. This mode is useful to provide a Pod the fastest possible way to access a volume, without any filesystem layer between the Pod and the volume. - Access Mode. The ways of accessing the volume are shown below. Not all providers, however, support the 3 modes listed.
ReadWriteOnce (RWO): the volume can be mounted as read-write by a single node;ReadOnlyMany (ROX): the volume can be mounted read-only by many nodes;ReadWriteMany (RWX): the volume can be mounted as read-write by many nodes.
- Class. A PV can have a class, which is specified by setting the storageClassNameattribute to the name of a SC. A PV of a particular class can only be bound to PVCs requesting that class. A PV with no storageClassName has no class and can only be bound to PVCs that request no particular class. In the static case, therefore, the class merely has the function of a label.
- Reclaim Policy. Current reclaim policies are listed below. As in the case of access mode, policy support depends on the provider used.
- Retain: manual reclamation;
- Recycle: basic scrub (
rm -rf /thevolume/*); - Delete: associated storage asset is deleted.
- Mount Option. A Kubernetes administrator can specify additional mount options for when a PV is mounted on a node, using the
mountOptions attribute. Mount options are not validated, so mount will simply fail if one is invalid. Again, not all persistent volume types support mount options. - Node Affinity. Kubernetes offers us the possibility to create a sub-selection of nodes, from which the volume can be accessed. Pods that use a PV will only be scheduled to nodes that are selected by the node affinity.
- Phase. Even if it does not represent a parameter, we conclude the part on the PV with a picture regarding the possible status it can assume.
- Available: a free resource that is not yet bound to a claim;
- Bound: the volume is bound to a claim;
- Released: the claim has been deleted, but the resource is not yet reclaimed by the cluster;
- Failed: the volume has failed its automatic reclamation;
Persistent Volume Claim
Now let's move on to the component that deals with claiming pieces of storage. PVC can be used both in static and dynamic mode, but it was mainly born to be used on the frontend side in dynamic mode. The aspects concerning the PVC are:
- Capacity. Claims, like Pods, can request specific quantities of a resource. In this case, the request is for storage and use the same convention as PV.
- Volume mode. Same conventions as PV.
- Access Mode. Same conventions as PV.
- Selector. Claims can specify a label selector to further filter the set of volumes. Only the volumes whose labels match the selector can be bound to the claim.
- Class.