Let's try to apply the theoretical concepts on a Kubernetes cluster. The practical test will embrace both static provisioning and dynamic provisioning.
In this guide we will use one of the most popular volume types in Kubernetes, namely Network File System (NFS). It's a distributed file system protocol, allowing a user on a client computer to access files over a computer network much like local storage is accessed. Of course, this networking protocol must already exist, as Kubernetes doesn't run NFS, the pods just access it. If you are implementing an NFS server for the first time, you can consult one of the many guides on the web.
Static provisioning
Let's start with the simplest case, namely the static one. For convenience we create a folder in which we will insert only 3 .yaml
files: one for the PV, one for the PVC and, finally, one for the application, which will exploit the persistence of the data. After sharing the folder /home/share
between the cluster nodes, we copy the following .yaml
file
Next we will use an Nginx image to carry out our tests. We could then create a file in the shared folder, called index.html
, containing a simple string like "Hello from Kubernetes Storage!". At this point, let's deploy the PV. Note that the component's status is available for now.
$ kubectl apply -f pv.yaml persistentvolume/mypv created $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE mypv 100Mi RWX Retain Available 5s
Let's move to the user side (frontend) and copy the following file
Before proceeding with the apply of the file, we need to create a namespace. The PVs do not have namespaces (as well as apiservices, nodes and the namespaces themselves), because they are created and made available at the cluster level. The PVC, on the other hand, makes a request for storage from the user, which certainly does not have the scope of the administrator (backend), and therefore needs a namespace. So
$ kubectl create ns static namespace/static created $ kubectl apply -f pvc.yaml -n static persistentvolumeclaim/mypvc created $ kubectl get pvc -n static NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE static mypvc Bound mypv 100Mi RWX 2m # Let's display the PV again. We note that after binding with PVC, in the CLAIM column, its state has changed to bound $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE mypv 100Mi RWX Retain Bound static/mypvc 5m
Now all we have to do is implement an application that takes advantage of the built infrastructure. For testing purposes, we can also create a simple Pod, but we'll make use of the StatefulSet here. It is a workload API object used to manage, exactly, stateful applications (suitable for our cases). We therefore copy
Deploy the application, in the same namespace as the PVC, and verify that everything works correctly. To check it, you can run the curl
command, followed by the IP of the service or Pod, or go directly to the folder specified in the container's mountPath
(in this case /usr/share/nginx/html
). So
$ kubectl get all -n static -o wide NAME READY STATUS RESTARTS AGE IP NODE pod/mystate-0 1/1 Running 0 13m 172.16.231.239 mycentos-1.novalocal pod/mystate-1 1/1 Running 0 13m 172.16.94.103 mycentos-2.novalocal pod/mystate-2 1/1 Running 0 13m 172.16.141.62 mycentos-ing.novalocal NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/mysvc ClusterIP 10.98.99.55 <none> 80/TCP 13m app=myapp NAME READY AGE CONTAINERS IMAGES statefulset.apps/mystate 3/3 13m mycontainer nginx # We use, for example, the IP of the service and of the pod/mystate-2 $ curl 10.98.99.55 Hello from Kubernetes Storage! $ curl 172.16.141.62 Hello from Kubernetes Storage! # Let's enter the pod/mystate-1 and go to the path indicated in the StatefulSet manifest, or run the "curl localhost" command $ kubectl exec -it pod/mystate-1 -n static -- bash root@mystate-1:/$ curl localhost Hello from Kubernetes Storage! root@mystate-1:/$ cat /usr/share/nginx/html/index.html Hello from Kubernetes Storage!
Finally, try to modify the index.html
file from inside the Pod and verify that the changes are acquired from the file on the node (and vice versa).
Dynamic provisioning
The major difference between dynamic provisioning and the previous one is that the PV is replaced by the SC. So let's copy the .yaml
file
Deploy and view the newly created component.
$ kubectl apply -f sc.yaml storageclass.storage.k8s.io/mysc created $ kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE mysc (default) nfs Retain WaitForFirstConsumer false 2s
Unfortunately, NFS doesn't provide an internal provisioner, but an external provisioner can be used. For this purpose, we will implement the following two .yaml
files. The first is related to the service account. It will create role, role binding, and various roles within the kubernetes cluster.
The second file implement an automatic provisioner, that use your existing and already configured NFS server to support dynamic provisioning in Kubernetes.
Now we can deploy, but, as already seen in the static provisioning, we must first create a new namespace
We claim a piece of storage, instantiating a PV thanks to a PVC. Copy
So deploy the PVC and see what happens. Note that the PVC status is pending. This is because the SC (see above) has the VOLUMEBINDINGMODE
equal to WaitForFirstConsumer
. This means that the PV will be created as soon as an application that requires it is created.
$ kubectl get pvc -n dynamic NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/mypvc Pending mysc 3m3s
We use the same application seen in the static case and see again what happens to the SC, PV and PVC supply chain
$ k apply -f stateful.yaml -n dynamic statefulset.apps/mystate created service/mysvc created $ k get sc,pv,pvc -n dynamic NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE storageclass.storage.k8s.io/mysc (default) nfs Retain WaitForFirstConsumer false 80m NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS AGE persistentvolume/pvc-100108de-18fb-4b85-864e-56e204f5b2d8 60Mi RWX Retain Bound dynamic/mypvc mysc 12s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/mypvc Bound pvc-100108de-18fb-4b85-864e-56e204f5b2d8 60Mi RWX mysc 7m18s
Then, after the application is deployed, a PV is automatically generated (the system hooks a hash code to the end of the component name) with exactly the required capacity. The PVC is now in the status bound and reports, in the adjacent column, the PV to which it is connected. If we go to check in the /home/share
folder, we will find a new one with the composite name <namespace>-<pvc_name>-pvc-<hash_code>. We insert a simple index.html
file in this directory and perform the same checks carried out in the static case.
Limitations
In the "parametrization" paragraph of the Storage chapter, countless customization possibilities are listed, surmounted, however, by a warning: not all the parameters presented are supported by the various plugins. The limitations in the implementations presented here are multiple:
ACCESSMODES: the option chosen is irrelevant, because the data access mode is controlled by the NFS (particular in the
/etc/exports
file);RECLAIMPOLICY: it uses the
Retain
policy, even if you have chosenDelete
;VOLUMEBINDINGMODE: regardless of the choice, it waits for the creation of the first Pod to create a PV (only in the dynamic case);
ALLOWVOLUMEEXPANSION: it does not allow volume expansion, even if you set it to
true
(only in the dynamic case).
These limitations are the result of poor integration between the parties, due to the lack of a real Storage provider. Here, in practice, the cluster was scraping data directly from the machines hard drive. In the next sub-chapter we will put into practice a procedure that will allow us to get around the problem, without the need to pay a Storage provider (AWSElasticBlockStore, AzureFile, GCEPersistentDisk, ecc.). We will introduce a layer (next sub-chapter), between the VM hard drive and Kubernetes, which will act as a mediator. In this way we will be able to access a wider range of parameters, able to better meet our storage needs, but still using the volume of the VMs.