When we were learning ConfigMap/Secret
before, we encountered the concept of Volume
storage volume in Kubernetes
, which uses the field volumes
and volumeMounts
are equivalent to mounting a “virtual disk” for Pod
, injecting configuration information into Pod
in the form of files code> for use by processes.
However, Volume
at that time could only store less data, which was far from a real “virtual disk”.
Now let’s understand the advanced usage of Volume
, and look at the API
object PersistentVolume
of Kubernetes
managing storage resources, PersistentVolumeClaim
, StorageClass
then use the local disk to create an actual usable storage volume.
1. PersistentVolume
We built a WordPress
website in the Kubernetes
cluster, but there is a very serious problem: Pod
has no persistence function, resulting in MariaDB
cannot store data “permanently”.
Because the container in the Pod
is generated by the image, and the image file itself is read-only, the process can only use a temporary storage space to read and write the disk. Once the Pod
Destroyed, the temporary storage will be reclaimed and released immediately, and the data will be lost.
In order to ensure that the reconstruction data still exists even after the Pod
is destroyed, we need to find a solution to allow the Pod
to use a real “virtual disk”. How to do it?
In fact, Volume
of Kubernetes
has given a good abstraction for data storage, it just defines such a “storage volume”, and this “storage volume” What type, how much capacity, how to store, we are free to play. Pod
does not need to care about those professional and complicated details, as long as volumeMounts
is set, Volume
can be loaded into the container for use.
Therefore, Kubernetes
follows the concept of Volume
and extends the PersistentVolume
object, which is specially used to represent persistent storage devices, but hides storage We only need to know that it can store data safely and reliably (because the word PersistentVolume
is very long, it is generally abbreviated as PV
).
So, where do the PV
in the cluster come from?
As an abstraction of storage, PV
is actually some storage devices and file systems, such as Ceph
, GlusterFS
, NFS
, or even local disks, managing them is beyond the scope of Kubernetes
, so, generally, the system administrator will maintain it separately, and then create the corresponding Kubernetes
code>PV.
It should be noted that PV
belongs to the system resources of the cluster and is an object at the same level as Node
, and Pod
has no management rights to it. Only the right to use.
2. PersistentVolumeClaim/StorageClass
Now that we have PV
, can we mount and use it directly in Pod
?
Not yet. Because the difference between different storage devices is too great: some are fast, some are slow; some can share read and write, some can only read and write exclusively; As large as TB, PB level…
With so many kinds of storage devices, it is a bit too reluctant to use only one PV
object to manage them. It does not conform to the principle of “single responsibility”. Let Pod
directly select PV
is also very inflexible. So Kubernetes
added two new objects, PersistentVolumeClaim
and StorageClass
, using the idea of “middle layer” to put storage volume The allocation management process is refined again.
PersistentVolumeClaim
, PVC
for short, is easy to understand from the name, it is used to apply for storage resources from Kubernetes
. PVC
is the object used by Pod
, it is equivalent to the agent of Pod
, representing Pod
to apply to the system for PV
. Once the resource application is successful, Kubernetes
will associate PV
with PVC
, this action is called “bind”.
However, there are a lot of storage resources in the system. If you want to directly traverse PVC
to find a suitable PV
, it is also very troublesome, so you need to use StorageClass
.
StorageClass
is a bit like the previous IngressClass
, which abstracts a specific type of storage system (such as Ceph, NFS), in PVC
and PV
acts as a “coordinator” to help PVC
find a suitable PV
. That is to say, it can simplify the process of Pod
mounting “virtual disk”, so that Pod
cannot see the implementation details of PV
.
If you see this, you feel that you almost understand it, so don’t worry, let’s find an example in life to compare. After all, compared with commonly used CPU and memory, we still know less about storage systems, so in Kubernetes
, PV
, PVC
and StorageClass
These three new concepts are not particularly easy to grasp.
Looking at the example, suppose you want 10 sheets of printed materials in the company, so you call the front desk to clarify the requirements.
- The action of “calling” is equivalent to
PVC
, applying for storage resources fromKubernetes
. - There are various brands of office paper in the front desk, with different sizes and specifications, which is equivalent to
StorageClass
. - The front desk selects a brand according to your needs, and then takes out a pack of A4 paper from the inventory, which may be more than 10 sheets, but it can also meet the requirements, and then adds a new record on the registration form, writing that you Claimed office supplies. This process is the binding of
PVC
toPV
. - And the A4 paper bag in your hand is the storage object of
PV
.
3. Use YAML to describe PersistentVolume
There are many types of PV
in Kubernetes
, let’s first look at the easiest local storage HostPath
, which is the same as Docker
The -v
parameter to mount the local directory in code> is very similar, you can use it to get a preliminary understanding of the usage of PV
.
Because Pod
will run on any node of the cluster, first, we need to create a directory on each node as a system administrator, which will be mounted to Pod as a local storage volume
inside.
To save trouble, I created a directory named host-10m-pv in /tmp
, indicating a storage device with only 10MB capacity.
With storage, we can use YAML
to describe this PV
object.
But unfortunately, you can’t use kubectl create
to directly create PV
objects, you can only use kubectl api-resources
, kubectl explain
code> View the field description of PV
, and manually write the YAML
description file of PV
.
Below I give a YAML
example, you can use it as a template to edit your own PV
:
# host-10m-pv.yml apiVersion: v1 kind: PersistentVolume metadata: name: host-10m-pv spec: storageClassName: host-test accessModes: - ReadWriteOnce capacity: storage: 10Mi hostPath: path: /tmp/host-10m-pv/
storageClassName
is what I just said, the abstract StorageClass
of the storage type. This PV
is manually managed by us, and the name can be arbitrary. Here I wrote host-test
, and you can also change it to manual
, hand-work
and the like.
accessModes
defines the access mode of the storage device. Simply put, it is the read and write permission of the virtual disk, which is similar to the file access mode of Linux
. Currently, Kubernetes
There are 3 types:
ReadWriteOnce
: The storage volume can be read and written, but it can only be mounted byPod
on one node.ReadOnlyMany
: The storage volume is read-only but not writable, and can be mounted multiple times byPod
on any node.ReadWriteMany
: The storage volume can be read and written, and can also be mounted multiple times byPod
on any node.
You should note that these 3 access modes are restricted to nodes rather than Pod
, because storage is a system-level concept and does not belong to the process in Pod
.
Obviously, the local directory can only be used locally, so this PV
uses ReadWriteOnce
.
The third field capacity
is easy to understand, indicating the capacity of the storage device, here I set it to 10MB.
Remind you again that the definition of storage capacity in Kubernetes
uses international standards, and the base of KB/MB/GB
that we are used to every day is 1024, so it should be written as Ki/Mi/Gi
, you must be careful not to write it wrong, otherwise the unit will not match the actual capacity.
The last field hostPath
is the simplest, it specifies the local path of the storage volume, which is the directory we created on the node. Use these fields to describe the type, access mode, capacity, and storage location of PV
clearly, and a storage device is created.
4. Use YAML to describe PersistentVolumeClaim
With PV
, it means that there is such a persistent storage in the cluster that can be used by Pod
, we need to define the PVC
object again, to PVC
code>Kubernetes requests storage.
The following YAML
is a PVC
, which requires a 5MB storage device, and the access mode is ReadWriteOnce
:
# host-5m-pvc.yml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: host-5m-pvc spec: storageClassName: host-test accessModes: - ReadWriteOnce resources: requests: storage: 5Mi
The content of PVC
is very similar to PV
, but it does not represent actual storage, but an “application” or “declaration”, in spec
The field describes the “desired state” of the storage.
So storageClassName
, accessModes
and PV
in PVC
are the same, but there will be no field capacity
, but use resources.request
to indicate how much capacity you want.
In this way, Kubernetes
will find a PV
that can match StorageClass
and capacity according to the description in PVC
, and then “Bind” PV
and PVC
together to realize storage allocation, which is similar to the previous process of calling for A4 paper.
5. Using PersistentVolume in Kubernetes
After preparing PV
and PVC
, you can make Pod
implement persistent storage.
First you need to use kubectl apply
to create a PV
object:
kubectl apply -f host-10m-pv.yml
Then use kubectl get
to check its status:
From the screenshot, we can see that the capacity of this PV
is 10MB, the access mode is RWO
(ReadWriteOnce
), StorageClass
code> is our own defined host-test
, and the status shows Available
, that is, it is in an available state and can be assigned to Pod
at any time .
Next, we create PVC
and apply for storage resources:
kubectl apply -f host-5m-pvc.yml kubectl get pvc
Once the PVC
object is successfully created, Kubernetes
will immediately search for the requirements in the cluster through StorageClass
, resources
and other conditions PV
, if a suitable storage object is found, the two will be “bound” together.
PVC
object is 5MB, but now there is only one 10MB PV
in the system, there is no more suitable object, so Kubernetes
can only put This PV
is allocated, and the extra capacity is considered a “welfare”.
You will see that the status of these two objects is Bound
, which means that the storage application is successful, and the actual capacity of PVC
is the capacity of PV
10MB instead of the initially requested 5MB.
So, what if we increase the application capacity of PVC
? For example, change to 100MB:
Let’s delete the PVC
first
kubectl delete -f host-5m-pvc.yml
View pv
status
Reference:
Solution to K8S PV always in Released state
k8s pv has been in release state
Let’s delete the previously bound content first
kubectl edit pv host-10m-pv
Check the status of pv
again, it has returned to the normal usable status
Modify the application capacity of PVC
to 100Mi, after kubctl apply
, check the status of PV
and PVC
again.
You will see that the PVC
will always be in the Pending
state, which means that Kubernetes
cannot find the required storage in the system and cannot allocate resources. Binding can only be completed when there is a PV
that meets the requirements.
6. Mount PersistentVolume for Pod
With persistent storage in place, we can now mount volumes for Pod
. First define the storage volume in spec.volumes
, and then mount it into the container in containers.volumeMounts
.
But because we are using PVC
, we need to use the field persistentVolumeClaim
in volumes
to specify the name of PVC
.
The following is the YAML
description file of Pod
, which mounts the storage volume to the /tmp
directory of the Nginx
container:
# host-path-pod.yml apiVersion: v1 kind: Pod metadata: name: host-pvc-pod spec: volumes: - name: host-pvc-vol persistentVolumeClaim: claimName: host-5m-pvc containers: - name: ngx-pvc-pod image: nginx:alpine ports: - containerPort: 80 volumeMounts: - name: host-pvc-vol mountPath: /tmp
I drew the relationship between Pod
and PVC/PV
as a graph (the field accessModes is omitted), and you can see how they are connected from the graph:
Now we create this Pod
and check its status:
kubectl apply -f host-path-pod.yml kubectl get pod -o wide
It was transferred to the worker
node by Kubernetes
, so is PV
mounted successfully? Let’s enter the container with kubectl exec
and execute some commands to see:
A host-pvc.txt file is generated in the /tmp
directory of the container. According to the definition of PV
, it should fall in the worker
node disk, so we log into the worker
node to check:
You will see that there is indeed a host-pvc.txt file in the local directory of the worker
node, and then check the time to confirm that it was generated in the Pod
just now document.
Because the data generated by Pod
has been stored on the disk through PV
, if Pod
is deleted and then recreated, the storage volume will still be mounted Using this directory, the data remains unchanged, and persistent storage is achieved.
But there is still a small problem, because this PV
is of HostPath
type, and it is only stored on this node. If Pod
is rebuilt, it is scheduled to other nodes , then even if the local directory is loaded, it will not be the previous storage location, and the persistence function will be invalid.
Therefore, the PV
of the HostPath
type is generally used for testing, or for applications such as DaemonSet
that are closely related to nodes, as we will see in the next section The lesson will talk about achieving truly arbitrary data persistence.
7. Summary
PersistentVolume
, referred to asPV
for short, is the abstraction of storage devices byKubernetes
and is maintained by the system administrator. It is necessary to clearly describe the type and access mode of storage devices , capacity and other information.PersistentVolumeClaim
, referred to asPVC
, representsPod
to apply for storage resources from the system. It declares the storage requirements, andKubernetes
will Find the most suitablePV
and bind.StorageClass
abstracts a specific type of storage system, classifies and groupsPV
objects, and simplifies the binding process ofPV/PVC
.HostPath
is the simplestPV
, and the data is stored locally on the node, which is fast but cannot migrate withPod
.pvc
is an application, the real use isvolume
, and thenpv
is mounted intovolume
>Pod .
Kubernetes
has a special form of storage volume called emptyDir
, which has the same life cycle as Pod
, longer than containers, but not persistent storage. Can be used as staging or caching.
If the storage system conforms to the CSI
standard, the ReadWriteOncePod
attribute can also be used in accessModes
to allow only a single Pod
to read and write , the granularity of control is finer.
1. StorageClass is host-test defined by us. Doesn’t it need us to manually create this StorageClass named host-test? If not, what kind of existence does it have and what role does it play
Answer: For purely manually created PVs, no special StorageClass object is needed, and special storage devices such as NFS will be used later.