k8s—–data storage

Table of Contents

1. The concept of data storage

2. Basic storage

1. EmptyDir storage volume

2. hostPath storage volume

3. nfs shared storage volume

3. Advanced storage

1. PV (persistent volume)

2. PVC (persistent volume statement)

3. Static PV experiment

4. Dynamic PV experiment

4.1 Install nfs on the stor01 node and configure the nfs service

4.2 Create Service Account

4.3 Use Deployment to create NFS Provisioner

4.4 Create StorageClass

4.5 Create PVC and Pod tests


1. The concept of data storage

The life cycle of files on the container disk is short-lived, which causes some problems when running important applications in the container. First, when a container crashes, the kubelet will restart it, but the files in the container will be lost – the container is restarted in a clean state (the original state of the image). Secondly, when multiple containers are running simultaneously in a Pod, files usually need to be shared between these containers. The Volume abstraction in Kubernetes solves these problems very well. Containers in the Pod share the Volume through the Pause container.

2. Basic Storage

1. EmptyDir storage volume

EmptyDir is the most basic Volume type. An EmptyDir is an empty directory on the Host.

EmptyDir is created when a Pod is assigned to a Node. Its initial content is empty, and there is no need to specify the corresponding directory file on the host, because kubernetes will automatically allocate a directory. When the Pod is destroyed, the data in EmptyDir will also be permanently deleted.

The uses of EmptyDir are as follows:

  • Temporary space, such as a temporary directory that is required when some applications are running and does not need to be retained permanently
  • A directory where one container needs to obtain data from another container (multi-container shared directory)

Next, let’s use EmptyDir through the case of file sharing between containers and.

  • Create a pod-emptydir.yaml
apiVersion: v1
Kind: Pod
metadata:
  name: pod-emptydir
  namespace:default
  labels:
    app: myapp
    tier: frontend
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
    imagePullPolicy: IfNotPresent
    ports:
    - name: http
      containerPort: 80
#Define container mounting content
    volumeMounts:
#The storage volume name used. If the value of the volume field name below is the same, it means that the storage volume of the volume is used.
    - name: html
#Which directory to mount to in the container
      mountPath: /usr/share/nginx/html/
  - name: busybox
    image: busybox:latest
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: html
#Define the mount storage name and mount path within the container
      mountPath: /data/
    command: ['/bin/sh','-c','while true;do echo $(date) >> /data/index.html;sleep 2;done']
  #Define storage volume
  volumes:
  #Define storage volume name
  - name: html
    #Define storage volume type
    emptyDir: {}
kubectl apply -f pod-emptydir.yaml
#Create pod

kubectl get pods -o wide
#View details

Two containers are defined above, one of which is to input the date into index.html, and then verify whether the date can be obtained by accessing the nginx html. To verify that the emptyDir mounted between the two containers is shared.

?

2. hostPath storage volume

The data in EmptyDir will not be persisted, it will be destroyed when the Pod ends. If you want to simply persist the data to the host, you can choose HostPath.

HostPath is to hang an actual directory in the Node host into the Pod for use by the container. This design can ensure that the Pod is destroyed, but the data basis can exist on the Node host.

  • Create a mounting directory on the node01 node
mkdir -p /data/pod/volume1
echo 'node01.kfc.com' > /data/pod/volume1/index.html
  • Create a mounting directory on the node02 node
mkdir -p /data/pod/volume1
echo 'node02.kc.com' > /data/pod/volume1/index.html
  • Create Pod resources
vim pod-hostpath.yaml
apiVersion: v1
Kind: Pod
metadata:
  name: pod-hostpath
  namespace:default
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
#Define container mounting content
    volumeMounts:
#The storage volume name used. If the value of the volume field name below is the same, it means that the storage volume of the volume is used.
    - name: html
#Which directory to mount to in the container
      mountPath: /usr/share/nginx/html
#Read-write mounting mode, the default is read-write mode false
readOnly: false
  The #volumes field defines the host or distributed file system storage volume associated with the paues container
  volumes:
    #Storage volume name
    - name: html
#Path, storage path for the host machine
      hostPath:
#The path to the directory on the host machine
        path: /data/pod/volume1
#Define the type, which means that if the host does not have this directory, it will be created automatically
        type: DirectoryOrCreate
kubectl apply -f pod-hostpath.yaml
  • Access Test
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-hostpath 2/2 Running 0 37s 10.244.2.35 node02 <none> <none>

curl 10.244.2.35
node02.kgc.com
  • Delete the pod and rebuild it again to verify whether the original content can still be accessed
kubectl delete -f pod-hostpath.yaml
kubectl apply -f pod-hostpath.yaml

kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-hostpath 2/2 Running 0 36s 10.244.2.37 node02 <none> <none>

curl 10.244.2.37
node02.kgc.com

3. nfs shared storage volume

HostPath can solve the problem of data persistence, but once the Node node fails and the Pod is transferred to another node, problems will occur again. At this time, a separate network storage system needs to be prepared, and NFS and CIFS are commonly used.

NFS is a network file storage system. You can build an NFS server and then directly connect the storage in the Pod to the NFS system. In this case, no matter how the Pod is transferred on the node, as long as the connection between Node and NFS is OK, , the data can be accessed successfully.

  • Install nfs on the stor01 node and configure the nfs service
mkdir /data/volumes -p
chmod 777 /data/volumes

vim /etc/exports
/data/volumes 192.168.10.0/24(rw,no_root_squash)

systemctl start rpcbind
systemctl start nfs

showmount -e
Export list for stor01:
/data/volumes 192.168.10.0/24
  • Master node operation
vim pod-nfs-vol.yaml
apiVersion: v1
Kind: Pod
metadata:
  name: pod-vol-nfs
  namespace:default
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
    volumeMounts:
    - name: html
      mountPath: /usr/share/nginx/html
  volumes:
    - name: html
      nfs:
        path: /data/volumes
        server: stor01
kubectl apply -f pod-nfs-vol.yaml

kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
pod-vol-nfs 1/1 Running 0 21s 10.244.2.38 node02
  • Create index.html on nfs server
cd /data/volumes
vim index.html
<h1>nfs stor01</h1>
  • Master node operation
curl 10.244.2.38
<h1>nfs stor01</h1>

kubectl delete -f pod-nfs-vol.yaml #Delete nfs-related pods and re-create them to obtain persistent storage of data

kubectl apply -f pod-nfs-vol.yaml

3. Advanced Storage

We learned earlier about using NFS to provide storage. At this time, users are required to build an NFS system and configure nfs in yaml. Since there are many storage systems supported by k8s, it is obviously unrealistic for customers to master them all. In order to shield the details of the underlying storage implementation and facilitate user use, k8s introduces two resource objects, PV and PVC.

  • PV (Persistent Volume): Persistent storage volume. It is used to describe or define a storage volume, which is usually defined by operation and maintenance engineers.
  • PVC (Persistent Volume Claim): It is a request for persistent storage. It is used to describe what kind of PV storage you want to use or what conditions you want to meet.

The usage logic of PVC: Define a storage volume in the Pod (the storage volume type is PVC), specify the size directly when defining, the PVC must establish a relationship with the corresponding PV, and the PVC will be based on the configuration definition Go to PV to apply, and PV is created from the storage space. PV and PVC are storage resources abstracted by Kubernetes.

The interaction between PV and PVC follows this life cycle:
Provisioning —> Binding —> Using —> Releasing —> Recycling

  • Provisioning, is the creation of PV. PV can be created directly (statically) or dynamically using StorageClass.
  • Binding, Assign PV to PVC
  • Using, Pod uses the Volume through PVC, and can prevent deletion of the PVC in use through admission control StorageProtection (PVCProtection for 1.9 and previous versions)
  • Releasing, the Pod releases the Volume and deletes the PVC
  • Reclaiming, recycling PV, you can keep the PV for next use, or you can delete it directly from the cloud storage

According to these 5 stages, there are 4 states of PV:

Available: Indicates available status and has not been bound by any PVC.
Bound: Indicates that the PV has been bound to the PVC
Released: Indicates that the PVC has been deleted, but the resources have not yet been reclaimed by the cluster.
Failed: Indicates that the automatic recycling of the PV failed

The specific process of a PV from creation to destruction is as follows:

1. After a PV is created, its status will change to Available, waiting to be bound by the PVC.
2. Once bound by PVC, the status of PV will change to Bound, and it can be used by Pods with corresponding PVC defined.
3. After the Pod is used, the PV will be released, and the status of the PV will change to Released.
4. The PV that becomes Released will be recycled according to the defined recycling strategy. There are three recycling strategies, Retain, Delete and Recycle. Retain means to retain the scene. The K8S cluster does nothing and waits for the user to manually process the data in the PV. After the processing is completed, the PV is manually deleted. Delete policy, K8S will automatically delete the PV and the data in it. In Recycle mode, K8S will delete the data in the PV, and then change the status of the PV to Available, which can then be bound and used by new PVCs.

1. PV (persistent volume)

PV is an abstraction of storage resources. The following is the resource list file.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv2 #Note: pv is an abstraction of storage resources, it does not store the definition of namespace
spec:
  nfs: #Storage type, corresponding to the underlying real storage (there are many, such as NFS, GFS, CIFS, etc.)
    path: (define the mount volume path)
    server: (define server name)
  cappcity: #Storage capacity, that is, the amount of storage space
    Storage: 2Gi
  accessModes: #Access mode
  storageClassName: #Storage class
  persistentVolumeReclaimPolicy: #Recycling policy

Description of key configuration parameters of PV

Storage type

  • The actual underlying storage type, k8s supports multiple storage types, and the configuration of each storage type is different.

Storage capacity

  • Currently, only storage space settings (storage=1Gi) are supported. In the future, configuration of IOPS, throughput and other indicators may be added.

Access Modes

User describes the user application’s access permissions to storage resources. Access permissions include the following methods:

  • ReadWriteOnce (RWO): Read and write permissions, but can only be mounted by a single node
  • ReadOnlyMany (ROX): Read-only permission, can be mounted by multiple nodes
  • ReadWriteMany (RWX): read and write permissions, can be mounted by multiple nodes

Reclaim Policy (persistentVolumeReclaimPolicy)

How to deal with PV when it is no longer used. Currently three strategies are supported

  • Reatin: Retain data and require administrator to manually clean the data
  • Recycle: Clear the data in PV, the effect is equivalent to executing rm -rf /thevolume/*
  • Delete: The back-end storage connected to the PV completes the volume deletion operation. Of course, it is common to use storage services with cloud service providers.

Storage Class

PV can specify a storage class through the storage Name parameter.

  • PVs with a specific category can only be bound to PVCs that requested that category
  • PVs with no category set can only be bound to PVCs that do not request any category.

2. PVC (persistent volume statement)

PVC is an application for resources, used to declare information on storage space, access mode, and storage category requirements.

Resource manifest file

apiVersion: v1
kind: PersistentVolumeClaim #Define the resource type of pvc
metadata:
  name: pvc
  namespace: dev #Namespace can be defined
sepc:
  accessModes: #Access mode
  selector: #Use tags to select PV
  storageClassName: #Storage class
  resources: #Request space
    requests:
      Storage: 5Gi

Description of key configuration parameters of PVC

Access Modes

  • Used to describe the user application’s access rights to storage resources.

Selector

  • By setting the Label Selector, PVC can be used to filter PVs that already exist in the system.

Storage Class (storageClassName)

  • When defining PVC, you can set the required back-end storage class. Only PVs with this class set can be selected by the system.

Resources

  • Describes a request for a storage resource

3. Static PV experiment

Use NFS as storage to demonstrate the use of PV, create 3 PVs, corresponding to the three exposed paths of NFS

1. Configure nfs storage

mkdir v{1,2,3,4,5}

vim /etc/exports
/data/volumes/v1 192.168.10.0/24(rw,no_root_squash)
/data/volumes/v2 192.168.10.0/24(rw,no_root_squash)
/data/volumes/v3 192.168.10.0/24(rw,no_root_squash)
/data/volumes/v4 192.168.10.0/24(rw,no_root_squash)
/data/volumes/v5 192.168.10.0/24(rw,no_root_squash)

exportfs -arv

showmount -e

Official documentation: https://kubernetes.io/zh-cn/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolume

2. Define PV

Five PVs are defined here, and the mounting path and access mode are defined, as well as the size of the PV division.

vim pv-demo.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv001
  labels:
    name: pv001
spec:
  nfs:
    path: /data/volumes/v1
    server: stor01
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    Storage: 1Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv002
  labels:
    name: pv002
spec:
  nfs:
    path: /data/volumes/v2
    server: stor01
  accessModes: ["ReadWriteOnce"]
  capacity:
    Storage: 2Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv003
  labels:
    name: pv003
spec:
  nfs:
    path: /data/volumes/v3
    server: stor01
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    Storage: 2Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv004
  labels:
    name: pv004
spec:
  nfs:
    path: /data/volumes/v4
    server: stor01
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 4Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv005
  labels:
    name: pv005
spec:
  nfs:
    path: /data/volumes/v5
    server: stor01
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    Storage: 5Gi
kubectl apply -f pv-demo.yaml
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv001 1Gi RWO,RWX Retain Available 7s
pv002 2Gi RWO Retain Available 7s
pv003 2Gi RWO,RWX Retain Available 7s
pv004 4Gi RWO,RWX Retain Available 7s
pv005 5Gi RWO,RWX Retain Available 7s

3. Define PVC

The access mode of pvc is defined here as multi-channel read and write. This access mode must be among the access modes defined by pv previously. Define the size of the PVC application to be 2Gi. At this time, the PVC will automatically match the multi-channel read and write PV with a size of 2Gi. The status of the successful matching of the PVC is Bound.

vim pod-vol-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name:mypvc
  namespace:default
spec:
  accessModes: ["ReadWriteMany"]
  resources:
    requests:
      Storage: 2Gi
---
apiVersion: v1
Kind: Pod
metadata:
  name: pod-vol-pvc
  namespace:default
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
    volumeMounts:
    - name: html
      mountPath: /usr/share/nginx/html
  volumes:
    - name: html
      persistentVolumeClaim:
        claimName: mypvc

kubectl apply -f pod-vol-pvc.yaml
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv001 1Gi RWO,RWX Retain Available 19m
pv002 2Gi RWO Retain Available 19m
pv003 2Gi RWO,RWX Retain Bound default/mypvc 19m
pv004 4Gi RWO,RWX Retain Available 19m
pv005 5Gi RWO,RWX Retain Available 19m

kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mypvc Bound pv003 2Gi RWO,RWX 22s

4. Test access

Create index.html on the storage server, write data, and view the corresponding page by accessing the Pod.

cd /data/volumes/v3/
echo "welcome to use pv3" > index.html

kubectl get pods -o wide
pod-vol-pvc 1/1 Running 0 3m 10.244.2.39 k8s-node02

curl 10.244.2.39
welcome to use pv3

4. Dynamic PV experiment

The PV and PVC modes introduced above require operation and maintenance personnel to create PVs first, and then developers define PVCs for one-to-one bonding. However, if there are thousands of PVC requests, thousands of PVs need to be created. Maintenance costs are very high for operation and maintenance personnel. Kubernetes provides a mechanism to automatically create PVs called StorageClass, which is used to create PV templates.

Creating a StorageClass requires defining the attributes of the PV, such as storage type, size, etc.; in addition, creating such a PV requires the use of storage plug-ins, such as Ceph, etc. With these two pieces of information, Kubernetes can find the corresponding StorageClass based on the PVC submitted by the user. Then Kubernetes will call the storage plug-in declared by the StorageClass to automatically create the required PV and bind it.
Build StorageClass + NFS to realize dynamic PV creation of NFS

The dynamic PV creation supported by Kubernetes itself does not include NFS, so you need to use an external storage volume plug-in to allocate PVs. For details, see: https://kubernetes.io/zh/docs/concepts/storage/storage-classes/

The volume plug-in is called Provisioner (storage allocator), and NFS uses nfs-client. This external volume plug-in will automatically create PVs using the configured NFS server.

4.1 Install nfs on the stor01 node and configure the nfs service

mkdir /opt/k8s
chmod 777 /opt/k8s/

vim /etc/exports
/opt/k8s 192.168.10.0/24(rw,no_root_squash,sync)

systemctl restart nfs

4.2 Create Service Account

Create a Service Account to manage the permissions of NFS Provisioner to run in the k8s cluster, and set nfs-client rules for PV, PVC, StorageClass, etc.

vim nfs-client-rbac.yaml
#Create a Service Account to manage the permissions of NFS Provisioner running in the k8s cluster
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
---
#Create cluster role
apiVersion: rbac.authorization.k8s.io/v1
kind:ClusterRole
metadata:
  name: nfs-client-provisioner-clusterrole
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["create", "delete", "get", "list", "watch", "patch", "update"]
---
#Cluster role binding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: nfs-client-provisioner-clusterrolebinding
subjects:
- kind: ServiceAccount
  name: nfs-client-provisioner
  namespace:default
roleRef:
  kind:ClusterRole
  name: nfs-client-provisioner-clusterrole
  apiGroup: rbac.authorization.k8s.io
kubectl apply -f nfs-client-rbac.yaml

4.3 Use Deployment to create NFS Provisioner

NFS Provisione (ie nfs-client) has two functions: one is to create a mount point (volume) under the NFS shared directory, and the other is to associate the PV with the NFS mount point.

#Since selfLink is enabled in version 1.20, an error will be reported when k8s 1.20 + version dynamically generates pv through nfs provisioner. The solution is as follows:
vim /etc/kubernetes/manifests/kube-apiserver.yaml
spec:
  containers:
  - command:
    - kube-apiserver
    - --feature-gates=RemoveSelfLink=false #Add this line
    - --advertise-address=192.168.80.20
...

kubectl apply -f /etc/kubernetes/manifests/kube-apiserver.yaml
kubectl delete pods kube-apiserver -n kube-system
kubectl get pods -n kube-system | grep apiserver
#Create NFS Provisioner
vim nfs-client-provisioner.yaml
Kind: Deployment
apiVersion: apps/v1
metadata:
  name: nfs-client-provisioner
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-client-provisioner
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-client-provisioner
    spec:
      serviceAccountName: nfs-client-provisioner #Specify Service Account account
      containers:
        - name: nfs-client-provisioner
          image: quay.io/external_storage/nfs-client-provisioner:latest
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: nfs-storage #Configure the provisioner Name and ensure that the name is consistent with the provisioner name in the StorageClass resource
            - name: NFS_SERVER
              value: stor01 #Configure the bound nfs server
            - name: NFS_PATH
              value: /opt/k8s #Configure the bound nfs server directory
      volumes: #Declare nfs data volume
        - name: nfs-client-root
          nfs:
            server: stor01
            path: /opt/k8s
\t
kubectl apply -f nfs-client-provisioner.yaml

kubectl getpod
NAME READY STATUS RESTARTS AGE
nfs-client-provisioner-cd6ff67-sp8qd 1/1 Running 0 14s

4.4 Create StorageClass

Create StorageClass, responsible for establishing PVC and calling NFS provisioner to perform scheduled work, and associate PV with PVC

vim nfs-client-storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-client-storageclass
provisioner: nfs-storage #The name here should be consistent with the environment variable PROVISIONER_NAME in the provisioner configuration file
parameters:
  archiveOnDelete: "false" #false means that the data will not be archived when the PVC is deleted, that is, the data will be deleted
  
  
kubectl apply -f nfs-client-storageclass.yaml

kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
nfs-client-storageclass nfs-storage Delete Immediate false 43s

4.5 Create PVC and Pod test

vim test-pvc-pod.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-nfs-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: nfs-client-PROVISIONER #Associate StorageClass object
  resources:
    requests:
      Storage: 1Gi
---
apiVersion: v1
Kind: Pod
metadata:
  name: test-storageclass-pod
spec:
  containers:
  - name: busybox
    image: busybox:latest
    imagePullPolicy: IfNotPresent
    command:
    - "/bin/sh"
    - "-c"
    args:
    - "sleep 3600"
    volumeMounts:
    - name: nfs-pvc
      mountPath: /mnt
  restartPolicy: Never
  volumes:
  - name: nfs-pvc
    persistentVolumeClaim:
      claimName: test-nfs-pvc #Consistent with the PVC name
kubectl apply -f pod-hostpath.yaml