Kubernetes Quality of Service – QoS

header-how-to-optimize-kubernetes-for-reliability-and-cos

Author:rab

Foreword

I mentioned Kubernetes’ admission control policy earlier, but have you ever thought about a question: If the current Node node where a Pod is running has insufficient resources, how can the Pod be evicted? Therefore, Quality of Service class (QoS class) is mentioned here. Kubernetes uses QoS class policies to evict Pods when Node resources are insufficient.

When Kubernetes creates a Pod, it sets one of the following QoS classes to the Pod:

Guaranteed
Burstable
BestEffort

Which Qos class a Pod will be set to depends on the value of limits in the Pod container.

1. Pod whose QoS class is Guaranteed

1.1 Overview

When the Qos of a Pod is Guaranteed, it is required that the Resource Limits and Resource Requests of the Pod should be configured to be equal before creating the Pod. This means that the Pod requested a specific amount of resources, and that amount is the minimum amount of resources the Pod can get. Therefore, the Guaranteed level is often used for critical applications to ensure that they always have the resources they need.

1.2 Case

Below is a list of Pods containing a Container. This Container sets the memory request and memory limit, both of which are 200 MiB. This Container sets the CPU request and CPU limit, both of which are 700 milliCPU.

apiVersion: v1
Kind: Pod
metadata:
  name: qos-demo
  namespace: qos-example
spec:
  containers:
  - name: qos-demo-ctr
    image: nginx
    resources:
      limits:
        memory: "200Mi"
        cpu: "700m"
      requests:
        memory: "200Mi"
        cpu: "700m"

**Note:** If a Container specifies its own memory limit but does not specify a memory request, Kubernetes will automatically specify a memory request equal to the memory limit for it. Likewise, if a container specifies its own CPU limit but not a CPU request, Kubernetes will automatically assign it a CPU request equal to the CPU limit. In other words, in addition to the above case, a Pod whose QoS class is Guaranteed can also be written like this:

apiVersion: v1
Kind: Pod
metadata:
  name: qos-demo
  namespace: qos-example
spec:
  containers:
  - name: qos-demo-ctr
    image: nginx
    resources:
      limits:
        memory: "200Mi"
        cpu: "700m"

At this point, Kubernetes automatically assigns it CPU requests equal to the CPU limit, and memory requests equal to the memory limit.

View Pod details:

kubectl get pod qos-demo --namespace=qos-example --output=yaml

spec:
  containers:
    ...
    resources:
      limits:
        cpu: 700m
        memory: 200Mi
      requests:
        cpu: 700m
        memory: 200Mi
    ...
status:
  qosClass: Guaranteed

It can be seen that the status of the Pod is Guaranteed.

2. Pod whose QoS class is Burstable

2.1 Overview

When the Qos of a Pod is Burstable, it is required that the resource request of the Pod should be configured to be less than the resource limit before creating the Pod. A Pod can use more resources than it requests, but under high load it may be subject to contention and may not have enough resources. Therefore, the Burstable level is typically used for applications that require flexibility but do not require a guaranteed fixed amount of resources.

2.2 Case

Below is a list of Pods containing a Container. The Container is set with a memory limit of 200 MiB and a memory request of 100 MiB.

apiVersion: v1
Kind: Pod
metadata:
  name: qos-demo-2
  namespace: qos-example
spec:
  containers:
  - name: qos-demo-2-ctr
    image: nginx
    resources:
      limits:
        memory: "200Mi"
      requests:
        memory: "100Mi"

View Pod details:

kubectl get pod qos-demo-2 --namespace=qos-example --output=yaml

spec:
  containers:
  - image: nginx
    imagePullPolicy: Always
    name: qos-demo-2-ctr
    resources:
      limits:
        memory: 200Mi
      requests:
        memory: 100Mi
  ...
status:
  qosClass: Burstable

It can be seen that the status of the Pod is Burstable.

3. Pod with QoS class BestEffort

3.1 Overview

When the Qos of a Pod is Guaranteed, there is no need to set resource requests and resource limits when creating a Pod. Therefore, the Pod can use all available resources on the node, but it is not guaranteed to obtain enough resources and may be affected by other Pods on the node. Therefore, the BestEffort level is typically used for non-critical applications, such as test applications or less critical workloads.

3.2 Case

Below is a list of Pods containing a Container. This Container has no memory and CPU limits or requests set.

apiVersion: v1
Kind: Pod
metadata:
  name: qos-demo-3
  namespace: qos-example
spec:
  containers:
  - name: qos-demo-3-ctr
    image: nginx

View Pod details:

kubectl get pod qos-demo-3 --namespace=qos-example --output=yaml

spec:
  containers:
    ...
    resources: {<!-- -->}
  ...
status:
  qosClass: BestEffort

It can be seen that the status of the Pod is BestEffort.

Then the question arises: when node resources are insufficient, which Pods should be deleted first to release resources?

BestEffort level Pods are usually deleted first because they have no resource requests set and they have the lowest deletion priority. When node resources are insufficient, Kubernetes will first recycle Pods at the BestEffort level to free up more resources.
Burstable level Pods have a lower deletion priority than Guaranteed level Pods. When node resources are insufficient, Burstable level Pods will be deleted before Guaranteed level Pods are recycled.
Guaranteed level Pods are usually the last to be deleted. Their resource requests and resource limits are equal, so they are considered high priority Pods. When node resources are insufficient, Kubernetes usually recycles BestEffort level Pods first, then Burstable level Pods, and finally Guaranteed level Pods.

Summary

QoS (Quality of Service) in Kubernetes (K8s) is a mechanism used to classify and manage the resource requirements and restrictions of Pods. There are three levels of QoS classification: Guaranteed, Burstable, and BestEffort. These levels reflect the Pod’s usage requirements for CPU and memory resources. QoS levels are used to help the Kubernetes scheduler and resource manager better manage and allocate resources to ensure that each Pod can receive reasonable resource allocation on shared nodes.

When Node node resources are insufficient, first delete the Pod with BestEffort service quality, then delete the Burstable Pod, and finally delete the Guaranteed Pod.

-END

Directory

Foreword

1. Pod whose QoS class is Guaranteed

1.1 Overview

1.2 Case

2. Pod whose QoS class is Burstable

2.1 Overview

2.2 Case

3. Pod with QoS class BestEffort

3.1 Overview

3.2 Case

Summary