Understand Limits and Requests of Kubernetes in one article

When working with containers in Kubernetes, it is important to know what the resources involved are and how they are required. Some processes require more CPU or memory than others. Some are critical and should not be starved.

Knowing this, we should properly configure our containers and pods to get the best of both worlds.

In this article we will see.

Introduction to Limits and Requests of Kubernetes

Practice case

Kubernetes Requests

Kubernetes Limits

CPU specificity

specificity of memory

Namespace ResourceQuta

Namespace LimitRange

Summarize

Introduction to Limits and Requests of Kubernetes

When using Kubernetes, Limits and Requests are important configurations, mainly including CPU and memory configurations.

Kubernetes defines Limits as the maximum amount of resources used by a container, which means that the container can never consume more than the displayed amount of memory or CPU.

Requests, on the other hand, refer to the minimum guaranteed amount of resources reserved for a container.

image.png

Practice case

Let’s take a look at the deployment below, we need to set Limits and Requests on CPU and memory for two different containers.

kind: Deployment apiVersion: extensions/v1beta1...template: spec: containers: - name: redis image: redis:5.0.3-alpine resources: limits: memory: 600Mi cpu: 1 requests: memory: 300Mi cpu: 500m - name : busybox image: busybox:1.28 resources: limits: memory: 200Mi cpu: 300m requests: memory: 100Mi cpu: 100m

If we want to deploy the deployment to the node configured with 4C16G, we can get the following information.

Effective requests for pods are 400 MiB of memory and 600 millicores of CPU, and you need a node with enough free allocatable space to schedule pods on.

The CPU shares of the Redis container will be 512, while the busybox container is 102, Kubernetes always allocates 1024 shares per core, so redis: 1024 * 0.5 cores ? 512 and busybox: 1024 * 0.1 cores ? 102

If the Redis container tries to allocate more than 600MB of RAM, it will be OOM killed, most likely failing the pod.

If Redis tries to use more than 100ms of CPU every 100ms, (since we have 4 cores with 400ms available every 100ms), it will suffer from CPU throttling, resulting in poor performance.

If the Busybox container tries to allocate more than 200MB of RAM, it will be OOM killed, resulting in a failed Pod.

If Busybox tries to use more than 30ms of CPU every 100ms, it will suffer from CPU throttling, resulting in poor performance.

Kubernetes Requests

Kubernetes defines a request as the minimum guaranteed amount of resources used by a container.

Basically, it sets the minimum amount of resources the container will consume.

When a Pod is scheduled, kube-scheduler will check the Kubernetes request in order to assign it to a specific node: the node that can satisfy at least this number of all containers in the Pod. If the number of requests is higher than the available resources, the Pod will not be scheduled and will remain in the Pending state.

For more information about Pending status, see Understanding Kubernetes Pod pending problems [1].

In this example, in the container definition, we set a request for 100M cores of CPU and 4Mi of memory.

resources: requests: cpu: 0.1 memory: 4Mi

Requests are usually used in the following scenarios:

When assigning a Pod to a Node, all specified requests for containers in the Pod are satisfied.

At runtime, the specified request volume will be guaranteed to be the minimum for the containers in this Pod.

image.png

Kubernetes Limits

Kubernetes defines Limits as the maximum amount of resources used by a container.

This means that the container can never consume more than the specified amount of memory or CPU.

 resources: limits: cpu: 0.5 memory: 100Mi

Limits are usually used in the following scenarios:

When assigning pods to a node, if request is not set, by default Kubernetes will assign request=limit.

At runtime, Kubernetes will check whether the amount of resources consumed by the containers in the Pod is higher than the amount indicated by the limit.

image.png

CPU characteristics

The CPU is a compressible resource, which means it can be stretched to meet all demands. If a process asks for too much CPU, some of it will be throttled.

CPU represents computing processing time in units of cores.

You can use nanometers (m) to mean quantities smaller than a core (for example, 500m is half a core).

The minimum quantity is 1m

A node may have more than one core available, so it is possible to request CPU > 1

image.png

Memory characteristics

Memory is an incompressible resource, meaning it cannot be stretched like a CPU. If a process doesn’t get enough memory to do work, the process is killed.

In Kubernetes, the unit of memory is bytes.

You can use, E, P, T, G, M, k for Exabyte, Petabyte, Terabyte, Gigabyte, Megabyte, and kilobyte, although only the last four are commonly used. (eg, 500M, 4G)

WARNING: don’t use a lowercase m for memory (this stands for Millibytes, ridiculously low)

You can define Mebibytes by Mi and the rest by Ei, Pi, Ti (for example, 500Mi)

A Mebibyte (and their analogs Kibibyte, Gibibyte…) is 20 bytes to the power of 2. It appears to avoid confusion with Kilo and Mega definitions in the metric system. You should use this notation because it’s the typical definition of a byte, while Kilo and Mega are multiples of 1000.

image.png

Best practice

In Kubernetes, you should rarely use limits to control your resource usage. This is because if you want to avoid starvation (make sure every important process gets its share), you should use requests first.

By setting a limit, you’re just preventing the process from retrieving additional resources in special cases, causing OOM killing on the memory side and Throttling on the CPU side (the process will need to wait for the CPU to be available again).

For more information, check out the article about OOM and Throttling [2].

If you set a request value equal to the limit in all containers in a pod, that pod will get a guaranteed quality of service.

Also note that Pods with resource usage higher than requested are more likely to be evicted, so setting very low requests will do more harm than good. It can be viewed in Pod eviction and Quality of Service【3】.

Namespace ResourceQuata

Thanks to namespaces, we can isolate Kubernetes resources into different groups, also known as tenants.

With ResourceQuota, you can set a memory or CPU limit for the entire namespace, ensuring that entities within it cannot consume more than this amount.

apiVersion: v1kind: ResourceQuotametadata: name: mem-cpu-demospec: hard: requests.cpu: 2 requests.memory: 1Gi limits.cpu: 3 limits.memory: 2Gi

requests.cpu: The maximum number of CPUs for all requests in this namespace.

requests.memory: The maximum amount of memory for all requests in this namespace.

limits.cpu: The maximum number of CPUs for all limits in this namespace.

limits.memory: The maximum amount of memory for the sum of all limits in this namespace.

Then, apply it to your namespace.

kubectl apply -f resourcequota.yaml --namespace=mynamespace

You can list the current ResourceQuota for a namespace with the following methods.

kubectl get resourcequota -n mynamespace

Note that if you set a ResourceQuota for a specific resource in a namespace, then you will need to specify the corresponding limit or request for each Pod in that namespace. Otherwise, Kubernetes will return a “failed quota” error.

Error from server (Forbidden): error when creating "mypod.yaml": pods "mypod" is forbidden: failed quota: mem-cpu-demo: must specify limits.cpu,limits.memory,requests.cpu,requests .memory

If you try to add a new Pod whose container limit or request exceeds the current ResourceQuota, Kubernetes will return an “exceeded quota” error.

Error from server (Forbidden): error when creating "mypod.yaml": pods "mypod" is forbidden: exceeded quota: mem-cpu-demo, requested: limits.memory=2Gi,requests.memory=2Gi, used : limits.memory=1Gi,requests.memory=1Gi, limited: limits.memory=2Gi,requests.memory=1Gi

Namespace LimitRange

ResourceQuotas are useful if we want to limit the total amount of resources a namespace can allocate. But what happens if we want to provide default values for the elements inside?

LimitRanges is a Kubernetes policy that limits resource settings per entity in a namespace.

apiVersion: v1kind: LimitRangemetadata: name: cpu-resource-constraintspec: limits: - default: cpu: 500m defaultRequest: cpu: 500m min: cpu: 100m max: cpu: "1" type: Container

default. If not specified, the created container will have this value.

min: Created containers cannot have a smaller limit or request than this.

max: Created containers cannot have limits or requests greater than this value.

Later, if you create a new Pod with no requests or limits set, LimitRange will automatically set these values for all its containers.

 Limits: cpu: 500m Requests: cpu: 100m

Now, imagine that you add a new Pod, limited to 1200M. You will get the following error.

Error from server (Forbidden): error when creating "pods/mypod.yaml": pods "mypod" is forbidden: maximum cpu usage per Container is 1, but limit is 1200m

Note that by default, all containers in a Pod will effectively have 100m CPU requests, even if no LimitRanges are set.

Summary

Choosing the best limit for our Kubernetes cluster is key in order to get the best energy consumption and cost.

Allocating too many resources to our pods can cause costs to skyrocket.

Scaling too small or dedicating very little CPU or memory will result in applications not functioning properly and even Pods being evicted.

As mentioned earlier, Kubernetes limits should not be used except in very specific circumstances, as they can cause more harm than good. In low-memory conditions, containers have the potential to be killed, and in low-CPU conditions, containers may be throttled.

For requests, use them when you need to ensure that a process gets a guaranteed share of a resource.