Introduction
Out of memory (OOM) errors and CPU throttling are major pain points for resource handling in cloud applications when using Kubernetes.
why is that?
CPU and memory requirements in cloud applications are becoming increasingly important because they are directly related to your cloud costs.
With limits and requests , you can configure how pods should allocate memory and CPU resources to prevent resource starvation and adjust cloud costs.
Pods may be evicted through preemption or node pressure if the node does not have enough resources.
When a process runs out of memory (OOM), it is terminated because it does not have the resources it needs.
If the CPU consumption is higher than the actual limit, the process will start throttling.
But how do you proactively monitor how close a Kubernetes pod is to OOM and CPU throttling?
Kubernetes OOM
Each container in a Pod requires memory to run.
Kubernetes limits are set per container in a Pod definition or Deployment definition.
All modern Unix systems have a way to kill processes in case they need to reclaim memory. This will be flagged as error 137 or OOMKilled.
State: Running Started: Thu, 10 Oct 2019 11:14:13 + 0200 Last State: Terminated Reason: OOM Killed Exit Code: 137 Started: Thu, 10 Oct 2019 11:04:03 + 0200 Finished: Thu, 10 Oct 2019 11:14:11 + 0200
This exit code 137 means that the process used more memory than allowed and must be terminated.
This is a feature present in Linux where the kernel oom_score
sets a value for processes running in the system. Additionally, it allows setting a value called oom_score_adj
which Kubernetes uses to allow quality of service. It also has an OOM Killer
feature that will audit processes and kill those that are using more memory than they should be capped.
Note that in Kubernetes, a process can reach any of the following limits:
-
The Kubernetes Limit set on the container.
-
The Kubernetes ResourceQuota set on the namespace.
-
The actual memory size of the node.
Memory overcommit
Limits can be higher than requests, so the sum of all limits can be higher than node capacity. This is called overuse, and it’s very common. In fact, it may run out of memory in the node if all containers are using more memory than requested. This usually results in some pods being killed to free up some memory.
Monitoring Kubernetes OOM
When using the node exporter in Prometheus, there is a metric called node_vmstat_oom_kill
. Tracking when OOM kills occur is important, but you may want to know about such events before they happen.
Instead, you can check how close a process is to Kubernetes limits:
(sum by (namespace, pod, container) (rate(container_cpu_usage_seconds_total{container!=""}[5m])) / sum by (namespace, pod, container) (kube_pod_container_resource_limits{resource="cpu"})) > 0.8
Kubernetes CPU throttling
CPU throttling is the behavior of slowing down a process when it is about to reach certain resource limits.
Similar to the memory case, these limits may be:
-
The Kubernetes Limit set on the container.
-
The Kubernetes ResourceQuota set on the namespace.
-
The actual memory size of the node.
Consider the following analogy. We have a highway with some traffic where:
-
The CPU is the way.
-
Vehicles represent processes, and each vehicle has a different size.
-
Multiple channels represent multiple cores.
-
A request would be for a dedicated road, such as a bike lane.
Throttling here manifests itself as a traffic jam: eventually, all processes will run, but everything will be slower.
CPU Processes in Kubernetes
CPUs are handled using shares in Kubernetes. Each CPU core is divided into 1024 shares, which are then distributed among all running processes using the cgroups (control groups) feature of the Linux kernel.
If the CPU can handle all current processes, no action is required. If the process is using more than 100% of the CPU, the shares are in place. Like any Linux Kernel, Kubernetes uses the CFS (Completely Fair Scheduler) mechanism, so processes with more shares will get more CPU time.
Unlike memory, Kubernetes does not kill pods for throttling.
CPU statistics can be viewed in /sys/fs/cgroup/cpu/cpu.stat
CPU Excessive Usage
As we saw in the Limits and Requests article, setting limits or requests is important when we want to limit the resource consumption of a process. However, be careful not to set the total number of requests larger than the actual CPU size, as this means that each container should have a certain amount of CPU.
Monitoring Kubernetes CPU throttling
You can check how close a process is to the Kubernetes limit:
(sum by (namespace,pod,container)(rate(container_cpu_usage_seconds_total {container!=""}[5m])) / sum by (namespace, pod, container) (kube_pod_container_resource_limits{resource="cpu"})) > 0.8
If we want to track the amount of throttling happening in the cluster, cadvisor provides container_cpu_cfs_throttled_periods_total
and container_cpu_cfs_periods_total
. With these two, you can easily calculate the throttling percentage of all CPU cycles.
Best practice
Pay attention to limits and requests
Limits are a way of setting a maximum resource cap in node, but these need to be treated with care as you may end up with a process being throttled or terminated.
Prepare to be evicted
By setting a very low request, you might think this would grant your process the least amount of CPU or memory. But the kubelet
will first evict pods that are more used than requested, so you mark them as the first to be killed!
If you need to protect specific pods from preemption (when kube-scheduler
needs to allocate new pods), assign priority to the most important processes.
Throttling is the silent enemy
By setting unrealistic limits or overcommitting, you may not realize that your process is being throttled and performance is suffering. Proactively monitor your CPU usage and understand your actual limits in containers and namespaces.
Summary
Here’s a Kubernetes resource management cheat sheet for CPU and memory. This summarizes the current article, and these articles in the same series: