Kubernetes OOM and CPU Throttling Issues

Introduction

Out of memory (OOM) errors and CPU throttling (Throttling) are major pain points for resource handling in cloud applications when using Kubernetes. why?

CPU and memory requirements in cloud applications are becoming increasingly important because they are directly related to your cloud costs.

With limits and requests, you can configure how pods should allocate memory and CPU resources to prevent resource starvation and adjust cloud costs.

  • Pods may be evicted due to preemption or node pressure if the node does not have enough resources.
  • When a process runs out of memory (OOM), it is killed because it does not have the resources it needs.
  • If the CPU consumption is higher than the actual limits, the process will start to be limited.

OK, how to monitor that the Pod is about to OOM, or the CPU is about to be limited?

Kubernetes OOM

Each container in a Pod requires memory to run.

Kubernetes limits are set per container in a Pod definition or Deployment definition.

All modern Unix systems have a way to kill processes in order to reclaim memory (only kill processes when free memory is not being used). This error will be flagged as a 137 error code or OOMKilled.

State: Running
    Started: Thu, 10 Oct 2019 11:14:13 + 0200
Last State: Terminated
    Reason: OOM Killed
    Exit Code: 137
    Started: Thu, 10 Oct 2019 11:04:03 + 0200
    Finished: Thu, 10 Oct 2019 11:14:11 + 0200

Exit code 137 means that the process used more memory than allowed and must be terminated by the OS.

This is a feature in Linux where the kernel sets the oom_score value for processes running in the system. Additionally, it allows setting a value called oom_score_adj which is used by Kubernetes for quality of service. It also has OOM Killer, which will check processes and kill those that use more than too much memory (such as requesting more memory than limits).

Note that in Kubernetes, a process may hit any of the following limits:

  • Kubernetes limits set on containers.
  • The Kubernetes ResourceQuota set on the namespace.
  • The actual memory size of the node.

20230719114814

Memory overcommitment (overcommitment)

Limits can be higher than requests, so the sum of all limits can be higher than node capacity. This is called overallocation, and it’s quite common. In fact, it is possible to run out of memory in a node if all containers use more memory than requested. This usually causes some pods to die to free up some memory.

Monitor Kubernetes OOM

In the Prometheus ecosystem, when using node-exporter, there is an indicator called node_vmstat_oom_kill. It is important to track when OOM terminations occur, but you may want to pre-empt such events and understand their circumstances before they occur.

We’d rather check how close the process is to the Kubernetes limits:

(sum by (namespace, pod, container)
(rate(container_cpu_usage_seconds_total{container!=""}[5m])) / sum by
(namespace, pod, container)
(kube_pod_container_resource_limits{resource="cpu"})) > 0.8

Kubernetes CPU throttling

CPU throttling (throttling) is the act of slowing down a process when it is about to hit some resource limit. Similar to the memory case, these limits may be:

  • Kubernetes limits to set on the container.
  • The Kubernetes ResourceQuota set on the namespace.
  • The actual computing power of the node.

Consider the following analogy. We have a highway with the following traffic flow:

  • The CPU is like a road
  • Vehicles represent Processes, each vehicle has a different size
  • Multiple channels represent multiple CPU cores
  • request will be a dedicated road, such as a bike lane

Throttling here is represented as traffic jams: eventually, all processes will run, but everything will be slower.

CPU processing logic in Kubernetes

CPUs are handled via shares in Kubernetes. Each CPU core is divided into 1024 shares, which are then divided between all running processes using the cgroups (control groups) feature of the Linux kernel.

20230719132700

If the CPU can handle all current processes, no action is required. If the process uses more than 100% of the CPU, the shares mechanism will come into play. Like any Linux kernel, Kubernetes uses the CFS (Completely Fair Scheduler) mechanism, so processes with more shares will get more CPU time.

Unlike memory, Kubernetes does not kill Pods due to throttling.

20230719133002

You can check CPU stats in /sys/fs/cgroup/cpu/cpu.stat

CPU transitional allocation

As we saw in the limits and requests article, setting limits or requests is very important when we want to limit the resource consumption of a process. Be careful not to set total requests larger than actual CPU size though, each container should have guaranteed CPU.

Monitor Kubernetes CPU throttling

You can check how close a process is to Kubernetes limits:

(sum by (namespace,pod,container)(rate(container_cpu_usage_seconds_total
{container!=""}[5m])) / sum by (namespace, pod, container)
(kube_pod_container_resource_limits{resource="cpu"}))

If we want to track the amount of throttling happening in the cluster, cadvisor provides two metrics container_cpu_cfs_throttled_periods_total and container_cpu_cfs_periods_total. With these two metrics, you can easily calculate the percentage of throttling over all CPU cycles.

Best Practices

Note limits and requests

Limits are a way to set maximum upper bounds on resources in node, but need to be treated with caution as you may end up with limits or process kills.

Get ready for eviction

By setting a very low request, you might think that this would grant your process the least amount of CPU or memory. But the kubelet will first evict pods that are more used than requested, so it is equivalent to you marking these processes as the first to be killed!

If you need to protect specific Pods from preemption (when kube-scheduler needs to allocate new Pods), assign Priority Classes to the most important processes.

Throttling is a silent enemy

Set unrealistic limits or overcommit, and you may not realize that your process is being throttled and performance is suffering. It is important to proactively monitor CPU usage, understand exact container and namespace level limits, and spot issues in a timely manner.

attached

The picture below better explains the CPU and memory limitations in Kubernetes. for reference:

20230719134849

This article is translated from: Kubernetes OOM and CPU Throttling – Sysdig