New Titanium Cloud Server has shared with you a total of 772 pieces of technical information.
Introduction
Kube-downscaler is an open source tool that allows users to define when pod resources in Kubernetes are automatically scaled down. This helps reduce infrastructure costs by reducing resource usage during off-peak hours.
In this article, we will detail the functionality, installation, and configuration of kube-downscaler, as well as its use cases and future prospects.
Features of kube-downscaler
Kube-downscaler is a powerful scheduling-based tool for upgrading or downgrading applications in a Kubernetes cluster. In this section, we’ll explore some of the tool’s key features:
Compatibility with Kubernetes features or tools
Kube-downscaler also supports Horizontal Pod Autoscaling (HPA) and can be used in conjunction with HPA to ensure the required number of replicas are maintained for the application. This enables kube-downscaler to provide additional flexibility and fine-grained control for application scaling in Kubernetes.
Karpenter and kube-downscaler are two tools that can be used together to provide a complete and powerful resource management solution for Kubernetes clusters. By using Karpenter in conjunction with kube-downscaler, Kubernetes clusters can benefit from horizontal and vertical scaling. Downscaler allows reducing the number of pods, while Karpenter optimizes node utilization by consolidating pods onto fewer or different types of machines.
Compatibility with Kubernetes features or tools
Kube-downscaler also supports Horizontal Pod Autoscaling (HPA) and can be used in conjunction with HPA to ensure the required number of replicas are maintained for the application. This enables kube-downscaler to provide additional flexibility and fine-grained control for application scaling in Kubernetes.
Karpenter and kube-downscaler are two tools that can be used together to provide a complete and powerful resource management solution for Kubernetes clusters. By using Karpenter in conjunction with kube-downscaler, Kubernetes clusters can benefit from horizontal and vertical scaling. Downscaler allows reducing the number of pods, while Karpenter optimizes node utilization by consolidating pods onto fewer or different types of machines.
Automatically scale deployment replicas based on defined time periods
Kube-downscaler can automatically scale deployment replicas based on predefined time periods. This means we can set up a schedule to increase or decrease the number of replicas at specific times of the day, week, or month.
For example, if we know that our application experiences high traffic during certain times of the day, we can configure kube-downscaler to automatically scale up replicas during those times and then scale them down when traffic decreases.
This allows scaling in anticipation of peak loads, rather than waiting for peak loads to occur and be handled by the HPA. This can help optimize resource usage and ensure your application is always available and responsive.
But Kube-downscaler is mainly used to shrink replicas and optimize the cost of the cluster, and we usually use HPA to manage scaling.
Installation and configuration of kube-downscaler
kube-downscaler installation instructions on kubernetes cluster
-
Clone the kube-downscaler repository from GitHub:
git clone <https://codeberg.org/hjacobs/kube-downscaler.git>
?
-
Enter the kube-downscaler directory:
cd kube-downscaler
?
-
Edit the
deploy/kube-downscaler.yaml
file to customize the configuration to your specific needs. For example, you can adjust time zones, schedules, and scaling rules. -
Apply the configuration to your Kubernetes cluster:
kubectl apply -f deploy/
?
This command will deploy the kube-downscaler controller and create a kube-downscaler
deployment.
You can verify that the kube-downscaler controller is running by checking the logs for your kube-downscaler
deployment:
kubectl logs -f deployment/kube-downscaler
?
After the installation is complete, some configuration is required.
Configure kube-downscaler according to specific user needs
Kube-downscaler provides customization of scaling plans by using annotations on Kubernetes deployment objects.
The downTimePeriod
annotation in the deployment object can be used to specify the downtime period during which the deployment should not be extended.
The minReplicas
annotation can be used to set the minimum number of replicas for a deployment.
These fields, combined with the kube-downscaler annotation, allow you to create customized scaling plans based on specific business needs and resource utilization patterns.
By adjusting these fields, you can configure kube-downscaler to scale your deployment in a way that optimizes application availability and cost efficiency.
The following is a simple configuration for deployment using kube-downscaler.
apiVersion: apps/v1 Kind: Deployment metadata: name: random-deployment annotations: # Kube-downscaler downscaler/downtimePeriod: "Mon-Fri 00:00-07:00 Europe/Berlin" downscaler/minReplicas: 1 spec: replicas: 2 selector: matchLabels: app: random template: metadata: labels: app: random spec: containers: - name: random-container image: random-image
?
With this configuration, the number of replicas will be reduced to 1 from midnight to 7am Monday through Friday (on the Europe/Berlin time line).
kube-downscaler will automatically start scaling down pods according to a defined schedule.
We currently have kube-downscaler installed and running on the Kubernetes cluster.
Algorithm
Kube-downscaler
will scale down a deployed replica if all of the following conditions are true:
-
current time is not part of the “uptime” schedule, nor is it part of the “downtime” schedule. If true, plans are evaluated in the following order:
-
Annotations for the
downscaler/downscale-period
ordownscaler/downtime
workload definition -
Annotations for the
downscaler/upscale-period
ordownscaler/uptime
workload definition -
Annotations on the
downscaler/downscale-period
ordownscaler/downtime
workload namespace -
Annotations on the
downscaler/upscale-period
ordownscaler/uptime
workload namespace -
--upscale-period
or--default-uptime
CLI parameters -
--downscale-period
or--default-downtime
CLI parameters -
UPSCALE_PERIOD
orDEFAULT_UPTIME
environment variable -
DOWNSCALE_PERIOD
orDEFAULT_DOWNTIME
environment variable
-
-
The workload’s namespace is not part of the exclusion list:
-
If an exclusion list is provided, it will be used instead of the default (including only
kube-system
).
-
-
The workload’s tags do not match the tag list.
-
The name of the workload is not part of the exclusion list
-
The workload is not marked for exclusion (comment
downscaler/exclude: "true"
ordownscaler/exclude-until: "2024-04-05"
) -
No active Pods force the entire cluster into uptime (comment
downscaler/force-uptime: "true"
)
Minimum replicasMinimum number of replicas
By default, deployments are scaled down to zero copies. This can be configured via annotations on the deployment or its namespace, downscaler/downtime-replicas
or via the CLI using --downtime-replicas
.
Ex: downscaler/downtime-replicas: "1"
Specific workloadSpecific workload
Under normal conditions for HorizontalPodAutoscalers
, this field cannot be set to zero, so downscaler/downtime-replicas
should be set to at least 1
. Regarding CronJobs
, their status will be defined as we would expect with suspend: true
.
Points to note
Note that the default grace period of 15 minutes applies to new nginx deployments, i.e.
-
If the current time is not in , it will not shrink immediately to
Mon-Fri 9-17 (Buenos Aires timezone)
, but will shrink after 15 minutes.downscaler
will eventually record the following content:
INFO: Scaling down Deployment default/nginx from 1 to 0 replicas (uptime: Mon-Fri 09:00-17:00 America/Buenos_Aires, downtime: never)
?
Note that if HorizontalPodAutoscaler
(HPA) is used with deployments, consider the following:
-
If scaling down to 0 replicas is required, the annotation should be applied on
Deployment
. This is a special case becauseminReplicas
is not allowed to be 0 on HPA. Setting deployment replicas to 0 essentially disables HPA. In this case, HPA will emit events such asfailed to get memory utilization: unable to get metrics for resource memory: no metrics returned from resource metrics API
There are no Pods to retrieve metrics from. -
If a reduction greater than 0 is required, the annotation should be applied on the HPA. This allows dynamic scaling of pods based on external traffic during downtime, and keeps
minReplicas
traffic low during downtime if there is no low traffic. Annotating a deployment instead of an HPA will cause a race condition, i.e. downscaling the deployment,kube-downscaler
HPA will upgrade it when the deployment is higherminReplicas
.
To enable the downscaler
using --downtime-replicas=1
on the HPA, make sure to add the following annotation to both the deployment and the HPA.
$ kubectl annotate deploy nginx 'downscaler/exclude=true' $ kubectl annotate hpa nginx 'downscaler/downtime-replicas=1' $ kubectl annotate hpa nginx 'downscaler/uptime=Mon-Fri 09:00-17:00 America/Buenos_Aires'
?
Detailed configuration
Uptime/downtime spec
Downscalers are configured via command line arguments, environment variables, or Kubernetes annotations.
Time definitions (such as DEFAULT_UPTIME
) accept a comma-separated list of specifications, for example, the following configuration will narrow down all deployments during non-working hours:
DEFAULT_UPTIME="Mon-Fri 07:30-20:30 Europe/Berlin"
?
Only narrowed on weekends and Fridays after 20:00:
DEFAULT_DOWNTIME="Sat-Sun 00:00-24:00 CET, Fri-Fri 20:00-24:00 CET'
?
Each time specification can be in one of two formats:
-
Repeat canonical format
. The time zone value can be any time zone, e.g. “US/Eastern”, “PST” or “UTC”.- : - : -
Absolute canonical formats,
where each- format is the ISO 8601 date and time format
.- - - T
: : [ + -] : - T
Replacement logic based on periods
Instead of strict uptime or downtime, you can choose a time period for upgrading or scaling down. The time definition is the same. In this case, zooming in or out only occurs during the time period and the rest of the time will be ignored.
If an upgrade or scale-down cycle is configured, uptime and downtime are ignored. This means that some options are mutually exclusive, for example, you can use --default-downtime
or --default-downtime
but not both at the same time.
This definition will scale down the cluster between 19:00 and 20:00. If you upgrade the cluster manually, the cluster will not be scaled down before 19:00-20:00 the next day.
DOWNSCALE_PERIOD="Mon-Sun 19:00-20:00 Europe/Berlin"
?
Command line options
Available command line options:
-
--dry-run
Run-only mode: doesn’t change anything, just prints what will be done
-
--debug
Debug mode: print more information
-
--once
Run the loop only once and exit
-
--interval
Loop interval (default: 30 seconds)
-
--namespace
Limit the downscaler to only work in a single namespace (default: all namespaces). This is mainly suitable for deployment scenarios where the deployer of kube-downscaler only has access to a given namespace (but not cluster access). If used simultaneously with
--exclude-namespaces
, nothing is applied. -
--include-resources
Narrow such resources to a comma-separated list.
-
--grace-period
The grace period in seconds before a new deployment is scaled down (default: 15 minutes). The grace period starts when the deployment is created, i.e., updated deployments will be scaled down immediately regardless of the grace period.
-
--upscale-period
Alternative logic to only scale up for a given period (default: never), also available via the environment variable
UPSCALE_PERIOD
or via per-deploymentdownscaler/upscale-period
annotation to configure -
--downscale-period
Alternative logic to scale down only during a given period (default: never), also available via the environment variable
DOWNSCALE_PERIOD
or via thedownscaler/downscale-period
on each deployment Comment to configure -
--default-uptime
Default time frame to scale up (default: always), also configurable via the environment variable
DEFAULT_UPTIME
or via annotations on each deploymentdownscaler/uptime
-
--default-downtime
Default time range to downscale (default: never), also configurable via the environment variable
DEFAULT_DOWNTIME
or via annotations on each deploymentdownscaler/downtime
-
--exclude-namespaces
Exclude namespaces from downgrade (list of regular expression patterns, default: kube-system), also configurable via the environment variable
EXCLUDE_NAMESPACES
. If used together with--namespace
, nothing is applied. -
--exclude-deployments
Exclude specific deployments/statesets/cronjobs from downgrading (default: kube-downscaler, downscaler), also configurable via the environment variable
EXCLUDE_DEPLOYMENTS
. Despite the name, this option will match the name of any included resource type (Deployment, StatefulSet, CronJob, etc.). -
--downtime-replicas
Default value for replicas to downscale to, the annotation
downscaler/downtime-replicas
takes precedence over this value. -
--deployment-time-annotation
Optional: The name of the annotation will be used instead of the resource’s creation timestamp. You should use this option if you want your resources to keep scaling up during the grace period (
--grace-period
) after deployment. The annotation’s timestamp value must be in the exact same format as Kubernetes:creationTimestamp
%Y-%m-%dT%H:%M:%SZ
. Recommendation: Automatically set this annotation through the deployment tool. -
--matching-labels
Optional: List of workload tags covered by kube-downscaleer scope. All workloads with labels that do not match any workload in the list will be ignored. For backward compatibility, if this parameter is not specified, kube-downscaler will be applied to all resources.
Namespace DefaultsNamespace defaults
The DEFAULT_UPTIME
, DEFAULT_DOWNTIME
and FORCE_UPTIME
exclusions can also be configured using namespace annotations. Where configured, these values supersede other global defaults.
apiVersion: v1 kind: Namespace metadata: name: foo labels: name: foo annotations: downscaler/uptime: Mon-Sun 07:30-18:00 CET
The following annotations are supported at the namespace level:
-
downscaler/upscale-period
-
downscaler/downscale-period
-
downscaler/uptime
: Sets the “uptime” for all resources in this namespace -
downscaler/downtime
: Sets “downtime” for all resources in this namespace -
downscaler/force-downtime
: Force downscaling of all resources in this namespace – can betrue
/false
-
downscaler/force-uptime
: Force upscaling of all resources in this namespace – can betrue
/false
-
downscaler/exclude
: Set totrue
to exclude all resources in the namespace -
downscaler/exclude-until
: Temporarily excludes all resources in the namespace until the given timestamp -
downscaler/downtime-replicas
: Overrides the default target replicas to scale down to (default: zero)
Use cases
The primary use case for this tool is to reduce costs by optimizing the utilization of Kubernetes cluster resources. However, it can also be used to warm up the cluster and avoid overreliance on HPA.
While this is not its primary purpose, this combination provides an alternative solution that ensures high availability of applications while minimizing infrastructure costs.
Reduce costs
Another use case for kube-downscaler is to prevent service outages during peak usage. By defining a plan for scaling resources during periods of high demand, kube-downscaler can help scale deployments pre-emptively and avoid HPA delays to ensure applications remain available and responsive even during peak usage.
Service interruption prevention
Another use case for kube-downscaler is to prevent service outages during peak usage. By defining a plan for scaling resources during periods of high demand, kube-downscaler can help scale deployments pre-emptively and avoid HPA delays to ensure applications remain available and responsive even during peak usage.
Suggestions
Extensions based on predefined plans, which may not be suitable for all use cases. Additionally, it does not support autoscaling, which means users must manually adjust scaling plans to meet changing needs.
Another solution to consider is Keda. Keda is an open source project that provides dynamic autoscaling capabilities for Kubernetes applications. Using Keda, users can set custom scaling rules based on various metrics such as queue length, CPU usage, or custom metrics.
This allows for more granular control over resource usage and ensures that the application always scales correctly to meet demand.
Additionally, Keda is compatible with a wide range of Kubernetes applications, including stateful and stateless applications, and supports multiple event sources such as Azure Event Hubs, Kafka, and RabbitMQ.
Conclusion
Kube-downscaler is a powerful tool for managing resource usage in Kubernetes clusters. By defining scaling plans, users can optimize resource usage in the cluster and reduce costs while ensuring that applications remain available and responsive even during peak usage.
While kube-downscaler is a valuable tool for managing resource usage in a Kubernetes cluster, it may have some limitations. If you need more granular control over resource scaling or need automatic scaling capabilities, it may be worth considering an alternative solution like Keda.
Recommended reading
Recommended videos