[Building Prometheus monitoring from scratch] Section 5: Installing Prometheus on k8s and indicator optimization

Article directory

  • Preface
  • Install Prometheus on Kubernetes
    • StatefulSet is configuring
    • Detailed explanation of configmap configuration and indicator collection optimization ideas
      • How to filter indicators (optimize indicator volume)
      • How to get all k8s metrics?
    • start up
      • View metrics

Foreword

In our Prometheus monitoring architecture, the main role of prometheus on k8s is to monitor k8s resources, such as Node, Pod, StatefulSet, etc. And we need to install prometheus on each k8s cluster, and the binary deployed prometheus can be used as an indicator collection center, responsible for collecting indicators of other exporters.

Install Prometheus on Kubernetes

First explain the role of each resource separately

  • Namespace: plays the role of resource isolation
  • StatefulSet: prometheus service
  • Configmap: prometheus configuration file
  • Service: Nodeport is used to access the prometheus page outside the cluster.
  • ClusterRole, ClusterRoleBinding: Define and grant cluster-wide permissions
  • ServiceAccount: Provides identity for the prometheus process

In order to make the resource list look clearer, each k8s resource is listed separately here.

  • prometheus-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: prometheus
  • prometheus-StatefulSet.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus
  namespace: prometheus
spec:
  serviceName: prometheus
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName:prometheus
      serviceAccount: prometheus
      volumes:
        - hostPath:
            path: /data/prometheus
            type: ''
          name: data
        - name: config-volume
          configMap:
            name:prometheus-config
        - name: timezone
          hostPath:
            path: /etc/localtime
      containers:
        - image: prom/prometheus:v2.48.0-rc.1
          name: prometheus
          args:
            - "--config.file=/etc/prometheus/prometheus.yaml" # Configuration file path
            - "--storage.tsdb.path=/prometheus" # Specify tsdb data path
            - "--storage.tsdb.retention.time=2d" # Data retention period
            - "--web.enable-lifecycle" # Support hot update, directly execute localhost:9090/-/reload to take effect immediately
          ports:
            - containerPort: 9090
              name: http
          securityContext:
            runAsUser: 0
          resources: # Configure according to needs
            limits:
              cpu: '1'
              memory: 2Gi
            requests:
              cpu: '0'
              memory: '0'
          volumeMounts:
            - mountPath: "/etc/prometheus"
              name: config-volume
            - mountPath: "/prometheus"
              name:data
            - name: timezone
              mountPath: /etc/localtime
  • prometheus-ClusterRole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  -nodes
  -nodes/proxy
  -nodes/metrics
  - services
  -endpoints
  -pods
  verbs: ["get", "list", "watch"]
- apiGroups:
  -extensions
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
  • prometheus-ClusterRoleBinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: prometheus
  • prometheus-ServiceAccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: prometheus
  • prometheus-Service.yaml
apiVersion: v1
kind: "Service"
metadata:
  name: prometheus
  namespace: prometheus
  labels:
    name: prometheus
spec:
  ports:
  - name: prometheus
    protocol: TCP
    port: 39090
    nodePort: 39090 #Port exposed to the outside world
    targetPort: 9090
  selector:
    app: prometheus
  type: NodePort
  • prometheus-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name:prometheus-config
  namespace: prometheus
data:
  prometheus.yaml: |
    global:
      scrape_interval: 30s
      evaluation_interval: 30s
      external_labels:
        origin_prometheus: k8s-prome-demo #Differentiate clusters

    remote_write: # Remote write to remote VM storage
      - url: http://10.0.1.50:8428/api/v1/write

    scrape_configs:
      - job_name: 'k8s-state-metrics'
        metrics_path: "/metrics"
        static_configs:
        - targets: ['kube-state-metrics.kube-system.svc:8080']
        metric_relabel_configs:
        - source_labels:
          - __name__
          regex: '(kube_.*_info|kube_namespace_labels|kube_node_status_.*|kube_pod_container_resource_.*|kube_pod_container_status_restarts_total).*'
          action: keep

      - job_name: 'k8s-cadvisor'
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - source_labels: [__meta_kubernetes_node_name]
          regex: (. + )
          target_label: __metrics_path__
          replacement: metrics/cadvisor
        - action: labelmap
          regex: __meta_kubernetes_node_label_(. + )
        metric_relabel_configs:
        - source_labels:
          - __name__
          regex: '(container_cpu_usage_seconds_total|container_.*_bytes|container_memory_rss|container_spec_cpu_quota).*'
          action: keep
        - source_labels: [instance]
          separator: ;
          regex: (. + )
          target_label: node
          replacement: $1
          action:replace

      - job_name: 'k8s-kubelet'
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(. + )

      - job_name: 'k8s-apiserver'
        kubernetes_sd_configs:
        - role: endpoints
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        relabel_configs:
        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
          action: keep
          regex: default;kubernetes;https
        - target_label: __address__
          replacement: kubernetes.default.svc:443

Don’t rush to start it, first understand the configuration so that you can customize it according to your own environment.

StatefulSet is being configured

See note

 - image: prom/prometheus:v2.48.0-rc.1
          name: prometheus
          args:
            - "--config.file=/etc/prometheus/prometheus.yaml" # Configuration file path
            - "--storage.tsdb.path=/prometheus" # Specify tsdb data path
            - "--storage.tsdb.retention.time=2d" # Data retention period
            - "--web.enable-lifecycle" # Support hot update, directly execute localhost:9090/-/reload to take effect immediately

Detailed explanation of configmap configuration and optimization ideas for indicator collection

  1. In the global configuration, add an external_labels to distinguish multiple Prometheus or k8s clusters
global:
...
      external_labels:
        origin_prometheus: k8s-prome-demo #Differentiate clusters
  1. Configure writing to remote storage VictoriaMetrics
 remote_write: # Remote write to remote VM storage
      - url: http://IP of your VM:8428/api/v1/write
  1. In the next scrape_configs, there are 4 pieces of content, namely
  • job: k8s-state-metrics: Collect detailed metrics about resource status provided by Kubernetes API Server, such as the status of Deployments, Nodes and Pods.
  • job: k8s-cadvisor: Collect container-level resource usage and performance indicators, such as CPU, memory and disk usage, through cAdvisor (Container Advisor).
  • job: k8s-kubelet: Collect performance and status information of nodes and Pods provided by Kubernetes Kubelet.
  • job: k8s-apiserver: Collect performance and health indicators of Kubernetes API Server, such as request latency, request rate and error rate.

How to filter indicators (optimize indicator volume)

For example, in the job: k8s-state-metrics configuration, there is such a piece of metric_relabel_configs configuration. Write a regular expression to filter the corresponding metric name.

 metric_relabel_configs:
        - source_labels:
          - __name__
          regex: '(kube_.*_info|kube_namespace_labels|kube_node_status_.*|kube_pod_container_resource_.*|kube_pod_container_status_restarts_total).*'
          action: keep

Briefly talk about optimization ideas: If you don’t know how to start at the beginning, you can get the entire Dashboard through Grafana’s related k8s Dashboard in SettingsJSON Model Configuration,

You can see the following configuration. container_cpu_usage_seconds_total is one of the indicators. All indicators of the entire Dashboard are collected, so that you can roughly know which indicators you want to collect.

 "expr": "topk(10, sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{origin_prometheus=~"$origin_prometheus", pod != "", container!=""} )))",

How to get all k8s indicators?

  1. Get all metrics of kube-state-metrics
$ kubectl get svc -n kube-system
kube-state-metrics NodePort 10.68.24.12 <none> 8080:30080/TCP,8081:30081/TCP 2y179d

$ curl http://<k8s-state-metrics-ip>:8080/metrics
#http://10.68.24.12:8080/metrics
  1. Get all indicators of k8s-cadvisor
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.0.0.20 Ready node 2y179d v1.14.6

$ curl \
--cert admin.pem \
--key admin-key.pem \
--cacert ca.pem \
https://<node-IP>:10250/metrics/cadvisor
# https://10.0.0.20:10250/metrics/cadvisor
  1. Get all indicators of k8s-kubelet
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.0.0.20 Ready node 2y179d v1.14.6

$ curl \
--cert admin.pem \
--key admin-key.pem \
--cacert ca.pem \
https://<node-IP>:10250/metrics
# https://10.0.0.20:10250/metrics
  1. Get all indicators of k8s-apiserver
$ kubectl get svc kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.68.0.1 <none> 443/TCP 2y179d

$ cd /etc/kubernetes/ssl
$ curl \
--cert admin.pem \
--key admin-key.pem \
--cacert ca.pem \
https://<kubernetes-CLUSTER-IP>/metrics
# https://10.68.0.1/metrics

Start

$ ls
prometheus-namespace.yaml
prometheus-StatefulSet.yaml
prometheus-configmap.yaml
prometheus-Service.yaml
prometheus-ClusterRole.yaml
prometheus-ClusterRoleBinding.yaml
prometheus-ServiceAccount.yaml

# one-button start
$ kubectl apply -f .
  • Visit Prometheus: http://your k8s cluster IP:39090
  • In the StatefulSet configuration, we configured --web.enable-lifecycle so that the configuration can be hot-loaded without restarting the Prometheus service. See the following command
curl -XPOST http://your k8s cluster IP:39090/-/reload

View indicator volume

  • Display the top 50 indicators by data volume
topk(50, count by (__name__, job)({__name__=~". + "}))
  • Display the top 50 indicators of the specified job data volume
topk(50, count by (__name__)({job="node_exporter"}))
  • The amount of indicator data in prometheus
sum(count by (__name__, job)({__name__=~". + "}))
  • The amount of indicator data of the specified job in prometheus
sum(count by (__name__, job)({__name__=~"node_ipvs_. + "}))

Slowly adjust and optimize indicators based on machine specifications and business needs.

To learn more, please pay attention to this column: Prometheus Monitoring