Monitoring system-Prometheus (3) Metrics in Prometheus

Article directory

  • Monitoring system-Prometheus (3) Metrics in Prometheus
    • 1. Prometheus indicator type
      • 1. GaugeVec and Gauge type indicators of Prometheus
    • 2. Actual demo of converting heartbeat data into gauge type indicators of Prometheus
      • Used promhttp.Handler() of the Prometheus Go client library to expose the metrics interface
    • 3. Configure Prometheus to obtain custom indicators
      • How to tell if a custom endpoint is working properly
      • How to determine whether Prometheus successfully obtained and stored custom indicators
      • How to check the port Prometheus is listening on
    • 4. Security of metrics interface
    • 5. grafana configuration indicator panel

Monitoring system-Prometheus (3) Metrics in Prometheus

Prometheus four major metrics and applications
Reference URL: https://www.jianshu.com/p/fa5f911003c6

Prometheus, a project within the Cloud Native Foundation (CNCF), has become the most popular open source monitoring software and has effectively become the industry standard for metric monitoring.

Prometheus defines a metric description format and a remote writing protocol. The community and many vendors have adopted this protocol to become the de facto standard for describing and collecting metrics.

OpenMetrics is another CNCF project that builds on the Prometheus export format to provide a standardized, vendor-neutral model for collecting metrics and is intended to be part of the Internet Engineering Task Force (IEFT).

1. Prometheus indicator types

The overall architecture of Prometheus is divided into Server side and Exporter side, and Exporter is usually developed based on the official SDK (such as Go SDK).
One concept needs to be clarified here: The data indicator type is only a concept on the client side, which is used to maintain the production of metrics to facilitate business differentiation.

Prometheus’s client library provides four core indicator types. The client can call different API interfaces according to different data types. In fact, Prometheus server does not distinguish between indicator types, but simply treats these indicators as untyped time series Metric value, timestamp .. Timestamps are added by a monitoring backend (such as Prometheus) or an agent when crawling metrics.

Prometheus uses a pull model to collect these metrics; that is, Prometheus actively crawls HTTP endpoints that expose metrics. These endpoints can be naturally exposed by the component being monitored.

Prometheus can capture metrics in Prometheus exposure format and OpenMetrics format.

  • Counter: A cumulative counter that only increases but does not decrease.
    In application scenarios, such as the number of requests, the number of errors, etc., it is very suitable to use Counter as an indicator type. In addition, the Counter type will only be reset to zero when the collected end is restarted.

  • Gauge: A gauge that can be increased or decreased, representing a single value.
    In application scenarios, for example, the number of Goroutines when the Go application is running can be represented by this type. It is very common when counting CPU, Memory, etc. in the system. In business scenarios, the number of business queues can also be represented by Gauge is used to collect statistics, observe the number of queues in real time, and discover the accumulation situation in time, because it is a floating value, not fixed, and focuses on feedback of the current situation.

  • Histogram: Summarizes the distribution of samples.
    In most cases people tend to use the average of some quantitative indicators, such as average CPU usage, average page response time.

  • Summary: Track statistical information such as total data, number of data, etc.

Each type has its suitable usage scenarios.

1. GaugeVec and Gauge type indicators of Prometheus

GaugeVec

  • A Metric Vector, a Gauge indicator collection that supports multi-dimensional labels.
  • Multiple Gauge indicator instances with the same indicator name and help text are allowed.
  • Differentiate each indicator time series through different label dimensions.
  • Usage scenarios: Situations where the same indicator needs to be grouped, such as monitoring indicators for nodes and instantiation dimensions.

Gauge

  • A single dashboard type indicator that represents the current value of a floats64.
  • No labels are supported, just a simple metric name and time series.
  • Usage scenario: There is no need to distinguish dimensions, and the current value of a single indicator is directly reflected.

the difference

  • GaugeVec supports labels, Gauge does not.
  • GaugeVec represents multiple time series, and Gauge represents a single time series.
  • GaugeVec needs to be registered before use, while Gauge can be used directly.

Applicable scene

  • Monitoring metrics that require labels use GaugeVec.
  • Simple label-free metrics using Gauge.

If there are multiple agents that need to be monitored, using GaugeVec is a good choice. GaugeVec’s tag mechanism is very suitable for scenarios where multiple objects need to be monitored for similar indicators.

main reason:

  1. For each agent, the indicators are the same, such as CPU usage, memory usage, etc.
  2. But it is necessary to distinguish the indicators of each agent.
  3. GaugeVec supports differentiation through labels.
  4. An agent_id tag can be defined to identify each agent.
  5. When reporting indicators, indicators will be automatically classified into different time series through different agent_id tag values.
  6. In Prometheus queries and Dashboards, it is possible to filter and group based on agent_id.

2. Convert heartbeat data to Prometheus gauge type indicator practical demo

From the open source project elkeid:

var agentGauge = map[string]*prometheus.GaugeVec{<!-- -->
"cpu": initPrometheusAgentCpuGauge(),
"rss": initPrometheusAgentRssGauge(),
"du": initPrometheusAgentDuGauge(),
"read_speed": initPrometheusAgentReadSpeedGauge(),
"write_speed": initPrometheusAgentWriteSpeedGauge(),
"tx_speed": initPrometheusAgentTxSpeedGauge(),
"rx_speed": initPrometheusAgentRxSpeedGauge(),
}

This code defines a map named agentGauge, which stores a set of Prometheus GaugeVec type indicators.
GaugeVec is a special indicator provided by Prometheus. It is a gauge vector (dashboard vector) that can configure multi-dimensional labels on a gauge indicator to group indicators.

  1. A map is defined, the key is a string, and the value is of *prometheus.GaugeVec type.
  2. The key represents the name of this indicator, such as “cpu”.
  3. value is the corresponding GaugeVec variable, initialized in the init function.
func initPrometheusAgentCpuGauge() *prometheus.GaugeVec {<!-- -->
   prometheusOpts := prometheus.GaugeOpts{<!-- -->
      Name: "elkeid_ac_agent_cpu",
      Help: "Elkeid AC agent cpu",
   }
   vec := prometheus.NewGaugeVec(prometheusOpts, []string{<!-- -->"agent_id", "name"})
   prometheus.MustRegister(vec)
   returnvec
}

This code implements the function of initializing a Prometheus GaugeVec type indicator.

  1. Create a GaugeOpts object and set the indicator’s name and annotation information.
  2. Use the NewGaugeVec method to create a GaugeVec object based on a list of GaugeOpts and tag keys.
  3. Call prometheus.MustRegister to register the indicator.
  4. Returns the GaugeVec object created.
func metricsAgentHeartBeat(agentID, name string, detail map[string]interface{<!-- -->}) {<!-- -->
if detail == nil {<!-- -->
return
}
for k, v := range agentGauge {<!-- -->
if cpu, ok := detail[k]; ok {<!-- -->
if fv, ok2 := cpu.(float64); ok2 {<!-- -->
v.With(prometheus.Labels{<!-- -->"agent_id": agentID, "name": name}).Set(fv)
}
}
}
}

The function of this code is to convert the agent heartbeat data into Prometheus’ gauge type indicator.

  1. Receive parameters:
  • agentID: agent ID
  • name:name of agent
  • detail: heartbeat details, map type
  1. Traverse the predefined agentGauge indicator map:
  • agentGauge defines a set of GaugeVec indicators in advance
  1. Find the field corresponding to the key in detail:
  • Such as cpu, rss, etc.
  1. If found, extract the value and convert it to float64:
  • Prometheus metrics require float64 value
  1. Call the With method of GaugeVec to set the label:
  • The agent_id and name are set here to distinguish different agents.
  1. Call the Set method to set the value of GaugeVec:
  • That is, the value of the indicator of the agent
  1. Repeat the above process to convert all indicators

  2. For example, in heartbeat data:

  • The value of cpu is 0.00216669
  • After conversion according to the above process, the values of two GaugeVec indicators will be set:
  • agent_cpu{agent_id=”1ea7…”, name=”agent-1″} 0.00216669
    In this way, monitoring indicators can be extracted from heartbeat data.

The metricsAgentHeartBeat function is designed to only update Prometheus metrics without storing heartbeat data. The core focus of this function is to convert heartbeat data into Prometheus indicators and publish them for monitoring. Persistence storage is another concern, so it is the responsibility of other modules.

This separation design considers code neatness, decoupling, efficiency, scalability, etc., and separates different functional concerns into appropriate modules.

Convert heartbeat data into Prometheus metrics and publish them for monitoring. Persistence storage is another concern and can be taken care of by other modules.

Used promhttp.Handler() of the Prometheus Go client library to expose the metrics interface

router.GET("/metrics", func(c *gin.Context) {<!-- -->
promhttp.Handler().ServeHTTP(c.Writer, c.Request)
})

This code uses Prometheus’s promhttp package to expose an example of the metrics interface in the Gin framework. This code is very important to Prometheus because it allows the application’s metrics data to be pulled by the Prometheus server outside the end.

  1. Defines an HTTP GET request route for /metrics
  2. Call promhttp.Handler().ServeHTTP in the request handling function
  3. This will use Prometheus’ default handler to handle the request
  4. and returns metrics data encoded in Prometheus format
  5. The interface will return data similar to the following format:
# HELP http_requests_total Total number of HTTP requests made.
# TYPE http_requests_total counter
http_requests_total{<!-- -->method="post",code="200"} 1027 1395066363000
http_requests_total{<!-- -->method="post",code="400"} 3 1395066363000

3. Configure Prometheus to obtain custom indicators

In the previous demo, Prometheus sdk client is used directly in the code to expose indicators. Next I need to configure the HTTP interface for Prometheus to directly capture indicators.

demo:

- job_name: ac
    scheme: https
    tls_config:
      insecure_skip_verify: true
    static_configs:
      - targets:
        - '127.0.0.1:6752'

The function of this configuration is to let Prometheus obtain indicator data from the address 127.0.0.1:6752

  • job_name: task name, here is ac
  • scheme: use https protocol
  • tls_config: Because it is a self-signed certificate, certificate verification is skipped
  • static_configs: static target configuration
    • Targets: Configure the crawl target address, here is 127.0.0.1:6752

How to determine whether a custom endpoint is working properly

It’s very simple, we can access the custom port, for example:

Access the interface directly in the browser and check the response:

https://x.x.x.x:{<!-- -->You define the endpoint port number}/metrics

Or, check if the response body contains the expected metric information. For example:

curl -I http://localhost:8080/metrics

How to determine whether Prometheus successfully obtains and stores custom indicators

  1. Check whether the status of the target job is UP on the Prometheus UI status page (Status -> Targets).
  2. Draw your indicators on the Graph page of Prometheus UI. If you can draw it, it means that Prometheus has obtained and stored these indicators.
  3. Query the indicator data through the Prometheus HTTP API. If it can be obtained, it will be successful. For example:
curl http://localhost:9090/api/v1/query?query=your_metric_name

How to check the port Prometheus is listening on

  1. View through the Prometheus configuration file. The listening port will be configured in prometheus.yml, the default is 9090:
global:
  scrape_interval: 15s
  external_labels:
    monitor: 'prometheus'

scrape_configs:
  - job_name: 'prometheus'

    static_configs:
    - targets: ['localhost:9090']
  1. Use the netstat command to view. Run on Prometheus server:
$ sudo netstat -plnt | grep prometheus
tcp 0 0 127.0.0.1:9090 0.0.0.0:* LISTEN 7558/prometheus
  1. Use the ss command to view. Run on Prometheus server:
$ sudo ss -plnt | grep prometheus
LISTEN 0 128 127.0.0.1:9090 *:* users:(("prometheus",pid=7558,fd=4))
  1. Accessed through Prometheus HTTP API, the default is http://localhost:9090. If it can be accessed, it means the port is open.
  2. View the Prometheus process. The port column will display the listening port:
$ ps -ef | grep prometheus
prometh + 7558 99 01:42 ? 00:00:07 prometheus --config.file=/path/to/prometheus.yml

4. Security of metrics interface

Regarding ensuring interface security, you can consider the following points:

  1. Modify the default port to avoid using common ports.
  2. You can add basic authentication and check the Authorization information of the request header in the code.
  3. Use HTTPS encryption to transmit metrics data.
  4. Restrict the access IP of the metrics interface and only allow access by the IP of the Prometheus server.
  5. Use a firewall policy to only open the IP of the Prometheus server to access the metrics port.
  6. Do not expose the metrics interface to the public network. It is only exposed on the intranet and is captured by Prometheus intranet.
  7. Periodically rotate the API key for accessing the metrics interface.
  8. Check the configuration of Prometheus and do not collect irrelevant metrics interfaces.
  9. Properly set the access control of Prometheus and only allow access to relevant teams.
    Taking the above measures together can strengthen the security of the metrics interface and prevent data leakage or exploitation. (

5. Grafana configuration indicator panel

TODO
Add a metric data source. Grafana supports the configuration of many kinds of data source graphs. Here we choose Prometheus.
Two cases using go_client illustrate the use of Counter and Gague. And combine it with Grafana to configure cool graphics.