Prometheus monitors the correct posture of Redis cluster

Prometheus The correct posture for monitoring Redis (redis cluster)

Prometheus monitors Redis cluster. In fact, the routines are the same, using exporter. exporter is responsible for collecting indicators and exposing them to Prometheus through http. granafa displays data through these indicator plots. The data collected by Prometheus will also determine whether to send it to Alertmanager according to the alarm rules you set, and Alertmanager will determine whether to issue an alarm.

Alertmanager Alerts are divided into three stages

Inactive Rules that trigger alerts will be sent here.
Pending The waiting time you set is the for in the rule
Firing Send alerts to email, DingTalk, etc.

Without further ado, let’s start monitoring Redis cluster

redis_exporter monitor Redis cluster

Which application is monitored and the corresponding exporter used can be found on the official website. EXPORTERS AND INTEGRATIONS

Redis uses redis_exporter, link: redis_exporter

Supports Redis 2.x – 5.x

Installation and parameters

download link

wget https://github.com/oliver006/redis_exporter/releases/download/v1.3.5/redis_exporter-v1.3.5.linux-amd64.tar.gz
tar zxvf redis_exporter-v1.3.5.linux-amd64.tar.gz
cd redis_exporter-v1.3.5.linux-amd64/
./redis_exporter <flags>

redis_exporter supports many parameters, but only a few are useful to us.

./redis_exporter --help
Usage of ./redis_exporter:
    -redis.addr string
    Address of the Redis instance to scrape (default "redis://localhost:6379")
    -redis.password string
    Password of the Redis instance to scrape
    -web.listen-address string
    Address to listen on for web interface and telemetry. (default ":9121")

Single instance redis monitoring

nohup ./redis_exporter -redis.addr 172.18.11.138:6379 -redis.password xxxxx & amp;

Prometheus Add a single instance

- job_name: redis_since
    static_configs:
    - targets: ['172.18.11.138:9121']

Redis cluster monitoring solution

This is quite laborious. I checked a lot of information online, and most of them are about monitoring single instances.

Solutions I’ve tried:
Both of the following will prompt authentication failure

level=error msg="Redis INFO err: NOAUTH Authentication required."

method one

nohup ./redis_exporter -redis.addr 172.18.11.139:7000 172.18.11.139:7001 172.18.11.140:7002 172.18.11.140:7003 172.18.11.141:7004 172.18. 11.141:7005 -redis.password xxxxx &< /pre>
 <p>Method Two</p>
 <pre>nohup ./redis_exporter -redis.addr redis://h:sxxmy.312==/@172.18.11.139:7000 redis://h:sxxmy.312==/@172.18.11.139:7001 redis:/ /h:sxxmy.312==/@172.18.11.140:7002 redis://h:sxxmy.312==/@172.18.11.140:7003 redis://h:sxxmy.312==/@172.18.11.141: 7004 redis://h:sxxmy.312==/@172.18.11.141:7005 -redis.password xxxxx & amp;

Originally I wanted to take the lowest approach and start one redis_exporter for each instance. In this case, many statements in the cluster cannot be used, such as cluster_slot_fail. abandon this method

nohup ./redis_exporter -redis.addr 172.18.11.139:7000 -redis.password xxxxxx -web.listen-address 172.18.11.139:9121 > /dev/null 2> & amp;1 & amp;
nohup ./redis_exporter -redis.addr 172.18.11.139:7001 -redis.password xxxxxx -web.listen-address 172.18.11.139:9122 > /dev/null 2> & amp;1 & amp;
nohup ./redis_exporter -redis.addr 172.18.11.140:7002 -redis.password xxxxxx -web.listen-address 172.18.11.139:9123 > /dev/null 2> & amp;1 & amp;
nohup ./redis_exporter -redis.addr 172.18.11.140:7003 -redis.password xxxxxx -web.listen-address 172.18.11.139:9124 > /dev/null 2> & amp;1 & amp;
nohup ./redis_exporter -redis.addr 172.18.11.141:7004 -redis.password xxxxxx -web.listen-address 172.18.11.139:9125 > /dev/null 2> & amp;1 & amp;
nohup ./redis_exporter -redis.addr 172.18.11.141:7005 -redis.password xxxxxx -web.listen-address 172.18.11.139:9126 > /dev/null 2> & amp;1 & amp;

In the end, I had to go to github to file an issue. I used my Chinese English to communicate with the author and finally understood. . . In fact, official documents have been written.

scrape_configs:
  ## config for the multiple Redis targets that the exporter will scrape
  - job_name: 'redis_exporter_targets'
    static_configs:
      - targets:
        - redis://first-redis-host:6379
        - redis://second-redis-host:6379
        - redis://second-redis-host:6380
        - redis://second-redis-host:6381
    metrics_path: /scrape
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: <<REDIS-EXPORTER-HOSTNAME>>:9121
  
  ## config for scraping the exporter itself
  - job_name: 'redis_exporter'
    static_configs:
      - targets:
        - <<REDIS-EXPORTER-HOSTNAME>>:9121

Redis Cluster Practical Operation

Start redis_exporter

nohup ./redis_exporter -redis.password xxxxx & amp;

Key points
How to configure in prometheus:

- job_name: 'redis_exporter_targets'
    static_configs:
      - targets:
        - redis://172.18.11.139:7000
        - redis://172.18.11.139:7001
        - redis://172.18.11.140:7002
        - redis://172.18.11.140:7003
        - redis://172.18.11.141:7004
        - redis://172.18.11.141:7005
    metrics_path: /scrape
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 172.18.11.139:9121
  ## config for scraping the exporter itself
  - job_name: 'redis_exporter'
    static_configs:
      - targets:
        - 172.18.11.139:9121

In this way, cluster data can be collected. But the log prompts

time="2019-12-17T09:10:49 + 08:00" level=error msg="Couldn't connect to redis instance"

During my lunch break, I suddenly figured out that as long as I can connect to a node in a cluster, I can naturally query the indicators of other nodes. So the startup command was changed to:

nohup ./redis_exporter -redis.addr 172.18.11.141:7005 -redis.password xxxxx & amp;

Prometheus configuration remains unchanged

Send a few pictures:

Alarm Rules

groups:
- name: Redis
  rules:
    - alert: RedisDown
      expr: redis_up == 0
      for: 5m
      labels:
        severity: error
      annotations:
        summary: "Redis down (instance {<!-- -->{ $labels.instance }})"
        description: "Redis is down, mmp\
 VALUE = {<!-- -->{ $value }}\
 LABELS: {<!-- -->{ $labels }}"
    - alert: MissingBackup
      expr: time() - redis_rdb_last_save_timestamp_seconds > 60 * 60 * 24
      for: 5m
      labels:
        severity: error
      annotations:
        summary: "Missing backup (instance {<!-- -->{ $labels.instance }})"
        description: "Redis has not been backed up for 24 hours\
 VALUE = {<!-- -->{ $value }}\
 LABELS: {<!-- -->{ $labels }}"
    - alert: OutOfMemory
      expr: redis_memory_used_bytes / redis_total_system_memory_bytes * 100 > 90
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Out of memory (instance {<!-- -->{ $labels.instance }})"
        description: "Redis is running out of memory (> 90%)\
 VALUE = {<!-- -->{ $value }}\
 LABELS: {<!-- -->{ $labels }}"
    - alert: ReplicationBroken
      expr: delta(redis_connected_slaves[1m]) < 0
      for: 5m
      labels:
        severity: error
      annotations:
        summary: "Replication broken (instance {<!-- -->{ $labels.instance }})"
        description: "Redis instance lost a slave\
 VALUE = {<!-- -->{ $value }}\
 LABELS: {<!-- -->{ $labels }}"
    - alert: TooManyConnections
      expr: redis_connected_clients > 1000
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Too many connections (instance {<!-- -->{ $labels.instance }})"
        description: "Redis instance has too many connections\
 VALUE = {<!-- -->{ $value }}\
 LABELS: {<!-- -->{ $labels }}"
    - alert: NotEnoughConnections
      expr: redis_connected_clients < 5
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Not enough connections (instance {<!-- -->{ $labels.instance }})"
        description: "Redis instance should have more connections (> 5)\
 VALUE = {<!-- -->{ $value }}\
 LABELS: {<!-- -->{ $labels }}"
    - alert: RejectedConnections
      expr: increase(redis_rejected_connections_total[1m]) > 0
      for: 5m
      labels:
        severity: error
      annotations:
        summary: "Rejected connections (instance {<!-- -->{ $labels.instance }})"
        description: "Some connections to Redis has been rejected\
 VALUE = {<!-- -->{ $value }}\
 LABELS: {<!-- -->{ $labels }}"

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. MySQL entry-level skills treeDatabase compositionTable 76865 people are learning the system