Foreword
Bigkey and hotkey are two relatively common problems in Redis production. This article analyzes these two problems from the perspectives of their concepts, hazards, discovery, and solutions.
bigkey
Concept
In layman’s terms, Big Key means that a certain key corresponds to a large value and takes up a lot of redis space, which is essentially a problem of large value. The key is often set by the program itself, and the value is often not controlled by the program, so the value may be very large.
The values corresponding to these Big Keys in redis are very large, and it takes a lot of time in the serialization/deserialization process. Therefore, when we operate Big Keys, it is usually time-consuming, which may cause redis to block, thereby reducing Redis performance.
Use several practical examples to describe the characteristics of the big key:
- A Key of String type, its value is 5MB (the data is too large)
- A Key of List type, its number of lists is 20,000 (too many lists)
- A Key of type ZSet with 10,000 members (too many members)
- A Key in Hash format, although the number of its members is only 1000, but the total value of these members is 10MB (the size of the members is too large)
The general industry (refer to Alibaba and Kuaishou Redis development specifications) has the following specifications for keys:
The string type is controlled within 10KB, and the number of hash, list, set, and zset elements should not exceed 5000.
Hazard
- Slow query: Due to the large amount of data contained in bigkey, the execution time of a request may be very long, resulting in slow query problems
- Cluster memory distribution is unbalanced: In cluster mode, nodes with more bigkeys have high memory usage, which affects cluster stability
- Expiration blocking: When the bigkey expires (deletes), due to the Redis single thread, it will cause Redis to block and affect the execution of client commands
- Network card overload: Imagine a string type key with a data volume of 10mb. At this time, 10,000 users request the key, which will require about 100G of network card bandwidth, which will affect the normal operation of the server.
Discover
The main idea is to scan all the keys of Redis and judge the length of the key.
-
Clients after Redis 4.0 provide the bigkeys command, which can find out the Key that each data type occupies the most memory.
-
Ali cloud redis big key search tool
Delete
There are two main methods 1. Traverse delete 2. Redis asynchronous delete command
Loop through delete
- Hash delete: hscan + hdel
public void delBigHash(String host, int port, String password, String bigHashKey) {<!-- --> Jedis jedis = new Jedis(host, port); if (password != null & amp; & amp; !"".equals(password)) {<!-- --> jedis.auth(password); } ScanParams scanParams = new ScanParams(). count(100); String cursor = "0"; do {<!-- --> ScanResult<Entry<String, String>> scanResult = jedis.hscan(bigHashKey, cursor, scanParams); List<Entry<String, String>> entryList = scanResult. getResult(); if (entryList != null & amp; & amp; !entryList.isEmpty()) {<!-- --> for (Entry<String, String> entry : entryList) {<!-- --> jedis.hdel(bigHashKey, entry.getKey()); } } cursor = scanResult. getStringCursor(); } while (!"0".equals(cursor)); //delete bigkey jedis.del(bigHashKey); }
- List delete: ltrim
public void delBigList(String host, int port, String password, String bigListKey) {<!-- --> Jedis jedis = new Jedis(host, port); if (password != null & amp; & amp; !"".equals(password)) {<!-- --> jedis.auth(password); } long llen = jedis.llen(bigListKey); int counter = 0; int left = 100; while (counter < llen) {<!-- --> //Cut off 100 from the left each time jedis.ltrim(bigListKey, left, llen); counter + = left; } //finally delete the key jedis.del(bigListKey); }
- Set delete: sscan + srem
public void delBigSet(String host, int port, String password, String bigSetKey) {<!-- --> Jedis jedis = new Jedis(host, port); if (password != null & amp; & amp; !"".equals(password)) {<!-- --> jedis.auth(password); } ScanParams scanParams = new ScanParams(). count(100); String cursor = "0"; do {<!-- --> ScanResult<String> scanResult = jedis.sscan(bigSetKey, cursor, scanParams); List<String> memberList = scanResult. getResult(); if (memberList != null & amp; & amp; !memberList.isEmpty()) {<!-- --> for (String member : memberList) {<!-- --> jedis.srem(bigSetKey, member); } } cursor = scanResult. getStringCursor(); } while (!"0".equals(cursor)); //delete bigkey jedis.del(bigSetKey); }
- SortedSet delete: zscan + zrem
public void delBigZset(String host, int port, String password, String bigZsetKey) {<!-- --> Jedis jedis = new Jedis(host, port); if (password != null & amp; & amp; !"".equals(password)) {<!-- --> jedis.auth(password); } ScanParams scanParams = new ScanParams(). count(100); String cursor = "0"; do {<!-- --> ScanResult<Tuple> scanResult = jedis.zscan(bigZsetKey, cursor, scanParams); List<Tuple> tupleList = scanResult. getResult(); if (tupleList != null & amp; & amp; !tupleList.isEmpty()) {<!-- --> for (Tuple tuple : tupleList) {<!-- --> jedis.zrem(bigZsetKey, tuple.getElement()); } } cursor = scanResult. getStringCursor(); } while (!"0".equals(cursor)); //delete bigkey jedis.del(bigZsetKey); }
Delete asynchronously
Redis4.0 already supports asynchronous deletion of keys, just use the unlink command, Redis lazyfree
Resolve
To solve the big key problem mainly through the method of big key splitting, we need to split the data of a big key into multiple small keys, and then use client fragmentation > way to access.
for example:
- string type
- Convert string type to hash type, list type, and then split hash and list
- For plain string types, you can pass
- 1. Use the more data-saving serialization protocol
- 2. Use the data compression algorithm to perform corresponding compression and decompression operations during the access process
- list type: split into small keys such as list:0, list:1, list:2, list:N, etc., and hash the data into different subkeys by id % N
- set type: same as list
- …
hotkey
Concept
Since the data of a certain Key must be stored in a single instance of Redis on a server at the backend, if a large number of request operations suddenly appear for this Key, this will cause the traffic to be too concentrated and reach the upper limit of the processing of a single instance of Redis, which may cause The CPU usage of the Redis instance is 100%, or the network card traffic reaches the upper limit, which affects the stability and availability of the system, or more seriously, the server is down and cannot provide external services.
For a Redis stand-alone machine, the industry generally believes that the theoretical limit OPS is around 10W, and the actual situation is related to the specific machine configuration.
Hazard
The traffic is too concentrated, causing the load of a single Redis node to be too large (a single node is generally 10W), causing the Redis service to crash, a large number of Redis requests to fail, query operations may hit the database, and the database crashes, resulting in the unavailability of the entire service.
It can be seen that hotkey will cause great harm to the availability of services, so we should find hotkey in time and solve it.
Discover
Through the above analysis, the harm of hot Key
is still great. We can’t wait until the hot Key
appears to have dragged down the service before processing. At that time, the business must have It is self-evident that the loss is self-evident if it is affected; then it is possible to monitor the emergence of hot Key
in advance through some means before the emergence of hot Key
, which is very important for ensuring the stability of the business system Sex is very important, so what means do we have to observe the emergence of hot Key
in advance?
1. Estimated business traffic
According to some activities and functions launched by the business system, we can predict the emergence of hot Key
in some scenarios in advance. All will be cached in Redis
, which is very likely to cause hot Key
problems.
- Advantages: simple, discover hot
Key
based on experience, early detection and early processing; - Disadvantage: There is no way to predict all hot
Key
occurrences, such as some hot news events, which cannot be predicted in advance.
2. Client monitoring
Generally, when we connect to the Redis
server, we need to use a special SDK (for example: Java
client tools Jedis
, Redisson
code>), we can encapsulate the client tool, collect and collect before sending the request, and report the collected data to a unified service regularly for aggregation calculation.
- Advantages: simple solution
- shortcoming:
- There is a certain intrusion into the client code, or the secondary development of the
SDK
tool is required; - Unable to adapt to the multi-language architecture, the
SDK
of each language needs to be developed, and the later development and maintenance costs are relatively high.
- There is a certain intrusion into the client code, or the secondary development of the
3. Proxy layer monitoring
If all Redis
requests go through Proxy
(proxy), you can consider changing the Proxy
code to collect, the idea is basically similar to that of the client.
- Advantages: It is completely transparent to the user, and can solve the language heterogeneity and version upgrade problems of the client
SDK
; - shortcoming:
- The development cost will be higher than that of the client;
- Not all
Redis
cluster architectures haveProxy
agents (you must deployProxy
in this way).
4. Redis comes with commands
hotkeys parameter
Redis
added hotkeys lookup feature in 4.0.3
version, you can directly use redis-cli --hotkeys
to get the current keyspace
of /code> is realized by scan + object freq
.
- Advantages: no need for secondary development, and ready-made tools can be used directly;
- shortcoming:
- Due to the need to scan the entire
keyspace
, the real-time performance is relatively poor; - The scanning time is positively related to the number of
key
, if the number ofkey
is large, it may take a long time.
- Due to the need to scan the entire
monitor command
The monitor
command can capture the commands received by the Redis
server in real time, capture data through redis-cli monitor
, and combine some ready-made analysis Tools, such as redis-faina, count hot keys.
- Advantages: no need for secondary development, and ready-made tools can be used directly;
- Disadvantage: Under the condition of high concurrency, this command has the hidden danger of memory explosion, and it will also reduce the performance of
Redis
.
5. Rely on the infrastructure capabilities of major manufacturers
In fact, all major cloud vendors have the ability to discover hotkeys and bigkeys, including the corresponding Redis monitoring tools for the base frames of major manufacturers, which can discover hotkeys and bigkeys.
Resolve
1. Multi-level cache
When the hot Key
appears, load the hot Key
into the JVM
of the system. Subsequent requests for these hot Key
will be obtained directly from JVM
instead of going to the Redis
layer. There are many tools for these local caching, such as Ehcache
, or the Cache
tool in Google Guava
, or directly use HashMap
can be used as a local cache tool.
There are two issues to pay attention to when using local cache:
- If the hot
Key
is cached locally, it is necessary to prevent the local cache from being too large and affecting the JVM Heap space; - Need to deal with local cache and
Redis
cluster Read and write data consistency issues.
2. Load balancing
Through the previous analysis, we can understand that the reason for the hot Key
is that there are a large number of requests for the same Key
falling to the same Redis
instance, if we can have a way to load these requests to different instances to prevent traffic skew, then the problem of hot Key
will not exist.
So how to split the request for a hot Key
to different instances? We can use the hot Key
backup method, the basic idea is that we can add a prefix or suffix to the hot Key
, and put a hot Key
The number of Redis
instances becomes a multiple M
of the number of Redis
instances, so accessing a Redis
Key
becomes access to M
Redis
Key
. M
Redis
Key
are distributed to different instances after sharding, and the access traffic is evenly distributed to all instances.
// N is the number of Redis instances, M is 2 times of N func getData() {<!-- --> const M = N * 2 //generate random number random = GenRandom(0, M) // Construct a backup new Key bakHotKey = hotKey + "_" + random data = redis. GET(bakHotKey) if data == NULL {<!-- --> data = redis. GET(hotKey) if data == NULL {<!-- --> // Here you can pay attention to the problem of cache breakdown and cache avalanche data = GetFromDB() redis.SET(hotKey, data, expireTime) redis.SET(bakHotKey, data, expireTime + GenRandom(0, 5)) } else {<!-- --> redis.SET(bakHotKey, data, expireTime + GenRandom(0, 5)) } } return data }
question:
- To waste Redis memory space, you can set a switch through the configuration center, and only visit the temporary hotkey node when the switch is turned on
- data consistency
- Data consistency between multiple Redis nodes cannot be guaranteed, and some data inconsistencies may exist during data synchronization
- If there is a data update, all Redis nodes need to be updated at the same time, and there will also be data inconsistencies here
Summary
As for the hotkeys solution, there is no one solution that is a silver bullet for all scenarios. We need to choose a specific solution for the business scenario, but what we can see is that no matter what the solution is, there will be certain consistency problems, but Since there are hotkeys, it must be a case of high concurrency. In this case, we generally only need to guarantee the final consistency, and there is no need to pursue strong data consistency.
And this also reveals a truth to us-consistency and availability cannot have both, this is very CAP.