1. HyperLoglog-cardinality statistics
1. HyperlogLog data type characteristics
- Redis HyperLogLog is an algorithm used for cardinality statistics to complete the statistics of independent totals
- The advantage of HyperLogLog is that when the number or volume of input elements is very, very large, the space required to calculate the cardinality is always fixed and small. With a cost of 12 KB of memory, the cardinality of nearly 2^64 different elements can be calculated.
- Because HyperLogLog only calculates the cardinality based on the input elements, and does not store the input elements themselves, HyperLogLog cannot return the individual elements of the input like a set.
- Its bottom layer uses the string data type
- It is an imprecise statistical algorithm with a standard error of 0.81%
What is cardinality
The number of unique elements in the dataset.
2. Application scenarios
-
Page visits (UV): A user who visits multiple times can only be counted as one person.
The traditional implementation stores the user’s id and compares each time. This method wastes space when there are more users, and our purpose is just to count, Hyperloglog can help us use the smallest space to complete. -
That is, if fault tolerance is allowed or the accuracy requirements are not so high, then Hyperloglog must be used!
-
If fault tolerance is not allowed, just use set or your own data type!
3. Commonly used API and use Spring client test
redis native API
command | description |
---|---|
pfadd key element1 [elememt2…] | Add the specified element to the HyperLogLog |
pfcount key [key] | Returns the cardinality estimate for the given HyperLogLog. |
pfmerge mergekey sourcekey [sourcekey…] | Merge multiple HyperLogLogs into one HyperLogLog |
Command | Description |
---|---|
setbit key offset value | Set the value for the offset bit of the specified key |
getbit key offset | Get The value of the offset bit |
bitcount key [start end] | The number of bits whose statistical string is set to 1, you can also specify the statistical range by byte |
bitop operation destkey key[key…] | Perform bit operation on one or more string keys that save binary bits, and save the result to destkey. |
BITPOS key bit [start] [end] | Returns the first bit in the string that is set to 1 or 0. start and end can only be byte-wise, not bit-wise |
# Add data and get data 127.0.0.1:6379> setbit sign 0 1 # Set the 0th bit of sign to 1 (integer) 0 127.0.0.1:6379> setbit sign 2 1 # Set the second bit of sign to 1, if not set, the default is 0 (integer) 0 127.0.0.1:6379> setbit sign 3 1 (integer) 0 127.0.0.1:6379> setbit sign 5 1 (integer) 0 127.0.0.1:6379> type sign # The bottom layer is also of type String string 127.0.0.1:6379> getbit sign 2 # Get the value of the second bit (integer) 1 127.0.0.1:6379> getbit sign 3 (integer) 1 127.0.0.1:6379> getbit sign 4 # If not set, the default is 0- and false (integer) 0 # The number of 1 in the statistical data - that is, the number of true 127.0.0.1:6379> BITCOUNT sign # Count the digits of 1 in the sign (integer) 4
spring client operation API
opsForValue().setBit(bitKey, 1, true)
– set state
The default status of each bit is falseopsForValue().getBit(bitKey, 0))
– query statusredisConnection.bitCount(bitKey.getBytes())
– count the number of true status
/** * Test the operation of BitMaps * Records - query and statistics */ @Test public void testBitMap() {<!-- --> String bitKey = "test:bit:01"; // record data status - default false redisTemplate.opsForValue().setBit(bitKey, 1, true); redisTemplate.opsForValue().setBit(bitKey, 4, true); redisTemplate.opsForValue().setBit(bitKey, 7, true); // Inquire System.out.println(redisTemplate.opsForValue().getBit(bitKey, 0)); System.out.println(redisTemplate.opsForValue().getBit(bitKey, 1)); System.out.println(redisTemplate.opsForValue().getBit(bitKey, 2)); // statistics Object execute = redisTemplate. execute(new RedisCallback() {<!-- --> @Override public Object doInRedis(RedisConnection redisConnection) throws DataAccessException {<!-- --> return redisConnection. bitCount(bitKey. getBytes()); } }); System.out.println(execute); }
connection.bitOp()
– bit operation
/** * OR operation * Count the Boolean values of 3 sets of data, and perform OR operation on these 3 sets of data. */ @Test public void testBitMapOperation() {<!-- --> String bitKey2 = "test:bm:02"; redisTemplate.opsForValue().setBit(bitKey2, 0, true); redisTemplate.opsForValue().setBit(bitKey2, 1, true); redisTemplate.opsForValue().setBit(bitKey2, 2, true); String bitKey3 = "test:bm:03"; redisTemplate.opsForValue().setBit(bitKey3, 2, true); redisTemplate.opsForValue().setBit(bitKey3, 3, true); redisTemplate.opsForValue().setBit(bitKey3, 4, true); String bitKey4 = "test:bm:04"; redisTemplate.opsForValue().setBit(bitKey4, 4, true); redisTemplate.opsForValue().setBit(bitKey4, 5, true); redisTemplate.opsForValue().setBit(bitKey4, 6, true); // Merge processing String bitKeyOR = "test:bm:or"; Object obj = redisTemplate. execute(new RedisCallback() {<!-- --> @Override public Object doInRedis(RedisConnection connection) throws DataAccessException {<!-- --> connection.bitOp(RedisStringCommands.BitOperation.OR, bitKeyOR.getBytes(), bitKey2.getBytes(), bitKey3.getBytes(), bitKey4.getBytes()); return connection. bitCount(bitKeyOR. getBytes()); } }); System.out.println(obj); // The number of statistics // After merging, the status of each bit System.out.println(redisTemplate.opsForValue().getBit(bitKeyOR, 0)); System.out.println(redisTemplate.opsForValue().getBit(bitKeyOR, 1)); System.out.println(redisTemplate.opsForValue().getBit(bitKeyOR, 2)); System.out.println(redisTemplate.opsForValue().getBit(bitKeyOR, 3)); System.out.println(redisTemplate.opsForValue().getBit(bitKeyOR, 4)); System.out.println(redisTemplate.opsForValue().getBit(bitKeyOR, 5)); System.out.println(redisTemplate.opsForValue().getBit(bitKeyOR, 6)); }