Paginated list caching, do you really know how to do it?

I read a lot of articles about caching, among which the multi-level caching ideas and paginated list caching gave me great inspiration.

In writing this article, we will talk about Paging List Caching, hoping to help everyone improve their understanding of caching technology.

1 Directly cache paginated list results

Obviously, this is the simplest and easiest way to understand.

We cache paging results according to different paging conditions. The pseudo code is as follows:

public List<Product> getPageList(String param,int page,int size) {
  String key = "productList:page:" + page + ”size:“ + size +
               "param:" + param ;
  List<Product> dataList = cacheUtils.get(key);
  if(dataList != null) {
    return dataList;
  }
  dataList = queryFromDataBase(param,page,size);
  if(dataList != null) {
       cacheUtils.set(key, dataList, Constants.ExpireTime);
  }
}

The advantages of this solution are simple engineering and fast performance, but it has a very obvious flaw: The granularity of the list cache is very large.

If data in the list is added or deleted, in order to ensure data consistency, the paging list cache needs to be modified.

There are two ways:

1. Rely on cache expiration to achieve laziness, but the business scenario must be inclusive;

2. Use the keys of Redis to find the paging cache of the business and execute the delete command. However, the keys command has a great impact on performance and will cause a large delay in Redis.

Using the keys command in a production environment is dangerous and has a high chance of accidents. It is not recommended to use it.

2 Query the object ID list and cache each object entry

Although caching paging results is easy to use, the cache granularity is too large and it is troublesome to ensure data consistency.

So our goal is to fine-grained control of caching.

We query the product paging object ID list, then create a cache for each product object, and aggregate the product ID and product object cache into a list and return it to the front end.

The pseudo code is as follows:

Core process:

1. Query the paging ID list from the database

// Query the paginated product ID list from the database
List<Long> productIdList = queryProductIdListFromDabaBase(
                           param,
                           page,
                           size);

The corresponding SQL is similar:

SELECT id FROM products
ORDER BY id
LIMIT (page - 1) * size, size

2. Get product objects from cache in batches

Map<Long, Product> cachedProductMap = cacheUtils.mget(productIdList);

If we use a local cache, aggregating directly from the local cache one by one is also very fast.

If we use distributed cache, Redis naturally supports batch query commands, such as mget and hmget.

3. Assemble the product ID that does not hit the target

List<Long> noHitIdList = new ArrayList<>(cachedProductMap.size());
for (Long productId : productIdList) {
     if (!cachedProductMap.containsKey(productId)) {
         noHitIdList.add(productId);
     }
}

Because the cache may not be hit due to expiration or other reasons, we need to find which products are not in the cache.

4. Query the missed product information list from the database in batches and reload it into the cache

First, query the list of unhit product information from the database in batch. Please note that it is batch.

List<Product> noHitProductList = batchQuery(noHitIdList);

The parameter is a list of product IDs that missed the cache, assembled into the corresponding SQL, so that the performance is faster:

SELECT * FROM products WHERE id IN
                         (1,
                          2,
                          3,
                          4);

Then these missed product information are stored in the cache, using the mset command of Redis.

//Add unhit products to the cache
Map<Long, Product> noHitProductMap =
         noHitProductList.stream()
         .collect(
           Collectors.toMap(Product::getId, Function.identity())
         );
cacheUtils.mset(noHitProductMap);
//Add unhit products to the aggregate map
cachedProductMap.putAll(noHitProductMap);

5. Traverse the product ID list and assemble the object list

for (Long productId : productIdList) {
    Product product = cachedProductMap.get(productId);
    if (product != null) {
       result.add(product);
    }
}

In the current solution, when there are hits in the cache, after two network IOs, the first database query IO, and the second Redis query IO, the performance will be better.

All operations are batch operations, and even if there are cache misses, the overall speed is faster.

” Query the object ID list, and then cache each object entry ” This solution is more flexible. When we query the object ID list, it can be not limited to the database, but also a search engine. Redis etc.

The following figure is the search process of Open Source China:

The essence is:The paginated results of the search only contain the business object ID, and the details of the object need to be obtained from the cache + MySQL.

3 Cache a list of object IDs, along with each object entry

The author once reconstructed a service similar to the circle of friends. Entering the class page, all the dynamics of the class members are displayed in the form of a waterfall flow.

Class Circle

We use push mode to store each dynamic ID in the Redis ZSet data structure. Redis ZSet is a data structure of type ordered set, which consists of multiple ordered and unique string elements, each element is associated with a floating point value.

ZSet uses the member -> score structure:

member: the sorted identifier, which is also the default second sorting dimension (when the scores are the same, Redis sorts them in dictionary order of member)
score: the sorted score, the storage type is double

As shown in the figure above: ZSet stores a dynamic ID list, the value of member is the dynamic number, and the score value is the creation time.

The paging effect can be achieved through ZSet’s ZREVRANGE command.

ZREVRANGE is one of the commands used for sorted sets in Redis. It is used to return a specified range of members in an ordered set from large to small according to the member’s score.

In order to achieve the paging effect, pass the following paging parameters:

Through the ZREVRANGE command, we can query the dynamic ID list.

After querying the dynamic ID list, you also need to cache each dynamic object entry. Dynamic objects include functional data such as details, comments, likes, and collections. We need to provide separate cache configuration for these data.

Whether it is querying the cache or rewriting the cache, in order to improve system performance, batch operations are more efficient.

If the ** cache object structure is simple, use the mget and hmget commands; if the structure is complex, consider using pipleline and Lua script mode. ** The batch solution chosen by the author is the pipeline function of Redis.

Let’s simulate the process of obtaining a dynamic paging list:

Use the ZREVRANGE command of ZSet, pass in the paging parameters, and query the dynamic ID list;
Pass the dynamic ID list parameter, and use the pipeline function of Redis to obtain dynamic details, comments, likes, and favorites from the cache in batches, and assemble these functional data into a list.

4 Summary

This article introduces three ways to implement paginated list caching:

Caching paginated list results directly
Query a list of object IDs and cache only each object entry
Cache a list of object IDs, along with each object entry

These three methods are progressive layer by layer. The key is:

Fine-grained control over caching and Bulk loading of objects.