Implementing single-machine current limiting based on Spring Boot + Guava

1. What is current limiting

Services can sometimes be under great pressure due to frequent network requests, especially those caused by illegal network attacks. Such situations sometimes require some restrictions. Current limiting can be considered a type of service degradation. Current limiting is to limit the input and output traffic of the system to achieve the purpose of protecting the system. Generally speaking, the throughput of the system can be measured. In order to ensure the stable operation of the system, once the threshold that needs to be restricted is reached, the traffic needs to be restricted and some measures must be taken to achieve the purpose of limiting the traffic. For example: restricting the other party’s request. This restriction can be based on several bases: requesting IP, user’s unique identifier, requested interface address, etc.

2. Why current limiting is necessary

Internet systems usually face large concurrent and large-traffic requests. In emergencies (the most common scenarios are flash sales and rush sales), the instantaneous large traffic will directly overwhelm the system and make it impossible to provide external services. One of the most common solutions to prevent this situation is current limiting. When requests reach a certain number or rate of concurrency, wait, queue, downgrade, deny service, etc.

For example: the 12306 ticket purchase system adopts current limiting when facing high concurrency. During peak traffic periods, prompts often appear; “There are currently many people in line, please try again later!”

3. Common current limiting algorithms

1. Counter method

Set the maximum number of requests allowed within a time window. If the number of requests in the current window exceeds this set number, subsequent requests within the window will be rejected.

The key point is ① time window ② counter.

For example, we set the maximum number of requests in 1 second to 100, and use a counter to record the number of requests in this second. At the beginning of each second, the counter starts recording the request amount from 0. If the counter reaches 100 within this second, all subsequent requests will be rejected before the second ends.

The counter method is a simple and crude method. That is to say, because it is simple and crude, there will be certain problems. For example, there are 100 requests at 0.9 seconds, and there are 100 requests at 1.1 seconds. According to the calculation of the counter method, the first and second seconds are indeed limited to 100 requests within their respective time ranges, but there are 200 requests between 0.5 seconds and 1.5 seconds. This is a critical problem of the counter method. . If the request volume is concentrated at the critical point of the two time windows, a request spike will occur.

We can use the sliding window method to divide the time window more precisely to solve this problem. Divide 1 second into 10 grids, and each grid uses a separate counter to count the current 100 millisecond request. As time goes by, our window also slides with time, and each time we calculate the request volume of the last 10 grids. If the total number of requests in the last 10 cells exceeds 100, all requests in the next cell will be rejected. The greater the number of grids, the more precise we can control the flow. This part of the logic can be implemented with the help of LinkedList.

The disadvantage of the sliding window is that it needs to occupy memory space to save the request of each grid in the latest time window.

2. Leaky bucket algorithm

We imagine a bucket with a fixed capacity and a hole in the bottom. Request is water poured in from top to bottom. The flow may be rapid or it may be a trickle. Because the capacity of the bucket is fixed, if too much water is poured in, we will not be able to use it if it overflows. So no matter what, the water leaking from the bottom of the bucket flows out at a constant speed. What we have to do is to deal with the water flowing out at a constant speed.

The key points are ① the inflow rate is variable ② the outflow rate is constant ③ only outgoing requests are processed

The advantage of leaky buckets is that the requests that need to be processed are always at a fixed rate, which is particularly stable. The corresponding disadvantage is that leaky buckets cannot solve sudden traffic situations.

3. Token bucket algorithm

Compared to leaky buckets, token buckets are handled in exactly the opposite way. We imagine that we still have a fixed-capacity bucket, but this time tokens are generated at a stable rate and placed in the bucket. When there is a request, a token needs to be obtained to pass it through. Because the capacity of the bucket is fixed, tokens will not be produced when they are full. When there are sufficient tokens in the bucket, burst traffic can directly obtain tokens and pass through. When the tokens are taken out and the bucket is empty, subsequent requests have to wait for tokens to be generated before they can pass. In this case, the rate of request passing becomes stable.

The key point is that ① the token generation rate is fixed ② you can pass if you can get the token

The advantage of token bucket is that it can handle burst traffic and is widely used in various traffic limiting tools. For example, Guava’s RateLimiter.

4. Implementing single-machine current limiting based on RateLimiter

Use File>New>Project>Spring Initializr in IDEA
Choose Spring Web and Lombok

The created pom is as follows

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.7.0</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>club.westudy</groupId>
    <artifactId>limitFlow-demo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>limitFlow-demo</name>
    <description>limitFlow-demo</description>
    <properties>
        <java.version>11</java.version>
        <maven.source>11</maven.source>
        <maven.compile>11</maven.compile>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-aop</artifactId>
        </dependency>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>30.0-jre</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
                <configuration>
                    <excludes>
                        <exclude>
                            <groupId>org.projectlombok</groupId>
                            <artifactId>lombok</artifactId>
                        </exclude>
                    </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>

</project>

Create annotation

package club.westudy.limitflowdemo.annoations;

import java.lang.annotation.*;
import java.util.concurrent.TimeUnit;

@Retention(RetentionPolicy.RUNTIME)
@Target({<!-- -->ElementType.METHOD})
@Documented
public @interface Limit {<!-- -->
    /**
     * The key of the resource, unique
     * Function: different interfaces, different flow control
     */
    String key() default "";
 
    /**
     * Maximum number of access restrictions
     */
    double permitsPerSecond () ;
 
    /**
     * Get the maximum waiting time for tokens
     */
    long timeout();
 
    /**
     * Maximum waiting time to obtain the token, unit (eg: minutes/seconds/milliseconds) Default: milliseconds
     */
    TimeUnit timeunit() default TimeUnit.MILLISECONDS;
 
    /**
     * Prompt message for not getting token
     */
    String msg() default "The system is busy, please try again later.";
}

Create AOP aspects

package club.westudy.limitflowdemo.aop;

import club.westudy.limitflowdemo.annoations.Limit;
import com.google.common.collect.Maps;
import com.google.common.util.concurrent.RateLimiter;
import lombok.extern.slf4j.Slf4j;
import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.aspectj.lang.reflect.MethodSignature;
import org.springframework.stereotype.Component;

import java.lang.reflect.Method;
import java.util.Map;

@Slf4j
@Aspect
@Component
public class LimitAop {<!-- -->
    /**
     * Store the token bucket, the key is the token name, and the value is the token bucket
     */
    private final Map<String, RateLimiter> limitMap = Maps.newConcurrentMap();

    @Around("@annotation(club.westudy.limitflowdemo.annoations.Limit)")
    public Object around(ProceedingJoinPoint joinPoint) throws Throwable {<!-- -->
        MethodSignature signature = (MethodSignature) joinPoint.getSignature();
        Method method = signature.getMethod();
        //Get limit annotation
        Limit limit = method.getAnnotation(Limit.class);
        if (limit != null) {<!-- -->
            //key function: different interfaces, different flow control
            String key = limit.key();
            RateLimiter rateLimiter = null;
            //Verify whether the cache has a hit key
            if (!limitMap.containsKey(key)) {<!-- -->
                //Create token bucket
                rateLimiter = RateLimiter.create(limit.permitsPerSecond());
                limitMap.put(key, rateLimiter);
                log.info("New token bucket: {}, capacity: {}", key, limit.permitsPerSecond());
            }
            rateLimiter = limitMap.get(key);
            // get token
            boolean acquire = rateLimiter.tryAcquire(limit.timeout(), limit.timeunit());
            // If the command cannot be obtained, an exception prompt will be returned directly.
            if (!acquire) {<!-- -->
                log.debug("Token bucket: {}, failed to obtain token", key);
                return limit.msg();
            }
        }
        return joinPoint.proceed();
    }

}

CreateController

package club.westudy.limitflowdemo.web;

import club.westudy.limitflowdemo.annoations.Limit;
import lombok.extern.slf4j.Slf4j;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

import java.util.concurrent.TimeUnit;

@Slf4j
@RestController
@RequestMapping("/limit")
public class LimitController {<!-- -->
    /**
     * When you frantically access http://localhost:8080/limit/test1 in the browser, "There are currently a lot of people in the queue, please try again later!" or "ok" will appear.
     * @return
     */
    @GetMapping("/test1")
    @Limit(key = "limit1", permitsPerSecond = 1, timeout = 500, timeunit = TimeUnit.MILLISECONDS,msg = "There are currently many people in the queue, please try again later!")
    public String limit1() {<!-- -->
        log.info("Token bucket limit2 successfully obtained the token");
        return "ok";
    }

    /**
     * When you access http://localhost:8080/limit/test2 in a browser, "The system is busy, please try again later!" or "ok" will appear.
     * @return
     */
    @GetMapping("/test2")
    @Limit(key = "limit2", permitsPerSecond = 2, timeout = 500, timeunit = TimeUnit.MILLISECONDS,msg = "The system is busy, please try again later!")
    public String limit2() {<!-- -->
        log.info("Token bucket limit2 successfully obtained the token");
        return "ok";
    }
}

When the browser frantically accesses http://localhost:8080/limit/test1, it will refresh twice within 1 second and the message “There are currently many people in the queue, please try again later!”
When the browser frantically accesses http://localhost:8080/limit/test2, the message “The system is busy, please try again later” will appear when it is refreshed three times within 1 second!