2. Sentinel basic application & current limiting rules (2)

2.2.1 What is
Sentinel is an open-source, lightweight, high-availability flow control component for distributed service architectures developed by the Alibaba middleware team. It mainly uses traffic as the entry point to help users protect services from multiple dimensions such as flow control, circuit breaker degradation, and system load protection. stability.

2.2.2 Basic concepts
? Resources (things that need to be protected)

Resources are a key concept in Sentinel. It can be anything within a Java application, such as a service provided by the application, a service provided by another application called by the application, or even a piece of code. In the following documentation, we will use resources to describe code blocks.

As long as the code defined through the Sentinel API is a resource, it can be protected by Sentinel. In most cases, you can use method signatures, URLs, or even service names as resource names to identify resources.

There are two definitions of resources, which can be set through @SentinelResource(value=”doTest”). If not set, the end of the request will be taken by default.

? Rules (current limiting rules/circuit breaker rules)

Rules set around the real-time status of resources can include flow control rules, circuit breaker degradation rules, and system protection rules. All rules can be dynamically adjusted in real time.

2.2.3 Console
Start method:

Download the jar package through the command method

Download the corresponding version of the dashboard jar package at https://github.com/alibaba/Sentinel/releases, enter the directory where the jar is located, and then run the jar package through the java command:

java -Dserver.port=7777 -Dcsp.sentinel.dashboard.server=localhost:7777 -Dproject.name=sentinel-dashboard-1.8.0 -jar sentinel-dashboard-1.8.0.jar

Access address: localhost:7777

The default account and password are: sentinel

2.2.4 Core API usage
The principle of flow control is to monitor indicators such as QPS or the number of concurrent threads of application traffic, and control the traffic when it reaches a specified threshold to prevent the system from being overwhelmed by instantaneous traffic peaks and ensure high application availability.

Let’s write a simple example to see how to use Sentinel to implement current limiting.

First, write a simple order query interface for subsequent interface current limiting examples:

demo01 in springboot-sentinel project

Introduce the Sentinel jar package:

<dependency>
    <groupId>com.alibaba.csp</groupId>
    <artifactId>sentinel-core</artifactId>
    <version>1.8.0</version>
</dependency>

Startup class:

@SpringBootApplication
public class SpringbootSentinelApplication {<!-- -->

    public static void main(String[] args) {<!-- -->
        initFlowQpsRule();
        SpringApplication.run(SpringbootSentinelApplication.class, args);
    }

    public static void initFlowQpsRule() {<!-- -->
        List<FlowRule> rules = new ArrayList<FlowRule>();
        FlowRule rule1 = new FlowRule();
        rule1.setResource("queryOrderInfo");
        // QPS is controlled within 2
        rule1.setCount(2);
        // QPS current limit
        rule1.setGrade(RuleConstant.FLOW_GRADE_QPS);
        rule1.setLimitApp("default");
        rules.add(rule1);
        FlowRuleManager.loadRules(rules);
    }
}
@Component
public class OrderQueryService {<!-- -->
    public String queryOrderInfo(String orderId) {<!-- -->
        System.out.println("Get order information:" + orderId);
        return "return OrderInfo:" + orderId;
    }
}
public class OrderControllser {<!-- -->
    @Autowired
    private OrderQueryService orderQueryService;

    @RequestMapping("/getOrder")
    @ResponseBody
    public String queryOrder1(@RequestParam("orderId") String orderId) {<!-- -->

        return orderQueryService.queryOrderInfo(orderId);
    }
}

Under normal circumstances, calling the order query interface in OrderController will return order information. How to control the QPS accessed by the interface to be below 2? Sentienl current limiting provides the following implementation methods:

How to use Sentienl.

First, you need to define current limiting rules, such as which interface to limit current, what is the QPS limit, what app is the caller being restricted, etc.:

public void initFlowQpsRule() {<!-- -->
    List<FlowRule> rules = new ArrayList<FlowRule>();
    FlowRule rule1 = new FlowRule();
    rule1.setResource(KEY);
    // QPS is controlled within 2
    rule1.setCount(2);
    // QPS current limit
    rule1.setGrade(RuleConstant.FLOW_GRADE_QPS);
    rule1.setLimitApp("default");
    rules.add(rule1);
    FlowRuleManager.loadRules(rules);
}

There are many ways to implement current limiting, here are only two common ones:

(1) Current limiting implementation method 1: Define resources by throwing exceptions

This method is more intrusive to the code, and the code needs to be wrapped with a try-catch style API where the interface is called:

/**
     * Current limiting implementation method 1: Define resources by throwing exceptions
     *
     * @param orderId
     * @return
     */
@RequestMapping("/getOrder2")
@ResponseBody
public String queryOrder2(@RequestParam("orderId") String orderId) {<!-- -->
    Entry entry = null;
    // Resource name
    String resourceName = "queryOrderInfo";
    try {<!-- -->
        // entry can be understood as entrance registration
        entry = SphU.entry(resourceName);
        //Protected logic, here is the order query interface
        return orderQueryService.queryOrderInfo(orderId);
    } catch (BlockException blockException) {<!-- -->
        //When the interface is current-limited, it will enter here
        blockException.printStackTrace();
        return "Interface current limit, return empty";
    } finally {<!-- -->
        // SphU.entry(xxx) needs to appear in pairs with entry.exit(), otherwise it will cause an exception in the call chain recording
        if (entry != null) {<!-- -->
            entry.exit();
        }
    }
}

Test, when QPS>2, the interface returns:

We can see the following output in the log ~/logs/csp/${appName}-metrics.log.xxx:

C:\Users\gupao-jingtian\logs\csp

timestamp|formatted timestamp|resource name|passed requests|rejected requests|number of successes|number of exceptions|rt average response time
1600608724000|2020-09-20 21:32:04|helloWorld|5|6078|5|0|5|0|0|0
1600608725000|2020-09-20 21:32:05|helloWorld|5|32105|5|0|0|0|0|0
1600608726000|2020-09-20 21:32:06|helloWorld|5|41084|5|0|0|0|0|0
1600608727000|2020-09-20 21:32:07|helloWorld|5|72211|5|0|0|0|0|0
1600608728000|2020-09-20 21:32:08|helloWorld|5|60828|5|0|0|0|0|0
1600608729000|2020-09-20 21:32:09|helloWorld|5|41696|5|0|0|0|0|0


(2) Current limiting implementation method two: define resources in annotation mode

The above-mentioned try-catch style API can achieve current limiting, but it is too intrusive to the code. It is recommended to use annotations to achieve it. If not noted below, annotations will be used by default.

For the use of annotations, see: Sentinel annotation use

First, you need to introduce the jar package that supports annotations:

<dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-starter-alibaba-sentinel</artifactId>
</dependency>

In the interface OrderQueryService, use annotations to implement current limiting of the order query interface:

/**
 * Order query interface, using Sentinel annotations to implement current limiting
 *
 * @param orderId
 * @return
 */
@SentinelResource(value = "queryOrderInfo3", blockHandler = "handleFlowQpsException",
        fallback = "queryOrderInfo3Fallback")
public String queryOrderInfo3(String orderId) {<!-- -->

    // A code exception is thrown when the simulated interface is running
    if ("000".equals(orderId)) {<!-- -->
        throw new RuntimeException();
    }

    System.out.println("Get order information:" + orderId);
    return "return OrderInfo3 :" + orderId;
}

/**
 * Processing logic when the order query interface throws current limit or downgrade
 *
 * Note: Method parameters and return values must be consistent with the original function
 * @return
 */
public String handleFlowQpsException(String orderId, BlockException e) {<!-- -->
    e.printStackTrace();
    return "handleFlowQpsException for queryOrderInfo3: " + orderId;
}

/**
 * Provide fallback processing for exceptions thrown when the order query interface is running
 *
 * Note: Method parameters and return values must be consistent with the original function
 * @return
 */
public String queryOrderInfo3Fallback(String orderId, Throwable e) {<!-- -->
    return "fallback queryOrderInfo3: " + orderId;
}

? blockHandler = “handleFlowQpsException” is used to handle Sentinel current limiting/melting errors;

? fallback = “queryOrderInfo2Fallback” is used to handle all exceptions in the business code in the interface (such as business code exceptions, sentinel current limit fuse exceptions, etc.); Note: The method names and parameters in the above two processing methods need to be consistent with the protected function .

test:

/**
 * Current limiting implementation method 2: Annotation definition resources
 *
 * @param orderId
 * @return
 */
@RequestMapping("/getOrder3")
@ResponseBody
public String queryOrder3(@RequestParam("orderId") String orderId) {<!-- -->
    return orderQueryService.queryOrderInfo3(orderId);
}

2.2.5 Flow control
Flow control is a commonly used concept in network transmission, which is used to adjust the sending data of network packets. However, from the perspective of system stability, there are also many considerations in the speed of processing requests. Requests arriving at any time are often random and uncontrollable, and the system’s processing capacity is limited. We need to control the flow based on the system’s processing capabilities. Sentinel, as a dispatcher, can adjust random requests into appropriate shapes as needed, as shown in the following figure:

Flow control has the following perspectives:

? Number of concurrencies

? QPS

? Resource calling relationship

No matter what method it is, the core principle of flow control is to monitor indicators such as QPS or the number of concurrent threads of application traffic to determine the threshold of the indicator to control traffic and prevent the system from being overwhelmed by instantaneous peak traffic, thereby ensuring that the system high availability.

In Sentinel, the direct form of current limiting is that a FlowException is thrown when executing Entry nodeA = SphU.entry(resourceName). FlowException is a subclass of BlockException. You can catch BlockException to customize the processing logic after the flow is limited.

Multiple current limiting rules can be created for the same resource. FlowSlot will traverse all flow limiting rules for the resource in sequence until a rule triggers flow limiting or all rules are traversed.

A current limiting rule mainly consists of the following factors. We can combine these elements to achieve different current limiting effects:

? resource: resource name, that is, the target of the current limiting rule

? count: current limiting threshold

? grade: current limiting threshold type (QPS or number of concurrent threads)

? limitApp: The calling source for flow control. If it is default, the calling source will not be distinguished.

? default: Indicates that the caller is not distinguished, and requests from any caller will be subject to current limiting statistics. If the total number of calls to this resource name exceeds the threshold defined by this rule, current limiting is triggered.

? {some_origin_name}: Indicates that for a specific caller, only requests from this caller will be flow controlled. For example, NodeA configures a rule for the caller caller1, then the flow control will be triggered if and only when a request to NodeA comes from caller1.

? other: Indicates flow control for the traffic of other callers except {some_origin_name}. For example, resource NodeA is configured with a current limiting rule for the caller caller1 and a rule with the caller other. Then any call to NodeA from a non-caller1 cannot exceed the threshold defined by the other rule.

? Multiple rules can be configured for the same resource name. The order in which the rules take effect is: {some_origin_name} > other > default

? strategy: calling relationship current limiting strategy

? controlBehavior: flow control effect (direct rejection, Warm Up, uniform queuing)

Through this address, you can view real-time statistical information: http://localhost:8719/cnode?id=doTest

? Thread: represents the current number of concurrent processes processing the resource;

? Pass: Represents requests arriving within one second;

? blocked: represents the number of requests that are flow controlled within one second;

? Success: Represents requests successfully processed within one second;

? total: represents the total number of requests arriving within one second and blocked requests;

? RT: represents the average response time of the resource within one second;

? 1m-pass: It is a request that comes within one minute;

? 1m-block: It is the requests that are blocked within one minute;

? 1m-all: It is the sum of incoming requests and blocked requests within one minute;

? exception: It is the total number of exceptions in the business itself within one second.

2.2.5.1 Number of concurrent threads
Concurrency control is used to protect the business thread pool from being exhausted by slow calls. For example, when the downstream application that the application depends on causes unstable service and increased response delay for some reason, for the caller, it means a decrease in throughput and more threads occupied, and in extreme cases, even the thread pool may be exhausted. . In order to deal with the situation where too many threads are occupied, there are isolation solutions in the industry, such as using different thread pools for different business logic to isolate resource competition between the businesses themselves (thread pool isolation). Although this isolation scheme has better isolation, the price is that there are too many threads and the overhead of thread context switching is relatively large, especially for low-latency calls. Sentinel concurrency control is not responsible for creating and managing thread pools, but simply counts the number of threads in the current request context (the number of calls being executed). If the threshold is exceeded, new requests will be rejected immediately, and the effect is similar to semaphore isolation. Concurrency control is usually configured on the calling side.

2.2.5.2 QPS traffic
When QPS exceeds a certain threshold, measures are taken to control traffic. The effects of flow control include the following: direct rejection, warm up, and uniform queuing. Corresponds to the controlBehavior field in FlowRule.

2.2.5.3 Based on calling relationship
The calling relationship includes the caller and the callee. One method may call other methods, forming a calling chain. The so-called current limiting of the calling relationship triggers flow control based on different calling dimensions.

? Current limiting based on caller

? Limit current flow based on calling link entry

? Resource flow control with relationships

2.2.5.3.1 Current limiting based on caller
In many scenarios, it is also very important to limit the current flow based on the caller. For example, there are two services A and B that both initiate call requests to the Service Provider. If we want to limit the flow of requests only from service B, we can set the limitApp of the flow limiting rule to the name of service B. Sentinel Dubbo Adapter will automatically parse the application name of the Dubbo consumer (caller) as the caller name (origin), and will bring the caller name when performing resource protection. If the current limiting rule does not configure the caller (default), the current limiting rule will take effect on all callers. If the current limiting rule configures a caller, the current limiting rule will only take effect on the specified caller.

The so-called caller flow limitation is to control the flow according to the request source. We can set the limitApp attribute to configure the source information. It has three options:

? default: Indicates that the caller is not distinguished, and requests from any caller will be subject to current limiting statistics. If the total number of calls to this resource name exceeds the threshold defined by this rule, current limiting is triggered.

? {some_origin_name}: Indicates that for a specific caller, only requests from this caller will be flow controlled. For example, NodeA configures a rule for the caller caller1, then the flow control will be triggered if and only when a request from caller1 to NodeA comes.

? other: Indicates flow control for the traffic of other callers except {some_origin_name}. For example, resource NodeA is configured with a current limiting rule for the caller caller1 and a rule with the caller other. Then any call to NodeA from other than caller1 cannot exceed the threshold defined by the rule other.

For the same resource, multiple rules can be configured. The order in which the rules take effect is: {some_origin_name} > other >default

2.2.5.3.2 Current limiting based on calling link entry
A current-limited protection method may come from different calling links. For example, for resource NodeA, the requests for Entrance1 and Entrance2 are both called to resource NodeA. Sentinel allows resource current limiting only based on the statistical information of a certain entrance. For example, we can set strategy to RuleConstant.STRATEGY_CHAIN and set refResource to Entrance1 to indicate that only calls from Entrance1 will be recorded in the current limiting statistics of NodeA, without caring about calls arriving through Entrance2.


2.2.5.3.3 Resource flow control with relationships
When there is resource contention or dependency between two resources, the two resources are associated. For example, there is contention between read and write operations on the same field in the database. If the reading speed is too high, it will affect the writing speed, and if the writing speed is too high, it will affect the reading speed. If read and write operations are allowed to compete for resources, the overhead caused by the contention itself will reduce overall throughput. Association current limiting can be used to avoid excessive contention between resources with associated relationships. For example, the two resources read_db and write_db represent database reading and writing respectively. We can set current limiting rules for read_db to achieve write priority. : Set strategy to RuleConstant.STRATEGY_RELATE and set refResource to write_db. In this way, when the database writing operation is too frequent, the request to read the data will be limited.

2.2.6 Circuit breaker degradation
2.2.6.1 What is circuit breaker degradation?
In addition to flow control, circuit breaker and downgrade of unstable resources in the call link are also one of the important measures to ensure high availability.

Due to the complexity of the calling relationship, if a resource in the calling link is unstable, requests will eventually pile up. Sentinel circuit breaker downgrade will limit the call of this resource when a resource in the call link is in an unstable state (such as call timeout or an increase in abnormal ratio), so that the request fails quickly and avoids affecting other resources and causing degradation. Connection error. When a resource is downgraded, calls to the resource will be automatically disconnected within the next downgrade time window (the default behavior is to throw a DegradeException).

The principles of Sentinel and Hystrix are the same: when a resource in the calling link becomes unstable, for example, manifested as timeout and the exception ratio increases, the calling of this resource will be restricted and the request will fail quickly to avoid It affects other resources and ultimately produces an avalanche effect.

2.2.6.2 Circuit Breaker Degradation Design Concept
Sentinel and Hystrix take completely different approaches when it comes to restrictions.

Hystrix uses thread pools to isolate dependencies (corresponding resources in our concept). The advantage of this is that the most complete isolation between resources is achieved. The disadvantage is that in addition to increasing the cost of thread switching, it is also necessary to allocate the thread pool size to each resource in advance.

Sentinel takes two approaches to this problem:

?Limited by the number of concurrent threads

Different from the resource pool isolation method, Sentinel reduces the impact of unstable resources on other resources by limiting the number of concurrent resource threads. This not only eliminates the cost of thread switching, but also does not require you to pre-allocate the size of the thread pool. When a resource becomes unstable, for example, the response time becomes longer, the direct impact on the resource is the gradual accumulation of threads. When the number of threads accumulates on a specific resource to a certain number, new requests for that resource will be rejected. The stacked threads complete their tasks before continuing to receive requests.

? Degrade resources through response time

In addition to controlling the number of concurrent threads, Sentinel can quickly degrade unstable resources through response time. When the response time of a dependent resource is too long, all access to the resource will be directly denied and will not be restored until the specified time window has passed.

The following is a demo of Sentinel’s circuit breaker degradation using an annotation-based approach.

demo02 in the springboot-sentinel project

2.2.7 Control strategy
2.2.7.1 Direct rejection
Direct rejection ( RuleConstant.CONTROL_BEHAVIOR_DEFAULT ) is the default flow control method. When QPS exceeds the threshold of any rule, new requests will be rejected immediately, and the rejection method is to throw FlowException . This method is suitable for situations where the processing capacity of the system is known exactly, such as when the accurate water level of the system is determined through pressure testing.

2.2.7.2 Warm Up
Warm Up ( RuleConstant.CONTROL_BEHAVIOR_WARM_UP ) mode, that is, preheating/cold start mode. When the system has been at a low water level for a long time, and when the flow rate suddenly increases, directly raising the system to a high water level may instantly crush the system. Through “cold start”, the passing flow rate is slowly increased and gradually increases to the upper limit of the threshold within a certain period of time, giving the cold system a time to warm up and preventing the cold system from being overwhelmed.

As shown in the figure below, the current maximum number of concurrencies that the system can handle is 480. First, at the bottom mark position, the system has been idle, and then the request volume suddenly increased linearly. At this time, the system did not directly increase QPS to the maximum. value, but gradually increases the threshold value within a certain period of time, and the middle period is a vehicle for the system to gradually warm up.

2.2.7.3 Uniform queuing
The uniform queuing ( RuleConstant.CONTROL_BEHAVIOR_RATE_LIMITER ) method will strictly control the interval time between requests, that is, let the requests pass at a uniform speed, which corresponds to the leaky bucket algorithm.

This method is mainly used to handle intermittent burst traffic, such as message queues. Imagine a scenario where a large number of requests arrive in a certain second and are idle for the next few seconds. We hope that the system can gradually process these requests during the next idle period instead of directly rejecting them in the first second. Superfluous request.