Cloud native microservice Spring Cloud Hystrix downgrade and circuit breaker practical application

Directory of series articles

Chapter 1 Application of Java Thread Pool Technology
Chapter 2 Application of CountDownLatch and Semaphone
Chapter 3 Introduction to Spring Cloud
Chapter 4 Spring Cloud Netflix-Eureka
Chapter 5 Spring Cloud Netflix Ribbon
Chapter 6 OpenFeign of Spring Cloud
Chapter 7 GateWay of Spring Cloud
Chapter 8 Hystrix of Spring Cloud Netflix

Article directory

  • Table of Contents of Series Articles
  • Preface
  • 1. Hystrix concept
  • 2. Function of Hystrix
    • 2.1. Service downgrade
    • 2.2. Service circuit breaker
  • 3. Case
    • 3.1. Service provider downgrade
      • 3.1.1. Modify pom.xml configuration dependencies
      • 3.1.2. Add downgrade annotation code to the microservice method
      • 3.1.3. Add annotations to the startup class
    • 3.2. Downgrading on the consumer side
      • 3.2.1. Modify pom.xml configuration dependencies
      • 3.2.2. Modify application.yml configuration
      • 3.2.3. Add code to the consumer method
      • 3.2.4. Add annotations to the consumer startup class
    • 3.3. Global downgrade
    • 3.4. Decoupling and downgrading
      • 3.4.1. Modify application.yml
      • 3.4.2. Modifications in methods
    • 3.5. Service circuit breaker
      • 3.5.1. Steps to implement service circuit breaker in Hystrix
      • 3.5.2. Example
  • Summarize

Foreword

When calling between multiple microservices, if microservice A calls microservice B and microservice C, and microservice B and microservice C call other microservices, this is So-called “fan-out”.

If the call response time of a certain microservice on the fan-out link is too long or is unavailable, the call to microservice A will occupy more and more system resources, causing the system to crash, which is the “avalanche effect”.

At this time, a component (hytrix) is needed to ensure that when the microservice fails, it will not cause an avalanche effect on the entire system to improve the resilience of the distributed system.

1. Hystrix concept

  • Hystrix is an open source library used to handle the delay and fault tolerance of distributed systems. It can ensure that when a service fails, it will not cause an avalanche effect on the entire system to improve the resilience of distributed systems;
  • As a “circuit breaker”, when a service fails, it can be monitored through the short circuit breaker and a response result that can be processed is returned to ensure that the service calling thread will not be occupied for a long time and avoid the spread of faults.

2. Function of Hystrix

2.1. Service downgrade

When a service unit fails, through the fault monitoring of the circuit breaker, a service-expected and handleable alternative response is returned to the calling method, instead of waiting for a long time or throwing an exception that the calling method cannot handle, avoiding Propagation of faults in distributed systems.

2.2. Service circuit breaker

When a service unit fails, through the fault monitoring of the circuit breaker, a preset failure response is directly returned instead of waiting for a long time or throwing an exception that cannot be handled by the calling method, which avoids the failure in the distributed system. spread.

3. Case

3.1. Service provider downgrade

3.1.1. Modify pom.xml configuration dependencies

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
    <version>2.2.10.RELEASE</version>
</dependency>

3.1.2. Add downgrade annotation code to the microservice method

//Downgrade to userInfoListFallBack after more than 3 seconds
@HystrixCommand(fallbackMethod = "userInfoListFallBack",commandProperties = {<!-- -->@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds"
        ,value = "3000")})
  • The parameter fallbackMethod attribute is used to specify the fallback method.
  • The parameter execution.isolation.thread.timeoutInMilliseconds is used to set the peak value of its own call timeout. It can run normally within the peak value, otherwise the downgrade method will be executed.

3.1.3. Add annotations to the startup class

@EnableHystrix

3.2. Consumer downgrade

3.2.1. Modify pom.xml configuration dependencies

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
    <version>2.2.10.RELEASE</version>
</dependency>

3.2.2. Modify application.yml configuration

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 3000

3.2.3. Add code to the consumer method

@HystrixCommand(fallbackMethod = "fallbackMethod")

3.2.4. Add annotations to the consumer startup class

@EnableHystrix

3.3. Global downgrade

@RestController
@RequestMapping("/user")
@DefaultProperties(defaultFallback = "globalFallback")
@Slf4j
public class UserConsumer {<!-- -->
@GetMapping("/userInfoList")
@HystrixCommand
public List<UserInfo> userInfoList(){<!-- -->
}
}

3.4. Decoupling and downgrading

Whether it is the downgrade method specified by the business method or the global downgrade method, they must be in the same class as the business method to take effect. The coupling between business logic and downgrade logic is extremely high.

3.4.1. Modify application.yml

feign:
#hystrix:
# enabled: true
  circuitbreaker:
    enabled: true

3.4.2. Modify the method

@FeignClient(value = "USER-SERVICE",fallback = UserServiceFallBack.class)
public class UserServiceFallBack implements UserService{<!-- -->

3.5. Service circuit breaker

The circuit breaker mechanism is a microservice link protection mechanism that emerged to deal with the avalanche effect.
There are three fuse states involved in the fuse mechanism:

  • Circuit breaker closed state (Closed): When the service access is normal, the circuit breaker is in the closed state, and the service caller can call the service normally.
  • Fusing open state (Open): By default, when the interface call error rate reaches a threshold (for example, 50%) within a fixed period of time, the fuse will enter the fusing open state. After entering the circuit breaker state, subsequent calls to the service will be cut off, and the circuit breaker will execute the local downgrade (FallBack) method.
  • Half-Open: After the fuse is turned on for a period of time, the fuse will enter the half-open state. In the semi-circuit state, the circuit breaker will try to resume the service caller’s call to the service, allow some requests to call the service, and monitor its call success rate. If the success rate reaches the expected level, it means that the service has returned to normal and the fuse enters the closed state; if the success rate is still low, the fuse enters the open state again.

3.5.1, Hystrix steps to implement service circuit breaker

  • When the call error rate of the service reaches or exceeds the rate specified by Hystix (default is 50%), the fuse enters the blown open state.
  • After the fuse enters the blown open state, Hystrix will start a sleep time window. During this time window, the downgrade logic of the service will temporarily serve as the main business logic, and the original main business logic will be unavailable.
  • When there is a request to call the service again, the downgrade logic will be directly called to quickly return a failure response to avoid a system avalanche.
  • When the sleep time window expires, Hystrix will enter a semi-circuit state, allowing some requests to call the original main business logic of the service, and monitor its call success rate.
  • If the call success rate reaches the expectation, it means that the service has returned to normal, Hystrix enters the circuit breaker closed state, and the original main business logic of the service is restored; otherwise, Hystrix re-enters the circuit breaker open state, the sleep time window is re-timed, and steps 2 to 5 continue to be repeated. .
Parameter Description
metrics.rollingStats.timeInMilliseconds Statistical time window.
circuitBreaker.sleepWindowInMilliseconds Sleep time window, after the fuse is turned on for a period of time, the fuse will automatically Entering the semi-fuse state, this period of time is called the sleep window period.
circuitBreaker.requestVolumeThreshold Request total threshold.

Within the statistical time window, the total number of requests must reach a certain magnitude before Hystrix may open the fuse and enter the fuse-on transition state, and this request magnitude is the total number of requests threshold. The default threshold for the total number of Hystrix requests is 20, which means that within the statistical time window, if the number of service calls is less than 20, even if all requests are called with errors, the circuit breaker will not be opened.

circuitBreaker.errorThresholdPercentage Error percentage threshold.

When the total number of requests exceeds the total number of requests threshold within the statistical time window, and the request call error rate exceeds a certain ratio, the fuse will open and enter the fuse-on transition state, and this ratio is the error percentage threshold. Setting the error percentage threshold to 50 means that the error percentage is 50%. If the service makes 30 calls and 15 of them have errors, that is, the error percentage exceeds 50%, the fuse will be opened at this time.

3.5.2, Example

@HystrixCommand(fallbackMethod = "paymentCircuitBreaker_fallback",commandProperties = {<!-- -->
        @HystrixProperty(name = "circuitBreaker.enabled",value = "true"), // Whether to turn on the circuit breaker
        @HystrixProperty(name = "metrics.rollingStats.timeInMilliseconds",value = "10000")//time window 10 seconds
        @HystrixProperty(name = "circuitBreaker.requestVolumeThreshold",value = "10"),//Number of requests
        @HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds",value = "30000"), // Time window period
        @HystrixProperty(name = "circuitBreaker.errorThresholdPercentage",value = "60"),//The failure rate reaches what level before tripping
})
public interface UserService {<!-- -->
    public String paymentCircuitBreaker(@PathVariable("id") Integer id);
}
@Service
public class UserServiceImpl implements UserService {<!-- -->
    @Override
    @HystrixCommand(fallbackMethod = "paymentCircuitBreakerFallback",commandProperties = {<!-- -->
            @HystrixProperty(name = "circuitBreaker.enabled",value = "true"), // Whether to turn on the circuit breaker
            @HystrixProperty(name = HystrixPropertiesManager.METRICS_ROLLING_STATS_TIME_IN_MILLISECONDS,value = "10000"),
            @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_REQUEST_VOLUME_THRESHOLD,value = "10"),//Number of requests
            @HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds",value = "30000"), // Time window period
            @HystrixProperty(name = "circuitBreaker.errorThresholdPercentage",value = "60"),//The failure rate reaches what level before tripping
    })
    public String paymentCircuitBreaker(Integer id) {<!-- -->
        if(id < 0) {<!-- -->
            throw new RuntimeException("******id cannot be negative");
        }
        String serialNumber = UUID.randomUUID().toString();
        return Thread.currentThread().getName() + "\t" + "The call was successful, serial number: " + serialNumber;
    }
    
    public String paymentCircuitBreakerFallback(@PathVariable("id") Integer id) {<!-- -->
        return "id cannot be negative, please try again later, /(ㄒoㄒ)/~~ id: " + id;
    }
}

Summary

Hystrix is a library open sourced by Netflix for handling latency and fault tolerance in distributed systems. It improves the availability and stability of distributed systems through resource isolation, service degradation, circuit breakers and other methods. In this chapter, we introduce the specific methods of fuses and global fuses in Hystrix, and make it easier for everyone to understand the role of Hystrix circuit breakers through practical exercises.