Performance issues caused by log4j2 synchronized logs

51c2e3a90a3acccd25be4799caaa7076.gif

1. Problem review

1.1 Problem Description

In the performance test of the project, as the number of concurrency of related interfaces increases, the response time of the interface becomes longer, the interface throughput no longer increases, and the CPU usage of the application is high.

1.2 Analysis ideas

Who causes high CPU and blocks the increase of interface TPS? What is the call chain distribution of the interface response time? Is there any slow point?

1) The CPU of the application using flame graph analysis is as follows. The log4j2 log accounts for about 40% of the CPU. It is initially suspected to be a problem with log4j2.

2d4a509847bd8409858e325f70cea584.png

2) Analysis of call chain

Checking the call chain through pfinder, we found that the interface took a total of 78ms, and there were no obviously slow calling methods or slow SQL. We first eliminated the code problems of the interface itself.

1.3 Preliminary Conclusion

For log4j2 problems, you need to analyze the log-related configuration log4j2.xml in detail.

820c55fcff5d88ea12aa94a0eb9eb3d1.png

As you can see above, the nodes under the Loggers node in the log and the nodes are all printed synchronization logs. Synchronized logs mean that the business logic and log output statements of the program are both run in one thread. When there are too many logs, it blocks the operating efficiency of the business to a certain extent. Try changing to asynchronous logging:

Change to asynchronous log configuration: use AsyncLogger

830fb2566a254d285df2b1c79cea7b42.png

1.4 Regression Verification

3eb4ec057912b0023af46e194a7a5d0e.png

After changing the synchronous log to an asynchronous log. For the same 10 concurrency, the TP99 of the interface dropped from 51ms to 23ms, the throughput TPS of the interface increased from 493 to 1078, and the application CPU dropped from 82% to 57%.

Perfect end. The problem is solved, but we still need to study the log4j2 log in detail.

1.5 Conclusion
  • Log4j2’s use of asynchronous logging will greatly improve performance and reduce the impact on the application itself.

  • Fundamentally reduce unnecessary log output.

But how is log4j2 asynchronous log implemented and what is the difference between synchronous log? Why is asynchronous logging more efficient? It inspired me to learn some related knowledge of log4j2. Let me share it with you:

2. log4j2 log

2.1 Advantages of log4j2

Log4j2 is an upgraded version of log4j 1.x. It refers to some excellent designs of logback and fixes some problems, thus bringing some major improvements, mainly including:

e48d1c76040a6cb822595726fd09d7cd.png

  • Exception handling, in logback, exceptions in Appender will not be perceived by the application, but in log4j2, some exception handling mechanisms are provided.

  • Performance improvement. Compared with log4j 1 and logback, log4j2 has obvious performance improvement. There will be official test data later.

  • The automatic reload configuration refers to the design of logback. Of course, it will provide automatic refresh parameter configuration. The most practical thing is that we can dynamically modify the log level in production without restarting the application – which is very sensitive for monitoring. of.

  • No garbage mechanism. In most cases, log4j2 can use its designed garbage-free mechanism to avoid jvm gc caused by frequent log collection.

2.2 Log4J2 log classification

There are two ways to record logs in Log4j2: synchronous logs and asynchronous logs. Asynchronous logs can be divided into two ways: using AsyncAppender and using AsyncLogger. There are three log modes using LOG4J2, fully asynchronous log, mixed mode, and synchronous log. The performance is from high to low. The more threads, the higher the efficiency. It can also avoid the log thread stuck situation.

2f6262cb1bfa56d5000821e7713160db.png

Performance comparison of synchronous and asynchronous logs:

f1949bb42195bd9cad2e2daf658543b0.png

2.3 Synchronization log

Use the synchronized log of log4j2 for log output. The log output statement and the business logic statement of the program will run in the same thread. That is, when outputting the log, you must wait for the log output statement to be executed before executing the subsequent business logic statement.

2.4 Asynchronous logs

When using asynchronous logs for output, the log output statements and business logic statements do not run in the same thread. Instead, a dedicated thread is used for log output operations. The main thread that processes the business logic can execute subsequent business without waiting. logic.

The biggest feature of log4j2 is asynchronous logging. Its performance improvement mainly benefits from asynchronous logging. Let’s take a look at how to use log4j2’s asynchronous logging.

There are two ways to implement asynchronous logs in Log4j2: AsyncAppender and AsyncLogger.
in:

  • AsyncAppender uses ArrayBlockingQueue to save log events that need to be output asynchronously;

  • AsyncLogger uses the Disruptor framework to achieve high throughput.

Note that these are two different implementation methods, with different reflections in design and source code.

2.4.1 AsyncAppender

AsyncAppender is implemented by referencing other Appenders. When log events arrive, another thread will be opened to process them. It should be noted that if an exception occurs during Appender, it will not be perceived by the application. AsyncAppender should be configured after the Appender it references, and is implemented by default using java.util.concurrent.ArrayBlockingQueue without the need for other external class libraries. When using this Appender, you need to be aware that blocking queues are susceptible to lock contention in a multi-threaded environment, which may have an impact on performance.

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="warn">
 <Appenders>
 <!--Normal Appender configuration, the RollingFile configured here will be referenced by name in the AsyncAppender below-->
 <RollingFile name="RollingFileError" fileName="${Log_Home}/error.${date:yyyy-MM-dd}.log" immediateFlush="true"
filePattern="${Log_Home}/$${date:yyyy-MM}/error-%d{MM-dd-yyyy}-%i.log.gz">
     <PatternLayout pattern="%d{yyyy-MM-dd 'at' HH:mm:ss z} %-5level %logger{36} : %msg%xEx%n"/>
       <ThresholdFilter level="error" onMatch="ACCEPT" onMismatch="DENY"/>
       <Policies>
        <TimeBasedTriggeringPolicy modulate="true" interval="1"/>
        <SizeBasedTriggeringPolicy size="10MB"/>
        </Policies>
 </RollingFile>
 <!--An Appender is configured-->
 <!--Asynchronous AsyncAppender configuration directly refers to the name of RollingFile above-->
 <Async name="Async">
        <AppenderRef ref="RollingFileError"/>
 </Async>
 <!--The asynchronous AsyncAppender is configured and several configurations are needed-->
 </Appenders>
 <Loggers>
         <Root level="error">
          <!--If the name of the asynchronous AsyncAppender is referenced here, it is the asynchronous output log-->
         <!--If you reference the name of RollingFile in the Appenders tag here, it will be a synchronized output log -->
          <AppenderRef ref="Async"/>
</Root>
 </Loggers>
</Configuration>

2.4.2 AsyncLogger

AsyncLogger is the highlight of log4j2 and is also the officially recommended asynchronous method. It can make calling Logger.log return faster. AsyncLogger in Log4j2 uses the Disruptor framework internally.

Introduction to Disruptor

Disruptor is a high-performance queue developed by LMAX, a British foreign exchange trading company. The system developed based on Disruptor can support 6 million orders per second in a single thread.
Currently, many well-known projects including Apache Strom and Log4j2 have applied Disruptor to obtain high performance.
The internal core data structure of the Disruptor framework is RingBuffer, which is a lock-free ring queue.

Why is Disruptor so fast?

  • lock-free – uses CAS to achieve thread safety

  • Using cache line padding to resolve spurious sharing issues

There are two options for asynchronous logging: global async and hybrid async.

1) Global asynchronous

Global asynchronous means that all logs are recorded asynchronously. There is no need to make any changes to the configuration file. You only need to add a parameter when starting the jvm; this is the simplest configuration and provides the best performance.
Then add the log4j2.component.properties configuration file in the src/java/resources directory

Set asynchronous log system properties

log4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector

2) Mixed asynchronous

Mixed asynchronous means that you can use synchronous logs and asynchronous logs at the same time in your application, which makes the log configuration more flexible. Because the Log4j document also says that although Log4j2 provides a set of exception handling mechanisms that can cover most states, there are still a small number of special situations that cannot be fully handled. For example, if we record audit logs, then the official It is recommended to use synchronous logs. For other places that only record a program log, using asynchronous logs will greatly improve performance and reduce the impact on the application itself. The mixed asynchronous method needs to be implemented by modifying the configuration file and using AsyncLogger to mark the configuration.

Step 1: Add relevant jar packages to pom

<dependency>
  <groupId>com.lmax</groupId>
  <artifactId>disruptor</artifactId>
  <version>3.4.2</version>
</dependency>

Step 2: An example of mixed configuration of log4j2.xml synchronous log and asynchronous log (configuring AsyncLogger) is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<!--Log level and priority sorting: OFF > FATAL > ERROR > WARN > INFO > DEBUG > TRACE > ALL -->
 <!--status="WARN": used to set the information output level of log4j2's own internal log. The default is OFF-->
 <!--monitorInterval="30": Interval seconds, automatically detect changes in configuration files and reconfigure itself-->
<configuration status="WARN" monitorInterval="30">
   <Properties>
       <!--1. Customize some constants and then use ${variable name} to reference -->
       <Property name="logFilePath">log</Property>
       <Property name="logFileName">test.log</Property>
   </Properties>
   <!--2. Appenders: Define output content, output format, output method, log saving strategy, etc. The following three tags are commonly used [console, File, RollingFile]-->
   <!--Configure the destination of log output in Appenders
           console is only the console system.out.println
           rollingFile only generates a new file when the file size reaches the specified size -->
   <appenders>
       <!--console: Configuration of console output-->
       <console name="Console" target="SYSTEM_OUT">
           <!--PatternLayout: The format of the output log, LOG4J2 defines the output code, see the second part for details %p output priority, that is, DEBUG, INFO, WARN, ERROR, FATAL-->
           <PatternLayout pattern="[%d{HH:mm:ss:SSS}] [%p] - %l - %m%n"/>
       </console>
       <!--File: Synchronously output logs to local files-->
       <!--append="false": According to the log policy, clear the file and re-enter the log each time, which can be used for testing-->
       <File name="log" fileName="${logFilePath}/${logFileName}" append="false">
           <!-- Formatted output:
           %d represents the date, %thread represents the thread name,
           %-5level: Level displays 5 characters wide from the left
           %thred: Output the name of the thread that generated the log event
           %class: is the output class
           %L: Line number in the output code
           %M: method name
           %msg: log message,
           %n is a newline character
           %c: The category to which the output log information belongs, usually the full name of the category.
           %t: Output the name of the thread that generated the log event
           %l: Outputs the location where the log event occurred, equivalent to the combination of %C.%M (%F:%L), including the category name, the thread where it occurred, and the number of lines in the code. Example: Testlog4.main(TestLog4.Java:10)
           %p: Output log information priority, namely DEBUG, INFO, WARN, ERROR, FATAL,
           2020.02.06 at 11:19:54 CST INFOcom.example.redistest.controller.PersonController 40 setPerson - 1 piece of data added successfully
           -->
           <!-- %class{36} means the class name can be up to 36 characters -->
           <PatternLayout pattern="%d{HH:mm:ss.SSS} %-5level %class{36} %L %M - %msg%xEx%n"/>
       </File>
       <!--
       The key point is the date format after filePattern and the interval of TimeBasedTriggeringPolicy. Which digit the date format is accurate to and which unit the interval is accurate to.
       1) TimeBasedTriggeringPolicy needs to be used in conjunction with filePattern. Since the minimum time granularity configured in filePattern is set to dd days, it means that a new file is created every day to save the log.
       2) SizeBasedTriggeringPolicy means that when the file size is larger than the specified size, a new file is generated to save the log. Used with %i -->
       <RollingFile name="RollingFileInfo" fileName="${sys:user.home}/logs/info.log"
                    filePattern="${sys:user.home}/logs/$${date:yyyy-MM}/info-%d{yyyy-MM-dd}-%i.log">
           <!--ThresholdFilter: Log output filter-->
           <!--level="info": log level, onMatch="ACCEPT": if the level is above info, accept, onMismatch="DENY": if the level is below info, reject-->
           <ThresholdFilter level="info" onMatch="ACCEPT" onMismatch="DENY"/>
           <PatternLayout pattern="[%d{HH:mm:ss:SSS}] [%p] - %l - %m%n"/>
           <!-- Policies: Log rolling strategy-->
           <Policies>
               <!-- TimeBasedTriggeringPolicy: Time rolling policy, the default is to generate new files at 0 o'clock, interval="6": Customize the file rolling time interval, generate new files every 6 hours, modulate="true": Whether to generate files in 0 o'clock offset time, that is, 6 o'clock, 12 o'clock, 18 o'clock, 0 o'clock -->
               <TimeBasedTriggeringPolicyinterval="6" modulate="true"/>
               <!-- SizeBasedTriggeringPolicy: File size rolling policy-->
               <SizeBasedTriggeringPolicysize="100 MB"/>
           </Policies>
           <!-- If the DefaultRolloverStrategy attribute is not set, it will default to a maximum of 7 files in the same folder, and 20 is set here -->
           <DefaultRolloverStrategy max="20"/>
       </RollingFile>
       <RollingFile name="RollingFileWarn" fileName="${sys:user.home}/logs/warn.log"
                    filePattern="${sys:user.home}/logs/$${date:yyyy-MM}/warn-%d{yyyy-MM-dd}-%i.log">
           <ThresholdFilter level="warn" onMatch="ACCEPT" onMismatch="DENY"/>
           <PatternLayout pattern="[%d{HH:mm:ss:SSS}] [%p] - %l - %m%n"/>
           <Policies>
               <TimeBasedTriggeringPolicy/>
               <SizeBasedTriggeringPolicy size="100 MB"/>
           </Policies>
       </RollingFile>
       <RollingFile name="RollingFileError" fileName="${sys:user.home}/logs/error.log"
                    filePattern="${sys:user.home}/logs/$${date:yyyy-MM}/error-%d{yyyy-MM-dd}-%i.log">
           <ThresholdFilter level="error" onMatch="ACCEPT" onMismatch="DENY"/>
           <PatternLayout pattern="[%d{HH:mm:ss:SSS}] [%p] - %l - %m%n"/>
           <Policies>
               <TimeBasedTriggeringPolicy/>
               <SizeBasedTriggeringPolicy size="100 MB"/>
           </Policies>
       </RollingFile>
   </appenders>
   <!--3. Then define the logger. Only when the logger is defined and the appender introduced will the appender take effect-->
   <loggers>
       <!--Filter out some useless DEBUG information from spring and mybatis-->
       <!--The Logger node is used to specify the log form individually, and name is the package path. For example, you want to specify the INFO level for all logs under the org.springframework package. -->
       <logger name="org.springframework" level="INFO"></logger>
       <logger name="org.mybatis" level="INFO"></logger>
       <!-- The Root node is used to specify the root log of the project. If the Logger is not specified separately, the Root log output will be used by default -->
       <root level="all">
           <appender-ref ref="Console"/>
           <appender-ref ref="RollingFileInfo"/>
           <appender-ref ref="RollingFileWarn"/>
           <appender-ref ref="RollingFileError"/>
       </root>
       <!--AsyncLogger: Asynchronous log, LOG4J has three log modes, fully asynchronous log, mixed mode, and synchronous log. The performance ranges from high to low. The more threads, the higher the efficiency. It can also avoid the log thread stuck situation -->
       <!--additivity="false": additivity sets whether the event is output in the root logger. In order to avoid repeated output, you can set additivity to "false" under the Logger tag to only output in the custom Appender.
-->
       <AsyncLogger name="AsyncLogger" level="trace" includeLocation="true" additivity="false">
           <appender-ref ref="RollingFileError"/>
       </AsyncLogger>
   </loggers>
</configuration>

2.5 Things to note when using asynchronous logs

There are some things you need to pay attention to when using asynchronous logs, as follows:

  1. Do not use AsyncAppender and AsyncLogger at the same time. That is, do not use the Async logo and configure AsyncLogger at the same time when configuring Appender. This will not cause an error, but it will not have any benefit in improving performance.

  2. Do not still use AsyncAppender and AsyncLogger when global synchronization is turned on. This has the same meaning as the previous one, that is, if you use asynchronous logs, AsyncAppender, AsyncLogger and global logs should not appear at the same time.

  3. If it is not absolutely necessary, regardless of whether it is synchronous or asynchronous, set immediateFlush to false, which will greatly help improve performance.

  4. If it is not really needed, do not print location information, such as HTML location, or %C or $class, %F or %file, %l or %location, %L or %line, %M or %method, in pattern mode. Wait, because Log4j needs to take a snapshot of the stack when printing logs to obtain this information, which is a huge loss in performance.

3. Summary

During the stress test, we try our best to explore and excavate the root causes of the problems, and constantly accumulate and summarize practical experience. Especially when using some open source components, you need to learn and understand the usage specifications and best practices in detail. If necessary, you can add performance testing to ensure that we meet our quality and performance requirements.

Reference materials

[1] https://www.yisu.com/zixun/623058.html

[2] https://www.jianshu.com/p/9f0c67facbe2

[3] https://blog.csdn.net/thinkwon/article/details/101625124

[4] https://zhuanlan.zhihu.com/p/386990511

-end-