How to locate Full GC or OOM problems in Java application production

1 Introduction

Production application services frequently perform Full GC but cannot release memory, or even OOM. This situation is likely to be a memory leak or insufficient heap memory allocation. At this time, you need to dump the heap information to locate the problem and see where the memory leaks are.

Dump files, also called memory dump files or memory snapshot files, are memory snapshots of a process or system at a given time. For example, when a process crashes or other problems occur in the process, or even at any time, we can use tools to back up the memory of the system or process for debugging and analysis. It contains module information, thread information, stack call information, exception information, etc.

2 Check the java service process pid

ps -aux|grep java

3 Simple analysis of GC

You can use jstat that comes with jdk to briefly analyze the gc situation.

# Output every 1000 milliseconds jstat -gcutil 1000

The output is as follows:

—————————————-

S0 S1 E O M CCS YGC YGCT FGC FGCT GCT

0.00 66.15 28.84 77.76 93.88 91.90 183 1.058 4 0.517 1.575

0.00 66.15 29.51 77.76 93.88 91.90 183 1.058 4 0.517 1.575

0.00 66.15 30.45 77.76 93.88 91.90 183 1.058 4 0.517 1.575

—————————————-

S0: Survivor area survivor0 usage percentage

S1: Survivor area survivor1 usage percentage

E: Percentage of new generation Eden usage

O: Old generation Old usage percentage

M: Metadata Metaspace usage percentage

CCS: Compressed class space usage percentage

YGC: Young GC times

YGCT: Young GC time consumption, milliseconds

FGC: Full GC times

FGCT: Full GC time consumption, milliseconds

GCT: Total GC time taken

Two adjacent GCs can quickly determine the time consumption of that GC; GCT / GC = average time consumption of each GC

Standard reference for whether GC is frequent: Young GC is executed quickly (within 50 milliseconds), Young GC is executed infrequently (once every 10 seconds), Full GC is executed quickly (within 1 second), Full GC is executed infrequently (once every 10 minutes or so) )

If you find that the number of Full GC times increases frequently and the number of Young GC times remains unchanged or changes very little, this means that the heap memory is insufficient, and it is likely to be a memory leak.

4 Export dump file

To analyze a dump file, you must first obtain the dump file. There are basically two ways to obtain it.

4.1 Set JVM environment variables

-XX: + HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/logs/xxx.hprof

When OOM occurs, the dump file will be automatically generated in the specified directory. At this time, you only need to obtain it in the specified directory.

4.2 Obtain through jmap

jmap -dump:live,format=b,file=/logs/xxx.hprof <pid>

5 Analyze dump files

For dump files that are smaller than the memory of your own computer, you can try to download them locally for analysis. The tools that can be used are: JVisualVM, JHat, MAT

5.1 Using JVisualVM

There is JVisualVM under the bin directory of jdk, open it, then select File->Load, and select the dump file.

Click on the thread with exception error: XNIO-1 task-1, wait for a while and you can open the specified OOM place of the thread on the heap dump.

Check the specific number of instances and occupied size

Click on the class -> Right-click on the class name with the largest number of instances -> Select Show in instance view -> View references

Right-click and select Show in Thread to view details.

5.2 Using jhat

jhat -J-Xmx1024M [file]

After execution, wait for Started HTTP on port 7000 to be output in the console. After seeing it, you can access http://ip:7000 through the browser. This page defaults to displaying all object instances in the system by package category. At the bottom of the page, there is the Other Queries navigation, which contains links that display the number of object instances in the jvm, links that display the size of the objects in the jvm, etc. Click the link that displays the size of the objects in the jvm. When jhat analyzes a large heap dump file The performance is not good and the speed is very slow.

5.3 Use mat (recommended)

5.3.1 Download

mat download address: https://eclipse.dev/mat/previousReleases.php

Download the appropriate mat version according to the jdk version. You can check the server’s operating system version number through the uname -a command.

[root@xxx ~]# uname -a
Linux xxxx 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

You are using JDK1.8 and the system is x86_64. You need to select the following version:

5.3.2 Unzip

unzip MemoryAnalyzer-1.8.0.20180604-linux.gtk.x86_64.zip

5.3.3 Modify memory

vi MemoryAnalyzer.ini

5.3.4 File Analysis

./ParseHeapDump.sh [hprof file] org.eclipse.mat.api:suspects org.eclipse.mat.api:overview org.eclipse.mat.api:top_components

The hprof file needs to be replaced with the path of the hprof file when actually used. After running, a series of index/threads files and 3 compressed files will be generated in the directory where the hprof file is located. These 3 compressed files are the analysis reports we focus on, and they are:

  • xxx_Leak_Suspects.zip: The report contains areas suspected of causing memory leaks, and the report includes a class hierarchy diagram. For OOM scenarios, it is easy to locate which object occupies a large amount of memory and is not released.
  • xxx_System_Overview.zip: Contains basic information of heap dump, related configuration and thread information of dump process JVM, etc.
  • xxx_Top_Components.zip: View the objects/class/classloader/packages that take up the largest space. Reports are presented in the form of pie charts and tables. Through this report, you can locate which objects occupy more memory when the Java program is running, which is very helpful for troubleshooting and program optimization.

These three reports are the key to analyzing the problem. We use reports to identify objects that occupy too much memory, and then analyze program logic using logs and project source code to gradually locate problems.

Mainly look at xxx_Leak_Suspects.zip, unzip it, and open index.html

In most cases it’s easy to see where the problem is, but sometimes further analysis may be needed.

References:

gc query java java check gc status_mob6454cc61981e’s technical blog_51CTO blog

https://www.cnblogs.com/east7/p/16989436.html

Use Linux’s MAT analysis tool to analyze very large dump files (several GB)_Ready to take off 55’s blog-CSDN blog

Java Heap Dump analysis steps – short book

If the dump file is too large, use linux mat to analyze the record_How to analyze the dump file if it is too large-CSDN Blog