java garbage collector

How to find trash?

1. Reference count. When the reference count becomes 0, it is considered garbage. However, this method cannot solve the problem of circular references.

A → B → C

↑________↓

2.Root Searching root reachability algorithm

When java first runs, it will call the main method. The objects in the main method are the root objects.

which instances are roots?

Thread stack variables, JNI pointers, constant pools, method area static variables

JVM stack,

native method stack,

run-time constant pool,

static references in method area,

Clazz,

JNI pointer (object used to call C, C++ native methods)

Commonly used GC algorithms

Mark-Sweep: The algorithm is relatively simple and is more efficient when there are many surviving objects; two scans are inefficient and prone to fragmentation.

Copying: Suitable for situations where there are relatively few surviving objects. It only scans once, which improves efficiency and eliminates fragmentation; space is wasted and object references need to be adjusted.

Mark-Compact (mark compression): It does not produce fragments, facilitates object allocation, and does not cause memory halving; but it requires scanning twice and moving objects, which is inefficient. This algorithm is the same as the markSweep algorithm in the marking stage. , but after completing the marking, instead of directly cleaning the garbage memory, the surviving objects are moved to one end, and then all memory outside the end boundary is directly cleared. These three algorithms have their own advantages and disadvantages, and each has its own suitable scenarios.

JVM memory generation model (for generational garbage collection algorithm)

1. Object life cycle
  • Most objects have extremely short life cycles
  • A few objects survive for a long time
2. JVM generation model
1. Young generation and old generation

Heap memory is divided into young generation and old generation

(1) Young generation

The young generation is also called the new generation. Stores objects that will be recycled soon after they are created and used.

Young generation division
  • It is divided into Eden area, from survivor area, and to survivor area
  • Eden area and two smaller survivor spaces. The default size ratio is 8:1:1
(2) Old era

Store long-lived objects

Why is it divided into young generation and old generation

Because objects with different survival times require different garbage collection algorithms, recycling is more efficient.

  • Most objects in Java are born and die. These objects that do not need to exist for a long time need to be placed in the young generation and need to use a garbage collection algorithm;
  • If it exists for a long time and is placed in the old generation, another garbage collection algorithm needs to be used.
Why is there a survivor area
  • If there is no survivor area and only the eden area, the object will be sent to the old generation every time a minor gc is performed. It is easy to trigger full gc, affecting performance
  • If there is a survivor area, you can reduce the number of objects sent to the old generation and reduce the occurrence of full gc
Why set up two survivor areas

Each minor gc avoids fragmentation problems by copying the contents of eden and one survivor to another survivor.

Permanent generation

There was a permanent generation before JDK1.8. Starting from JDK1.8, the permanent generation has been cancelled. But the method area still exists as a conceptual area, but we no longer call it the method area, but the metaspace.

The original metainformation of the class in the permanent generation will be put into the local memory (metadata area metaspace), and the static variables and internal strings of the class will be put into the Java heap.

When does the object enter the old age

When the replication age between survivor1 (s0|from) ~ survivor2 (s1|to) exceeds the limit, it enters the old area and is configured through the parameter -XX: MaxTenuringThreshold.

1. Exceeded -XX: MaxTenuringThreshold specified number of times (YGC)

Parallel Scavenge 15

CMS 6

G1 15

2. Dynamic age jvm misunderstanding – dynamic object age determination – short book

s1->s2 exceeds 50%

Put the oldest one into the old area

3. Assign guarantee

During YGC, new objects are generated and the space in the survivor area is not enough. The space guarantee directly enters the old generation.

4. Is it a large object?

-XX:PreTenureSizeThreshold How big is the large object?

When is it allocated on the stack?

-Thread private small object

-No escape

-Supports scalar substitution

-No adjustment required

//Scalar replacement refers to, for example, this User object. The entire object can be represented by only id and name.
public class V15_TestTLAB {
    User u;
    class User{
        int id;
        String name;

        public User(int id, String name) {
            this.id = id;
            this.name = name;
        }
    }
    void alloc(int i){
        //There is no reference to this object, and there is no way to escape from the braces. This is called no escape.
        new User(i,"name " + i);
        //u = new User(i,"name " + i); This is called escape
        //return user; This also has escape
    }
    
}

When is it allocated on TLAB (Thread Local Allocation Buffer)?

-Occupies Eden, default 1%

-When multi-threading, you can apply for space without competing for Eden, improving efficiency.

-Small objects

-No adjustment required

Throughput

Throughput is the ratio of the time the CPU spends running user code to the total CPU time consumed, that is

Throughput = time running user code / (time running user code + garbage collection time).

Assuming that the virtual machine runs for a total of 100 minutes, of which garbage collection takes 1 minute, the throughput is 99%.

Minor GC and Full GC

  • New generation GC (Minor GC): refers to the garbage collection action that occurs in the new generation. Because most Java objects have the characteristics of ephemeral life and death, Minor GC is very frequent and the general recycling speed is also low. faster. See the previous article for specific principles.
  • Old generation GC (Major GC / Full GC): refers to the GC that occurs in the old generation. When Major GC appears, it is often accompanied by at least one Minor GC (but not absolutely Yes, there is a direct Major GC strategy selection process in the collection strategy of the Parallel Scavenge collector). The speed of Major GC is generally more than 10 times slower than Minor GC.

Garbage collector combination

The combination relationship between 7 classic garbage collectors:

illustrate:

  1. There is a connection between the two recyclers, indicating that they can be used together;
  2. Serial Old serves as a backup plan for CMS failure when “Concurrent Mode Failure” occurs;
  3. G1 can be used in the new generation and the old generation;
  4. Red dotted line: JDK 8 declares these two combinations as obsolete and will be completely removed in JDK 9;
  5. Green dotted line connection: In JDK 14, this combination is deprecated;
  6. Green dotted border: In JDK 14, CMS was deleted.

Default garbage collector view

Add -XX: + PrintCommandLineFlags JVM parameter configuration, program output in JDK 8 environment:

Introduction to the classic garbage collector

Serial, Serial Old recycler

The Serial garbage collector is a single-threaded Serial garbage collector, which is the default new generation garbage collector in Client mode in HotSpot. It uses copy algorithm, serial recycling and STW mechanism for memory recycling;

The Serial Old garbage collector is the old generation garbage collector provided by Serial. It uses mark compression algorithm, serial recycling and STW mechanism for memory recycling:

  • Serial Old is the default old generation garbage collector running in Client mode;
  • Serial Old has two main uses in Server mode: used in conjunction with the new generation Parallel Scavenge; as a backup garbage collection solution for the old generation CMS collector.

Serial is suitable for virtual machines running in Client mode or environments with small memory (tens of MB to one or two hundred MB). Because it is serial and has a long STW, it is not suitable for applications that require fast response. , applications with strong interaction.

You can enable the Serial recycler through the XX: + UseSerialGC parameter, which means that the new generation uses Serial and the old generation uses Serial Old.

ParNew Recycler

ParNew is the abbreviation of Parallel New and is a multi-threaded version of Serial garbage collector. ParNew is the default garbage collector for the new generation of many JVMs running in Server mode. It uses a copy algorithm, parallel recycling and STW mechanism for memory recycling.

The ParNew collector can be enabled through the XX: + UseParNewGC parameter, which means that the new generation uses ParNew and the old generation is not affected.

Schematic diagram of Serial, ParNew and Serial Old recycler:

PS + PO Recycler

The Parallel Scavenge collector also works on the new generation, and also uses the copy algorithm, parallel recycling and STW mechanism.

Comparison between Parallel Scavenge and ParNew:

  • Parallel Scavenge is a throughput-first garbage collector;
  • Parallel Scavenge has an adaptive adjustment strategy.

JDK 1.6 provides a parallel garbage collector for the old generation – Parallel Old collector, which is used to replace the Serial Old collector. Parallel uses mark compression, parallel recycling and STW mechanisms.

You can pass -XX: + UseParallelGC to specify that the new generation uses the Parallel Scavenge collector; -XX: + UseParallelOldGC specifies that the old generation uses the Parallel Old collector. They exist in pairs. , turning one on will also turn on the other.

In addition, you can also set the number of threads for the parallel collector through -XX:ParallelGCThreads=:

  • By default, when the number of CPUs is less than 8, the value of -XX:ParallelGCThreads= is equal to the number of CPUs;
  • When the number of CPUs is greater than 8, the value of -XX:ParallelGCThreads= is equal to 3 + 5*CPU_COUNT/8
  • -XX: + UseAdaptiveSizePolicy Turn on Parallel Scavenge’s adaptive adjustment policy:
  • In this mode, the young generation size, the ratio of Eden Park and survivor area, and the age threshold of objects promoted to the old generation will be automatically adjusted to achieve the heap size and throughput balance between pause time.

CMS Recycler

JDK 1.5 HotSpot launched a true concurrent collector – CMS (Concurrent-Mark-Sweep), which for the first time allowed the garbage collection thread and the user thread to work at the same time. The focus of CMS is to shorten the user thread pause time during garbage collection as much as possible.

As an old-generation garbage collector, CMS cannot be used with the new-generation garbage collector Parallel Scavenge. It can only be used with ParNew or Serial.

Schematic diagram of CMS recycler:

It is mainly divided into the following steps:

  1. Initial-Mark: All user threads are suspended (STW). This stage only marks objects that GC Roots can directly associate with, so it is very fast, STW time very short;
  2. Concurrent-Mark: This stage starts from the GC Roots directly associated objects and traverses the entire object chain. Although this process takes a long time, it does not require pausing the user. Threads, concurrent execution, no STW;
  3. Remark: Since the user thread is also executing in the previous step, this step is used to correct the part of the object whose mark changes due to the user thread continuing to run. Mark records. This phase will take a little longer than the initial marking phase, but much less time than the concurrent marking phase;
  4. Concurrent Sweep (Concurrent-Sweep): This stage cleans and deletes garbage and reclaims space. Since there are no moving objects, no STW is required at this stage.

The advantages and disadvantages of CMS are obvious:

advantage:

  • Concurrent collection;
  • Low latency.

shortcoming:

  • Will generate fragments. Because the user thread is still executing during the cleanup phase, only the mark-clear algorithm that does not move objects can be used, and this algorithm will cause fragmentation problems;
  • Sensitive to CPU resources. In addition to being used for user threads, CPU resources also need to be allocated to handle garbage collection, which reduces throughput;
  • Unable to process floating garbage. In the concurrent marking phase, the user thread does not stop. Garbage will also be generated in this phase. CMS cannot mark this garbage and can only leave it to be processed in the next GC.
  • In addition, during the CMS recycling process, because the user thread is not interrupted, it is necessary to ensure that the user thread has enough memory available. In other words, the CMS collector cannot wait until the old generation is about to be filled before recycling, but starts recycling when the heap memory usage reaches a certain threshold. If there is insufficient reserved memory during CMS operation, a “Concurrent Mode Failure” failure will occur. The virtual machine will start a backup plan and temporarily enable the Serial Old collector to complete garbage collection in the old generation.

CMS recycler can set parameters:

  • -XX: + UseConcMarkSweepGC, turn on CMS GC. After it is turned on, -XX: + UseParNewGC will be turned on automatically;
  • -XX:CMSInitiatingOccupanyFraction=, set the heap memory usage threshold. Once this threshold is reached, CMS starts recycling (JDK5 and before, default value is 68, the default value for JDK6 and above is 92%);
  • -XX: + UseCMSCompactAtFullCollection, specifies that the memory space will be compressed after CMS recycles the old generation to avoid fragmentation problems;
  • -XX:CMSFullGCsBeforeCompaction, set how many times CMS GC is executed before compressing and organizing the memory space;
  • -XX:ParallelCMSThreads=, set the number of CMS threads. The default number of threads started is (ParallelGCThreads + 3)/4. We know that when the number of CPUs is less than 8, the default value of ParallelGCThreads is the number of CPUs, so for an 8-core CPU, the default number of CMS threads started is 3. In other words, only 62.5% of the CPU resources are used to process user threads. . Therefore, CMS is not suitable for scenarios with high throughput requirements.

G1 Recycler

The G1 (Garbage First) collector divides the heap memory into many unrelated regions (regions, physically discontinuous), and uses different regions to represent the Eden Zone, the Survivor Zone, and the Old Generation.

Logical generation, physical generation is not. In the memory model, the heap memory is no longer divided into the new generation and the old generation, but is divided into small memory blocks one by one, called Region. Each Region can be between Eden, Survivor and old. Switch between roles. In addition, there is a type of giant area called Humongous, which is used to accommodate objects that exceed 50% of the region size. Each partition may be a young generation or an old generation, but it can only be used at the same time. The concepts of belonging to a certain generation, the young generation, the survivor area, and the old generation still exist and have become logical concepts. This facilitates the reuse of the logic of the previous generation framework and does not require physical continuity, which brings additional benefits – -Some partitions have a lot of garbage objects, and some partitions have very few garbage objects. G1 will give priority to recycling partitions with a lot of garbage objects, so that it can spend less time recycling the garbage in these partitions. This is the reason for the name of G1. The reason is that the partition with the most garbage is collected first.

The G1 garbage collection process is shown in the figure below:

It is mainly divided into the following steps:

  1. Initial Marking: Just marking objects that can be directly associated with GC Roots requires STW, but this process is very fast;
  2. Concurrent Marking: Starting from GC Roots, perform reachability analysis on the objects in the heap to find surviving objects. This stage takes a long time , but can be executed concurrently with user threads;
  3. Final Marking: Mainly corrects the marking records of the part of the object that changes due to the user thread continuing to run during the concurrent marking phase, which requires STW;
  4. Filtering and recycling: Sort the recycling value and cost of each region partition, and develop a recycling plan based on the pause time expected by the user. This phase pauses the user thread, STW.

Advantages and Disadvantages of G1 Recycler:

advantage:

  • Parallel and concurrency;
  • Generational collection can use different algorithms to process different objects;
  • Spatial integration, mark compression algorithm means no memory fragmentation;
  • Predictable pause time allows users to explicitly specify a time segment of length M milliseconds, and the time consumed in garbage collection does not exceed N milliseconds (according to the priority list) The region with the greatest recycling value).

shortcoming:

  • There is no advantage compared to CMS in a small memory environment. G1 is suitable for large heap memory;
  • When the user program is running, G1 is higher than CMS in terms of memory usage for garbage collection and additional execution load when the program is running.

G1 recycler related parameter settings:

  • -XX: + UseG1GC, turn on G1 GC;
  • -XX:G1HeapRegionSize=, set the size of the region. The value is a power of 2, ranging from 1MB to 32MB, and the target is to divide the minimum heap memory size into approximately 2048 regions. So if this value is set to 2MB, then the minimum heap memory is approximately 4GB;
  • -XX:MaxGCPauseMillis=, sets the expected maximum GC pause time indicator (JVM will try its best to achieve it, but does not guarantee it), default value is 200ms;
  • -XX:ParallelGCThread=, when setting STW, the GC thread value is set to a maximum of 8;
  • -XX:ConcGCThreads=, set the number of threads marked concurrently, the recommended value is about 1/4 of ParallelGCThread;
  • -XX:InitiatingHeapOccupancyPercent=, set the Java heap occupancy threshold that triggers a concurrent GC cycle. If this value is exceeded, GC will be triggered. The default value is 45 .

Summary

Garbage collector

Classification

Action position

Use algorithms

Features

Applicable scene

Serial

serial

new generation

Replication algorithm

Prioritize response speed

Client mode suitable for single CPU environment

ParNew

parallel

new generation

Replication algorithm

Prioritize response speed

Used with CMS in multi-CPU environment Server mode

Parallel

parallel

new generation

Replication algorithm

Throughput first

Suitable for background operations that do not require too much interaction

Serial Old

serial

old age

tag-compression algorithm

Prioritize response speed

Client mode in single CPU environment

Parallel Old

parallel

old age

tag-compression algorithm

Throughput first

Suitable for background operations that do not require too much interaction

CMS

concurrent

old age

Mark-and-sweep algorithm

Prioritize response speed

Suitable for Internet or B/S business

G1

Parallelism and Concurrency

New generation, old generation

Copy Algorithm Mark-Compression Algorithm

Prioritize response speed

For server-side applications

New garbage collector

Epsilon collector, Shenandoah collector, ZGC collector