“Understand in one article” CMS garbage collector

Contents of this chapter

CMS Garbage Collector

The CMS (Concurrent Mark Sweep) collector is a collector that aims to obtain the shortest collection pause time.

CMS collector is mainly used for Internet projects that require low latency (ie: improved response speed).

Set CMS collector parameters: -XX: + UseConcMarkSweepGC.

The algorithm used by the CMS collector is the mark-sweep algorithm.

CMS garbage collector features:

  • 1) CMS will only recycle the old generation and permanent generation (JDK1.8 is a metadata area, and the CMS collector will not perform garbage collection on the permanent generation by default. If you want to perform garbage collection on the permanent generation, you can Enable garbage collection of the permanent generation by setting parameters: -XX: + CMSClassUnloadingEnabled (this parameter is off by default), and the young generation will not be collected.
  • 2) CMS is a preprocessing garbage collector, which needs to complete garbage collection before the old generation memory is exhausted, otherwise it will cause concurrent collection to fail (concurrent failure will degenerate into the SerialOld single-threaded garbage collector), therefore, CMS has a The threshold that triggers garbage collection (parameter: -XX:CMSInitiatingOccupancyFraction, the default value is 92%), that is, when the old generation or permanent generation memory reaches 92%, garbage collection begins.

CMS garbage collection process

The CMS garbage collection process is mainly divided into seven steps: initial marking, concurrent marking, concurrent preprocessing, terminable preprocessing, remarking, concurrent clearing, and concurrent reset.

Initial tag

The initial mark mainly marks the surviving objects. The surviving objects contain two parts:

  • 1) Mark all GC Roots objects in the old generation.
  • 2) Mark the surviving objects in the young generation that reference the old generation objects.

as the picture shows:

In order to speed up the processing speed of the initial marking phase and reduce pause time, you can enable initial marking parallelization with the parameter: -XX: + CMSParallelInitialMarkEnabled, and increase the number of parallel marking threads (parameter: -XX:ParallelGCThreads). The number of threads should not exceed Number of CPU cores.

NOTE: STW is raised during the initial marking phase.

Concurrency flag

The concurrent phase mainly searches for surviving objects along the objects marked in the initial marking phase. This phase runs concurrently with the application. Since the new generation objects are promoted to the old generation, objects are directly allocated to the old generation, and the reference relationships of old generation objects change during the running of the program, these objects need to be relabeled (that is, the Card where these objects are located is marked as Dirty). , to avoid missing objects and avoid scanning the entire old generation.

as the picture shows:

Notice:

  • 1) The concurrent marking phase does not cause STW.
  • 2) The concurrent marking phase and the application running concurrently can easily lead to Concurrent Mode Failure.

Three-color mark

In the concurrent marking phase, since the marking period is parallel to the application, the reference relationship between objects may change. Therefore, three-color marking is used to mark objects. The marking process is divided into three colors: white, gray, and black.

  • Black: Indicates that the object has been accessed by the garbage collector and all references to this object have been scanned. The black object represents that it has been scanned and it is safe to survive. If there are other object references pointing to the black object, there is no need to scan it again. It is impossible for a black object to point directly (without passing through a gray object) to a white object.
  • Gray: Indicates that the object has been accessed by the garbage collector, but there is at least one reference on the object that has not been scanned.
  • White: Indicates that the object has not been accessed by the garbage collector. At the beginning of the reachability analysis, all objects are white. If they are still white at the end of the analysis, it means they are unreachable.

Marking process:

  • 1) Initially, all objects are in the [White Collection].
  • 2) Transfer the objects directly referenced by GC Roots to the [gray collection].
  • 3) Get the object from the gray collection:
    • Transfer all other objects referenced by this object to the [Gray Collection].
    • Transfer this object to [Black Collection].

Repeat step 3 until the [Gray Collection] is empty. Objects still in the [White Collection] after the end are unreachable by GC Roots and can be recycled.

The three-color marking grade has problems of multiple markings and missing markings.

Missing mark

as the picture shows:

In the figure, there are four objects ABCD, A depends on B and C, and C depends on D. After the initial marking is completed, the A object has been scanned and marked in gray, and other objects are white; continue to scan B and C. When B and C are scanned, Finally, A becomes black, B becomes gray, C is black, and D remains white. At this time, if the application removes the references to B and D and makes C dependent on D, B will turn black after establishing the relationship between C and D. At this time, a problem arises. C is already black and its dependent objects will no longer be scanned. But in fact, C still has a dependent object D that has not been scanned. If garbage collection is performed, D will be recycled. This is a missing label problem.

Missing label solution

Incremental update: Maintain the newly added references into a collection, turn the source of the references gray, and wait for the re-marking phase before re-scanning. For example: when the reference of D points to C, C will be turned gray and C will be placed in a collection of new references. During the re-marking phase, C will be used as the root section and the downward scan will continue.

Multiple standards

as the picture shows:

In the picture, there are four objects ABCD, AB is black, C is gray, and D is white. When the GC is scanning D, B is left empty. At this time, B should be recycled, but because the GC will not do anything to the black object The scan is repeated, so B is still black and will not be recycled during garbage cleaning. We can only wait for the next GC to re-mark the scan. Compared with missing labels, this situation will not cause BUG in the system.

Concurrent preprocessing

Concurrency pre-cleaning mainly deals with surviving objects that are not marked due to changes in reference relationships during the concurrency phase (that is, scanning all Cards marked as Direty).

as the picture shows:

In the figure, node 3 refers to node 7. Since the Card of node 3 is marked as Dirty, node 7 will be marked as a surviving object.

Preprocessing can be terminated

The terminable preprocessing phase is the same as the concurrent preprocessing node. It mainly processes the surviving objects that are not marked due to changes in reference relationships in the concurrent phase (ie: scan all Cards marked as Direty). However, terminable preprocessing is conditionally triggered, and the triggering conditions are controlled by two parameters of CMS:

  • Parameter CMSScheduleRemarkEdenSizeThreshold, default value: 2M.
  • Parameter CMSScheduleRemarkEdenPenetration, default value: 50%.

These two parameters are generally used in combination, that is: when the Eden space usage exceeds 2M, the startup can terminate the preprocessing, and when the Eden space usage reaches 50%, it is interrupted and enters the re-marking stage.

At the same time, CMS provides a parameter CMSMaxAbortablePrecleanTime (default is 5S), which means that regardless of whether the Eden space usage reaches the value configured by the parameter CMSScheduleRemarkEdenPenetration, it will be interrupted and enter the remarking phase.

Finally, CMS also provides the parameter CMSScavengeBeforeRemark (closed by default, recommended to be turned on, turning on method: -XX: + CMSScavengeBeforeRemark), which means that a Minor GC is forcibly executed before entering remarking.

Relabel

Remarking mainly marks all living objects in the entire old generation. This phase scans the entire heap memory. The reason for scanning the new generation is that if objects in the old generation are referenced by objects in the new generation, they will be regarded as living objects. Even if the objects in the new generation are no longer reachable, these unreachable objects will be used as GC Root. Scan the old generation.

The remarking phase takes a long time. You can set the parameters -XX: + CMSScavengeBeforeRemark to execute a Minor GC before remarking, recycle unreachable objects in the new generation, and transfer the remaining objects to the survivor area or promote them to the old generation. , so that when scanning the new generation, only the survivor area objects need to be scanned, which will greatly reduce the time required to scan objects.

At the same time, parallel remarking can be enabled by setting the parameter CMSParallelRemarkEnabled to improve marking efficiency and reduce remarking processing time.

NOTE: The relabeling phase raises STW.

Concurrent clearing

Concurrent clearing mainly clears objects that have not been marked and reclaims memory space.

Since the application is still running during the concurrent cleanup phase, new unreachable objects (that is, garbage) will continue to be generated. This part of garbage appears after the marking process. CMS cannot process these unreachable objects in the current collection and needs to wait. This part of garbage will be cleaned up during the next GC. This part of garbage is called floating garbage.

Concurrent reset

Concurrent reset mainly resets the data structure of the CMS to prepare for use in the next CMS life cycle.

This phase runs concurrently with the application.

CMS issues and optimization

Why does CMS use mark and clear algorithm

Because the garbage collection phase of CMS is collected concurrently, if the marking and sorting algorithm is used, the memory address of the object will be moved. In order to avoid BUG caused by the memory address movement, the object pointer of the user thread needs to be maintained. This process will cause STW , at the same time, this processing increases the garbage cleaning time (the pause time will also increase), which is not in line with the original intention of the CMS design to minimize the recycling pause time.

How to reduce the remark pause time

Generally, 80% of CMS GC time is spent in the remarking phase. You can try adding parameters -XX: + CMSScavengeBeforeRemark for optimization.

How to optimize memory fragmentation

CMS uses a mark-and-clear algorithm, and the recycling process will generate memory fragments. When there are too many memory fragments, it will affect the allocation of large objects (for example, there is enough remaining space in the old generation, but there is not enough continuous memory space to allocate to large objects, thus triggering Full GC).

CMS provides two parameters:

  • UseCMSCompactAtFullCollection (enabled by default) means to perform memory defragmentation when Full GC is to be performed. The memory sorting process cannot be concurrent, so the pause time will become longer.
  • CMSFullGCsBeforeCompaction means that after executing the specified number of uncompressed Full GCs, execute a compressed Full GC. The default value is 0, which means defragmentation will be performed every time Full GC is entered.

Note: Although the CMSFullGCsBeforeCompaction parameter will reduce the frequency of Full GC compression and reduce the pause duration. However, it will aggravate the generation of memory fragmentation and increase the frequency of Full GC triggering. Therefore, when setting, you need to make a trade-off between the Full GC pause duration and the number of memory fragments.

Promotion Failed and Concurrent Mode Failure

Promotion Failed

The Promotion Failed problem occurs during the Minor GC process. The Survivor Space cannot hold the transferred objects, nor can the old generation. (When Promotion Failed occurs, the old generation CMS has not yet had a chance to recycle, and it cannot hold the transferred objects to the old generation. Next step Concurrent Mode Failure will occur, and STW will be downgraded to Serial Old).

Concurrent Mode Failure

Concurrent Mode Failure is a CMS-specific error. The CMS garbage cleaning thread and the user thread are performed in parallel. While the old generation is being cleaned up, new objects are promoted from the young generation, or the new generation cannot hold the allocated large objects and directly allocates memory in the old generation. At this time, if the old generation cannot hold these promoted objects or large objects, an error message will be thrown. Concurrent Mode Failure.

Impact of Concurrent Mode Failure: The old generation garbage collector degrades from CMS to Serial Old, all user threads are suspended, and the pause time becomes longer.

Solution

1) CMS is triggered too late

Parameter -XX:CMSInitiatingOccupancyFraction=N means that CMS starts GC when the memory usage reaches N% (because CMS will have floating garbage, so the threshold for triggering garbage collection needs to be set, the default is 92%), which can be adjusted appropriately. The value of parameter -XX:CMSInitiatingOccupancyFraction=N, such as: -XX:CMSInitiatingOccupancyFraction=70.

2) Too much space debris

Enable space defragmentation and set the space defragmentation cycle within a reasonable range.

For example: -XX:CMSFullGCsBeforeCompaction=5, which means that after executing 5 times of uncompressed Full GC, execute a compressed Full GC.

3) Garbage is generated too quickly

  • Promotion threshold is too small.
  • Survivor space is too small.
  • The Eden area is too small, causing the promotion rate to be too fast.
  • Large objects exist.

CMS related parameters

-XX: + UseConcMarkSweepGC
Open the CMS GC collector. JVM used Parallel GC by default before 1.8, and G1 GC after 9.
-XX: + UseParNewGC
When using the CMS collector, the young generation uses multiple threads to perform garbage collection in parallel by default (it is enabled by default after UseConcMarkSweepGC is enabled).
-XX: + CMSParallelRemarkEnabled
Use parallel marking to reduce stalls (enabled by default).
-XX: +CMSConcurrentMTEnabled
When enabled, concurrent CMS phases will be executed by multiple threads (so multiple GC threads will work in parallel with all application threads). (enabled by default)
-XX:ConcGCThreads
Defines the number of threads on which concurrent CMS processes run.
-XX:ParallelGCThreads
Defines the number of threads for parallel collection of the CMS process.
-XX:CMSInitiatingOccupancyFraction
This value represents the usage of the old generation heap space, and the default value is 92. When the usage rate of the old generation reaches this value, the parallel collector will start garbage collection. This parameter needs to be used together with UseCMSInitiatingOccupancyOnly. Setting it alone is invalid.
-XX: + UseCMSInitiatingOccupancyOnly
The parameter CMSInitiatingOccupancyFraction will not take effect until this parameter is enabled. Off by default.
-XX: + CMSClassUnloadingEnabled
Compared with the parallel collector, the CMS collector does not perform garbage collection on the permanent generation by default. If you want to perform garbage collection on the permanent generation, you can use the setting -XX: + CMSClassUnloadingEnabled. Off by default.
-XX: +CMSIncrementalMode
Turn on the incremental mode of the CMS collector. Incremental mode makes the recycling process longer, but the pause times tend to be shorter. Off by default.
-XX:CMSFullGCsBeforeCompaction
Set the number of times to compress and organize the memory space after performing Full GC. The default value is 0.
-XX: + CMSScavengeBeforeRemark
Do a ygc before cms gc remark to reduce the number of objects scanned by gc roots, thereby improving the efficiency of remark. It is turned off by default.
-XX: +ExplicitGCInvokesConcurrent
After this parameter is enabled, whenever the JVM calls the system GC, it will execute CMS GC instead of Full GC.
-XX: + ExplicitGCInvokesConcurrentAndUnloadsClasses
This parameter ensures that when there is a system GC call, the permanent generation is also included in the scope of CMS garbage collection.
-XX: + DisableExplicitGC
This parameter will cause the JVM to completely ignore system GC calls (regardless of the type of collector used).
-XX: + UseCompressedOops
This parameter is used to compress class object data and improve memory utilization (enabled by default).
-XX:MaxGCPauseMillis=200
This parameter is used to set the GC pause waiting time in milliseconds. Do not set it too low.