Garbage collector and memory allocation strategy

Overview

There are three things garbage collection needs to accomplish:

  1. What memory needs to be recycled?
  2. When will it be recycled?
  3. How to recycle?
Algorithm for determining whether objects are alive:
  1. Reference counting method: Add a reference counter to the object. Whenever there is a reference, the counter value is incremented by 1; when the reference expires, the counter value is decremented by 1. Any object with a counter of 0 cannot be used again. But this algorithm cannot solve the circular reference problem.
  2. Reachability analysis algorithm: Use a series of root objects called “GC Roots” as the starting point set. Starting from these nodes, search downward according to the reference relationship. During the search process The path traveled is called the “reference chain”.
Let’s talk about citations again:
  1. Strong reference: similar to the reference relationship of “new Object()”. In any case, as long as the strong reference relationship still exists, the garbage collector will never reclaim the referenced object.
  2. Soft reference: describes some useful but not necessary objects. Objects associated with soft references will be included in the recycling scope for a second recycling before a memory overflow exception occurs in the system. If there is not enough memory for this recycling, a memory overflow exception will be thrown. Use the SoftRefrence class to implement soft references
  3. Weak reference: It also describes non-essential objects and is weaker than soft references. Objects associated with weak references can only survive until the next garbage collection occurs. When the garbage collector starts working, regardless of whether the current memory is enough, objects associated with only weak references will be recycled. Use the WeakRefrence class to implement weak references.
  4. Virtual reference: the weakest reference relationship. An object instance cannot be obtained through a virtual reference. The only purpose of setting a virtual reference association for an object is to receive a system notification when the object is reclaimed by the collector. Use the PhantomRefrence class to implement virtual references.

A demonstration of object self-rescue:

public class FinalizeEscapeGC {
    public static FinalizeEscapeGC SAVE_HOOK = null;

    public void isAlive() {
        System.out.println("yes, I am still alive :)");
    }

    @Override
    protected void finalize() throws Throwable {
        super.finalize();
        System.out.println("finalize method executed!");
        FinalizeEscapeGC.SAVE_HOOK = this;
    }

    public static void main(String[] args) throws InterruptedException {
        SAVE_HOOK = new FinalizeEscapeGC();

        // The object successfully saves itself for the first time
        SAVE_HOOK = null;
        System.gc();
        // Because finalize() has a very low priority, pause for 0.5 seconds to wait for it to execute.
        Thread.sleep(500);
        if (SAVE_HOOK != null) {
            SAVE_HOOK.isAlive();
        } else {
            System.out.println("no, I am dead :(");
        }

        //The following code is exactly the same as the above, but this time it failed to save itself.
        SAVE_HOOK = null;
        System.gc();
        // Because finalize() has a very low priority, pause for 0.5 seconds to wait for it to execute.
        Thread.sleep(500);
        if (SAVE_HOOK != null) {
            SAVE_HOOK.isAlive();
        } else {
            System.out.println("no, I am dead :(");
        }
    }
}

—- operation result:

finalize method executed!
yes, I am still alive :)
no, I am dead :(

There are two identical code snippets in the above sample code, but the execution result is that one escape is successful and one fails. Any finalize() method here will only be automatically called once by the system. If the object faces the next recycling, its The finalize() method will not be executed again.

Generational collection theory
  1. Weak generation hypothesis: most objects are born and die
  2. Strong generational hypothesis: The more times the garbage collection process goes through, the harder it is to die.

Garbage collection terms for different generations:

  1. Partial GC: refers to garbage collection whose goal is not to completely collect the entire Java heap, which is divided into:
    1. New generation collection (Minor GC/Yong GC): refers to the collection that targets only the new generation
    2. Old generation collection (Major GC/Old Gc): refers to the collection that targets only the old generation. Currently, only the CMS collector has the behavior of collecting the old generation separately.
    3. Mixed GC: refers to a garbage collector that collects the entire new generation and part of the old generation at the same time. Currently only the G1 collector behaves this way.
  1. Full GC: Collect garbage collection of the entire Java heap and method area.

Garbage collection algorithm

Mark-and-clear algorithm

The algorithm is divided into two stages: “marking” and “clearing”: first, the objects that need to be recycled are marked, and after the marking is completed, all marked objects are recycled uniformly. It can also be reversed, marking surviving objects and recycling unmarked objects uniformly. There are mainly two disadvantages:

    1. The execution efficiency is unstable, and the execution efficiency of marking and clearing will decrease as the number of objects increases.
    2. The problem of fragmentation of memory space. If a larger object needs to be allocated, another garbage collection action has to be triggered in advance because sufficient contiguous memory cannot be found.
Mark-copy algorithm

The new generation uses this algorithm. Half-area copying will reduce the available memory to half of the original size, which will cause a large waste of space.

The new generations of the HotSpot virtual machine such as Serial and ParNew are divided into a larger Eden space and two smaller Survivor spaces. Only Eden and one of the Survivor spaces are used for each memory allocation. When garbage collection occurs, copy the surviving objects in Eden and Survivor to another Survivor space at once, and then clean up Eden and the used Survivor space directly. The default size ratio of Eden and Survivor for the HotSpot virtual machine is 8: 1, that is, only 10% of the new generation is “wasted”. If the Survivor space is not enough to accommodate objects that survive after a Minor GC, you need to rely on other memory areas (actually mostly the old generation) for allocation guarantees.

Marking-collation algorithm

The essential difference from the mark-sweep algorithm is that the mark-sweep algorithm is a mobile recycling algorithm. Whether to move live objects is a risky decision with pros and cons.

If you move surviving objects, especially in an area like the old generation where a large number of objects survive each collection, moving the surviving objects and updating the places that reference these objects will be an extremely heavy-duty operation, and this object moving operation The user application must be fully suspended to proceed. Also known as “Stop The World”. If surviving objects are not moved, the space fragmentation problem can only be solved by more complex memory allocators and memory accessors. Since memory access is the most frequent operation of the user program, additional burden will be added to this link. , it will greatly affect the throughput of the application.

The Parallel Scavenge collector in the HotSpt virtual machine that focuses on throughput is based on the mark-sort algorithm. The latency-focused CMS collector is based on the mark-and-sweep algorithm. There is also a “peaceful” solution, which allows the virtual machine to use the mark-and-sweep algorithm most of the time, temporarily tolerating the existence of memory fragmentation until the degree of fragmentation of the memory space has become large enough to affect object allocation. , and then use the mark-sort algorithm to collect once to obtain a regular memory space. This is the approach adopted by the CMS collector.

Classic garbage collector

If the garbage collection algorithm is the methodology of memory recycling, then the garbage collector is the practitioner of memory recycling.

The figure above shows seven collectors acting on different generations. If there is a connection between two collectors, they can be used together. This relationship is not static. The two combinations of Serial + CMS and ParNew + Serial Old have been declared obsolete in JDK8.

The knowledge points of the article match the official knowledge archives, and you can further learn related knowledge. Java skill tree behavioral abstraction and Lambda collector 138176 people are learning the system