What are the components of the Java memory area?

Hello everyone, I am Hululu. Today we will take a look at the Java memory area. This article is based on the HotSpot virtual machine, JDK8

Foreword

Java memory area, is also called runtime data area, memory area, JVM memory model, which is related to the runtime area of the Java virtual machine (JVM), and refers to the JVM runtime Store data in regions, emphasizing the division of memory space. Often confused with the Java Memory Model (JMM), which defines the access rules for each variable in the program, that is, the underlying details of storing variables into memory and taking variables out of memory in a virtual machine. There is not only one version of the JVM. In the history of Java development, there are many excellent Java virtual machines. Among them, the most familiar one is the HotSpot virtual machine. What do you not know? Let’s go to Oracle’s official website and download JDK, its own virtual machine , is HotSpot.

The biggest feature of HotSpot VM: Hot spot code detection, which can find the code with the most compilation value through the execution counter, and then notify the JIT compiler to compile. Through the cooperation of the compiler and interpreter, Balance optimal program response time with best execution performance.

Briefly introduce the main components of the above figure:

Class loader system: mainly used for the subsystem to load compiled .class files into the JVM, see: Class loader
Execution engine: including a just-in-time compiler and a garbage collector. The just-in-time compiler compiles Java bytecode into specific machine code, and the garbage collector is used to reclaim objects that are no longer used during operation
Native library interface: used to call the local method library of the operating system to complete specific instruction operations
Runtime data area: used to store data generated during the running of the JVM. Different virtual machines have slight differences in memory allocation, but generally follow the “Java Virtual Machine Specification”. In the “Java Virtual Machine Specification”, five virtual machine runtime data areas are specified, and they are: program counter, Java virtual machine stack, local method stack, local method area, heap and method area . Below we use this picture as a benchmark to analyze each part in detail, slowly

Java memory area

Program Counter

The program counter (Program Counter Register) is a block of memory for storing the address of the unit where the next instruction is located. In the specification of the virtual machine, the work of the bytecode parser is to change the value of this counter to Select the next bytecode instruction that needs to be executed, and basic functions such as branching, looping, jumping, exception handling, and thread recovery all need to rely on this counter to complete.

Let’s decompile the class file in Java:

According to JVM logic, the program counter is a small memory space, which can be regarded as the line number indicator of the bytecode executed by the current thread, PC register, also called “program counter\ “, which is a kind of register in the CPU, partial hardware concept

Since the program counter saves the the address of the next instruction to be executed, in the JVM, the general process of executing instructions: the execution engine will obtain the address of the next instruction from the program counter, and get its corresponding operation instruction, execute it, when the instruction ends, the bytecode interpreter will select the next instruction according to the value in the pc register and modify the value in the pc register, To achieve the purpose of executing the next instruction, it goes round and round until the end of the program.

The bytecode interpreter can get the execution sequence of all bytecode instructions, and the program counter is only to record the address of the currently executed bytecode instruction to prevent thread switching from finding the next instruction address

We know that the threads in the operating system are executed by CPU scheduling, and the multi-threading of the JVM is realized by CPU time slice rotation. may hang due to time slice exhaustion. When it gets its time slice again, it needs to continue execution from where it left off. In the JVM, the bytecode execution position of the program is recorded by the program counter.

The execution program is fine in the case of single thread, but in the case of multi-thread: when the thread is executing the instruction, the CPU may switch the thread, go to another more urgent instruction, and continue to execute the previous instruction after execution. Especially in the case of a single-core CPU, the CPU will frequently switch threads, “simultaneously” to perform multiple tasks. In order for the CPU to switch threads, it can still return to the position where the previous instruction was executed, which requires each thread to have its own independent program counter, which does not affect each other. We can find that the program counter is private to the thread, and each thread has a program counter.

The program counter is the only area in the java virtual machine specification that does not specify any OutofMemeryError (memory leak), its life cycle is created with the creation of the thread, and dies with the end of the thread. Because the current thread is executing a method in Java, the program counter records the address of the virtual machine bytecode instruction being executed. If it is a Native method, this counter is empty (undefined)

The PC register (program counter) is still different from the program counter in the JVM:

The PC register always points to the memory address of the next instruction to be executed (never undefined), and before the program starts executing, the starting address of the program instruction sequence, that is, the first instruction of the program The address of the memory unit where it is located is sent to the PC, and the CPU reads the first instruction from the memory according to the instructions of the PC (fetch)

When executing an instruction, the CPU will automatically modify the contents of the PC, that is, the PC will increase by an amount every time an instruction is executed, which is equal to the number of bytes contained in the instruction (the number of instruction bytes), so that the PC always points to the next one to be fetched. Refers to the instruction address.

Since most instructions are executed sequentially, the process of modifying the PC is usually simply adding the “number of instruction bytes” to the PC. When the program is transferred, the final result of the execution of the transfer instruction is to change the value of the PC, and this PC value is the target address to be transferred to. The processor always points to, fetches, decodes, and executes instructions according to the PC, thereby realizing program transfer.

Virtual machine stack

Virtual machine stack (JVM Stacks), similar to the stack on the data structure, first in last out. Like the program counter, it is also private to the thread, and its life cycle is the same as that of the thread. It is created with the creation of the thread and dies with the death of the thread.

The virtual machine stack describes the memory model of Java method execution: each method will create a stack frame while executing, which is used to store local variable tables, operand stacks, dynamic links, method exits, etc. information. The process of the stack frame from being pushed to the stack (order: first in, last out) in the virtual machine stack actually corresponds to the process from the call of the method in Java to the completion of execution

Stack frame is a data structure for supporting method call and method execution by virtual machine, it is the stack element of the virtual machine stack in the data area when the virtual machine is running. A stack frame stores information such as the method’s variable table, operand stack, dynamic link, and method return.

where:

In the currently active thread, only the frame at the top of the stack is valid, called the current stack frame. The method being executed is called the current method, and the stack frame is the basic structure for the method to run. When the execution engine is running, all instructions can only operate on the current stack frame.
The data of the method call needs to be passed through the stack. Each method call will have a corresponding stack frame pushed onto the stack. After each method call, a stack frame will be popped.
Each stack frame contains four areas: local variable table, operand stack, dynamic link, return address
In the “Java Virtual Machine Specification”, two types of abnormal conditions are specified for this memory area:

If the stack depth requested by the thread is greater than the depth allowed by the virtual machine, a StackOverflowError exception will be thrown;
If the Java virtual machine stack capacity can be dynamically expanded, cannot apply for enough memory when the stack tries to expand, or does not have enough memory when initializing the JVM stack for a new thread An OutOfMemoryError exception will be thrown. “Java Virtual Machine Specification” clearly allows the Java virtual machine to choose whether to support the dynamic expansion of the stack, HotSpot virtual machine chooses not to support the expansion, so HotSpot virtual The machine will not cause OutOfMemoryError (memory overflow) exception due to expansion when the thread is running

We mainly introduce the structure of the stack frame below:

local variable table

Local variable table: It is the area for storing method parameters and local variables, mainly storing various data types (boolean, byte, char, short, int, float, long, double), object reference (reference type, which is different from the object itself, may be a reference pointer pointing to the starting address of the object, or point to a handle representing the object or other position relative to this object)

We know that local variables cannot be used without an initial value, while global variables are placed on the heap, and there are two stages of assignment, one in the preparation stage of class loading , to give the system an initial value; another time in the initialization phase of class loading, to give the initial value defined by the code. For extension, see: class loader

The capacity of the local variable table takes Variable Slot (variable slot) as the smallest unit, and each variable slot can store a memory space of 32 bits in length. Basic type data, references and returnAddress (return address) occupy a variable slot, long and double need two

When the method is executed, the virtual machine uses the local variable table to complete the transfer process of the parameter value to the parameter variable list. If the instance method is executed, the 0th index Slot in the local variable table defaults to It is used to pass the reference of the object instance to which the method belongs (in the method you can access this implicit parameter through the keyword this) and the remaining parameters are arranged in the order of the parameter list, occupying local variables starting from 1 Slot. Detailed explanation of the keyword this We can write an example to verify

public class Test {<!-- -->
 ? void fun(){<!-- -->
 ? }
}

javac -g:vars Test.java generates the Test.class file, and the parameter -g:vars must be added, otherwise, the local variable table LocalVariableTable cannot be displayed when decompiling. Let’s go on Decompile it:

javap -v Test
?
?
Classfile /D:/GiteeProjects/study-java/study/src/com/company/test3/Test.class
  Last modified 2022-11-20; size 261 bytes
  MD5 checksum 72c7d1fcc5d83dd6fc82c43ae55f2b34
public class com.company.test3.Test
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
 ? #1 = Methodref ? ? ? #3. #11 ? ? ? ? // java/lang/Object."<init>":()V
 ? #2 = Class ? ? ? ? ? #12 ? ? ? ? // com/company/test3/Test
 ? #3 = Class ? ? ? ? ? #13 ? ? ? ? // java/lang/Object
 ? #4 = Utf8 ? ? ? ? ? <init>
 ? #5 = Utf8 ? ? ? ? ? ()V
 ? #6 = Utf8 ? ? ? ? ? Code
 ?#7 = Utf8?LocalVariableTable
 ? #8 = Utf8 ? ? ? ? ? ? this
 ? #9 = Utf8 ? ? ? ? ? ? Lcom/company/test3/Test;
  #10 = Utf8? fun
  #11 = NameAndType ? ? ? #4:#5 ? ? ? ? // "<init>":()V
  #12 = Utf8 ? ? ? ? ? ? com/company/test3/Test
  #13 = Utf8 ? ? ? ? ? ? java/lang/Object
{<!-- -->
  public com.company.test3.Test();
 ? descriptor: ()V
 ?flags: ACC_PUBLIC
 ?Code:
 ? stack=1, locals=1, args_size=1
 ? ? ? 0: aload_0
 ? ? ? 1: invokespecial #1 ? ? ? ? ? ? // Method java/lang/Object."<init>":()V
 ? ? ? 4: return
 LocalVariableTable:
 ? ? ? Start Length Slot Name ?
 ? ? ? 0 ? ? 5 ? 0 this ? Lcom/company/test3/Test;
?
  void fun();
 ? descriptor: ()V
 ?flags:
 ?Code:
 ? stack=0, locals=1, args_size=1
 ? ? ? 0: return
 LocalVariableTable:
 ? ? ? Start Length Slot Name ?
 ? ? ? ? 0 ? ? 1 ? 0 this ? Lcom/company/test3/Test; //! ! ! It can be seen that this is at No. 0 of Slot!!!
}

operand stack

The operand stack is mainly used for storage of intermediate calculation results or temporary variables generated during method execution, and calculations are performed by operations such as pushing variables into and out of the stack. In the process of method execution, there will be various bytecode instructions to write and extract content from the operand stack, that is, pop and push operations. The JVM execution engine we mentioned earlier is a stack-based execution engine, where the stack refers to the operand stack

dynamic link

Each stack frame saves a run-time constant pool that can point to the class where the current method is located. The purpose is: If you need to call other methods in the current method, you can find the corresponding symbol reference from the run-time constant pool. Then convert the symbolic reference to a direct reference, and then you can directly call the corresponding method, which is dynamic link. The essence is that converts symbolic references to direct references to call methods when the method is running. This process of reference conversion is dynamic, not all method calls require dynamic linking, some of them Symbolic references will convert symbolic references into direct references during the class loading phase. This part of the operation is called: Static analysis. It means that the calling version can be determined during compilation. Including: Calling static methods, calling private constructors of instances, private methods, superclass methods

return address

Java methods have two return methods:

Normal exit, that is, normal execution to return bytecode instructions of any method, such as return, etc.;
Abnormal exit

Regardless of the exit condition, returns to the point where the method was currently called. The process of method exit is equivalent to popping the current stack frame We can find that: The stack frame is created with the method call and destroyed with the end of the method. Whether the method completes normally or abnormally, it is counted as the end of the method.

Native method stack

Native Method Stack (Native Method Stack): It is thread-private, and its function is basically the same as that of the virtual machine stack. There is a difference: The virtual machine stack serves Java methods, while the native method stack is for virtual The that the machine calls the Native method service directly calls the local C/C ++ library through JNI (Java Native Interface), and is no longer controlled by the JVM.

The most famous JNI class local method should be System.currentTimeMillis() , JNI enables Java to deeply use the features of the operating system and reuse non-Java code. When a large number of native methods appear, it is bound to weaken the JVM’s control over the system

When the local method is executed, a stack frame will also be created on the local method stack to store the local variable table, operand stack, dynamic link, and exit information of the local method. After the method is executed, the corresponding stack frame will also be popped and the memory space will be released. Like the virtual machine stack, the local method stack area will also throw StackOverflowError and OutOfMemoryError

In addition, in the Java virtual machine specification, there is no special requirement for the local method stack, and the virtual machine can implement it freely. Therefore, the HotSpot virtual machine directly combines the local method stack and the virtual machine stack into one . Therefore, for HotSpot, although the -Xoss parameter (setting the local method stack size) exists, it actually has no effect, and the stack capacity can only be controlled by the -Xss parameter to set.

Heap

The heap (Heap) is the largest memory area managed by the Java virtual machine and is shared by all threads. The sole purpose of the Java heap is to store object instances, almost all The object instances of all objects are all allocated memory on the heap, but with the development of JIT compilers and the gradual maturity of escape analysis technology, allocation on the stack and thread local allocation cache (TLAB) can also store object instances

The Java virtual machine specification stipulates that Java heap can be in a physically discontinuous memory space, as long as it is logically continuous, and the current mainstream virtual machines are all implemented in accordance with scalability (via – Xmx and -Xms controls). If there is no memory in the heap to complete the instance allocation, and the heap can no longer be expanded, an OutOfMemoryError exception will be thrown.

Method area

The method area (Methed Area) is used to store data such as class information, constants, static variables, and just-in-time compiled code that have been loaded by the virtual machine. It is an area of memory that is shared by all threads.

In the Java virtual machine specification, the method area is described as a logical part of the heap, but it has an alias called Non-Heap (non-heap), which is distinguished from the Java heap.

The method area is a conceptual definition of the JVM specification, not a specific implementation. Since the Java virtual machine has very loose restrictions on the method area, it also leads to the method area on different virtual machines. There are different performances, let’s take the HotSpot virtual machine as an example:

Before JDK8, the HotSpot virtual machine implemented the method area in the Java virtual machine specification as permanent generation
In JDK8 and later, the implementation of the method area in the Java virtual machine specification by the HotSpot virtual machine has become metaspace

Many articles on the Internet like to use “permanent generation” or “metaspace” to replace the method area, but the two are not equivalent in essence. The method area is the concept of the Java virtual machine specification, “permanent generation” or “metaspace” are the two implementation methods of the method area

The method area was a separate area before JDK7, and the design team of the HotSpot virtual machine extended GC generational collection to the method area. In this way, HotSpot’s garbage collector can manage this part of memory just like managing the Java heap. But for other virtual machines (such as BEA JRockit, IBM J9, etc.), there is actually no concept of permanent generation.

The HotSpot team obviously realized that it is not a good idea to use the permanent generation to implement the method area:

Strings exist in the permanent generation, which is prone to performance problems and memory overflow

It is difficult to determine the size of the class and method information, so it is difficult to specify the size of the permanent generation. If it is too small, permanent generation overflow will easily occur, and if it is too large, it will easily cause old generation overflow.

The permanent generation will bring unnecessary complexity to the GC, and the recovery efficiency is low.

Therefore, in JDK1.8, the “permanent generation” is completely abolished, and the permanent generation is replaced by the metaspace. Other content is moved to the metaspace, and the metaspace is allocated directly in the local memory.

When the method area cannot meet the memory allocation requirements, an OutOfMemoryError exception will be thrown. Metaspace is implemented using direct memory, which we will discuss in detail below.

That’s about it for the Java memory area, let’s add a few more confusing concepts

String constant pool

Strings are reference data types, but it can be said that strings are a frequently used data type in Java. Therefore, in order to save program memory and improve performance, Java designers created an area called the string constant pool to store these strings and avoid repeated creation of strings. The string constant pool is a common space for all classes, and there is only one constant pool area in a virtual machine.

After the class loading is completed, verified and prepared, a string object instance is generated in the heap, and then the reference value of the string object instance is stored in the string constant pool (the description here refers to JDK7 and later HotSpot virtual machines). In the HotSpot virtual machine, the string constant pool is implemented through a StringTable class. It is a hash table that stores string references

Before JDK7, the string constant pool was in the method area (permanent generation), and at this time, string objects were stored in the constant pool. In JDK7 and later, the string constant pool is migrated from the method area to the heap memory, and the string object is stored in the heap memory , only the reference of the string object is stored in the string constant pool.

The removal of HotSpot’s permanent generation has already started in JDK7, mainly because the GC recovery efficiency of the permanent generation is too low. By the time of JDK 8, the permanent generation has been completely removed. In Java programs, there are usually a large number of created strings waiting to be recycled. Putting the string constant pool on the heap can reclaim string memory more efficiently and in a timely manner.

Runtime constant pool

The Runtime Constant Pool is part of the method area. We know that in addition to common descriptive information such as class version, fields, methods, and interfaces in the Class file, there is also a piece of information called the Constant Pool (Constant Pool Table), which is used to store and compile Various literals, symbolic references, and translated direct references generated during the period will be stored in the runtime constant pool in the method area after the class is loaded. Therefore, each class will have a runtime constant pool

Because the Java language does not require constants to be generated during compilation. That is to say, the content that is not preset into the constant pool of the Class file can enter the constant pool at runtime. New constants can also be put into the constant pool during runtime. Another important feature of the constant pool at runtime is Be dynamic.

Since the runtime constant pool is part of the method area, it is naturally limited by the memory in the method area. When the constant pool can no longer apply for memory, an OutOfMemoryError exception will be thrown.

Direct memory

After the JDK 8 version, the permanent generation has been replaced by the metaspace, and the metaspace uses direct memory. Direct memory (Direct Memory) is not part of the Java virtual machine runtime data area, nor is it a memory area defined in the Java virtual machine specification.

NIO was newly added in JDK 1.4, introducing an I/O method based on channels (Channel) and buffers (Buffer), which can use the Native function library to directly allocate off-heap memory, and then store it in the Java heap through a The DirectByteBuffer object operates as a reference to this memory. This can significantly improve performance in some scenarios, because it avoids copying data back and forth between the Java heap and the Native heap.

Obviously, Native direct memory allocation will not be limited by the size of the Java heap, but since it is memory, it will definitely be limited by the size of the total memory of the machine (including RAM and SWAP area or paging file) and The limit of the address space of the processor. When configuring virtual machine parameters, server administrators will set parameters such as -Xmx based on actual memory, but often ignore direct memory, making the sum of each memory area greater than the physical memory limit (including physical and operating system-level limits), resulting in An OutOfMemoryError exception occurs during dynamic expansion.

Summary

Thread private area (including program counter, virtual machine stack, local method stack), the life cycle is created with the start of the thread, and destroyed with the end of the thread
The thread shared area (including method area and heap), the life cycle is created with the startup of the virtual machine, and destroyed with the shutdown of the virtual machine

References:

“In-depth understanding of Java virtual machine: JVM advanced features and best practices”

“On Java 8”

www.cnblogs.com/newAndHui/p…

blog.csdn.net/qq_20394285…

www.cnblogs.com/czwbig/p/11…

This article is over here. If my article is helpful to you, please help me with three clicks: Like, follow, bookmark, your support will motivate me to output more Quality article, thanks!

More high-quality articles on computer internal skills, JAVA source code, career growth, project practice, interviews, etc., were first published on the official account “Xiao Niu Hululu”. See you next time.