Why does Java NIO cause off-heap memory OOM?

Why does Java NIO cause off-heap memory OOM?

Description

  • One day, an alarm was issued: a service deployed on a certain machine was suddenly inaccessible. Remember to log in to the machine and check the logs as the first reaction, because the service is down, most likely due to OOM. At this time, the following information was found in the machine’s log:

  • nio handle failed java.lang.OutOfMemoryError: Direct buffer memory at org.eclipse.jetty.io.nio.xxxx
    at org.eclipse.jetty.io.nio.xxxx at org.eclipse.jetty.io.nio.xxxx
    
  • It shows that it is indeed OOM. Which area is causing the problem? You can see: Direct buffer memory, and a lot of jetty-related method call stacks. With these logs alone, you can analyze the cause of OOM.

Direct buffer memory

  • Off-heap memory, a piece of memory outside the JVM heap memory, is not managed by the JVM, but Java code can use some memory space outside the JVM heap. These spaces are direct buffer memory, which is direct memory. This memory is directly managed by the os. But it’s a bit strange to call it direct memory. I prefer to call it “off-heap memory”.

  • The process of Jetty running the system we wrote as a JVM process:

  • This OOM is caused by Jetty using off-heap memory. It can be deduced that Jetty may be constantly using off-heap memory, and then there is insufficient off-heap memory space and cannot use more off-heap memory, causing OOM.

  • Jetty keeps using off-heap memory:

Underlying technology to solve OOM

  • Since Jetty is written in Java, how does it apply for off-heap memory through Java code? Then how to release this off-heap memory space? This involves Java’s NIO underlay.
  • JVM performance optimization is relatively easy, but if you want to solve OOM, except for some mentally retarded and simple ones, such as someone constantly creating objects in the code. Many other production OOM problems are somewhat technically difficult and require solid skills.

How is off-heap memory applied for and how is it released?

  • If you want to apply for an off-heap memory space in Java code, use the DirectByteBuffer class. You can use this class to construct a DirectByteBuffer object. The object itself is in the JVM heap memory.

  • But when you build this object, a memory space will be allocated in the off-heap memory to associate with this object. Let’s look at the picture below, and you will have a clear understanding of the relationship between them.

  • Therefore, when allocating off-heap memory, this is basically the idea.

How to release off-heap memory

  • When your DirectByteBuffer object is no longer referenced and becomes garbage, it will be recycled during a certain YGC or Full GC.
  • Whenever a DirectByteBuffer object is recycled, its associated off-heap memory is released:

Then why is there still an off-heap memory overflow?

  • If you create a lot of DirectByteBuffer objects, which occupy a large amount of off-heap memory, and there is no GC thread to recycle these DirectByteBuffer objects, they will not be released!
  • When the off-heap memory is associated with a large number of DirectByteBuffer objects, if you try to use additional off-heap memory, a memory overflow will be reported! When will a large number of DirectByteBuffer objects remain alive, resulting in a large amount of off-heap memory that cannot be released?
  • It is also possible that the system has high concurrency and creates too many DirectByteBuffers, occupying a large amount of off-heap memory. If you continue to use off-heap memory at this time, OOM will occur! But that’s clearly not the case with this system.

The real cause of off-heap memory overflow

  • You can use jstat to observe the operation of the online system. At the same time, you can check the processing time of some requests based on the logs, analyze the past gc logs, and also look at the call time of each interface of the system. The analysis ideas are as follows.

  • First, look at the time it takes to call the interface. The system concurrency is not high, but each request is time-consuming to process, with an average of 1 second per request.

  • Then jstat discovered that as the system is constantly being called, various objects will be created continuously, including Jetty itself constantly creating DirectByteBuffer objects to apply for off-heap memory space, and then until Eden is full, YGC will be triggered:

  • But often at the moment of GC, some requests may not be processed yet. At this time, many DirectByteBuffer objects are alive and have not been recycled. Of course, the requests corresponding to many DirectByteBuffer objects may have been processed before. They It can be recycled.

  • At this time, there must be some DirectByteBuffer objects and some other objects that are alive, and they need to be transferred to the Survivor area. I remember that when the system was launched, the memory allocation was extremely unreasonable. One or two hundred MB was given to the young generation, but seven to eight hundred MB was given to the old generation. As a result, the Survivor in the young generation only had 10 MB. Therefore, often after YGC, some surviving objects (including some DirectByteBuffers) will exceed 10M and cannot be put into Survivor, so they go directly to Old:

  • So this process is repeatedly executed, causing some DirectByteBuffer objects to slowly enter Old. There are more and more DirectByteBuffer objects in Old, and these DirectByteBuffers are associated with a lot of off-heap memory:

  • Many of these DirectByteBuffers in the old generation are actually in a recyclable state, but because the old generation is not full, full gc is not triggered, so naturally these DirectByteBuffers in the old generation will not be recycled! Of course, these DirectByteBuffers that have not been recycled in the old generation have been associated with occupying a large amount of off-heap memory space!

  • Until the end, when you continue to use off-heap memory, all off-heap memory is occupied by a large number of DirectByteBuffers in the old generation. Although they can be recycled, it is helpless because the full gc in the old generation is never triggered, so the off-heap memory is occupied by a large number of DirectByteBuffers in the old generation. Memory can never be reclaimed. Finally leads to OOM!

Why does this Java NIO look so silly?

  • Didn’t Java NIO consider that this would happen?

  • Considered! He knows that many DirectByteBuffer objects may be unused, but because gc is not triggered, they continue to occupy off-heap memory. Java NIO does the following processing. Every time new off-heap memory is allocated, System.gc() is called to remind the JVM to actively perform the following GC to recycle some garbage DirectByteBuffer objects that are not referenced and release off-heap memory space.

  • As long as the GC can be triggered to recycle some unreferenced DirectByteBuffers, some off-heap memory will be released, and more objects can naturally be allocated to off-heap memory. But because we set it in the JVM again:

  • -XX: + DisableExplicitGC
    
  • As a result, System.gc() does not take effect, thus causing OOM.

Ultimate Optimization

  • The project has the following problems:
  • The memory settings are unreasonable, causing the DirectByteBuffer object to slowly enter the old age, and the off-heap memory cannot be released.
  • -XX: + DisableExplicitGC is set, causing Java NIO to be unable to proactively remind you to recycle some garbage DIrectByteBuffer objects, and also causing the inability to release off-heap memory.
  • This should be done:
  • Allocate memory reasonably, give more memory to the young generation, and give the Survivor area more space.
  • Release the -XX: + DisableExplicitGC restriction and allow System.gc() to take effect
  • After optimization, DirectByteBuffer will generally not continue to enter the old age. As long as it stays in the young generation, the off-heap memory will be released normally with the young gc.
  • As long as the -XX: + DisableExplicitGC restriction is released, Java NIO finds that there is insufficient off-heap memory, and will naturally remind the JVM to actively garbage collect through System.gc(), reclaim some DirectByteBuffers, and then release the off-heap memory.

————————————————– ———————————-

Introduction to offer assault training camp:

1: For those who don’t know how to interview and are not confident in the interview, we will give you an offer guarantee.

2: We will supervise you to master at least 70% of the technical points of the interview system within 15-20 days, which is enough for you to find a satisfactory job.

3: We are interview-oriented study guides. We will not take you to write code. We will teach you in detail the iterative process of the actual development of the project and the technical details of how to implement business functions. It is enough that you can express yourself fluently and clearly in the interview. Project experience You don’t have to worry (the real project experience provided by the technical teacher will definitely be useful), the efficiency is completely different between learning by yourself and learning with someone else’s system.

Please click here for details: Offer Assault Training Camp will give you a guarantee of an offer. If you are looking for a job or want to change jobs, check it out!