Netty’s IO principle three

Recycling conditions for off-heap memory:

Buffer underlying principle:

Direct memory and heap memory source code implementation:
//The base class of Buffer, which defines the usage of four index subscripts
public abstract class Buffer {
    // Invariant properties: mark <= position <= limit <= capacity
    private int mark = -1; // Mark index subscript
    private int position = 0; // Current processing index subscript
    private int limit; // Limit the index subscript (usually used to mark the final position of the read after writing)
    private int capacity; // Capacity index subscript

    // Used to point to the address of DirectByteBuffer
    long address;
}

// Directly inherited from the Buffer base class, it implements the basic operations of the byte buffer and the factory method for creating actual Buffer instances.
public abstract class ByteBuffer extends Buffer implements Comparable<ByteBuffer>{
    final byte[] hb; // used for byte array in HeapByteBuffer

    //The default allocation is heap memory
    public static ByteBuffer allocate(int capacity) {
        if (capacity < 0)
            throw new IllegalArgumentException();
        return new HeapByteBuffer(capacity, capacity); // Create a heap memory object
    }

    // You can create a direct memory buffer through this method
    public static ByteBuffer allocateDirect(int capacity) {
        return new DirectByteBuffer(capacity); // Create a direct memory buffer
    }
}

// This class inherits from ByteBuffer and implements the implementation of byte arrays in heap memory. This class is not a public class and can only create instances through ByteBuffer.
class HeapByteBuffer extends ByteBuffer{
    HeapByteBuffer(int cap, int lim) {
        super(-1, 0, lim, cap, new byte[cap], 0); // Directly initialize hb byte array
    }
}

// This class inherits from ByteBuffer and provides mapping between file fd and user space address.
public abstract class MappedByteBuffer extends ByteBuffer{
    private final FileDescriptor fd; // file fd
    MappedByteBuffer(int mark, int pos, int lim, int cap,FileDescriptor fd)
    {
        super(mark, pos, lim, cap);
        this.fd = fd;
    }
    
    MappedByteBuffer(int mark, int pos, int lim, int cap) { // Does not use FD mapping, supports DirectByteBuffer
        super(mark, pos, lim, cap);
        this.fd = null;
    }
}

// This class inherits from MappedByteBuffer and provides the implementation of direct memory buffer
class DirectByteBuffer extends MappedByteBuffer implements DirectBuffer{
    protected static final Unsafe unsafe = Bits.unsafe(); // Unsafe object to operate
    DirectByteBuffer(int cap) { // package-private
        super(-1, 0, cap, cap); // Initialize the parent class
        boolean pa = VM.isDirectMemoryPageAligned();
        int ps = Bits.pageSize();
        long size = Math.max(1L, (long)cap + (pa ? ps : 0));
        Bits.reserveMemory(size, cap);

        long base = 0;
        try {
            base = unsafe.allocateMemory(size); // Allocate direct memory
        } catch (OutOfMemoryError x) {
            Bits.unreserveMemory(size, cap);
            throw x;
        }
        unsafe.setMemory(base, size, (byte) 0);
        if (pa & amp; & amp; (base % ps != 0)) {
            // Round up to page boundary
            address = base + ps - (base & amp; (ps - 1));
        } else {
            address = base;
        }
        cleaner = Cleaner.create(this, new Deallocator(base, size, cap)); // Used to clean allocated direct memory
        att = null;
    }
}
Netty’s ByteBuf:

Netty’s ByteBuf is a layer encapsulation of NIO’s ByteBuffer.

What performance does object pooling improve?

1. Reduce the system loss caused by YGC. If the object is released when it is used up, it will be recycled during YGC, which will cause time loss. If an object pool is used, then during the GC process, the objects in the object pool will be moved to the old area, and YGC will not be affected at this time.

2. It can avoid the performance loss caused when creating and destroying objects.

When will the ByteBuf object be returned to the object pool?
First of all, there are four main garbage collection algorithms:

1. Reference count. When the reference count becomes 0, it is considered garbage. However, this method cannot solve the problem of circular references.

A → B → C
↑________↓

2. Mark-Sweep: The algorithm is relatively simple and is more efficient when there are many surviving objects; two scans are inefficient and prone to fragmentation.

3.Copying: Suitable for situations where there are relatively few surviving objects. It only scans once, which improves efficiency and eliminates fragmentation; space is wasted and object references need to be adjusted.

4. Mark-Compact (mark compression): It will not produce fragments, facilitate object allocation, and will not cause memory to be halved; but it needs to be scanned twice and objects need to be moved, which is inefficient. This algorithm is similar to the markSweep algorithm in the marking stage. The same thing, but after the marking is completed, instead of directly cleaning the garbage memory, the surviving objects are moved to one end, and then all memory outside the end boundary is directly cleared.

And ByteBuf is the reference counting algorithm used, which is ReferenceCounted

public interface ReferenceCounted {
    int refCnt();
    ReferenceCounted retain();
    ReferenceCounted retain(int var1);
    ReferenceCounted touch();
    ReferenceCounted touch(Object var1);
    boolean release();
    boolean release(int var1);
}
public abstract class ByteBuf implements ReferenceCounted, Comparable<ByteBuf> {

}
public abstract class AbstractByteBuf extends ByteBuf {
    int readerIndex;
    int writerIndex;
    private int markedReaderIndex;
    private int markedWriterIndex;
    private int maxCapacity;
}
public abstract class AbstractReferenceCountedByteBuf extends AbstractByteBuf {
    private static final AtomicIntegerFieldUpdater<AbstractReferenceCountedByteBuf> AIF_UPDATER =
    AtomicIntegerFieldUpdater.newUpdater(AbstractReferenceCountedByteBuf.class, "refCnt");
    private static final ReferenceCountUpdater<AbstractReferenceCountedByteBuf> updater =
    new ReferenceCountUpdater<AbstractReferenceCountedByteBuf>() {
        @Override
        protected AtomicIntegerFieldUpdater<AbstractReferenceCountedByteBuf> updater() {
            return AIF_UPDATER;//An attribute defined above
        }
        @Override
        protected long unsafeOffset() {
            return REFCNT_FIELD_OFFSET;
        }
    };
}
//This class implements the operation of ReferenceCounted
public abstract class ReferenceCountUpdater<T extends ReferenceCounted> {}
//Perform atomic operations on common properties in existing objects without modifying the source code
public abstract class AtomicIntegerFieldUpdater<T> {
    //You need to pass an object and the field names in the object that need to be updated atomically. The underlying operation is through reflection and CAS.
    public static <U> AtomicIntegerFieldUpdater<U> newUpdater(Class<U> tclass,
                                                              String fieldName) {
        return new AtomicIntegerFieldUpdaterImpl<U>
            (tclass, fieldName, Reflection.getCallerClass());
    }
}
ByteBuf has two pointers:

Call discardReadBytes()

Solving the ABA problem of CAS:
public class AtomicStampedReference<V> {
    public boolean compareAndSet(V expectedReference,
                                 V newReference,
                                 int expectedStamp,
                                 int newStamp) {
        Pair<V> current = pair;
        return
            expectedReference == current.reference & amp; & amp;
            expectedStamp == current.stamp & amp; & amp;
            ((newReference == current.reference & amp; & amp;
              newStamp == current.stamp) ||
             casPair(current, Pair.of(newReference, newStamp)));
    }
}
NIO’s underlying model:

FileInputStream fileInputStream = new FileInputStream("NioTest1.txt");
FileChannel fileChannel = fileInputStream.getChannel();

public FileChannel getChannel() {
    synchronized (this) {
        if (channel == null) {
            channel = FileChannelImpl.open(fd, path, true, false, this);
        }
        return channel;
    }
}
public static FileChannel open(FileDescriptor var0, String var1, boolean var2, boolean var3, Object var4) {
    return new FileChannelImpl(var0, var1, var2, var3, false, var4);
}
private FileChannelImpl(FileDescriptor var1, String var2, boolean var3, boolean var4, boolean var5, Object var6) {
    this.fd = var1;
    this.readable = var3;
    this.writable = var4;
    this.append = var5;
    this.parent = var6;
    this.path = var2;
    this.nd = new FileDispatcherImpl(var5);
}
ByteBuffer buffer = ByteBuffer.allocate(512);
fileChannel.read(buffer);

The bottom layer still calls the read and write functions of the operating system.

Netty architecture diagram:

Transport Services corresponds to the Netty transport layer, and Protocol Support corresponds to the Netty protocol layer.

What is a thread?

Different execution paths in a program means that if you can’t handle one, you have to handle multiple ones at the same time.

What is the difference between a process and a thread?
Process is the basic unit for OS to allocate resources, and thread is the basic unit for execution scheduling (such as executing main thread, executing socket thread,
Execution UI thread), the most important thing to allocate resources is: independent memory space, thread scheduling execution (threads share the memory space of the process
, does not have its own independent memory space)
Process: Also called Task in Linux, it is the basic unit for system allocation of resources. Resource: independent address space.
    Kernel data structures (process descriptors), global variables, data segments...
The implementation of threads in Linux: it is an ordinary process, but shares resources (memory space, global data, etc.) with other processes
Other systems have their own implementation of the so-called LWP, Light Weight Process.
High-level understanding: different execution routes in a process

In the Linux source code, sys_fork is called to create a process, sys_clone is called to create a thread, and do_fork is called at the bottom level. The difference is that the clone flags are different. The clone_flags of sys_clone are passed from the user space, and sys_fork is the calling system itself.

How to achieve high performance?

Just let the thread keep doing it. If there is no Selector in the middle, then if the Channel has no data, the thread will always be blocked. If there is another connection at this time, will another thread be opened? This is very inefficient, so a layer of Selector is added in the middle. The thread asks the Selector whether there is data. If not, it will do other things first.

So can all Channels be registered to Selector?

The answer is no, so there is a SelectableChannel in Java. By looking at its implementation class, we can see that basically all network-related Channels are OK, but local ones, such as FileChannel, are not allowed because FileChannel operates on local files. For network transmission, it is very fast.

Event loop group:

Multiple ServerSocketChannels will generate multiple SocketChannels. If an event is generated, it will be registered on a thread through Chooser for processing, and events can also be distributed between threads.

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Java Skill TreeIO Flow Overview 138,710 people are learning the system