Netty encoder and decoder

Article directory

1. Decoder principle and practice
- 1. ByteToMessageDecoder decoder
- 2. Custom integer decoder
- - 1) Conventional way
  - 2)ReplayingDecoder decoder
- 3. Packet decoder
- 3. MessageToMessageDecoder decoder
2. Netty’s built-in Decoder
- 1. LineBasedFrameDecoder decoder
- 2. DelimiterBasedFrameDecoder decoder
- 3. LengthFieldBasedFrameDecoder decoder
3. Encoder Principle and Practice
- 1. MessageToByteEncoder encoder
- 2. MessageToMessageEncoder encoder
4. Combination of decoder and encoder
- 1. ByteToMessageCodec codec
- 2. CombinedChannelDuplexHandler combiner

Netty reads ByteBuf binary data from the underlying Java channel, passes it into the Netty channel pipeline, and then starts inbound processing. During inbound processing, the ByteBuf binary type needs to be decoded into a Java POJO object. This decoding process can be completed through Netty’s Decoder decoder. During outbound processing, the business processing results need to be encoded from a Java POJO object into the final ByteBuf binary data, and then sent to the peer through the underlying Java channel. During the encoding process, Netty’s Encoder encoder needs to be used to complete the data encoding work.

1. Decoder principle and practice

Netty’s decoder Decoder is essentially an InBound inbound processor. It decodes or formats the input data passed by the previous InBound inbound processor, and then outputs it to the next InBound inbound processor.

A standard decoder decodes the data of the input type ByteBuf buffer and outputs Java POJO objects one by one. Netty has this decoder built-in, called ByteToMessageDecoder.

The decoders in Netty are all Inbound inbound processor types, and all directly or indirectly implement the ChannelInboundHandler interface.

1. ByteToMessageDecoder decoder

ByteToMessageDecoder is a very important decoder base class. It is an abstract class that implements the basic logic and process of decoding. ByteToMessageDecoder inherits from the ChannelInboundHandlerAdapter adapter and is an inbound processor that implements the decoding function from ByteBuf to Java POJO objects.

The decoding process of ByteToMessageDecoder is roughly to first decode the data input into ByteBuf from the previous station, decode a list of List objects, then iterate this list, and pass the Java POJO objects one by one to the Inbound inbound processor of the next station. .

The decoding method of ByteToMessageDecoder is called decode. This class only provides an abstract method. The specific decoding process, that is, how to convert ByteBuf data into Object data, requires subclasses to complete. ByteToMessageDecoder, as the parent class of the decoder, only provides a process framework. It only puts the Object result decoded by the decode method of the subclass into its own internal result list (this process is also completed by the decode method of the subclass). ), and ultimately the parent class will be responsible for passing the elements in the list to the next station one by one.

If you want to implement your own decoder, first inherit the ByteToMessageDecoder abstract class, then implement its accumulated decode abstract method, and write the decoding logic into this method. Overall, the process is roughly as follows:

Inherit the ByteToMessageDecoder abstract class
Implement the decode abstract method of the base class and write the logic of decoding ByteBuf to POJO into this method. Decode ByteBuf binary data into Java POJO objects one by one
In the decode method of the subclass, the decoded Java POJO object is placed in the decoded List. This is passed in by the parent class, which is the result collection list of the parent class.
The parent class passes the results in the List separately to the Inbound inbound processor of the next stop.

What ByteToMessageDecoder passes to the next station is the decoded Java POJO object (it will traverse all elements in the list and call the fireChannelRead method as a parameter), not the ByteBuf buffer. So who is responsible for reference counting and release management of the ByteBuf buffer?

The initially accumulated ByteToMessageDecoder is responsible for releasing the ByteBuf buffer of the decoder. It will automatically call the release method to reduce the reference number of the previous ByteBuf buffer by 1. This work is completed automatically.

If this ByteBuf needs to be used later, you can call the retain method in the decode method to increase the reference count.

2. Custom integer decoder

1) Conventional method

public class Byte2IntegerDecoder extends ByteToMessageDecoder {<!-- -->
    @Override
    protected void decode(ChannelHandlerContext channelHandlerContext, ByteBuf byteBuf, List<Object> list) throws Exception {<!-- -->
        while (byteBuf.readableBytes() >= 4) {<!-- -->
            int i = byteBuf.readInt();
            System.out.println("Decode an integer:" + i);
            list.add(i);
        }
    }
}

public class IntegerProcessHandler extends ChannelInboundHandlerAdapter {<!-- -->

    @Override
    public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {<!-- -->
        Integer integer = (Integer) msg;
        System.out.println("Print out an integer:" + integer);
        super.channelRead(ctx, msg);
    }
}

public class Byte2IntegerDecoderTester {<!-- -->

    public static void main(String[] args) {<!-- -->
        EmbeddedChannel embeddedChannel = new EmbeddedChannel(new ChannelInitializer<EmbeddedChannel>() {<!-- -->
            @Override
            protected void initChannel(EmbeddedChannel ch) throws Exception {<!-- -->
                ch.pipeline().addLast(new Byte2IntegerDecoder())
                        .addLast(new IntegerProcessHandler());
            }
        });
        ByteBuf buf = Unpooled.buffer();
        buf.writeInt(1);
        embeddedChannel.writeInbound(buf);
    }
}

Using the above Byte2IngeterDecoder certificate decoder, you need to check the length of ByteBuf first. If there are enough bytes, the integer can be read. This length judgment can be completed by Netty’s ReplayingDecoder class.

2) ReplayingDecoder decoder

The ReplayingDecoder class is a subclass of ByteToMessageDecoder and its function is:

Before reading data from a ByteBuf buffer, check whether the buffer has enough bytes
If there are enough bytes in ByteBuf, it will be read normally, otherwise decoding will stop.

In other words, using the Replaying base class to write an integer decoder does not require us to perform length detection.

public class Byte2IntegerReplayDecoder extends ReplayingDecoder {<!-- -->
    @Override
    protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception {<!-- -->
        int i = in.readInt();
        System.out.println("Decode an integer:" + i);
        out.add(i);
    }
}

It can be seen that by inheriting the ReplayingDecoder class to implement a decoder, there is no need to write length judgment code. Replaying internally defines a new binary buffer class to decorate the ByteBuf buffer, and the class name is ReplayingDecoderBuffer. This decorator will first judge the length of the buffer before actually reading the data. If the length is qualified, the data will be read, otherwise a ReplayError will be thrown. After ReplayingDecoder captures the ReplayError, it will retain the data and wait for the next IO time to read again.

That is to say, in fact, the value of the actual parameter in obtained by the decode method in ReplayingDecoder is not the original ByteBuf type, but the ReplayingDecoderBuffer type. It inherits the ByteBuf class, wraps most of the reading methods, and performs the length before reading. judge.

Of course, the role of ReplayingDecoder goes far beyond length judgment, and its more important role is packet transmission.

3. Packet decoder

The underlying communication protocol is transmitted in packets. A piece of data may reach the opposite end several times, which means that the packets sent out by the sender will be split and assembled multiple times during the transmission process. The packets received by the receiving end are not exactly the same as those sent by the sending end.

In Java OIO streaming, such a problem does not occur because its strategy is to block the program until the complete information is read and not execute backwards. However, in Java’s NIO, due to the non-blocking nature of NIO, there will be situations where the received data packets and the packets sent by the sender are not exactly the same. For example, the sender sends ABC and DEF, and the receiver receives ABCD and EF.

For this kind of problem, you can still use ReplayingDecoder to solve it. During data parsing, if it is found that all the readable data in the current ByteBuf is not enough, ReplayingDecoder will end the parsing until there is enough readable data. All this is implemented inside ReplayingDecoder and does not require user program operation. Unlike integer packet transmission, the length of the string is not fixed to the length of the integer, but is of variable length. Therefore, generally speaking, the universal Header-Content content transfer protocol can be used for string transmission in Netty:

Place the byte length of the string in the Head part of the protocol. The Head part can be described by an integer.
In the Content part of the protocol, a byte array of strings is placed.

Then during the actual transmission process, a Header-Content content packet will be encoded into a ByteBuf content sending packet at the sending end. When it reaches the receiving end, it may be divided into many ByteBuf receiving packets. How to decode these uneven receiving packets into the original ByteBuf content sending packet?

There is a very important state member attribute in ReplayingDecoder, which is used to save the current stage of the decoding process of the current decoder. The type of this attribute is consistent with the generic type of ReplayingDecoder, and ReplayingDecoder provides a parameterized constructor to initialize this value. In addition, the checkpoint(status) method is provided to set the status to the new status value and set the read breakpoint pointer.

The read breakpoint pointer is another important member of the ReplayingDecoder class. It saves the starting read pointer of the ReplayingDecoderBuffer member inside the decorator, which is somewhat similar to the mark mark. When reading data, once the readable data is not enough, ReplayingDecoderBuffer will restore the value of the read pointer to the read breakpoint pointer set by the previous checkpoint method before throwing the RelayError exception. Therefore, the next time it is read, it will start from the previously set breakpoint. Click location to start.

So we can divide the reading into two stages, the first stage gets the length and the second stage gets the string. According to the state attribute provided by the previous ReplayingDecoder, we only need to use the ReplayingDecoder decoder to implement a custom string packet decoder. The code is as follows:

public class StringReplayDecoder extends ReplayingDecoder<StringReplayDecoder.Status> {<!-- -->

    private int length;
    private byte[] inBytes;

    public StringReplayDecoder() {<!-- -->
        super(Status.PARSE_1);
    }
  
    @Override
    protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception {<!-- -->
        switch (state()) {<!-- -->
            case PARSE_1 -> {<!-- -->
                length = in.readInt();
                inBytes = new byte[length];
                checkpoint(Status.PARSE_2);
            }
            case PARSE_2 -> {<!-- -->
                in.readBytes(inBytes, 0, length);
                out.add(new String(inBytes, StandardCharsets.UTF_8));
                checkpoint(Status.PARSE_1);
            }
        }
    }

    enum Status {<!-- -->
        PARSE_1, PARSE_2
    }
}

public class StringProcessHandler extends ChannelInboundHandlerAdapter {<!-- -->

    @Override
    public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {<!-- -->
        System.out.println("Print out a string:" + msg);
        super.channelRead(ctx, msg);
    }
}

public class StringReplayDecoderTester {<!-- -->

    public static void main(String[] args) {<!-- -->
        EmbeddedChannel embeddedChannel = new EmbeddedChannel(new ChannelInitializer<EmbeddedChannel>() {<!-- -->
            @Override
            protected void initChannel(EmbeddedChannel ch) throws Exception {<!-- -->
                ch.pipeline().addLast(new StringReplayDecoder())
                        .addLast(new StringProcessHandler());
            }
        });
        final String str = "Hello world!";
        for (int i = 0; i < 3; i + + ) {<!-- -->
            ByteBuf buf = Unpooled.buffer();
            buf.writeInt((i + 1) * str.getBytes().length);
            for (int j = 0; j < i + 1; j + + ) {<!-- -->
                buf.writeBytes(str.getBytes(StandardCharsets.UTF_8));
            }
            embeddedChannel.writeInbound(buf);
        }
    }
}
Output result:
Print out a string: Hello world!
Print out a string: Hello world! Hello World!
Print out a string: Hello world! Hello World! Hello World!

You can see that the result successfully printed the data we entered. This is achieved by relying on the state attribute of ReplayingDecoder. When this decoder is created, it defaults to STATE1 state. At this time, it will try to read an integer, and the read result is This time we hope to read the length of the string (packet), then change the state to STATE2, and the decode method ends. Because the read result is not added to the out list, the logic of the second processor will not be triggered. When the string is read, it is already in the STATE2 state, so the in.readBytes method is called to read the specified length. If the string read this time is incomplete due to unpacking during the transmission process, then the target read length cannot be reached at this time. ReplayingDecoderBuffer will restore the value of the read pointer to the previous checkpoint before throwing a RelayError exception. The read breakpoint pointer set by the method, so the next read will start from the previously set breakpoint position. Therefore, the correctness of the read packets is guaranteed.

Although the packetized ByteBuf data packets can be correctly decoded in this way, it is not recommended to inherit this class during the actual development process, because:

Not all ByteBuf operations are supported by the ReplayingDecoderBuffer decoration class. Some operations may throw a ReplayError exception when used in the decode method.
In application scenarios with complex data parsing logic, the parsing speed of ReplayingDecoder is relatively poor (because when the length of ByteBuf is not enough, ReplayingDecoder will catch a ReplayError exception, and then restore the read pointer in ByteBuf to the previous read breakpoint pointer (checkpoint). ), and then parse the parsing operation and wait for the next IO read event. When the network conditions are poor, the parsing logic of a data packet will be repeatedly executed multiple times. If the parsing process is a CPU-consuming operation, it will be a big problem for the CPU. burden)

Therefore, ReplayingDecoder is more suitable for scenarios with simple data parsing logic. For complex scenarios, it is recommended to use ByteToMessageDecoder or its subclasses, as follows:

public class StringIntegerHeaderDecoder extends ByteToMessageDecoder {<!-- -->
    @Override
    protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception {<!-- -->
        if (in.readableBytes() < 4) {<!-- -->
            return;
        }
        in.markReaderIndex();
        int length = in.readInt();
        if (in.readableBytes() < length) {<!-- -->
            //Reset to before reading length
            in.resetReaderIndex();
            return;
        }
        byte[] inBytes = new byte[length];
        in.readBytes(inBytes, 0, length);
        out.add(new String(inBytes, StandardCharsets.UTF_8));
    }
}

On the surface, the ByteToMessageDecoder base class is stateless, unlike ReplayingDecoder, which needs to use state to save the current reading phase. But in fact, ByteToMessageDecoder has an accumulation of binary bytes inside it, which is used to save binary content that has not been parsed. Therefore, ByteToMessageDecoder and its subclasses are stateful business processors and cannot be shared. Therefore, every time you initialize the channel pipeline, you need to recreate an instance of ByteToMessageDecoder or its subclass.

3. MessageToMessageDecoder decoder

The previous decoders all decode the binary data in the ByteBuf buffer into ordinary POJO objects in Java. If you want to parse one POJO object into another POJO object, you need to inherit a new Netty decoder base class – -MessageToMessageDecoder, you need to specify the generic actual parameter when inheriting it, which represents the inbound message Java POJO type.

2. Netty’s built-in Decoder

Fixed length packet decoder: FixedLengthFrameDecoder
- Usage scenario: The length of each received data packet is fixed
- The inbound ByteBuf will be split into fixed-length data packets (ByteBuf) and then sent to the next channelHandler inbound processor.
Line split packet decoder: LineBasedFrameDecoder
- Usage scenario: For each ByteBuf data packet, use the newline character (or carriage return and line feed character) as the boundary delimiter of the data packet.
Custom delimiter packet decoder: DelimiterBasedFrameDecoder
Custom length packet decoder: LengthFieldBasedFramDecoder
- A flexible-length decoder. A length field is added to the ByteBuf data packet to save the length of the original datagram. During decoding, the original data packet will be extracted based on this length.

1. LineBasedFrameDecoder decoder

The content in the previous string packet decoder is transmitted according to the Header-Content protocol. If you do not use this protocol but use the newline character (“\
” or “\r\
“) to split each sent string at the sending end, you will need to use the LinedBasedFrameDecoder decoder.

The working principle of this decoder is very simple. It traverses the readable bytes in the ByteBuf packet at a time and determines whether there is a bytecode of the newline character “\
” or “\r\
” in the binary byte stream. If This position is used as the end position, and the bytes from the readable index to the end are used as the ByteBuf data packet after successful decoding.

LineBasedFrameDecoder also supports configuring a maximum length value (passed in by the constructor), which represents the maximum number of bytes that a line can contain. If no newline character is found after reading continuously to the maximum length, an exception will be thrown.

2. DelimiterBasedFrameDecoder decoder

The DelimiterBasedFrameDecoder decoder can not only use newlines, but also other special characters as packet delimiters, such as the tab character “\t”. Its construction method is as follows:

public DelimiterBasedFrameDecoder(int maxFrameLength,
boolean stripDelimiter, // Whether the delimiter is removed from the decoded data packet
ByteBuf delimiter) // delimiter

The delimiter is of ByteBuf type, that is, the byte array corresponding to the delimiter needs to be wrapped in ByteBuf.

3. LengthFieldBasedFrameDecoder decoder

The LengthFieldBasedFrameDecoder decoder can be translated as a length field packet decoder. The value of the LengthField length field in the transmission content refers to the number of bytes stored in the data packet to transmit the content. For ordinary content transmission based on the Header-Content protocol, try to use the built-in LengthFieldBasedFrameDecoder to decode.

Its specific constructor is as follows:

public LengthFieldBasedFrameDecoder(
    int maxFrameLength, // Maximum length of data packet sent
    int lengthFieldOffset, // length field offset value
    int lengthFieldLength, //The number of bytes occupied by the length field itself
    int lengthAdjustment, // Offset correction of length field
    int initialBytesToStrip) // The starting number of bytes to discard

maxFrameLength: The maximum length of the data packet sent
lengthFieldOffset: refers to the lower mark of the length field located in the byte array inside the entire packet
lengthFieldLength: The number of bytes occupied by the length field, 4 if the length field is an integer
lengthAdjustment: When the transmission protocol is complex (for example, it contains a length field, protocol version number, magic number, etc.), length correction is required during decoding. The calculation formula of the length correction value is: content field offset – length field offset – number of bytes of the length field
initialBytesToStrip: In front of the valid data field Content, there are some bytes of other fields, which can be discarded as the final parsing result.

Assume that the data packet ByteBuf (58 bytes) we want to transmit includes the following three parts:

Length field (4 bytes): 52
Version field (2 bytes): 10
content field (52 bytes): xxxxx

Then at this time, the parameters that need to be passed into the constructor when using the LengthFieldBasedFrameDecoder decoder are:

The maximum length can be set to 1024
Length field offset is 0
The length field has a length of 4
The offset of the length field is corrected to 2, that is, the number of bytes from the length field to the content part is 2
When obtaining the byte array of the final Content content, the first 6 bytes of content can be discarded.

3. Encoder principle and practice

After Netty’s business processing is completed, the result of the business processing is often a Java POJO object, which needs to be encoded into the final ByteBuf binary type and written to the underlying Java channel through the pipeline.

The encoder and decoder echo each other. The encoder is an Outbound processor that encodes the input data from the previous station (usually some kind of Java POJO object) into a binary ByteBuf, or into another Java POJO object. .

Encoder is the implementation class of ChannelOutboundHandler outbound handler. After an encoder encodes the outbound object, the data will be passed to the next ChannelOutboundHandler outbound processor for subsequent outbound processing. Since only ByteBuf can be written to the channel in the end, it is certain that the first encoder installed on the channel pipeline must encode the data into ByteBuf type (the order of outbound processing is from back to front).

1. MessageToByteEncoder encoder

The MessageToByteEncoder encoder is a very important encoder base class. Its function is to encode a Java POJO object into a ByteBuf data packet. It is an abstract class that only implements the basic process of encoding. During the encoding process, it is completed by calling the encode abstract method. The specific encode logic needs to be implemented by subclasses.

If you need to implement your own encoder, you need to inherit from the MessageToByteEncoder base class and implement its encode abstract method. When inheriting MessageToByteEncoder, you need to bring generic actual parameters, which represent the original Java POJO type before encoding.

2. MessageToMessageEncoder encoder

In addition to encoding POJO objects into ByteBuf binary objects, POJO objects can also be encoded into another POJO object. By inheriting the MessageToMessageEncoder encoder and implementing its encode abstract method. In the implementation of the encode method of the subclass, the encoding logic from the original POJO type to the target POJO type is completed. In the encode implementation method, after the encoding is completed, just add the object to the List actual parameter list in the encode method.

4. Combination of decoder and encoder

During pipeline processing, data often flows in and out, decoding when it comes in and encoding when it goes out. Therefore, if some kind of encoding logic is added to the same pipeline, it is often necessary to add a corresponding decoding logic.

The encoders and decoders mentioned earlier are implemented separately. For example, decoding from ByteBuf to POJO is completed by inheriting the ByteToMessageDecoder base class or its subclasses; POJO to ByteBuf data packets are completed by inheriting the MessageToByteEncoder base class or its subclasses. coding work. In short, the encoder and decoder with opposite logic are implemented in two different classes, resulting in the matching encoder and decoder needing to be added twice when added to the channel pipeline.

Therefore, Netty provides a new type Codec type to implement encoders and decoders with mutually supporting logic in the same class.

1. ByteToMessageCodec codec

The base class that completes the supporting encoder and decoder from POJO to ByteBuf data packet is called ByteToMessageCodec, which is an abstract class. Functionally speaking, inheriting it is equivalent to inheriting the two base classes of ByteToMessageDecoder decoder and MessageToByteEncoder encoder.

ByteToMessageCodec also contains two abstract methods: encoding and decoding. Both methods need to be implemented by yourself:

Encoding method: encode(ChannelHandlerContext, I, ByteBuf)
Decoding method: decode(ChannelHandlerContext, ByteBuf, List)

2. CombinedChannelDuplexHandler combiner

The combination of the previous encoder and decoder is completed through inheritance. Forcing the logic of the encoder and decoder into the same class is not logically appropriate in a pipeline that only requires encoding or decoding unilateral operations.

If the encoder and decoder are to be combined, in addition to inherited methods, they can also be implemented through combination. Composition allows greater flexibility than inheritance: encoders and decoders can be bundled or used separately.

Netty provides a new combiner – the CombinedChannelDuplexHandler base class. Inheriting this class does not require squeezing the encoding logic and decoding logic into the same class like ByteToMessageCodec, or reusing the original encoder and decoder. The specific usage method as follows:

public class IntegerDuplexHandler extends CombinedChannelDuplexHandler<Byte2IntegerDecoder, Integer2ByteEncoder> {<!-- -->
public IntegerDuplexHandler() {<!-- -->
super(new Byte2IntegerDecoder(), new Integer2ByteEncoder());
}
}

Inheriting this class does not require putting the encoding and decoding logic in one class like ByteToMessageCodec, or reusing the original encoder and decoder. In short, using this class ensures that the encoder and decoder with opposite logical relationships can be used together or separately, which is very convenient.