Motivation
- Dealing with IO every day, I am familiar with it, but the pictures in my mind may not be the same
- “Proficient” in IO and “picky” -> missing a line in series
- The high-level language framework is transparent about IO-related API usage details -> Difficulties have been paved by predecessors
- Too basic…not “tall”…Common related expressions: “That’s it?” “Isn’t it very simple” “I can do what you say”…”F Word”
- Time…”Why don’t I spend this time learning algorithm architecture business?”
It’s interesting to talk about what everyone knows
Bound Context
IO is too broad… Reduce ambiguity -> Constrain the boundary of this sharing:
- IO associated with network transmission socket
- OS: Linux
- Language: JAVA
- TCP/IP protocol : >= TCP (rarely any further down…)
- NIO -> JAVA NEW I/O (including blocking & non-blocking)
How high can a JAVA user touch?
High-level programming languages are “weak” very “weak”:
- Function calls -> fast…but can’t do a lot
- Kernel call -> chatter user/kernel mode switch…slow…but have to use
even one sentence
System.out.println("HelloWorld")
JVM: What is a console? Where is the printer? Help me install a driver! Kernel : you you you...let me do it ......... ......... Soft interrupt -> register records kernel function & amp; parameters -> kernel executes write....finally...."HelloWorld"
(picture from https://pediaa.com/difference-between-kernel-and-shell)
Kernel -> A software that provides system calls upwards and manages IO devices downwards -> Linux = Kernel + Miscellaneous Tools
What is soft interrupt…kernel switch…kernel call? -> www.baidu.com
Not going into details…but it takes time (often imperceptible to our development)
Title: IO
What is IO in JAVA?
API?
FileInputStream
/** * Reads a byte of data from this input stream. This method blocks * if no input is yet available. * * @return the next byte of data, or <code>-1</code> if the end of the * file is reached. * @exception IOException if an I/O error occurs. */ public int read() throws IOException {<!-- --> return read0(); } private native int read0() throws IOException;
Just calling the function read0() can communicate with the IO device? .. It seems that something is missing
(quote reference image)
JVM: Kernel, you go to communicate with the IO device (same as the above HelloWorld example)
So… no matter how JAVA evolves, as long as JVM can’t directly interact with IO hardware… Kernel will play the role of real interaction
Here… JAVA I/O related APIs are finally mapped to -> corresponding SystemCall (in the future… if JVM can directly interact with hardware… it will be a new chapter)
How does Kernel interact with IO devices?
Richard Steven’s Top Five IO Models
- BIO (Java stream-related BIO API/NIO API)
- NIO (Java channel-related NIO API)
- IO Multiplexing (Java channel-related NIO API)
Signal DrivenAIO
No matter what kind of model… the behavior can be split into two steps
- 1 Wait for data to be ready
- 2 Data from kernel buffer -> application (JVM instance)
Different IO models have different above behavioral characteristics
The map’s address
BIO: To put it simply -> after the process/thread performs systemCall, it will be Blocked (stuck) until the execution of the corresponding read function of the Kernel is completed
The map’s address
NIO:
- Before the data is ready, the read-related systemCall will return immediately -> no blocking process/thread (Non-blocking)
- After the data is ready, the read-related systemCall blocks the process/thread until the data is copied to the application
The map’s address
Multiplexing:
- When calling select related systemCall, the parameter carries the corresponding file descriptor group (which can be understood as sockets group)
- In the scenario where all sockets group data is not ready -> the process/thread that calls the select related system call is in a blocked state
- After the data is ready, the process/thread resumes the running state, and calls the read-related systemCall to read the data (at this time, the read-related systemCall must return some data)
- read-related systemCall is often set to NIO mode
- From a functional point of view, it is very similar to “BIO + each connection exclusively has a thread” -> a process/thread calls select-related systemCall, enters blocking and listens to multiple file descriptors (sockets)
A brief summary
BIO-related APIs before JDK 1.5 only involve the systemCall corresponding to Kernel BIO-related IO models
The NIO (new I/O) related API after JDK 1.5 involves the systemCall corresponding to Kernel & amp; BIO NIO & amp; Multiplexing
Let’s give a rough description: JAVA NIO in terms of function >= JAVA BIO API
Topic: JAVA Reactor mode
Why use NIO?
The limitations of BIO…you can probably find a lot of them (www.baidu.com):
- thread
- Concurrency
- resource
- …
expensive in simplicity
Why do you have to use Reactor mode when using JAVA NIO?
When we stood on the shoulders of giants… remember the touch of the ground?
First of all… Can I build an NIO-based application without using select (multiplexing) related APIs? YES
code as proof
//@author : Yukai public static void main(String[] args) throws IOException {<!-- --> List<SocketChannel> socketChannels = new ArrayList(); ServerSocketChannel ssc = ServerSocketChannel. open(); ssc. configureBlocking(false); ssc.bind(new InetSocketAddress(9999)); while (true) {<!-- --> /** ** Attention... systemCall going on like crazy here!!!! * although you can't feel it **/ SocketChannel sc = ssc. accept(); if (sc != null) {<!-- --> sc. configureBlocking(false); socketChannels. add(sc); socketChannels.forEach(JavaPureNioDemo::printReadInfo); } } } @SneakyThrows public static void printReadInfo(SocketChannel sc) {<!-- --> ByteBuffer buffer = ByteBuffer. allocate(1024 * 4); SocketAddress localAddress = sc. getLocalAddress(); sc. read(buffer); buffer. flip(); byte[] sys_buffer = new byte[buffer. remaining()]; buffer. get(sys_buffer); System.out.println("socketChannel:" + localAddress.toString() + " " + new String(sys_buffer)); }
However… using NIO like this tends to be much worse than Multithreading + BIO:
BIO + multithreading | NIO + crazy rotation training |
---|---|
single connection One systemCall -> blocking -> one request to read | Single connection unlimited systemcall -> try to read again and again (sometimes not read) |
one Connect a thread | One thread manages multiple connections |
Just the first one… can completely veto the application of this mode | |
So… can one thread manage multiple connections while avoiding crazy systemCall ? YES |
Reactor Mode -> Integrate Multiplexing-related API (Selector)
Recommendation from the official elder (Doug Lea) Scalable IO in Java
Reactor: Distribute IO events to appropriate processors
Unlike the IO model of the operating system, Reactor is a programming pattern… when applying this pattern:
- Call JAVA Selector related API -> Kernel Multiplexing related systemCall
- Call JAVA NIO related API -> Kernel NIO related systemCall
- Try to break down complex problems
- The application will replace “infinite polling” with appropriate blocking
Give a Reactor application chestnut
We are going to write a simple embedded NIO web container
Divide-and-conquer
First… set a small goal for yourself:
- A Embedded
- B Application JAVA NIO
- C application of the Reactor pattern
- D Can run…can analyze business data packets under the http protocol
Then…Let’s take a look at the commonality and differences of BIO and NIO at the API level
The biggest commonality is probably: they are all Stream-related APIs, and they all deal with byte streams (byte[])
The biggest difference is probably: how data is read from the corresponding socket Link
- BIO read once (after recovering from the blocking state, the read related API reads byte by byte until -1) -> the read data must be deserialized into a complete Business data package
(quote reference image)
- NIO reads once (under a read event), and the read byte[]is not sure whether it corresponds to a complete business data packet: e.g The next read of the HTTP protocol only reads A byte, which cannot be deserialized into a message
Solution: Caching
1 Use an object (XSocket) to wrap the SocketChannel object bound to each connection
2 The XSocket object maintains an elastic (expandable) cache
3 Every NIO Read event -> read as much as possible (the read function returns >0), and lose the cache without thinking
4 After reading, use the parsing component corresponding to the specific protocol to try to deserialize the business data packet (the cache needs to be manipulated and trimmed after the parsing is successful)
5 Execute business processing logic
6 Repeat steps 3, 4, 5
However…the above…seems to be possible without Selector
What role did Selecor… play?
My understanding is: by making the thread corresponding to the main Loop properly block to greatly reduce system calls (remember the chestnut that didn’t use selector just now)
Complete container design
Argument Code
Reference:
http://gee.cs.oswego.edu/dl/cpjslides/nio.pdf
http://tutorials.jenkov.com/java-nio/non-blocking-server.html
W. Richard Stevens – Unix Network Programming Volume 1 3rd Edition – The Sockets Networking API