Blocking and non-blocking in the network and reactor model

Article directory

1. Responsibilities of the network `IO`
- Operation `IO`
- - How `IO` operates
  - **The specific difference between blocking and non-blocking `IO`:**
  - - Blocking the flow of `IO` in a system call
    - The flow of non-blocking `IO` in system calls
  - The network programming system call has the function of detection and operation
  - - `accept`:
    - `read`:
    - `write`:
2. The specific processing of system calls when calling non-blocking `IO`
- `connect`
- `accept`
- `read`
- `write`
3. When the connection is disconnected in the network
- active shutdown
- passive close
4. Separation detection and operation
- Taking `epoll` as an example to introduce `io` multiplexing
- - establish connection
  - Disconnect
  - message arrives
  - send message
- Detailed explanation of `epoll`
- - Introduce the four parameter interfaces of `epoll_wait`
5. Introducing `Reactor`
- What is `reactor`?
- Why there is `reactor`
- Why is `reactor` paired with non-blocking?
6. Application of `reactor` in middleware
- 1. `redis`
- 2. `memcache`
- 3. `nginx`

1. Network `IO` Responsibilities

Operate `IO`

`IO` operation mode

Blocking IO
Non-blocking IO

Use fcntl to modify whether fd is blocking or non-blocking, and then all system calls using this fd will behave as non-blocking

The specific difference between blocking and non-blocking `IO`:

Data is not ready for processing

The process of blocking `IO` in system calls

readblocking IO call process: first check whether there is data in the kernel buffer, if there is no data, then block and wait, when the client sends us Data, and then our network card driver fills the kernel buffer with data, and then the blocked system call will remove the block, then copy, and then the system call will return

writeblocks the IO call process: first check whether the kernel buffer can write data (whether it is full), if it is full, then Just block, wait until the network protocol stack sends out the kernel buffer data, he will stop blocking when there is remaining space, write the data into the kernel buffer, and return the number of bytes actually written into the kernel space

**acceptblocks the IO call process: **Check if there is a node in the full connection queue, if not, it will be blocked and waiting, if there is, it will take out and return to the assigned clientfd

The flow of non-blocking `IO` in system calls

Regardless of whether the kernel buffer is writable or readable, it returns directly. Will set errno if necessary

Network programming system calls have the function of detection and operation

`accept`:

Detection: Whether the full connection has unprocessed connection information
Operation: Take out the connection from the full connection queue to obtain and generate the customer’s fd and ip port information

`read`:

Detection: Whether the kernel buffer has data or not
Operation: copy the kernel buffer data to the user mode buffer

`write`:

Detection: Whether the kernel buffer can write data (is it full)
Operation: Copy the user-mode buffer to the kernel-mode buffer

2. The specific processing of system calls calling non-blocking `IO`

`connect`

Calling connect multiple times, the return value may be different from the **set errno**, when errno When code> is EISC-ONN, it tells us that the connection has been successfully established. If it is the first time to call connect, it will return EINPROGRESS , which tells us that a connection is being established

`accept`

If the full connection queue is empty (the client is not connected to the server), then accept will return -1, and errno will be set to EWOULDBLOCK, tells us that the full connection queue is empty

`read`

read call to view errno:
- read = -1 & amp; & amp; errno = EWOULDBLOCK: means the read buffer has no data yet
- read = -1 & amp; & amp; errno = EINTR: It means that it was interrupted when it could be read

`write`

write call to view errno:
- Same as above, but it means the buffer is full

3. When the connection is disconnected in the network

Active close

shutdown closes one end
close is to close the two ends (the first is to recycle resources, and then the read and write ends will be closed accordingly)
Writer of Client is associated with Read of Server, Server The reader is associated with the writer of the client
- So when the client calls shutdown to close the reading end, the corresponding writing end of the server will be closed (you need to judge by yourself)

Passive close

read returns 0: tells us that the connection is disconnected, more precisely, the server’s read end is closed
write returns -1 & amp; & amp; errno = EPIPE: tells us that the write end of the server is closed
Why do you need to distinguish between read-end closure and write-end closure?
- Because some server frameworks need to support the half-closed state,
  - If the reading end is closed, the server can also call write to send the unsent data of this connection, which corresponds to the time before the peer sends the fin packet in the four recycling part.

4. Separation detection and operation

accept personally detects whether there are nodes in the full connection queue, but uses io multiplexing technology to detect whether io is ready to ensure that when the execution reaches accept must be io ready

And io multiplexing can detect the readiness of multipleio at the same time

Taking `epoll` as an example to introduce `io`multiplexing

Establish a connection

io detects the processing flow of receiving connection
- socket
- bind & amp;listen
- Listen for read events (epollin)
  - How does epoll monitor! epoll will listen to the read event of listenfd. When we connect and listen successfully, There will be one more node on the full connection queue, and this node will send a signal< /strong>(EPOLLIN) to tell epoll (also select, poll), it will trigger the read event, which means that the connection is received io is ready, so accept will be called later, and now accept will only operate without detection, because io Multiplexing has been checked

Process flow of io detecting active connection

The server acts as a client to connect to some services such as mysql

Listen for write events (epollout)

The server calls conncet to send a SYN packet. At this time, the status is EINPROGRESS, how to detect it by multiplexing io Whether the connection is successfully established, io multiplexing will detect whether the last handshake of the three-way handshake is sent, will be sent when the last handshake of the three-way handshake is sent A signal, telling epoll that the write event is triggered (so when epoll listens to the write event trigger, it means that our connection is established successfully)

The following is a small demo where epoll is responsible for the detection of connect

// Set sockfd non-blocking connect(sockfd, (struct sockaddr *) & server_addr, sizeof(server_addr)); epollfd = epoll_create(EPOLL_SIZE); // 10 event.events = EPOLLOUT; event.data.fd = sockfd; ret = epoll_ctl(epollfd, EPOLL_CTL_ADD, sockfd, &event); epoll_wait(epollfd, events, EPOLL_SIZE, -1); // -1 means blocking if (events[0].events & amp; EPOLLERR || events[0].events & amp; EPOLLHUP) { printf("connect failed\\ "); exit(EXIT_FAILURE); } else if (events[0].events & EPOLLOUT) { printf("connect success\\ "); // send a message ret = write(sockfd, send_buf, strlen(send_buf)); if (ret == -1) { perror("write"); exit(EXIT_FAILURE); } }

Connection disconnected

Detect customer disconnection by judging event[i].events

EPOLLRDHUP: indicates that the server read end is closed

EPOLLHUP: indicates that both the read and write ends of the server are closed

Message arrived

Client fd triggers EPOLLIN (detection) and then calls read (operation)

Message sending

Client fd triggers EPOLLOUT (detection), then calls write/send (operation)

epollDetailed explanation

epoll_create will create a red-black tree and a double-ended queue

epoll_ctl is to add, delete, and modify the red-black tree. At the same time, it will establish a callback relationship with the network card driver. When the response event is triggered, the callback function will be called. This callback function will copy the triggered event to rdlist in the doubly linked list;

Calling epoll_wait will copy the ready events in rdlist to user mode (after copying out, the events in the kernel mode ready queue will be cleared)

Introduce the four parameter interfaces of epoll_wait

parameter

epoll file descriptor

An array of user space, used to receive things in the ready queue in kernel space

How many events are expected to be copied

Timing (in milliseconds)

epoll is a synchronous io, there is no blocking or non-blocking, but the blocking of the function can be controlled by timeout,

If it is set to -1, it will be blocked (the ready queue will always block if there is no event),

Set to 0 will behave as non-blocking (ready queue)

return

Return how many events are actually fetched

5. Introduce Reactor

It is to convert the operation of io into the processing of events

What is reactor?

A server has many io, we will submit them to the epoll kernel management, once there is an event (readable or writable) on io, it will The callback function corresponding to the trigger event (readable or writable) handles the business. Each fd has a corresponding structure, which stores some necessary information (callback function, independent read and write buffer)

Why is there a reactor

What does reactor add on the basis of epoll, and what are the benefits?

epoll: he is the management of io

reactor: is the management of events, different events correspond to different callback functions

Due to sock_item encapsulation, unprocessed events are placed in an independent buffer (independent of other fd (clients))

benefit?

For example, if a httpserver is implemented, our recv can only receive 1024, but the get request has 2000 characters, each received is placed in fd‘s own buffer, and will not be affected by other fd inputs Influence

Why is

reactor paired with non-blocking?

Why does io multiplexing help us detect that io is ready, why use non-blocking io, consider three situations

multi-threaded environment

Multiple threads use the same epoll to monitor the same fd

When there is io on this fd, the node of the ready queue will notify the epoll of each thread that there is a new connection established here.

If you use blocking io, when a thread’s epoll_wait takes it out, other threads’ epoll_wait will be blocked there

**Shocking crowd! :** There is a Neptune who has many girlfriends, each of whom is called his wife. Once he met these girlfriends at the same time, he made a doctor’s wife. They all agreed

edge triggered case

Call epollwait to take out the corresponding node in rdlist when the io event occurs on the client side fd code>readbuf will not automatically put the fd into the rdlist (but lt will be in the readbuf automatically puts data into rdlist when there is still data, and then lets epollwait trigger)

Under edge triggering, readbuf must read the data empty at one time, otherwise there will be sticky packets similar to tcp, if fd is set to block, Then readbuf will not return but continue to block when it has been read empty, so it must be set to non-blocking to judge that read has been read

select bug

When a new data segment arrives in a socket receiving buffer, then select reports that the socket descriptor is readable, but then, the protocol stack discards this segment after checking that the new segment has a checksum error. At this time, calling read has no data to read. If the socket is not set to be non-blocking, then read will block the current thread.

6. Application of reactor in middleware

1, redis

Using a single reactor

specific environment

kv, data structure, memory database

Use key to check value, value can support a variety of data structures, and these data operations are all performed on memory

Command processing is single-threaded

redis why use a single reactor

Only a single reactor can be used, because the core business logic is single-threaded.

It is very simple to operate the data structure, and the time complexity of the operation command is relatively low

How does redis deal with reactor

general flow

Create listenfd

bind listen

listenfd register read time

Read event trigger callback accept

clientfd registers for read events

read event trigger

What optimizations have redis made for reactor

Turn on multi-threading, bind the read data and solution protocol together in another thread, put the package and write data in another thread for processing, and put the middle business logic in the main thread operation

2, memcache

Multiple reactor

What optimizations have been made for reactor

In the main thread, there is a reactor specially used to receive connections. There are as many child reactor as there are connections. One connection creates a thread, and the interaction between threads It uses pipe to communicate

3, nginx

Multiple reactor

Multi-process, each process has its own epfd, listening to the same listenfd, the user layer handles the shock group to customize load balancing

The knowledge points of this column are based on the systematic learning of , to sort out and summarize and write articles. Readers who are interested in the c/c++ linux course can click on the link to view the detailed services in detail:

Blocking and non-blocking in the network and reactor model

Article directory

1. Network `IO` Responsibilities

Operate `IO`

`IO` operation mode

The specific difference between blocking and non-blocking `IO`:

The process of blocking `IO` in system calls

The flow of non-blocking `IO` in system calls

Network programming system calls have the function of detection and operation

`accept`:

`read`:

`write`:

2. The specific processing of system calls calling non-blocking `IO`

`connect`

`accept`

`read`

`write`

3. When the connection is disconnected in the network

Active close

Passive close

4. Separation detection and operation

Taking `epoll` as an example to introduce `io`multiplexing

Establish a connection

Connection disconnected

Message arrived

Message sending

`epoll`Detailed explanation

Introduce the four parameter interfaces of `epoll_wait`

5. Introduce `Reactor`

What is `reactor`?

Why is there a `reactor`

`reactor` paired with non-blocking?

6. Application of `reactor` in middleware

1, `redis`

2, `memcache`

3, `nginx`

Article directory

1. Network IO Responsibilities

Operate IO

IO operation mode

The specific difference between blocking and non-blocking IO:

The process of blocking IO in system calls

The flow of non-blocking IO in system calls

Network programming system calls have the function of detection and operation

accept:

read:

write:

2. The specific processing of system calls calling non-blocking IO

connect

accept

read

write

3. When the connection is disconnected in the network

Active close

Passive close

4. Separation detection and operation

Taking epoll as an example to introduce iomultiplexing

Establish a connection

Connection disconnected

Message arrived

Message sending

epollDetailed explanation

Introduce the four parameter interfaces of epoll_wait

5. Introduce Reactor

What is reactor?

Why is there a reactor

reactor paired with non-blocking?

6. Application of reactor in middleware

1, redis

2, memcache

3, nginx

1. Network `IO` Responsibilities

Operate `IO`

`IO` operation mode

The specific difference between blocking and non-blocking `IO`:

The process of blocking `IO` in system calls

The flow of non-blocking `IO` in system calls

`accept`:

`read`:

`write`:

2. The specific processing of system calls calling non-blocking `IO`

`connect`

`accept`

`read`

`write`

Taking `epoll` as an example to introduce `io`multiplexing

`epoll`Detailed explanation

Introduce the four parameter interfaces of `epoll_wait`

5. Introduce `Reactor`

What is `reactor`?

Why is there a `reactor`

`reactor` paired with non-blocking?

6. Application of `reactor` in middleware

1, `redis`

2, `memcache`

3, `nginx`