Explore the innovation of Reactor network model in today’s application field

This article is shared from Huawei Cloud Community “Controlling the Future of Network Technology: Exploring the Innovation of the Reactor Network Model in Today’s Application Fields”, author: Lion Long.

This article introduces the Reactor network model in Linux network design and its importance in practical applications. The Reactor model is a classic event-driven design pattern widely used in building high-performance, scalable web servers. We will explore the fundamentals and components of the Reactor model, and detail how the Reactors model is implemented in Linux network programming.

1. Introduction to reactor network model programming

Reactor converts the detection of IO into the processing of events, and is an asynchronous event mechanism. reactor will use IO multiplexing for IO detection. IO multiplexers are generally: select, poll, epoll.

The general logic of reactor:

(1) socket() creates a socket, listenfd;

(2) bind(), listen() configure listenfd, bind and monitor;

(3) listenfd registers the read event, which is managed by epoll;

(4) Read event trigger, callback accept;

(5) The client connects to clientfd to form a read event;

(6) Related events call related callback functions

1.1, establish a connection

Receive client connections.

//...

int epfd=epoll_create(1);//create epoll object

//...

int listenfd=socket(AF_INET,SOCK_STREAM,0);//create socket

//...

struct epoll_event ev;

ev.events=EPOLLIN;

epoll_ctl(epfd, EPOLL_CTL_ADD, listenfd, & amp;ev)//register event

//...

// When the read event of listenfd is triggered, call accept to receive the connection

struct sockaddr_in clientaddr;

socklen_t len=sizeof(clientaddr);

int clientfd=accept(listenfd,(struct sockaddr *) & amp;clientaddr, & amp;len);

struct epoll_event ev;

ev.events=EPOLLIN;

epoll_ctl(epfd, EPOLL_CTL_ADD, clientfd, & amp;ev)//Register the read event of the new connection

//...

Connect to third-party services.

//...

int epfd=epoll_create(1);//create epoll object

//...

int fd=socket(AF_INET,SOCK_STREAM,0);//create socket

//...

struct sockaddr_in clientaddr;

socklen_t len=sizeof(clientaddr);

connect(fd,(struct sockaddr *) & amp;clientaddr, & amp;len);//connection service

//...

struct epoll_event ev;

ev.events=EPOLLOUT;

epoll_ctl(epfd,EPOLL_CTL_ADD,fd, & amp;ev)//register event

//...

// When the write event of fd is triggered, the connection is established successfully

if(status==e_connecting & amp; & amp; ev.events==EPOLLOUT)

{

status=e_connected;

epoll_ctl(epfd, EPOLL_CTL_DEL.fd, NULL);

}

//...

1.2, Disconnect

//...

if(ev.events & EPOLLRDHUP)

{

// close server reader

close_read(fd);

}

if(ev.events & EPOLLHUP)

{

// Close the server read and write end

close(fd);

}

//...

1.3, data arrival

// ...

if(ev.events & EPOLLIN)

{

while(1)

{

int n=recv(clientfd,buffer,buffer_size,0);

if(n<0)

{

if(errno==EINTR)

continue;

if(errno==EWOULDBLOCK)

break;

close(clientfd);

}

else if(n==0)

{

close_read();

}

else

{

// handle business

}

}

//...

}

// ...

1.4, data sending

// ...

if(ev.events & EPOLLOUT)

{

int n=send(clientfd,buffer,buffer_size,0);

if(n<0)

{

if(errno==EINTR)

continue;

if(errno==EWOULDBLOACK)

{

struct epoll_event e;

e.events=EPOLLOUT;

epoll_ctl(epfd,EPOLL_CTL_ADD,clientfd, & amp;e)//register event

return;

// break;

}

close(clientfd);

}

else if(n==buffer_size)

{

epoll_ctl(epfd, EPOLL_CTL_DEL, clientfd, NULL);

// epoll_ctl(epfd,EPOLL_CTL_MOD,clientfd, &e);

}

//...

}

//...

1.5, common questions about reactor

1. epoll shocking group

What is “Shocking Group”? Network programming often uses multi-thread and multi-process models. Each thread or process has an epoll object. The listenfd generated by socket(), bind(), and listen() may manage multiple epoll objects. When an accept When it arrives, all epolls are notified, and all processes or threads respond to this event at the same time, but only one accept succeeds in the end. This is “Shocking Group”.

2. Horizontal trigger and edge trigger

Horizontal trigger: When there is data in the read buffer, trigger until the data is read.

Edge trigger: Trigger once when an event comes. Read and write operations generally need to cooperate with the cycle to complete all read and write.

3. Why should reactor be used with non-blocking IO?

Mainly three reasons:

(1) In a multi-threaded environment, a listenfd will be managed by multiple epoll (IO multiplexer) objects. When a connection arrives, all epolls will be notified, and all epolls will respond, but in the end there is only one accept is successful; if blocking is used, other epolls will be blocked all the time. So it’s better to use non-blocking IO to return in time.

(2) Under the edge trigger, the event trigger will only read the event, then the buffer needs to be read empty in an event loop; if the blocking mode is used, then when the data in the read buffer is read, it will always be blocked and cannot return.

(3) Select bugs. When a new data segment arrives in a socket receiving buffer, select reports that the socket descriptor is readable, but then, the protocol stack detects that the new segment checksum is wrong, and then discards the segment, and then calls recv /read means there is no data to read; if the socket is not set to nonblocking, this recv/read will block the current thread.

4. Must IO multiplexing be combined with non-blocking IO?

No, blocking mode can also be used. For example, MySQL uses select to receive connections, and then uses one thread to process a connection; you can also use a system call to first obtain the number of bytes in the read buffer, and then read the data once, but this leads to relatively low efficiency.

int n=EVBUFFER_MAX_READ_DEFAULT;

ioctl(fd,FIONREAD, & amp;n);//Get the number of data bytes in the read buffer

2. reactor application scenarios

Scenarios using a single reactor and scenarios using multiple reactors. Using multiple reactors has different usages of multi-threading and multi-processing.

2.1, redis – use a single reactor

Redis is a network database component with a key-value structure, rich data structures, and operations on memory. Redis command processing is single-threaded.

2.1.1 Why does redis use a single reactor?

To understand why redis only uses a single reactor, you need to understand that redis command processing is single-threaded.

Redis provides rich data structures, and locking these data structures is very complicated, so Redis uses a single thread for processing; because a single thread is used for command processing, the core business logic is single thread, so no matter how many reactors are used, it cannot be processed of; therefore, redis uses a single reactor.

In addition, the time complexity of redis operating specific commands is relatively low, and there is no need to use multiple reactors.

2.1.2, redis processing reactor block diagram

2.1.3, redis optimization of reactor

Optimized the business logic and introduced IO threads:

After receiving the data, throw the data to the IO thread for processing; before sending the data, put the packaged data in the IO thread for processing, and then send it out. Referring to the above figure, it is to put (read + decode) in the thread for processing, and put (encode + write) in the thread for processing.

reason:

For a single thread, when the data received or sent is too large, the thread load will be too large, and it is necessary to use multiple threads for IO data processing. Especially in the protocol solution process, the data is huge and time-consuming, and an IO thread needs to be opened for processing.

Scenario example:

The client uploads log records; the client obtains leaderboard records.

2.1.4, look at redis source code from the perspective of reactor

Create an epoll object:

Create a socket and bind the listener:

put listenfd into epoll management:

Listen for events:

Handle events:

2.2, memcached – use multiple reators in multi-threaded mode

Memcached is a key-value structure, a network database component that operates on memory. Memcached’s command processing is multi-threaded.

Memcached needs libevent, which is an event-driven library, and memcached is based on libevent for network usage.

2.2.1 Why does memcached use multiple reactors?

Unlike redis, the key-value structure of memcached supports rich data structures. The data structure used by its value is relatively simple, and locking is relatively easy. Therefore, multi-threading can be introduced to improve efficiency.

2.2.2 How does memcached handle reactor?

The memcached main thread will have a reactor, which is mainly responsible for receiving connections; after receiving the connection, after load balancing, the reactor of the sub-thread will be notified through the pipe (pipeline), and the client’s fd will be managed by the reactor of the thread; each thread handles related Corresponding business logic.

2.2.3, look at memcached source code from the perspective of reactor

Download the latest memcached from github.

Start source code analysis:

Create a socket and bind the listener:

Assign clientfd to a specific thread and add a read event:

2.3, nginx – use multiple reators in multi-process mode

Nginx can reverse proxy and use multi-process to handle business.

The master will create listenfd, bind and listen; fork multiple processes, each process has its own epoll object, and listenfd is managed by multiple epoll objects. At this time, there will be shocking groups, which need to be dealt with; events are handled through load balancing.

2.3.1. Solve the problem of “shocking group”

Locking method. Nginx will open up a shared memory, put the lock in the shared memory, and multiple processes will compete for the lock, and only those who compete for the lock can accept the connection.

2.3.2, load balancing

Define the maximum number of connections for a process. When the number of connections exceeds 7/8 of the total number of connections, the process will suspend accepting connections and leave the opportunity for other processes.

In this way, one process will not have too many connections, while other processes will have too few connections; thus, the number of connections for each process will be relatively balanced.

When the number of connections accepted by all processes reaches 7/8 of the total number of connections, it means that nginx will become very slow to accept connections.

Summary

In this article, the Linux Reactor network model is explored in depth, and its importance and advantages in practical applications are highlighted. The Reactors model is an efficient network design pattern that excels at handling concurrent connections, allowing us to build high-performance, scalable network applications.

First, understand the basic principles of the Reactors model. It uses an event-driven approach to monitor input events through a main loop, and once an event occurs, the corresponding handler will be called. This non-blocking design enables the server to efficiently handle large numbers of concurrent connections without creating a thread for each connection.

Then, discusses the practical application of Reactors model in Linux network design. It also deeply analyzes the event processing and callback mechanism to help readers understand how to optimize the design of network applications.

Compared with the traditional multi-thread or multi-process model, the Reactors model can make better use of system resources, reduce the overhead of context switching and thread creation, and thus improve the concurrent processing capabilities of applications.

This article aims to help readers fully understand the Linux Reactor network model, and encourage them to use this model in their own network applications to build higher performance and more reliable network applications. Mastering the knowledge of the Reactors model will enable readers to more confidently navigate the future of network technology and meet the ever-changing challenges.

Click to follow and learn about Huawei Cloud’s fresh technologies for the first time~