Linux network programming – IO multiplexing

IO multiplexing

IO multiplexing is a very useful technique that allows a single thread/process to monitor and manage multiple IO descriptors simultaneously. It is particularly suitable for scenarios that need to handle a large number of concurrent socket connections, such as web servers, database servers, or other network applications. IO multiplexing allows applications to wait for data without being blocked and process the data as soon as it arrives.

Core concepts

Blocking and non-blocking IO:

  • Blocking IO: When an application performs an IO operation, it must wait for the IO operation to complete before continuing to perform other tasks.
  • Non-blocking IO: The application can return immediately when performing IO operations and perform other tasks. If the IO operation is not completed, the system will return an error.

Synchronous and asynchronous IO:

  • Synchronous IO: After the application initiates an IO operation, it must wait or actively poll to know when the IO operation is completed.
  • Asynchronous IO: After the application initiates an IO operation, the system will notify the application when the IO operation is completed.

IO multiplexing technology

The core of IO multiplexing is to use a system call to monitor multiple file descriptors to see which file descriptors are ready for read or write operations. There are several main IO multiplexing techniques:

  1. select: This is the earliest IO multiplexing method, but it has its limitations, such as the limit on the number of descriptors.
  2. poll: Similar to select, but there is no limit on the number of descriptors.
  3. epoll: A Linux-specific method that provides better scalability, especially in the case of a large number of concurrent connections.

How it works

Consider a network application, such as a web server. In the simplest case, each time the server accepts a connection, it creates a new process or thread to handle it. However, this method will lead to a huge waste of resources in a high-concurrency environment.

The working principle of IO multiplexing is as follows:

  1. A main thread/process uses system calls such as select, poll or epoll to monitor multiple file descriptors simultaneously.
  2. The system call returns when one or more of the file descriptors is ready for read or write operations.
  3. The main thread/process can then do IO on these prepared descriptors without being blocked.

Advantages and limitations

Advantages:

  • Able to manage a large number of descriptors and use only a small number of threads.
  • Since there are fewer thread/process switching, the efficiency is high.
  • Can scale to very large number of connections, especially using epoll.

Restrictions:

  • Programming using IO multiplexing technology is usually complex.
  • Not all operating systems support all IO multiplexing technologies. For example, epoll is only available on Linux.

Summary

IO multiplexing is a powerful technique for handling large numbers of concurrent network connections. Despite its high programming complexity, it is still the technology of choice for many network applications considering its performance and efficiency in high-concurrency environments.

select()

select() is a classic multiplexed I/O function used to monitor multiple file descriptors (usually socket descriptors) to see if they are ready for reading and writing Or if there are abnormal conditions to be handled. Its main application is in network programming, especially when the application needs to handle multiple concurrent connections or multiple I/O streams.

Function prototype

#include <sys/select.h>

int select(int nfds, fd_set *readfds, fd_set *writefds,
           fd_set *exceptfds, struct timeval *timeout);

Parameter explanation

  • nfds: Used to specify the range of file descriptors to be checked. Specifically, the maximum file descriptor value to be checked plus 1.
  • readfds: A set of file descriptors that an application wants to know if they are ready for reading.
  • writefds: A set of file descriptors that an application wants to know if they are ready to be written to.
  • exceptfds: A set of file descriptors that the application wants to know if an exception has occurred.
  • timeout: Specifies the maximum time for the select() function to wait. If set to NULL, the function waits until a descriptor is ready.

File descriptor set

fd_set is a set data type specifically used for select(). Here are some macros related to it:

  • FD_ZERO(fd_set *set): Clear the file descriptor set.
  • FD_SET(int fd, fd_set *set): Add a file descriptor to the set.
  • FD_CLR(int fd, fd_set *set): Remove a file descriptor from the set.
  • FD_ISSET(int fd, fd_set *set): Check whether the file descriptor is in the set.

Return value

  • Returns a value greater than 0 indicating the number of prepared file descriptors.
  • Returning 0 indicates a timeout and no file descriptors are ready.
  • Returns -1 to indicate an error.

How it works

  1. The application sets readfds, writefds, and exceptfds to instruct select() which file descriptors to monitor.
  2. The application calls the select() function.
  3. The select() function blocks until one of the following conditions is met:
    • There is a file descriptor ready (read, write, or exception).
    • The timeout has expired.
  4. After select() returns, the application can check readfds, writefds, and exceptfds to determine which file descriptors have been Be prepared and act accordingly.

Advantages and disadvantages of using select()

Advantages:

  • Can handle multiple descriptors.
  • Can be used cross-platform (UNIX/Linux and Windows are supported).

Disadvantages:

  • All file descriptors are saved in an array, which is not efficient, especially when the number of descriptors is large.
  • The fd_set size is fixed, which limits the maximum number of descriptors that select() can handle.
  • If a descriptor is ready but not processed by the application, select() will return the descriptor again on the next call, possibly resulting in an invalid select() wake-up.

Despite this, select() is still widely used in many applications, especially in the early days of network programming. Modern systems may prefer to use other multiplexing mechanisms, such as poll(), epoll() (Linux) or kqueue() (BSD).

Example

This example uses select() to implement a Hello server. When a client connects and sends data, the server responds with a simple “Hello, World!” HTTP response regardless of what request is sent.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <arpa/inet.h>
#include <sys/select.h>

#definePORT 8080
#defineBUFFER_SIZE 2048
#define MAX_CLIENTS 5

const char *HTTP_RESPONSE = "HTTP/1.1 200 OK\r\
"
                            "Content-Type: text/plain\r\
"
                            "Content-Length: 13\r\
"
                            "Connection: close\r\
\r\
"
                            "Hello, World!";

int main() {<!-- -->
    int server_socket, client_socket, max_sd, sd, activity;
    int client_sockets[MAX_CLIENTS] = {<!-- -->0};
    struct sockaddr_in server_address, client_address;
    socklen_t client_len;
    char buffer[BUFFER_SIZE];
    fd_set read_fds;

    server_socket = socket(AF_INET, SOCK_STREAM, 0);
    if (server_socket == -1) {<!-- -->
        perror("Could not create socket");
        exit(1);
    }

    server_address.sin_family = AF_INET;
    server_address.sin_addr.s_addr = INADDR_ANY;
    server_address.sin_port = htons(PORT);

    if (bind(server_socket, (struct sockaddr *) & amp;server_address, sizeof(server_address)) == -1) {<!-- -->
        perror("Bind failed");
        exit(1);
    }

    if (listen(server_socket, 3) == -1) {<!-- -->
        perror("Listen failed");
        exit(1);
    }

    printf("Waiting for connections on port %d...\
", PORT);

    while (1) {<!-- -->
        FD_ZERO( & amp;read_fds);
        FD_SET(server_socket, &read_fds);
        max_sd = server_socket;

        for (int i = 0; i < MAX_CLIENTS; i + + ) {<!-- -->
            sd = client_sockets[i];
            if (sd > 0)
                FD_SET(sd, &read_fds);
            if (sd > max_sd)
                max_sd = sd;
        }

        activity = select(max_sd + 1, & amp;read_fds, NULL, NULL, NULL);

        if ((activity < 0) & amp; & amp; (errno != EINTR)) {<!-- -->
            perror("Select error");
        }

        if (FD_ISSET(server_socket, & amp;read_fds)) {<!-- -->
            client_len = sizeof(client_address);
            client_socket = accept(server_socket, (struct sockaddr *) & amp;client_address, & amp;client_len);
            if (client_socket < 0) {<!-- -->
                perror("Accept error");
                exit(1);
            }

            printf("New connection from %s:%d\
", inet_ntoa(client_address.sin_addr), ntohs(client_address.sin_port));

            for (int i = 0; i < MAX_CLIENTS; i + + ) {<!-- -->
                if (client_sockets[i] == 0) {<!-- -->
                    client_sockets[i] = client_socket;
                    break;
                }
            }
        }

        for (int i = 0; i < MAX_CLIENTS; i + + ) {<!-- -->
            sd = client_sockets[i];
            if (FD_ISSET(sd, & amp;read_fds)) {<!-- -->
                int read_size = recv(sd, buffer, sizeof(buffer), 0);
                if (read_size == 0) {<!-- -->
                    getpeername(sd, (struct sockaddr*) & amp;client_address, & amp;client_len);
                    printf("Client disconnected: %s:%d\
", inet_ntoa(client_address.sin_addr), ntohs(client_address.sin_port));

                    close(sd);
                    client_sockets[i] = 0;
                } else {<!-- -->
                    send(sd, HTTP_RESPONSE, strlen(HTTP_RESPONSE), 0);
                    // buffer[read_size] = '\0';
                    // send(sd, buffer, strlen(buffer), 0);
                }
            }
        }
    }

    close(server_socket);
    return 0;
}

This example creates a server that uses select() to monitor connection requests and client data. When a new client connects to the server, it adds the client’s socket to the client sockets array. When the client sends data, the server returns a “Hello, World!” HTTP response. When a client disconnects, it removes that client’s socket from the array.

Open a new terminal and use curl to send an HTTP request. You will see the HTTP response returned by the server:

$ curl http://localhost:8080
Hello, World!

poll()

The poll() function is another multiplexed I/O tool that monitors multiple file descriptors to see if they are ready for reading, writing, or if there are exception conditions pending. Compared to select(), poll() provides better scalability, especially when dealing with large numbers of file descriptors.

Function prototype

#include <poll.h>

int poll(struct pollfd *fds, nfds_t nfds, int timeout);

Parameter explanation

  • fds: is a pointer to an array of pollfd structures, which contains information about the file descriptors to be monitored.
  • nfds: is the number of items in the fds array.
  • timeout: Wait timeout in milliseconds. If -1, poll() will wait indefinitely.

pollfdstructure

This structure is defined in the header file and contains the following fields:

struct pollfd {<!-- -->
    int fd; /* file descriptor */
    short events; /* Events to monitor */
    short revents; /* actual events */
};
  • fd: The file descriptor to monitor.
  • events: Bitmask of events to monitor. Can be a combination of the following values:
    • POLLIN: The data is readable.
    • POLLOUT: Data can be written.
    • POLLERR: Error condition.
    • POLLHUP: Hang.
    • POLLNVAL: The descriptor is not an open file.
  • revents: Input/output parameter, when poll() returns, the system will set this field to indicate which events actually occurred.

Return value

  • If one or more file descriptors are ready, returns the number of ready file descriptors.
  • If timeout occurs, 0 is returned.
  • If an error occurs, -1 is returned.

How it works

  1. The application initializes the pollfd structure array and sets the file descriptors and events to be monitored.
  2. The application calls the poll() function.
  3. The poll() function blocks until one of the following conditions is met:
    • One or more file descriptors are ready.
    • The timeout has expired.
  4. After poll() returns, the application can check the revents field in the pollfd structure to determine which file descriptors have been prepared and respond accordingly operation.

Poll() Advantages and Disadvantages

Advantages:

  • In contrast to select(), poll() is not restricted by a fixed-size set of file descriptors.
  • poll() provides a more intuitive interface to explicitly specify the required events for each file descriptor.

Disadvantages:

  • In a large number of file descriptors, although poll() can handle any number of file descriptors, it must traverse the entire file descriptor list, which may cause efficiency issues.
  • On some systems, poll() may not perform as well as more advanced multiplexing mechanisms (such as Linux’s epoll).

Overall, poll() provides a more flexible way to monitor file descriptor multiplexing than select(), but is less efficient when handling large numbers of active When connecting, you may also want to consider using more advanced multiplexing techniques.

Example

The following is a simple example of using poll(). This example is also a HELLO server.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <poll.h>

#definePORT 8080
#defineBUFFER_SIZE 2048
#define MAX_CLIENTS 5

const char *HTTP_RESPONSE = "HTTP/1.1 200 OK\r\
"
                            "Content-Type: text/plain\r\
"
                            "Content-Length: 13\r\
"
                            "Connection: close\r\
\r\
"
                            "Hello, World!";

int main() {<!-- -->
    int server_socket, client_socket;
    struct sockaddr_in server_address, client_address;
    socklen_t client_len;
    char buffer[BUFFER_SIZE];

    struct pollfd fds[MAX_CLIENTS + 1];

    server_socket = socket(AF_INET, SOCK_STREAM, 0);
    if (server_socket == -1) {<!-- -->
        perror("Could not create socket");
        exit(1);
    }

    server_address.sin_family = AF_INET;
    server_address.sin_addr.s_addr = INADDR_ANY;
    server_address.sin_port = htons(PORT);

    if (bind(server_socket, (struct sockaddr *) & amp;server_address, sizeof(server_address)) == -1) {<!-- -->
        perror("Bind failed");
        exit(1);
    }

    if (listen(server_socket, 3) == -1) {<!-- -->
        perror("Listen failed");
        exit(1);
    }

    printf("Waiting for connections on port %d...\
", PORT);

    fds[0].fd = server_socket;
    fds[0].events = POLLIN;

    for (int i = 1; i <= MAX_CLIENTS; i + + ) {<!-- -->
        fds[i].fd = -1; // initially all clients are -1
    }

    while (1) {<!-- -->
        int activity = poll(fds, MAX_CLIENTS + 1, -1); // infinite timeout

        if (activity < 0) {<!-- -->
            perror("Poll error");
            continue;
        }

        if (fds[0].revents & amp; POLLIN) {<!-- -->
            client_len = sizeof(client_address);
            client_socket = accept(server_socket, (struct sockaddr *) & amp;client_address, & amp;client_len);

            if (client_socket < 0) {<!-- -->
                perror("Accept error");
                continue;
            }

            printf("New connection from %s:%d\
", inet_ntoa(client_address.sin_addr), ntohs(client_address.sin_port));

            for (int i = 1; i <= MAX_CLIENTS; i + + ) {<!-- -->
                if (fds[i].fd == -1) {<!-- -->
                    fds[i].fd = client_socket;
                    fds[i].events = POLLIN;
                    break;
                }
            }
        }

        for (int i = 1; i <= MAX_CLIENTS; i + + ) {<!-- -->
            if (fds[i].fd == -1) continue;

            if (fds[i].revents & amp; POLLIN) {<!-- -->
                int read_size = recv(fds[i].fd, buffer, sizeof(buffer), 0);
                if (read_size == 0) {<!-- -->
                    getpeername(fds[i].fd, (struct sockaddr*) & amp;client_address, & amp;client_len);
                    printf("Client disconnected: %s:%d\
", inet_ntoa(client_address.sin_addr), ntohs(client_address.sin_port));

                    close(fds[i].fd);
                    fds[i].fd = -1; // mark this client as -1 again
                } else {<!-- -->
                    send(fds[i].fd, HTTP_RESPONSE, strlen(HTTP_RESPONSE), 0);
                    // buffer[read_size] = '\0';
                    // send(fds[i].fd, buffer, strlen(buffer), 0);
                }
            }
        }
    }

    close(server_socket);
    return 0;
}

This code creates a server that uses poll() to monitor connection requests and data from clients. When a client connects to the server, it adds its socket to poll()‘s watch array. When the client sends data, the server returns a “Hello, World!” HTTP response. When the client disconnects, it removes the socket from the monitoring array.

Open a new terminal and use curl to send an HTTP request. You will see the HTTP response returned by the server:

$ curl http://localhost:8080
Hello, World!

epoll()

epoll is a Linux-specific I/O multiplexing mechanism that provides a more efficient way to monitor the activities of multiple file descriptors. Unlike traditional select() and poll(), epoll uses an event-driven approach and only returns those file descriptors that are actually active. , instead of checking the status of each file descriptor. This makes epoll very efficient when handling large numbers of file descriptors.

Basic concepts and functions

  1. epoll_create(): Create a new epoll instance.
int epoll_create(int size);

Although this function has a size parameter, in newer Linux versions, it is not actually useful and is only for backward compatibility.

  1. epoll_ctl(): Used to add, delete or modify monitored file descriptors to the epoll instance.
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
  • epfd: The file descriptor of the epoll instance returned by epoll_create().
  • op: Operation type, which can be the following values: EPOLL_CTL_ADD (add), EPOLL_CTL_MOD (modify) or EPOLL_CTL_DEL( delete).
  • fd: The file descriptor to be operated on.
  • event: Pointer to a epoll_event structure describing the event of interest on fd and how to return it.
  1. epoll_wait(): Wait for one or more file descriptors in the epoll instance to become active.
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
  • epfd: The file descriptor of the epoll instance returned by epoll_create().
  • events: Array of epoll_event structures used to return active events.
  • maxevents: The size of the events array.
  • timeout: Timeout in milliseconds. -1 means waiting infinitely.

epoll_event structure

struct epoll_event {<!-- -->
    uint32_t events; /* Epoll events */
    epoll_data_t data; /* User data variable */
};
  • events: is a bit set indicating the events of interest and the returned events, for example: EPOLLIN, EPOLLOUT, EPOLLERRetc.
  • data: is a union that can contain user-defined data, such as file descriptors, pointers, etc.

How it works

  1. Create an epoll instance.
  2. Use epoll_ctl() to add or modify file descriptors and their related events to the instance.
  3. Use epoll_wait() to wait for an event to occur.
  4. When epoll_wait() returns, handle active events.
  5. Repeat steps 3 and 4.

Advantages

  1. Scalability: Compared to select and poll, epoll can handle a large number of concurrent connections.
  2. Efficiency: epoll only cares about active file descriptors instead of checking all file descriptors every time.
  3. No fixed limit: Unlike select‘s FD_SETSIZE limit, epoll‘s limit is usually determined by the system’s maximum number of file descriptors.

Disadvantages

  1. Linux-specific: epoll is Linux-specific and is not portable to other UNIX systems or Windows.

In general, epoll is an ideal choice for high-concurrency server applications under Linux. It solves the problems of select and poll when there are a large number of active connections. Performance bottleneck problem.

Example

The following is a simple example of using epoll(). This example is also a HELLO server.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <sys/epoll.h>

#definePORT 8080
#defineBUFFER_SIZE 2048
#defineMAX_EVENTS 10

const char *HTTP_RESPONSE = "HTTP/1.1 200 OK\r\
"
                            "Content-Type: text/plain\r\
"
                            "Content-Length: 13\r\
"
                            "Connection: close\r\
\r\
"
                            "Hello, World!";

int main() {<!-- -->
    int server_socket, client_socket;
    struct sockaddr_in server_address, client_address;
    socklen_t client_len;
    char buffer[BUFFER_SIZE];

    int epoll_fd = epoll_create1(0);
    if (epoll_fd == -1) {<!-- -->
        perror("epoll_create1");
        exit(EXIT_FAILURE);
    }

    server_socket = socket(AF_INET, SOCK_STREAM, 0);
    if (server_socket == -1) {<!-- -->
        perror("Could not create socket");
        exit(1);
    }

    server_address.sin_family = AF_INET;
    server_address.sin_addr.s_addr = INADDR_ANY;
    server_address.sin_port = htons(PORT);

    if (bind(server_socket, (struct sockaddr *) & amp;server_address, sizeof(server_address)) == -1) {<!-- -->
        perror("Bind failed");
        exit(1);
    }

    if (listen(server_socket, 10) == -1) {<!-- -->
        perror("Listen failed");
        exit(1);
    }

    printf("Waiting for connections on port %d...\
", PORT);

    struct epoll_event ev, events[MAX_EVENTS];
    ev.events = EPOLLIN;
    ev.data.fd = server_socket;
    if (epoll_ctl(epoll_fd, EPOLL_CTL_ADD, server_socket, & amp;ev) == -1) {<!-- -->
        perror("epoll_ctl: server_socket");
        exit(EXIT_FAILURE);
    }

    while (1) {<!-- -->
        int nfds = epoll_wait(epoll_fd, events, MAX_EVENTS, -1);
        if (nfds == -1) {<!-- -->
            perror("epoll_wait");
            exit(EXIT_FAILURE);
        }

        for (int n = 0; n < nfds; + + n) {<!-- -->
            if (events[n].data.fd == server_socket) {<!-- -->
                client_socket = accept(server_socket, (struct sockaddr *) & amp;client_address, & amp;client_len);
                if (client_socket == -1) {<!-- -->
                    perror("accept");
                    continue;
                }
                printf("New connection from %s:%d\
", inet_ntoa(client_address.sin_addr), ntohs(client_address.sin_port));
                
                ev.events = EPOLLIN;
                ev.data.fd = client_socket;
                if (epoll_ctl(epoll_fd, EPOLL_CTL_ADD, client_socket, & amp;ev) == -1) {<!-- -->
                    perror("epoll_ctl: client_socket");
                    exit(EXIT_FAILURE);
                }
            } else {<!-- -->
                int read_size = recv(events[n].data.fd, buffer, sizeof(buffer), 0);
                if (read_size <= 0) {<!-- -->
                    if (read_size == 0) {<!-- --> // client disconnected
                        printf("Client disconnected\
");
                    } else {<!-- -->
                        perror("recv");
                    }
                    close(events[n].data.fd); // close the client socket
                } else {<!-- -->
                    send(events[n].data.fd, HTTP_RESPONSE, strlen(HTTP_RESPONSE), 0);
                    // buffer[read_size] = '\0';
                    // send(events[n].data.fd, buffer, strlen(buffer), 0);
                }
            }
        }
    }

    close(server_socket);
    return 0;
}

This code creates a server that uses epoll() to listen for connection requests and data from clients. When a client connects to the server, it adds its socket to epoll()‘s watch set. When the client sends data, the server returns a “Hello, World!” HTTP response. When the client disconnects, the server removes the socket from epoll()‘s monitoring set.

Open a new terminal and use curl to send an HTTP request. You will see the HTTP response returned by the server:

$ curl http://localhost:8080
Hello, World!

For detailed usage of the curl command, please go to: Linux-curl command

For information on how to use common functions in network programming, readers are invited to go to: Linux- A Preliminary Study on Network Programming