34 | What you must know about Linux networking (Part 2)

In the previous section, I took you to learn the basic principles of Linux networking. To briefly review, the Linux network builds its network protocol stack based on the TCP/IP model. The TCP/IP model consists of four layers: application layer, transport layer, network layer, and network interface layer. This is also the core component of the Linux network stack.

When an application sends a data packet through the socket interface, it must first be processed layer by layer from top to bottom in the network protocol stack, and then finally sent to the network card for sending; when receiving a data packet, it must first go through the network stack from bottom to top. It is processed layer by layer and finally sent to the application.

After understanding the basic principles and sending and receiving processes of Linux networks, you must be eager to know how to observe the performance of the network. Specifically, what metrics can be used to measure Linux network performance?

Performance indicators

In fact, we usually measure network performance using indicators such as bandwidth, throughput, delay, and PPS (Packet Per Second).

Bandwidth represents the maximum transmission rate of the link, usually in b/s (bits per second).

Throughput represents the amount of data successfully transmitted per unit time, usually in b/s (bits/second) or B/s (bytes/second). Throughput is limited by bandwidth, and throughput/bandwidth is the utilization of that network.

Delay means the time delay from the time the network request is sent until the remote response is received. In different scenarios, this indicator may have different meanings. For example, it can represent the time it takes to establish a connection (such as TCP handshake delay), or the time it takes for a data packet to go round (such as RTT).

PPS is the abbreviation of Packet Per Second (packet/second), which represents the transmission rate in network packets. PPS is usually used to evaluate the forwarding capability of a network, such as hardware switches, which can usually achieve linear forwarding (that is, PPS can reach or be close to the theoretical maximum). Forwarding based on Linux servers is easily affected by the size of network packets.

In addition to these indicators, network availability (whether the network can communicate normally), number of concurrent connections (number of TCP connections), packet loss rate (loss Packet percentage), Retransmission rate (proportion of network packets retransmitted), etc. are also commonly used performance indicators.

Next, please open a terminal, log in to the server via SSH, and then explore and observe these performance indicators with me.

Network Configuration The first step in analyzing network problems is usually to review the configuration and status of network interfaces. You can use the ifconfig or ip command to view the network configuration. I personally recommend using the ip tool because it provides richer functions and an easier-to-use interface.

ifconfig and ip belong to the software packages net-tools and iproute2 respectively, iproute2 is the next generation of net-tools. Usually they are installed by default in the distribution. But if you can’t find the ifconfig or ip command, you can install these two packages.

Taking the network interface eth0 as an example, you can run the following two commands to view its configuration and status:

$ ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
      inet 10.240.0.30 netmask 255.240.0.0 broadcast 10.255.255.255
      inet6 fe80::20d:3aff:fe07:cf2a prefixlen 64 scopeid 0x20<link>
      ether 78:0d:3a:07:cf:3a txqueuelen 1000 (Ethernet)
      RX packets 40809142 bytes 9542369803 (9.5 GB)
      RX errors 0 dropped 0 overruns 0 frame 0
      TX packets 32637401 bytes 4815573306 (4.8 GB)
      TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
?
$ ip -s addr show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  link/ether 78:0d:3a:07:cf:3a brd ff:ff:ff:ff:ff:ff
  inet 10.240.0.30/12 brd 10.255.255.255 scope global eth0
      valid_lft forever preferred_lft forever
  inet6 fe80::20d:3aff:fe07:cf2a/64 scope link
      valid_lft forever preferred_lft forever
  RX: bytes packets errors dropped overrun mcast
   9542432350 40809397 0 0 0 193
  TX: bytes packets errors dropped carrier collsns
   4815625265 32637658 0 0 0 0

As you can see, the metrics output by ifconfig and ip commands are basically the same, but the display format is slightly different. For example, they all include network interface status flags, MTU size, IP, subnet, MAC address, and statistics on network packet sending and receiving.

The meaning of these specific indicators is explained in detail in the document. However, there are several indicators closely related to network performance that require your special attention.

First, the status flag of the network interface. RUNNING in ifconfig output, or LOWER_UP in ip output, both indicate that the physical network is connected, that is, the network card is connected to the switch or router. If you can’t see them, it usually means the network cable is unplugged.

Second, the size of the MTU. The default MTU size is 1500. Depending on the network architecture (such as whether an overlay network such as VXLAN is used), you may need to increase or decrease the MTU value.

Third, the IP address, subnet, and MAC address of the network interface. These are required for the network to function properly and you need to make sure it is configured correctly.

Fourth, the number of bytes, packets, errors and packet losses sent and received by the network, especially when the errors, dropped, overruns, carriers and collisions of the TX and RX parts are not 0, it usually indicates that a network I/O has occurred. O question. in:

errors indicates the number of error packets, such as parity errors, frame synchronization errors, etc.;

dropped indicates the number of dropped packets, that is, the packet has been received in the Ring Buffer, but was lost due to insufficient memory or other reasons;

Overruns represents the number of over-limit data packets, that is, the network I/O speed is too fast, causing the data packets in the Ring Buffer to be processed too late (the queue is full), resulting in packet loss;

carrier indicates the number of packets in which carrier errors occurred, such as duplex mode mismatch, physical cable problems, etc.;

collisions represents the number of collision packets.

Socket information

ifconfig and ip only show the statistics of packets sent and received by the network interface, but in actual performance problems, we must also pay attention to the statistics in the network protocol stack. You can use netstat or ss to view socket, network stack, network interface, and routing table information.

I personally recommend using ss to query network connection information, because it provides better performance (faster speed) than netstat.

For example, you can execute the following command to query socket information:

# head -n 3 means only displaying the first 3 lines
# -l means only display listening sockets
# -n means display numeric address and port (instead of name)
# -p means display process information
$ netstat -nlp | head -n 3
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN 840/systemd-resolve

# -l means only display listening sockets
# -t means only display TCP sockets
# -n means display numeric address and port (instead of name)
# -p means display process information
$ ss -ltnp | head -n 3
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=840,fd=13))
LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=1459,fd=3))

The output of netstat and ss is also similar, both showing the socket status, receive queue, send queue, local address, remote address, process PID and process name, etc.

Among them, the receive queue (Recv-Q) and the send queue (Send-Q) require your special attention, and they should usually be 0. When you find that they are not 0, it means that there is accumulation of network packets. Of course, also note that they have different meanings in different socket states.

When the socket is in the connected state (Established),

Recv-Q represents the number of bytes in the socket buffer that have not been taken away by the application (that is, the receive queue length).

Send-Q represents the number of bytes that have not been acknowledged by the remote host (ie, the length of the send queue).

When the socket is in the listening state (Listening),

Recv-Q represents the length of the fully connected queue.

Send-Q represents the maximum length of the full connection queue.

The so-called full connection means that the server receives the client’s ACK, completes the TCP three-way handshake, and then moves the connection to the full connection queue. These sockets in the full connection still need to be taken away by the accept() system call before the server can actually start processing the client’s request.

Corresponding to the fully connected queue, there is also a semi-connected queue. The so-called semi-connection refers to a connection that has not completed the TCP three-way handshake, and the connection is only halfway through. After the server receives the SYN packet from the client, it will put the connection in the semi-connection queue, and then send the SYN + ACK packet to the client.

Protocol stack statistics

Similarly, you can use netstat or ss to view protocol stack information:

$ netstat -s
...
Tcp:
    3244906 active connection openings
    23143 passive connection openings
    115732 failed connection attempts
    2964 connection resets received
    1 connections established
    13025010 segments received
    17606946 segments sent out
    44438 segments retransmitted
    42 bad segments received
    5315 resets sent
    InCsumErrors: 42
...

$ss-s
Total: 186 (kernel 1446)
TCP: 4 (estab 1, closed 0, orphaned 0, synrecv 0, timewait 0/0), ports 0

Transport Total IP IPv6
* 1446 - -
RAW 2 1 1
UDP 2 2 0
TCP 4 3 1
...

The statistics of these protocol stacks are very intuitive. ss only displays brief statistics such as connected, closed, orphan sockets, etc., while netstat provides more detailed network protocol stack information.

For example, the above netstat output example shows various information such as active connections, passive connections, failed retries, number of segments sent and received, etc. of the TCP protocol.

Network throughput and PPS

Next, let’s take a look at how to check the current network throughput and PPS of the system. Here, I recommend using our old friend sar, which we have used many times in the previous CPU, memory and I/O modules.

Add the -n parameter to sar to view network statistics, such as network interface (DEV), network interface error (EDEV), TCP, UDP, ICMP, etc. Execute the following command to get network interface statistics:

# The number 1 means outputting a set of data every 1 second
$ sar -n DEV 1
Linux 4.15.0-1035 (ubuntu) 01/06/19 _x86_64_ (2 CPU)

13:21:40 IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
13:21:41 eth0 18.00 20.00 5.79 4.25 0.00 0.00 0.00 0.00
13:21:41 docker0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
13:21:41 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

There are many indicators output here. Let me briefly explain their meaning.

rxpck/s and txpck/s are the received and sent PPS respectively, in packets/second.

rxkB/s and txkB/s are the receiving and sending throughputs respectively, in KB/second.

rxcmp/s and txcmp/s are the number of compressed data packets received and sent respectively, in packets/second.

%ifutil is the usage of the network interface, which is (rxkB/s + txkB/s)/Bandwidth in half-duplex mode and max(rxkB/s, txkB/s)/Bandwidth in full-duplex mode.

Among them, Bandwidth can be queried using ethtool. Its unit is usually Gb/s or Mb/s, but note that the lowercase letter b here represents bits rather than bytes. The units we usually mention for Gigabit network cards, 10 Gigabit network cards, etc. are also bits. As you can see below, my eth0 network card is a Gigabit network card:

$ ethtool eth0 | grep Speed
  Speed: 1000Mb/s

Connectivity and latency

Finally, we usually use ping to test the connectivity and latency of the remote host, which is based on the ICMP protocol. For example, by executing the following command, you can test the connectivity and latency from the local machine to the IP address 114.114.114.114:

# -c3 means sending ICMP packets three times and then stopping.
$ ping -c3 114.114.114.114
PING 114.114.114.114 (114.114.114.114) 56(84) bytes of data.
64 bytes from 114.114.114.114: icmp_seq=1 ttl=54 time=244 ms
64 bytes from 114.114.114.114: icmp_seq=2 ttl=47 time=244 ms
64 bytes from 114.114.114.114: icmp_seq=3 ttl=67 time=244 ms

--- 114.114.114.114 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 244.023/244.070/244.105/0.034 ms

The output of ping can be divided into two parts.

The first part is information about each ICMP request, including ICMP sequence number (icmp_seq), TTL (time to live, or hop count) and round-trip delay.

The second part is a summary of three ICMP requests.

For example, the above example shows that 3 network packets were sent and 3 responses were received, with no packet loss. This shows that the test host is connected to 114.114.114.114; the average round-trip delay (RTT) is 244ms, that is, from It takes a total of 244ms from sending the ICMP to receiving the confirmation of the 114.114.114.114 reply.

Summary

We usually use bandwidth, throughput, delay and other indicators to measure network performance; accordingly, you can use ifconfig, netstat, ss, sar, ping and other tools to view the performance indicators of these networks.

In the next section, I’ll take you a step further into how Linux networking works using classic C10K and C100K questions.

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. CS entry skill treeLinux introductionFirst introduction to Linux 37426 people are learning the system