Detailed graphic analysis of iPerf3 -A parameters

Table of Contents

  • 1. The official instructions viewed by man iperf3 are as follows:
  • 2. Translate:
  • 3. Use n format to set CPU affinity
    • 3.1. Server
    • 3.2. Client
    • 3.3. Test results
  • 4. Use n,m format to set CPU affinity
    • 4.1, server run command
    • 4.1. The client runs the command
    • 4.2. Test results
  • 6. CPU affinity setting under NUMA architecture

CPU affinity refers to binding a process to run on a certain core, so that the operating system will not dynamically migrate the process to other cores to run.

iperf3 can specify iperf3 to run in a bound CPU core in some operating systems that support CPU affinity settings (such as Linux).

1. The official description viewed by man iperf3 is as follows:

-A, –affinity n/n,m
Set the CPU affinity, if possible (Linux, FreeBSD, and Windows only). On both the client and server you can set the local affinity by using the n form of this argument (where n is a CPU number). In addition, on the client side you can override the server’s affinity for just that one test, using the n,m form of argument. Note that when using this feature, a process will only be bound to a single CPU (as opposed to a set containing potentially multiple CPUs).

2. Translate:

You can use the -A or –affinity parameter, followed by the CPU number (if you don’t know how to check the CPU status of the host, see How to know how many CPUs the current host has and which CPUs).
Sets the CPU affinity, if supported by the operating system (such as Linux, FreeBSD and windows). Using the n format, you can set the local CPU affinity on the server and client (n represents the number of the cpu). Using the n,m format, you can set the client’s own affinity on the client (n means that the client runs on the nth core of the client host) and the affinity of the server during this test (n means the server This test runs on the mth core of the server host). Note that when you use this parameter, the iperf3 process only needs to run on the specified CPU core (for comparison, if it is not set, iperf3 can run on multiple CPU core sets specified by the operating system, usually all of the host CPU core)

3. Use n format to set CPU affinity

The n format setting specifies the affinity setting of the client or server’s own CPU. We use the following command to assign the client and server to CPU No. 2 and CPU No. 2 respectively, and then start the 1Gbps bandwidth test. Note that in order to make the test results more obvious, our client uses an 8-core i8 x86 machine with a single-core frequency of 4.0Ghz, while the server uses a Raspberry Pi 4B and an ARM A53 with a single-core frequency of 1G CPU.

3.1, server

Use the following command to start the server and bind the iperf3 server process to cpu2. After the client starts to play UDP streams, we can see the CPU load displayed by top as shown below. The CPU usage of cpu2 has reached 100%. The occupancy rate of several cores is still very low (CPU0 is slightly occupied because the TCP/IP protocol stack of LINUX runs on CPU0), indicating that the iperf3 process is successfully bound to cpu2.

iperf3 -s -A 2

3.2, Client

Use the following command to bind the No. 2 CPU to send data, and the sending rate is 750Mbps.

/usr/bin/iperf3 -c 192.168.3.60 -u -b 750M -t 100 -A 2

We can see that the CPU usage of cpu2 is significantly higher than that of the other cores. It means that iperf3 runs on the CPU. When we replace the above -A 2 with -A 6, we can see that the CPU usage of cpu6 is significantly higher than that of other cores. Description -A is in effect.

3.3, test results

During the bandwidth test, we can see through the top command that the usage rate of CPU No. 2 on the client side has increased significantly, and the usage rates of other CPU cores have not changed, while the usage rate of CPU No. 2 on the server side has increased significantly. There is no change in CPU core usage. It means that the iperf3 process is successfully bound to the specified CPU no matter on the client or server.

4. Use n,m format to set CPU affinity

This format can only be used on the client, but it will take effect on both the client and the server, and it will only take effect during this test. The server affinity specified by the client with m will cover the original service in this test Affinity settings specified via the n format when the client starts. After the test is completed, the iperf3 client exits automatically. After this test is completed, the server will restart to listen to the specified port (5102 by default) to start the next test process. At this time, the server will return to the cpu affinity specified by the n format on the server when the server is started.

We use the following commands to assign the server to CPU No. 2, and then use the client command to assign the client to CPU No. 1 (n=1) and the server to CPU No. 3 (m = 3), and then start 1Gbps bandwidth test. Note that our client uses a 4-core i3 x86 machine with a single-core frequency of 3.6G, while the server uses a Raspberry Pi 4B ARM A53 CPU with a single-core frequency of 1G.

4.1, server run command

Run the following command to bind the server to CPU No. 2.

iperf3 -s -A 2

4.1, client running command

Run the following command to assign the client to CPU No. 1 (n=1) and the server to CPU No. 3 (m = 3).

/usr/bin/iperf3 -c 192.168.3.60 -u -b 750M -t 100 -A 1,3

4.2, test results

During the bandwidth test, we can see through the top command that the usage rate of CPU No. 3 on the server side has increased significantly, and the usage rates of other CPU cores have not changed much. It means that during this test, the iperf3 on the server switched from the originally designated No. 2 CPU to No. 3 CPU according to the instructions of the client command.

5. Recommended solution for iperf3 CPU affinity configuration
If the affinity is not set, Linux will schedule iperf3 to run on the CPU core that the Linux system thinks is more appropriate according to the current load of each CPU core. This will cause two problems:

  • By default, iperf3 will run on the 0th CPU first, so because the processing of the TCP/IP protocol stack itself is fixed on the 0th CPU, when the test is just started, the load of the 0th CPU will soon become very high. Packet loss caused by too late processing
  • Linux found that the No. 0 CPU was too busy, and during the process of scheduling the iperf3 process to other CPUs, there was a brief packet receiving pause, which caused packet loss during this period.

As shown in the figure below, when the CPU that the server runs on is not specified, we can see the existence of the above two problems (the packet loss rate rises at certain times, and if we look at the TOP correspondingly at this time, we will find that packet loss occurs , there are many CPU switching occurs).

Therefore, it is recommended that on a system with limited CPU capacity (that is, a system where the CPU processing capacity is not enough to maximize the network throughput), specify the CPU affinity from the beginning. Generally speaking, TCP/IP of LINUX The protocol stack of the protocol stack is fixed on the No. 0 CPU core, and the processing of TCP/IP itself will take up a lot of CPU processing power, so Generally speaking, we should avoid the No. 0 CPU.
Then the server and the client can use the following recommended settings (if you know which CPU is the most idle in your system, you can assign the affinity to the most idle CPU), Let the client and the server run on the No. 2 CPU core and No. 3 CPU core of their respective hosts.

iperf3 -s -A 2
iperf3 -c 192.168.3.1 -u -b 1000m -A 3

6. CPU affinity setting under NUMA architecture

If the number of NUMA nodes displayed by your lscpu is greater than 1, it is best to avoid using cross-NUMA CPU cores, that is, let IPERF run on different CPUs under the same NUMA node as the network card/TCPIP protocol stack.