39 | HTTP performance optimization aspects (Part 1)

Since we want to optimize performance, we need to know: What is performance? What indicators does it have, how should it be measured, and what measures should be taken to optimize it?

“Performance” is actually a complex concept. Different people and different application scenarios will have different definitions of it. For HTTP, it is a very complex system with many roles, so it is difficult to describe the performance clearly in one or two simple words.

Let’s start with the most basic “request-response” model of HTTP. There are two roles in this model: client and server, as well as the intermediate transmission link. You can look at these three parts when examining performance.

HTTP Server

Let’s take a look at the server first. It generally runs on the Linux operating system and uses Web server software such as Apache and Nginx to provide external services. Therefore, the meaning of performance is its service capability, that is, processing as much and as quickly as possible. User’s request.

There are three main indicators for measuring server performance: Throughput (requests per second), Concurrency (concurrency) and Response time (time per request ).

Throughput is what we often call RPS. The number of requests per second is also called TPS and QPS. It is the most basic performance indicator of the server. The higher the RPS, the better the performance of the server.

The number of concurrency reflects the load capacity of the server, that is, the number of clients that the server can support at the same time. Of course, the more, the better, so that it can serve more users.

The response time reflects the processing power of the server, that is, the speed. The shorter the response time, the more users the server can provide services per unit time, improving throughput and concurrency.

In addition to the three basic performance indicators above, the server must also consider the occupancy of system resources such as CPU, memory, hard disk, and network card. Too high or too low utilization may cause problems.

During the years of development of HTTP, many mature tools have emerged to measure the performance indicators of these servers, including open source, commercial, command line, and graphical tools.

On Linux, the most commonly used performance testing tool may be ab (Apache Bench). For example, the following command specifies the concurrency number of 100 and sends a total of 10,000 requests:

ab -c 100 -n 10000 'http://www.xxx.com'

In terms of system resource monitoring, Linux also comes with many tools. Commonly used tools include uptime, top, vmstat, netstat, sar, etc. You may be more familiar with it than me. Here are a few simple examples:

top #View CPU and memory usage
vmstat 2 #Check system status every 2 seconds
sar -n DEV 2 #View the traffic of all network cards, check regularly for 2 seconds

By understanding these performance indicators, we know the direction of server performance optimization: rational use of system resources, improving server throughput and concurrency, and reducing response time.

HTTP client

After looking at the server’s performance indicators, let’s take a look at how to measure the client’s performance.

The client is a consumer of information, and all data must be obtained from the server through the network, so its most basic performance indicator is “Latency” (latency).

I briefly introduced delay when talking about HTTP/2 before. The so-called “delay” is actually “waiting”, the time spent waiting for data to arrive at the client. However, because the HTTP transmission link is very complex, there are many reasons for delay.

First of all, we must remember that there is an “insurmountable” obstacle – the speed of light. The delay caused by geographical distance is insurmountable. Accessing a website thousands of kilometers away will obviously have a greater delay.

Secondly, the second factor is bandwidth, which also includes various bandwidths of cables, WiFi, 4G, and intra-operator networks and inter-operator networks when accessing the Internet. Each of these may become a bottleneck for data transmission. Reduce transfer speed and increase latency.

The third factor is DNS query. If the domain name is not cached locally, it must initiate a query to the DNS system, causing a series of network communication costs, and the client can only wait before obtaining the IP address and cannot access the website.

The fourth factor is the TCP handshake. You should be familiar with it. The connection must be established after three packets of SYN, SYN/ACK, and ACK. The delay it brings is determined by the speed of light and bandwidth.

After the TCP connection is established, normal data sending and receiving begins, followed by parsing HTML, executing JavaScript, typesetting and rendering, etc., which will also take some time. However, they no longer belong to HTTP, so they are outside the scope of today’s discussion.

When I talked about HTTPS before, I introduced a special website “SSLLabs” (https://www.ssllabs.com/), and for HTTP performance optimization, there is also a special test website “WebPageTest” (https://www.webpagetest. org/). Its characteristic is that it has established many test points around the world. You can choose the geographical location, model, operating system and browser to initiate testing. It is very convenient and easy to use.

The final result of the website test is an intuitive “Waterfall Chart” that clearly lists the order and time consumption of loading all resources on the page. For example, the picture below is a test of the GitHub homepage.

The developer tools that come with browsers such as Chrome can also be used to observe client latency indicators. The left side of the panel shows the specific time consumed by each URI, and the right side of the panel is also a similar waterfall chart.

Click on a URI, and a small “waterfall chart” will be displayed on the Timing page, which is a detailed breakdown of the resource consumption time. The reasons for the delay are clearly listed, such as the following picture:

What do these indicators in the picture mean? explain:

Because of “Head of Line Blocking”, the browser can open up to 6 concurrent connections (HTTP/1.1) for each domain name. When there are many links on the page, they must wait in line (Queued, Queueing). Here it waits for 1.62 seconds. , and then it is officially processed by the browser;

The browser needs to pre-allocate resources and schedule connections, which takes 11.56 milliseconds (Stalled);

The domain name must be resolved before connecting. Because there is a local cache, it only consumes 0.41 milliseconds (DNS Lookup);

The cost of establishing a connection with the website server is very high, taking a total of 270.87 milliseconds, of which 134.89 milliseconds are used for the TLS handshake, so the TCP handshake time is 135.98 milliseconds (Initial connection, SSL);

The actual sending of data is very fast, taking only 0.11 milliseconds (Request sent);

After that, we wait for the server’s response. The proper noun is TTFB (Time To First Byte), which is the “first byte response time”, which includes the server’s processing time and network transmission time, which took 124.2 milliseconds;

Receiving data is also very fast, taking 3.58 milliseconds (Content Dowload).

As you can see from this picture, the delay time in an HTTP “request-response” process is very alarming, accounting for almost 99% of the total time of 415.04 milliseconds.

Therefore, the key to client-side HTTP performance optimization is: reducing latency.

HTTP transport link

Taking the basic “request-response” model of HTTP as the starting point, we just got some indicators of HTTP performance optimization. Now, let’s zoom in to the “real world” and look at the transmission link between the client and the server. , it is also the key to HTTP performance.

Remember the Internet diagram from Chapter 8? I changed it slightly and divided it into several areas, which are called “The first kilometer”, “The middle kilometer” and “The last kilometer” (mile in the original English text, mile).

“The first mile” refers to the exit of the website, that is, the transmission line through which the server connects to the Internet. Its bandwidth directly determines the website’s external service capabilities, that is, throughput and other indicators. Obviously, to optimize performance, we should increase investment in this “first mile”, purchase as much bandwidth as possible, and connect to more operator networks.

The “middle kilometer” is the actual Internet composed of many small networks. In fact, it is far more than “one kilometer”, but a very, very large and complex network. Geographical distance and network interconnection have seriously affected the transmission speed. Fortunately, there is a “good helper” of HTTP – CDN, which can help the website span “thousands of mountains and rivers”, making this distance really seem like only “one kilometer”.

The “last mile” is the entrance for users to access the Internet. For fixed-line users, it is optical fiber and network cables. For mobile users, it is WiFi and base stations. In the past, it was the main bottleneck of client performance, with high latency and small bandwidth. However, with the popularization of 4G and high-speed broadband in recent years, the situation of the “last mile” has improved a lot and is no longer the main factor restricting performance.

In addition to these “three kilometers”, I personally think there is another “zero mile”, which is the web service system within the website. It is actually a small network (of course it may be very large). Intermediate data processing and transmission will cause delays and increase the server’s response time, which is also an optimization point that cannot be ignored.

In the above entire Internet transmission link, we have no control over the “last mile” at the end, so we can only work hard on the “zero mile”, “first mile” and “middle mile”. Increase bandwidth, reduce latency, and optimize transmission speed.

Summary

Performance optimization is a complex concept, which can be broken down into server performance optimization, client performance optimization and transmission link optimization in HTTP;

The server has three main performance indicators: throughput, number of concurrencies, and response time. In addition, resource utilization needs to be considered;

The basic performance indicator of the client is latency, and influencing factors include geographical distance, bandwidth, DNS query, TCP handshake, etc.;

The transmission link from the server to the client can be divided into three parts. What we can optimize are the first two parts, which are the “first mile” and the “middle mile”;

There are many tools to measure these indicators. On the server side, there are ab, top, sar, etc., and on the client side, you can use test websites and browser developer tools.

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Network skill treeProtocol supporting applicationsHTTP protocol 42180 people are learning the system