Python’s various network request libraries urllib3 requests aiohttp request http and https efficiency comparison, multi-threading, gevent, asyncio comparison, super large thread pool, 2n + 1 thread pool comparison…

The three purposes of this article are not to go astray by just obsessing with concepts. Some people think that there is a set of concepts, but in fact it is not what they think.

This article uses various network request libraries, various concurrency modes, and thread pools of various sizes to test 50,000 requests for a Baidu static page with a small content source code, and test http and https respectively

https://www.baidu.com/content-search.xml
http://www.baidu.com/content-search.xml

The content of the webpage is very small, (it can basically be ruled out that the request is slow, and it is caused by poor network speed and bandwidth).

1. Summarize the performance of various network request libraries of python, including urllib3 and requests and aiohttp

2. Summarize the concurrency efficiency of multi-threaded asyncio gevent

In the case of 3.4-core cpu, compare the concurrent number of 200 thread pools and the concurrent efficiency of 9 thread pools (2 * 4 + 1).

The screenshot of the test is as follows. nb_log will automatically print out the print time, which is convenient for the console to search how many times it runs per second. You must import nb_log.

Test Scenario 1:

urllib3 + ThreadPoolExecutor 200 thread pool + request connection pool + request https

On average, 750 requests are completed per second.

Test plan 2

requests + ThreadPoolExecutor 200 thread pool + request connection pool + request https

On average, 310 requests are completed per second.

Test plan 3

urllib3 + gevent 200 concurrency + connection pool + request https

On average, 270 requests are completed per second.

Test plan 4

urllib3 + thread pool 200 concurrency + connection pool + request http (note that it is http, not https, https will execute more code paths and consume higher cpu)

On average, 1300 requests are completed per second.

Test plan 5

requests + thread pool 200 concurrency + connection pool + request http (note that it is http, not https)

On average, 330 requests are completed per second.

Test plan 6

urllib3 + gevent 200 concurrency + connection pool + request http (note that it is http, not https)

On average, 580 requests are completed per second.

Test plan 7

urllib3 + thread pool 200 concurrency + do not use request connection pool + request https (note that it is https, not http)

On average, 580 requests are completed per second.

Test plan 8

requests + thread pool 200 concurrency + do not use request connection pool + request https (note that it is https, not http)

On average, 120 requests are completed per second.

Test plan 9

aiohttp + asyncio 200 concurrency + use request connection pool + request https (note that it is https, not http)

An average of 990 requests per second are completed.

Test plan 10

aiohttp + asyncio 200 concurrency + use request connection pool + request http (note that it is http, not https)

An average of 1080 requests per second are completed.

Test Scenario 11

aiohttp + asyncio 200 concurrency + no request connection pool + request http (note that no connection pool is used)

On average, 400 requests are completed per second.

Test plan 12

aiohttp + asyncio 200 concurrency + no request connection pool + request https (note that no connection pool is used)

On average, 170 requests are completed per second.

In summary, requesting http is much faster than https

Connection pooling is much faster than not using connection pooling.

gevent is slower than thread pool.

urllib3 is much faster than requests.

asyncio + aiohttp is almost as efficient as thread pool + urllib3, and asyncio will never overwhelm multi-threading by dozens of times. There is no need to worry too much about thread switching consuming a lot of resources in theory. This is just a theory, and it will not be much stronger in real scenarios, and Synchronous programming is much simpler.

The specific reason is that the number of runs of each scenario is different, because the single-core CPU consumed reaches 100%, and the single-core single-process can no longer increase the speed, unless it spends 100,000 to customize a 10GHz CPU with liquid nitrogen cooling.

The requests package get requests the http url, and the actual executed code is 24,000 lines, and the urllib3 package get requests the http url, which actually executes 11,000 lines of code, because the number of lines of code actually executed by urllib3 is only less than half of the requests package, so the performance of urllib3 is obviously better than that of the requests package. requests.

Specifically, how do you know how many lines of code are actually executed behind the running of a function? It is possible to use pycharm to break points and run slowly line by line, but this will count until you doubt your life. Use pip install pysnooper_click_able, and then put the decorator Adding it to the function that needs to be analyzed can count the running track of the code and the total number of lines actually executed. pysnooper_click_able is a god-level black technology decorator.

Let’s calculate the actual number of lines of code executed. A urllib3.get needs to run 12,000 lines of code and run 1,000 requests per second, which is equivalent to running 10 million lines of python code per second, so it’s okay. . Those who know nothing about performance think that a single process can request 1 million urls per second, which is completely wrong by 1000 times, because they did not use pysnooper_click_able to analyze the number of code execution lines, thinking that sending a url request only needs to be executed 10 lines of python code, big mistake. If someone thinks that a single process can request 10,000 URLs per second, no matter what asynchronous or threaded implementation you use, we can bet 10,000 yuan.

Ask for a very fast interface on a 4-core machine, 4 threads compared to 200 threads, 200 threads is much faster. If the interface response takes a long time, for example, it takes 1 second, the time to complete 50,000 requests with 200 threads is far less than the time taken to complete 4-thread requests, and there is no need to filter threads and switch resources.

So be bold, as long as it is not pure computing, if there are 200 threads with IO exceeding 0.5 seconds, be bold, don’t be afraid of the CPU exploding, some people who bought a mac notebook are really afraid of opening a lot of threads and always worry about opening too many threads Yes, the laptop will explode and damage it, don’t worry, I will accompany you if it explodes.

Besides, people who love their notebooks so much, why do they have to spend 13,000 to buy an Apple notebook? The 5800u 8-core 16-thread + 32g + depping20 system notebook is only more than 4,000 yuan. The performance of running code opens 16 processes + each process Open 200 threads internally, start to run the code, and the running speed is far from critical, and those who dare not open multi-threading are afraid that the mac notebook will explode.