Illustration | Why Python multithreading cannot take advantage of multiple cores

1. Global interpretation lock

For example: Why can’t Python’s multi-threading take advantage of multi-core processors?

Global Interpreter Lock (Global Interpreter Lock) is a mechanism used by computer programming language interpreters to synchronize threads, which allows only one thread to be executed at any time.

Even on multi-core processors, interpreters using GIL only allow one thread to execute at the same time. Common interpreters using GIL include CPython and Ruby MRI.

It can be seen that GIL is not a unique feature of Python. It is a mechanism for interpreted languages to deal with multi-threading issues rather than a language feature.

2.Python interpreter

Python is an interpreter language. Code is executed through the interpreter. There are multiple interpreters for Python, which are developed based on different languages. Each interpreter has different characteristics.

A simplified diagram of the interpretation and execution process of Python programs:

  • CPython

CPython is the mainstream version of the interpreter. This interpreter is written in C language and is also the most widely used interpreter. It can easily interact with C/C++ class libraries, so it is also the most popular interpreter.

  • Jython

A python interpreter written in Java language. It is an interpreter that compiles Python into Java bytecode and then executes it. It can easily interact with Java class libraries.

  • IronPython

Interpret Python code into bytecode running on the .Net platform for execution, similar to the Jython interpreter, which can easily interact with class libraries on the .Net platform. IPython

The interaction effect has been enhanced, but the execution process and functions are the same as CPython.

  • PyPy

A compiler that uses JIT (just-in-time) technology to focus on execution speed and dynamically compile Python code to improve Python’s execution speed.

In the process of PyPy processing python code, the processing of a small number of functions is different from the execution results of CPython. If you want to use PyPy in the project to improve execution efficiency, you must understand the difference between PyPy and CPython in advance.

3.CPython is thread-unsafe

CPython threads are native threads of the operating system, and pthreads in Linux are completely scheduled and executed by the operating system.

pthread itself is not thread-safe and requires users to use locks to achieve safe operation of multi-threads. Therefore, multi-threading in Python under the CPython interpreter must also have thread insecurity issues.

This lays hidden dangers for the use of GIL in the multi-core era.

4.GIL background and challenges

Python was released by Guido van Rossum in 1989. At that time, the main frequency of computers had not yet reached 1G, and all programs were run on single-core computers. It was not until 2005 that multi-core processors were developed by Intel.

Python version release timeline:

4.1 The impact of multi-core on software systems

Gordon Moore predicted in 1965 that the number of components per integrated circuit would double every 18 to 24 months, and its relevance is expected to continue into 2015-2020.

Before Moore’s Law expires, software systems can simply rely on hardware advancements to achieve performance improvements, or they can enjoy performance leaps with only a small amount of improvements.

Starting in 2005, however, clock speed increases and transistor count increases no longer synchronized.

As clock rates have stopped growing or even declined due to the physical nature of processor materials, processor manufacturers have begun packing more execution unit cores into a single chip.

This trend puts increasing pressure on application development and programming language design.

Programmers and programming language decision-makers have to consider how to quickly adapt to multi-core hardware to improve software performance and the market share of programming languages, and Python is no exception.

4.2 The impact of multi-core on CPython

In the single-core era, Guido Van Rossum, who advocates elegance, clarity, and simplicity, chose to implement a global mutex lock at the interpreter level to protect Python objects and achieve better utilization of the single-core CPU. This approach It worked well in the single-core era.

If GIL is not selected when using a single core, developers will need to manage tasks themselves, which will not maximize CPU utilization.

The picture shows Guido Van Rossum, the father of Python:

But with the advent of the multi-core era, an effective way to efficiently utilize CPU cores is to use parallelism. Multi-threading is a good way to fully realize parallelism, but CPython’s GIL hinders the utilization of multi-core CPUs.

4.3 Painful and happy GIL

CPython’s GIL brings convenience to users, and many important Package and language features have been developed based on GIL.

However, the ubiquity of multi-core CPUs and the impact of other languages on Python make GIL seem primitive and crude, and its inability to effectively utilize multi-core processors has become a drawback.

5. Problems with GIL exposure in the multi-core era

To understand the impact of GIL on multi-threaded programs, you need to understand the basic principles of GIL operation.

  • Single-core CPU situation

CPython’s Pthread schedules execution through the operating system scheduling algorithm.

Every time the Python interpreter executes a certain amount of bytecode, or encounters system IO, it will forcefully release the GIL, and then trigger the thread scheduling of the operating system to fully utilize the single-core CPU, and release and re-execute on the single core. The time interval is very short.

  • Multi-core CPU situation

When multi-threads execute in a multi-core situation, if one thread releases the GIL after CPU-A completes execution, threads on other CPUs will compete, but CPU-A may obtain the GIL immediately.

This causes the awakened threads on other CPUs to watch helplessly as the thread on CPU-A executes again, and they can only wait until they are switched to the state to be scheduled again.

This will cause multi-core CPUs to frequently switch threads, consuming resources, but only one thread can get the GIL to actually execute the Python code. This causes multi-threading to be less efficient than single-threaded execution in the case of multi-core CPUs.

This situation is very similar to the thundering herd phenomenon caused by multiple threads listening to the same port in network programming, but it is at the CPU level, and the waste caused is even more extravagant.

6. Practical impact of GIL

  • I/O intensive

Efficient switching by the interpreter is beneficial when executing multiple threads on a single-core CPU.

In I/O-intensive programs such as web crawlers, even if you use multi-threaded programs under GIL control, the performance will not be as bad as you think.

  • CPU intensive

For CPU-intensive computing programs, GIL has a big problem, because CPU-intensive programs themselves do not have much waiting, do not require the intervention of the interpreter, and all tasks can only wait for one core, and other cores cannot be used if they are idle. This looks like a really bad use of multi-core.

7. Abandon and optimize GIL

GIL has always been controversial. For this reason, PEP has tried many times to delete or optimize GIL. However, the complexity of the interpreter itself and the many class libraries under GIL have made GIL removal a distant idea.

  • Remove GIL

An attempt to implement this idea was made in 1999 for Python 1.5 with a free threading patch from Greg Stein.

In this patch, the GIL is completely removed and replaced with fine-grained locks. However, the removal of the GIL brings a certain price to the execution speed of single-threaded programs.

When executed with a single thread, the speed is reduced by approximately 40%. Using two threads shows an improvement in speed, but beyond this improvement, the gain does not increase linearly with the number of cores. Due to the slowdown in execution, this patch was rejected and almost forgotten.

Multi-core was still a fantasy in 1999, but it is extremely difficult to remove the GIL today. The actual effect of removal is unknown. It can only be said that it is too difficult to turn back.

  • Optimize GIL

In 2009 Antoine Pitrou implemented a new GIL in Python 3.2 with some positive results.

This is a major change to the GIL, which counted Python instructions to determine when to abandon the GIL.

A single Python instruction will contain a lot of work. In the new GIL implementation, a fixed timeout is used to instruct the current thread to give up the lock, making switching between threads more predictable.

8.Solutions to GIL defects

As a popular language with strong vitality, python will never sit still in the multi-core era. Even with the limitations of the GIL, there are still many ways to make programs embrace multi-core.

  • Multiple processes

Python 2.6 introduced the MultiProcess library to make up for the shortcomings caused by the GIL in the Threading library. Based on this, a multi-process program was developed. Each process has a separate GIL to avoid competition for GIL between multiple processes, thereby achieving multi-core utilization. However, It also brings some synchronization and communication problems, which are bound to occur.

  • Ctypes

The advantage of CPython is its combination with the C module, so you can use Ctypes to call the C dynamic library to transfer calculations. The C dynamic library does not have GIL to realize the utilization of multi-cores.

  • Coroutine

Coroutines are also a good means. Before Python 3.4, there was no support for coroutines. There were implementations of some third-party libraries, such as gevent and Tornado.

After Python 3.4, the asyncio standard library has been built-in to truly realize the feature of coroutines.

9. Summary

GIL is still the most difficult technical challenge in the Python language. The problem with GIL is not a problem with the programming language itself. Switching to other languages just transfers the problem to the user level. On the contrary, the author of Python tries to transfer this problem to the interpreter for use. The author presents an elegant language.

Although the arrival of the multi-core era has exposed the shortcomings of GIL, Python policymakers and community developers have taken many other measures to embrace multi-core. It is unwise to criticize GIL ignorantly.

Just as production relations must adapt to the development of productivity, it is biased to discuss the advantages and disadvantages of the mechanism regardless of the historical background, so GIL must be treated dialectically.

Finally:

Python learning materials

If you want to learn Python to help you automate your office, or are preparing to learn Python or are currently learning it, you should be able to use the following and get it if you need it.

① Python learning roadmap for all directions, knowing what to learn in each direction
② More than 100 Python course videos, covering essential basics, crawlers and data analysis
③ More than 100 Python practical cases, learning is no longer just theory
④ Huawei’s exclusive Python comic tutorial, you can also learn it on your mobile phone
⑤Real Python interview questions from Internet companies over the years, very convenient for review

There are ways to get it at the end of the article

1. Learning routes in all directions of Python

The Python all-direction route is to organize the commonly used technical points of Python to form a summary of knowledge points in various fields. Its usefulness is that you can find corresponding learning resources according to the above knowledge points to ensure that you learn more comprehensively.

2. Python course video

When we watch videos and learn, we can’t just move our eyes and brain but not our hands. The more scientific learning method is to use them after understanding. At this time, hands-on projects are very suitable.

3. Python practical cases

Optical theory is useless. You must learn to follow along and practice it in order to apply what you have learned to practice. At this time, you can learn from some practical cases.

Four Python Comics Tutorial

Use easy-to-understand comics to teach you to learn Python, making it easier for you to remember and not boring.

5. Internet company interview questions

We must learn Python to find a high-paying job. The following interview questions are the latest interview materials from first-tier Internet companies such as Alibaba, Tencent, Byte, etc., and Alibaba bosses have given authoritative answers. After finishing this set I believe everyone can find a satisfactory job based on the interview information.


This complete version of Python learning materials has been uploaded to CSDN. If friends need it, you can also scan the official QR code of csdn below or click on the WeChat card at the bottom of the homepage and article to get the method. [Guaranteed 100% free]

syntaxbug.com © 2021 All Rights Reserved.