Lecture 24 | Multi-threading and concurrency issues that cannot be ignored

Since we are talking about server-side development, we have to mention the issues of multi-threading and concurrency, because without multi-threading and concurrency, it is impossible to do the network server side, unless your project is based on Nginx or Apache.

What are the differences and connections between multithreading and concurrency?

When it comes to concurrency, we have to mention parallelism, so I will talk about these three concepts: concurrency, parallelism, and multi-threading. As a beginner, you may not quite understand what is the difference and connection between multi-threading and concurrency? Let’s take a look at them separately.

Concurrency occurs when the computer has only one CPU. What should we do if there are multiple threads that need to operate? It is impossible for the CPU to only run one program at a time, and then run the second program after running one. No one can tolerate this efficiency! So, I thought of a way.

The CPU divides the running thread into several CPU time slices to run. The thread that is not running will hang. When it is running, that thread will come alive and switch very quickly, as if they are running at the same time.

You can imagine this scene, there is a chess master, one person plays chess against ten opponents, and the ten people take turns playing with him. The master starts playing from player No. 1. After playing No. 1, he goes to player No. 2, plays player No. 2’s chess, and keeps taking turns until he returns to player No. 1 and takes the next step. As long as the chess master plays chess fast enough, and then moves to the next player fast enough, everyone will feel as if there are ten chess masters playing chess against ten opponents. There is actually only one chess master playing the game, but he is moving very fast.

Parallel is different from concurrency. Parallelism occurs when multiple physical CPUs are used. In this case, parallelism is a true concurrent state, running concurrently in a physical state. So, parallelism is really several chess masters dealing with several opponents. Of course, while parallelizing, the CPU will also perform concurrent operations.

Multi-threading is a slice of a single process. The memory and resources in the threads in a single process are shared, so communication between threads is very convenient.

The meaning of multi-threading is like a chef who is in charge of three pots. One pot is cooking ribs, one pot is cooking fish, and another pot is cooking noodles. The contents of these three pots are different, and the heat is different, but all the seasonings are used. and resources, including vegetables, oil, water, salt, MSG, sugar, soy sauce, etc., all come from the same place (that is, resource sharing), and the chef himself is a process, and he is assigned three threads (that is, three Pot), these three pots burn different things. The three foods may not be cooked at the same time, but the chef knows when the dish can be cooked and when it needs to be cooked. This is a metaphor for multithreading.

When we write a network server, we must consider the issues of multi-threading and concurrency. The network concurrency we are talking about can be said to be similar to the CPU concurrency. In other words, the meaning of network concurrency is how many users the network server can support at the same time logging in at the same time, or operating online at the same time.

Why does Python have problems when using multiple CPUs?

So let’s go back and see, why do Python, Ruby or Node.js have problems when using multiple CPUs? This is because they are written in C/C++ language. Yes, that’s the problem.

Our subsequent content will still be written in Python, so let’s take a look at Python’s multi-threading issues first. Python has a GIL (Global Interpreter Lock), and the problem lies in GIL.

Threads in the Python version written in C (hereinafter abbreviated as C-Python) are native threads of the operating system. pthread on Linux and Win thread on Windows, the execution of the thread is scheduled entirely by the operating system.

A Python interpreter process has a main thread and multiple user program execution threads. Even on multi-core CPU platforms. Due to the existence of GIL, parallel execution of multiple threads is prohibited. Why is this?

Because the multi-threads within the Python interpreter process are executed in a cooperative multi-tasking manner. When a thread encounters an I/O (input-output) task, the GIL lock is released. Threads that are compute-intensive (compute-heavy logic code) will release the GIL lock when they execute approximately 100 steps of the interpreter. You can think of step counting as instructions for the Python virtual machine. Step counting is actually independent of the CPU’s time slice length. We can control the GIL release event by setting the step count length through the Python library sys.setcheckinterval().

On a single-core CPU, hundreds of interval checks can result in a thread switch. On a multi-core CPU, this is not possible. Starting with Python 3.2, new GIL locks are used. In the new GIL implementation, a fixed timeout is used to instruct the current thread to give up the global lock. When the current thread holds this lock and other threads request this lock, the current thread will be forced to release the lock after five milliseconds.

If we want to achieve parallelism, using Python’s multi-threading is not effective, so we can create independent processes to achieve parallelization. Python 2.6 (inclusive) and above introduces the multiprocessing package.

We can also write the key parts of multi-threading as a Python extension in C/C++, and use ctypes to make the Python program directly call the exported functions of the dynamic library compiled in C language.

The problem with C-Python’s GIL exists in the writing language of C-Python, the native language C. Since the GIL is to ensure the smooth operation of the Python interpreter, in fact, multi-threading only simulates switching threads. Doing this will increase speed if you are using IO intensive tasks. Why do you say that?

Because the time it takes to write a file and read a file can completely release the GIL lock, and if it is a computationally intensive task, it may be slower than a single thread. why? In fact, GIL is a global exclusive lock. It does not make good use of the multi-core CPU. On the contrary, it simulates multi-threading into a single thread for context switching.

Let’s take a look at how single-threading compares to multi-threading in computationally intensive code.

Single-threaded version:

from threading import Thread
  import time
  def my_counter():
      i = 0
      for x in range(10000):
          i = i + 1
      return True
  def run():
      thread_array = {}
      start_time = time.time()
      for tt in range(2):
          t = Thread(target=my_counter)
          t.start()
          t.join()
      end_time = time.time()
      print("count time: {}".format(end_time - start_time))
  if __name__ == '__main__':
      run()

Multi-threaded version:

from threading import Thread
  import time
  def my_counter():
      i = 0
      for x in range(10000):
          i = i + 1
      return True
  def run():
      thread_array = {}
      start_time = time.time()
      for tt in range(2):
          t = Thread(target=my_counter)
          t.start()
          thread_array[tid] = t
      for i in range(2):
          thread_array[i].join()
      end_time = time.time()
      print("count time: {}".format(end_time - start_time))
  if __name__ == '__main__':
      run()

Of course, we can also change the number of this ranger to be larger and see a bigger difference.

When the step counting is completed, a lock release threshold will be reached, and the lock will be acquired immediately after the release. However, this is no problem in a single CPU environment, but in a multi-CPU environment, the second CPU is about to be awakened by the thread. At that time, the main thread of the first CPU directly obtained the main thread lock. At this time, the second CPU appeared and was constantly awakened. The first CPU obtained the main thread lock and continued to execute the content. The second CPU Continue to wait for the lock, wake up, wait, wake up, wait. In this way, only one CPU is actually executing instructions, wasting the time of other CPUs. This is where the problem lies.

This is also the problem with the Python language developed in C language. Of course, if you use Python (Jython) written in Java and Python (Iron Python) under .NET, there is no GIL problem. In fact, they don’t even have GIL locks. We can also use the new Python implementation PyPy. Therefore, these problems are actually caused by differences in implementation languages.

How to take advantage of multi-threading and concurrency as much as possible?

Let’s try another solution. We are still using C-Python, but we want to make it possible to take advantage of multi-threading and concurrency. How should we do this?

Multiprocess is provided in Python 2.6 (inclusive) and above to make up for the efficiency problems of GIL. The difference is that it uses multi-processes instead of multi-threads. Each process has its own independent GIL lock, so there will be no problem of CPU contention for GIL locks between processes, because they are all independent processes.

Of course, multiprocessing also has many problems. First of all, it will increase the difficulty of data communication and synchronization between threads in program implementation.

Take counters as an example. If we want multiple threads to accumulate the same variable, for thread, just declare a global variable and use the context of thread.Lock. In multiprocessing, since processes cannot see each other’s data, they can only declare a Queue in the main thread, put and then get, or use shared memory, shared files, pipes, etc.

We can take a look at multiprocess’s solution for sharing content data.

from multiprocessing import Process, Queue
  def f(q):
      q.put([4031, 1024, 'my data'])
  if __name__ == '__main__':
      q = Queue()
      p = Process(target=f, args=(q,))
      p.start()
      print q.get()
      p.join()

Although such a solution is feasible, the coding efficiency becomes relatively low, but it is also a temporary measure.

Summary

Let’s wrap up today’s content.

First, several concepts are introduced. Concurrency is the operation of switching multi-threaded tasks between a single CPU. Parallelism is the operation of multiple CPUs assigning and running multi-threaded tasks simultaneously. A thread is an independent task unit within a process, but shares all resources of the process. Network concurrency refers to the number of people and tasks that the server can carry at the same time.

Python written in C language has a GIL lock problem, which makes its multi-threaded computing-intensive tasks less efficient. Solutions include using multi-process to solve the problem or changing the implementation version of the Python language, such as PyPy or JPython, etc.