The real Python multithreading is here!

142bfda51a422c38cdd04eb8bb23bd4e.gif

[CSDN editor’s note] IBM engineer Martin Heinz wrote that Python is about to usher in a real multi-threaded moment!

Original: https://martinheinz.dev/blog/97

Unauthorized reprinting is prohibited!

Author | Martin Heinz Editor | Meng Yidan

Translation Tools | ChatGPT

32 years old Python still doesn’t have real parallelism/concurrency. However, this is about to change with the introduction of a new feature called the “Per-Interpreter GIL” in the upcoming release of Python 3.12. We are still a few months away from release (expected in October 2023), but the relevant code is already there, so we can get an early look at how to write truly concurrent Python code using the subinterpreter API.

9f9ec18f5481288ca71c57aecb732acf.png

Subinterpreter

First, let’s explain how the “Per-Interpreter GIL” solves Python’s lack of proper concurrency.

In Python, the GIL is a mutex that only allows one thread to control the Python interpreter. This means that even if multiple threads are created in Python (e.g. using the threading module), only one thread will run.

With the introduction of the “Per-Interpreter GIL”, individual Python interpreters no longer share the same GIL. This isolation level allows each sub-interpreter to run concurrently. This means that we can bypass Python’s concurrency limitations by spawning additional sub-interpreters, each of which has its own GIL.

A more detailed explanation can be found in PEP 684, which documents this feature/change: https://peps.python.org/pep-0684/#per-interpreter-state

b6227ad22781eef22e247228e391ddfb.png

Hands-on experience

Install

To use this latest feature, we must have the latest version of Python installed and build it from source:

# https://devguide.python.org/getting-started/setup-building/#unix-compiling
git clone https://github.com/python/cpython.git
cd cpython


./configure --enable-optimizations --prefix=$(pwd)/python-3.12
make -s -j2
./python
# Python 3.12.0a7 + (heads/main:22f3425c3d, May 10 2023, 12:52:07) [GCC 11.3.0] on linux
# Type "help", "copyright", "credits" or "license" for more information.

Where is the C-API?

Now that the latest version is installed, how do we use the subinterpreter? Can it be imported directly? No, as mentioned in PEP-684: “This is an advanced feature designed for a small subset of users of the C-API.”

Currently, the Per-Interpreter GIL feature is only available through the C-API, so there is no direct interface for Python developers to use. Such an interface is expected to come out with PEP 554 and, if adopted, should be implemented in Python 3.13, until then we’ll have to figure out how to implement subinterpreters ourselves.

With some scattered documentation in the CPython codebase, we can take the following two approaches:

Use the _xxsubinterpreters module, which is implemented in C, so the name looks a bit odd. Since it’s implemented in C, developers can’t easily inspect the code (at least not in Python);

Or you can take advantage of CPython’s test module, which has example Interpreter (and Channel) classes for testing.

# Choose one of these:
import _xxsubinterpreters as interpreters
from test.support import interpreters

In the following demonstrations, we will mainly use the second method. We’ve found the sub-interpreter, but we also need to borrow some helper functions from Python’s test module in order to pass code to the sub-interpreter:

from textwrap import dedent
import os
# https://github.com/python/cpython/blob/
# 15665d896bae9c3d8b60bd7210ac1b7dc533b093/Lib/test/test__xxsubinterpreters.py#L75
def _captured_script(script):
    r, w = os. pipe()
    indented = script. replace('\\
', '\\
 ')
    wrapped = dedent(f"""
        import contextlib
        with open({w}, 'w', encoding="utf-8") as pipe:
            with contextlib.redirect_stdout(spipe):
                {indented}
        """)
    return wrapped, open(r, encoding="utf-8")




def _run_output(interp, request, channels=None):
    script, rpipe = _captured_script(request)
    with rpipe:
        interp. run(script, channels=channels)
        return rpipe. read()

Combining the interpreters module with the helper programs described above, the first sub-interpreter can be generated:

from test.support import interpreters


main = interpreters. get_main()
print(f"Main interpreter ID: {main}")
# Main interpreter ID: Interpreter(id=0, isolated=None)


interp = interpreters. create()


print(f"Sub-interpreter: {interp}")
# Sub-interpreter: Interpreter(id=1, isolated=True)


# https://github.com/python/cpython/blob/
# 15665d896bae9c3d8b60bd7210ac1b7dc533b093/Lib/test/test__xxsubinterpreters.py#L236
code = dedent("""
            from test.support import interpreters
            cur = interpreters. get_current()
            print(cur.id)
            """)


out = _run_output(interp, code)


print(f"All Interpreters: {interpreters. list_all()}")
# All Interpreters: [Interpreter(id=0, isolated=None), Interpreter(id=1, isolated=None)]
print(f"Output: {out}") # Result of 'print(cur.id)'
# Output: 1

One way to generate and run a new interpreter is to use the create function and then pass the interpreter to the _run_output helper function along with the code to execute.

An easier way is:

interp = interpreters. create()
interp. run(code)

Use the run method in the interpreter. However, if we run any of the above codes, we will get the following error:

Fatal Python error: PyInterpreterState_Delete: remaining subinterpreters
Python runtime state: finalizing (tstate=0x000055b5926bf398)

To avoid such errors, some dangling interpreters also need to be cleaned up:

def cleanup_interpreters():
    for i in interpreters. list_all():
        if i.id == 0: # main
            continue
        try:
            print(f"Cleaning up interpreter: {i}")
            i. close()
        except RuntimeError:
            pass # already destroyed


cleanup_interpreters()
# Cleaning up interpreter: Interpreter(id=1, isolated=None)
# Cleaning up interpreter: Interpreter(id=2, isolated=None)

Threads

While it is possible to run code using the above helper functions, it may be more convenient to use the familiar interface from the threading module:

import threading


def run_in_thread():
    t = threading.Thread(target=interpreters.create)
    print(t)
    t. start()
    print(t)
    t. join()
    print(t)


run_in_thread()
run_in_thread()


# <Thread(Thread-1 (create), initial)>
# <Thread(Thread-1 (create), started 139772371633728)>
# <Thread(Thread-1 (create), stopped 139772371633728)>
# <Thread(Thread-2 (create), initial)>
# <Thread(Thread-2 (create), started 139772371633728)>
# <Thread(Thread-2 (create), stopped 139772371633728)>

By passing the interpreters.create function to Thread, it will automatically generate a new sub-interpreter inside the thread.

We can also combine the two approaches and pass a helper function to threading.Thread:

import time


def run_in_thread():
    interp = interpreters.create(isolated=True)
    t = threading.Thread(target=_run_output, args=(interp, dedent("""
            import _xxsubinterpreters as _interpreters
            cur = _interpreters. get_current()


            import time
            time. sleep(2)
            # Can't print from here, won't bubble-up to main interpreter


            assert isinstance(cur, _interpreters. InterpreterID)
            """)))
    print(f"Created Thread: {t}")
    t. start()
    return t




t1 = run_in_thread()
print(f"First running Thread: {t1}")
t2 = run_in_thread()
print(f"Second running Thread: {t2}")
time.sleep(4) # Need to sleep to give Threads time to complete
cleanup_interpreters()

Here, we demonstrate how to use the _xxsubinterpreters module instead of the one in test.support. We also sleep for 2 seconds in each thread to simulate some “work”. Note that we don’t even have to call the join() function to wait for the thread to finish, we just clean up the interpreter when the thread finishes.

Channels

If we dig into the CPython test module, we also find that there are implementations of RecvChannel and SendChannel classes, which are similar to Channels in Golang. To use them:

# https://github.com/python/cpython/blob/
# 15665d896bae9c3d8b60bd7210ac1b7dc533b093/Lib/test/test_interpreters.py#L583
r, s = interpreters. create_channel()


print(f"Channel: {r}, {s}")
# Channel: RecvChannel(id=0), SendChannel(id=0)


orig = b'spam'
s. send_nowait(orig)
obj = r.recv()
print(f"Received: {obj}")
# Received: b'spam'


cleanup_interpreters()
# Need clean up, otherwise:


# free(): invalid pointer
# Aborted (core dumped)

This example shows how to create a channel with receiver (r) and sender (s) ends. We can pass data to the sender using send_nowait and read it on the other side using the recv function. This channel is really just another sub-interpreter – so same as before – we need to clean up when we’re done.

Dig deeper

Finally, if we want to interfere with or tweak subinterpreter options set in C code, we can use the code in the test.support module, specifically run_in_subinterp_with_config:

import test.support


def run_in_thread(script):
    test.support.run_in_subinterp_with_config(
        script,
        use_main_obmalloc=True,
        allow_fork=True,
        allow_exec=True,
        allow_threads=True,
        allow_daemon_threads=False,
        check_multi_interp_extensions=False,
        own_gil=True,
    )


code = dedent(f"""
            from test.support import interpreters
            cur = interpreters. get_current()
            print(cur)
            """)


run_in_thread(code)
# Interpreter(id=7, isolated=None)
run_in_thread(code)
# Interpreter(id=8, isolated=None)

This function is a Python API for calling C functions. It provides some sub-interpreter options, like own_gil, which specify whether the sub-interpreter should have its own GIL.

d93179f51ebf8de4d7ade911293e92bd.png

Summary

Having said that – and as you can see, the API call is not simple, unless you already have C language expertise, and desperately want to use the word interpreter, it is recommended to wait for the release of Python 3.13. Or you can try the extrainterpreters project, which provides a friendlier Python API for working with subinterpreters.

Recommended reading:

?Zhang Yong sent a letter to all staff: Alibaba Cloud will be spun off and listed; the official iOS app of ChatGPT will be launched, supporting Chinese voice; Bun 0.6 will be released|Geek Headlines

?Professor misused ChatGPT to “check plagiarism” in papers. Student: Not only did I fail the course, but I almost couldn’t get my diploma!

?ChatGPT App is here!

55b2690fa2162791a4780e5e99a0c6fa.jpeg

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledgePython entry skill treeHomepageOverview 296656 people are studying systematically