Python acceleration running skills

Python is a scripting language that has some shortcomings in efficiency and performance compared to compiled languages such as C/C++. However, there are many times when Python’s efficiency is not as exaggerated as imagined. This article summarizes some tips for speeding up the running of Python code.

0. Code optimization principles

This article will introduce many techniques to speed up the running of Python code. Before going into the details of code optimization, you need to understand some basic principles of code optimization.

The first basic principle: don’t optimize prematurely

Many people start writing code with the goal of performance optimization. “It is much easier to make a correct program faster than to make a fast program correct.” Therefore, the prerequisite for optimization is that the code can work properly. Optimizing prematurely may neglect grasping the overall performance indicators. Don’t reverse priorities before getting global results.

The second basic principle: weigh the cost of optimization

Optimization comes at a cost, and it is almost impossible to solve all performance problems. The choice usually faced is time for space or space for time. In addition, development costs also need to be considered.

The third principle: Don’t optimize those irrelevant parts

If every part of the code were to be optimized, these changes would make the code difficult to read and understand. If your code is running slowly, first find where the code is slow, usually the inner loop, and focus on optimizing where it is slow. Elsewhere, a little loss of time makes little difference.

1. Avoid global variables

# This writing method is not recommended. Code time: 26.8 seconds
import math

size=10000
for x in range(size):
    for y in range(size):
        z = math.sqrt(x) + math.sqrt(y)

Many programmers will start by writing some simple scripts in Python. When writing scripts, they are usually accustomed to writing them directly as global variables, such as the above code. However, due to the different implementations of global variables and local variables, code defined in the global scope will run much slower than code defined in a function. By putting script statements into functions, you can typically get a 15% – 30% speedup.

# Recommended writing method. Code time: 20.6 seconds
import math

def main(): # Define it into a function to reduce the use of all variables
    size=10000
    for x in range(size):
        for y in range(size):
            z = math.sqrt(x) + math.sqrt(y)

main()

2. Avoid .

2.1 Avoid module and function attribute access

# This writing method is not recommended. Code time: 14.5 seconds
import math

def computeSqrt(size: int):
    result = []
    for i in range(size):
        result.append(math.sqrt(i))
    return result

def main():
    size=10000
    for _ in range(size):
        result = computeSqrt(size)

main()

Each time . (attribute access operator) is used, specific methods, such as __getattribute__() and __getattr__(), will be triggered. Dictionary operations are performed, so additional time overhead is incurred. Attribute access can be eliminated through the from import statement.

# Optimize the writing method for the first time. Code time: 10.9 seconds
from math import sqrt

def computeSqrt(size: int):
    result = []
    for i in range(size):
        result.append(sqrt(i)) # Avoid the use of math.sqrt
    return result

def main():
    size=10000
    for _ in range(size):
        result = computeSqrt(size)

main()

In Section 1, we mentioned that the search for local variables will be faster than that of global variables, so for frequently accessed variables sqrt, you can speed up the operation by changing it to a local variable.

# Second optimization of writing. Code time: 9.9 seconds
import math

def computeSqrt(size: int):
    result = []
    sqrt = math.sqrt #Assign value to local variable
    for i in range(size):
        result.append(sqrt(i)) # Avoid the use of math.sqrt
    return result

def main():
    size=10000
    for _ in range(size):
        result = computeSqrt(size)

main()

In addition to math.sqrt, there is also . in the computeSqrt function, which is the that calls list >append method. By assigning this method to a local variable, the use of . inside the for loop in the computeSqrt function can be completely eliminated.

# Recommended writing method. Code time: 7.9 seconds
import math

def computeSqrt(size: int):
    result = []
    append = result.append
    sqrt = math.sqrt #Assign value to local variable
    for i in range(size):
        append(sqrt(i)) # Avoid the use of result.append and math.sqrt
    return result

def main():
    size=10000
    for _ in range(size):
        result = computeSqrt(size)

main()

2.2 Avoid intra-class attribute access

# This writing method is not recommended. Code time: 10.4 seconds
import math
from typing import List

classDemoClass:
    def __init__(self, value: int):
        self._value = value
    
    def computeSqrt(self, size: int) -> List[float]:
        result = []
        append = result.append
        sqrt = math.sqrt
        for _ in range(size):
            append(sqrt(self._value))
        return result

def main():
    size=10000
    for _ in range(size):
        demo_instance = DemoClass(size)
        result = demo_instance.computeSqrt(size)

main()

The principle of avoiding . also applies to in-class properties. Accessing self._value will be slower than accessing a local variable. By assigning frequently accessed intra-class properties to a local variable, you can improve code running speed.

# Recommended writing method. Code time: 8.0 seconds
import math
from typing import List

classDemoClass:
    def __init__(self, value: int):
        self._value = value
    
    def computeSqrt(self, size: int) -> List[float]:
        result = []
        append = result.append
        sqrt = math.sqrt
        value = self._value
        for _ in range(size):
            append(sqrt(value)) # Avoid the use of self._value
        return result

def main():
    size=10000
    for _ in range(size):
        demo_instance = DemoClass(size)
        demo_instance.computeSqrt(size)

main()

3. Avoid unnecessary abstractions

# Not recommended, code takes 0.55 seconds
classDemoClass:
    def __init__(self, value: int):
        self.value = value

    @property
    def value(self) -> int:
        return self._value

    @value.setter
    def value(self, x: int):
        self._value = x

def main():
    size=1000000
    for i in range(size):
        demo_instance = DemoClass(size)
        value = demo_instance.value
        demo_instance.value = i

main()

Any time you wrap code with additional layers of processing (such as decorators, property access, descriptors), you’re going to make the code slower. In most cases, it is necessary to re-examine the definition of using property accessors. Using getter/setter functions to access properties is usually a coding style left by C/C++ programmers. If it’s really not necessary, use simple attributes.

# Recommended writing method, code time: 0.33 seconds
classDemoClass:
    def __init__(self, value: int):
        self.value = value # Avoid unnecessary property accessors

def main():
    size=1000000
    for i in range(size):
        demo_instance = DemoClass(size)
        value = demo_instance.value
        demo_instance.value = i

main()

4. Avoid data duplication

4.1 Avoid meaningless data copy

# This writing method is not recommended. The code takes 6.5 seconds.
def main():
    size=10000
    for _ in range(size):
        value = range(size)
        value_list = [x for x in value]
        square_list = [x * x for x in value_list]

main()

The value_list in the above code is completely unnecessary and would create unnecessary data structures or copies.

# Recommended writing method, code time: 4.8 seconds
def main():
    size=10000
    for _ in range(size):
        value = range(size)
        square_list = [x * x for x in value] # Avoid meaningless copying

main()

Another situation is that you are too paranoid about Python’s data sharing mechanism, do not understand or trust Python’s memory model well, and abuse functions such as copy.deepcopy(). Usually the copy operation can be eliminated in these codes.

4.2 Do not use intermediate variables when exchanging values

# Not recommended, code takes 0.07 seconds
def main():
    size=1000000
    for _ in range(size):
        a=3
        b = 5
        temp=a
        a = b
        b=temp

main()

The above code creates a temporary variable temp when exchanging values. Without the help of intermediate variables, the code is more concise and runs faster.

# Recommended writing method, code time: 0.06 seconds
def main():
    size=1000000
    for _ in range(size):
        a=3
        b = 5
        a, b = b, a # without intermediate variables

main()

4.3 Use join instead of + for string concatenation

# This writing method is not recommended. The code takes 2.6 seconds.
import string
from typing import List

def concatString(string_list: List[str]) -> str:
    result = ''
    for str_i in string_list:
        result + = str_i
    return result

def main():
    string_list = list(string.ascii_letters * 100)
    for _ in range(10000):
        result = concatString(string_list)

main()

When using a + b to splice strings, since strings in Python are immutable objects, they will allocate a memory space and combine a and b are copied to the newly applied memory space respectively. Therefore, if you want to splice n strings, n-1 intermediate results will be generated. Each intermediate result needs to apply for and copy memory, which seriously affects the operating efficiency. . When using join() to splice strings, the total memory space that needs to be applied for will be calculated first, then the required memory will be applied for all at once, and each string element will be copied to the memory. go.

# Recommended writing method, code time: 0.3 seconds
import string
from typing import List

def concatString(string_list: List[str]) -> str:
    return ''.join(string_list) # use join instead of +

def main():
    string_list = list(string.ascii_letters * 100)
    for _ in range(10000):
        result = concatString(string_list)

main()

5. Utilize the short-circuit characteristics of if conditions

# Not recommended, code takes 0.05 seconds
from typing import List

def concatString(string_list: List[str]) -> str:
    abbreviations = {'cf.', 'e.g.', 'ex.', 'etc.', 'flg.', 'i.e.', 'Mr.', 'vs.'}
    abbr_count = 0
    result = ''
    for str_i in string_list:
        if str_i in abbreviations:
            result + = str_i
    return result

def main():
    for _ in range(10000):
        string_list = ['Mr.', 'Hat', 'is', 'Chasing', 'the', 'black', 'cat', '.']
        result = concatString(string_list)

main()

The short-circuit characteristic of if conditions refers to statements such as if a and b. When a is False, Return directly without calculating b; for statements such as if a or b, when a is True Will be returned directly without calculating b. Therefore, in order to save running time, for the or statement, variables with a higher probability of being True should be written before or, and and should be pushed back.

# Recommended writing method, code takes 0.03 seconds
from typing import List

def concatString(string_list: List[str]) -> str:
    abbreviations = {'cf.', 'e.g.', 'ex.', 'etc.', 'flg.', 'i.e.', 'Mr.', 'vs.'}
    abbr_count = 0
    result = ''
    for str_i in string_list:
        if str_i[-1] == '.' and str_i in abbreviations: # Take advantage of the short circuit characteristic of if condition
            result + = str_i
    return result

def main():
    for _ in range(10000):
        string_list = ['Mr.', 'Hat', 'is', 'Chasing', 'the', 'black', 'cat', '.']
        result = concatString(string_list)

main()

6. Loop optimization

6.1 Use for loop instead of while loop

# This writing method is not recommended. Code time: 6.7 seconds
def computeSum(size: int) -> int:
    sum_ = 0
    i = 0
    while i < size:
        sum_ + = i
        i+=1
    return sum_

def main():
    size=10000
    for _ in range(size):
        sum_ = computeSum(size)

main()

Python’s for loop is much faster than the while loop.

# Recommended writing method. Code time: 4.3 seconds
def computeSum(size: int) -> int:
    sum_ = 0
    for i in range(size): # for loop instead of while loop
        sum_ + = i
    return sum_

def main():
    size=10000
    for _ in range(size):
        sum_ = computeSum(size)

main()

6.2 Use implicit for loops instead of explicit for loops

For the above example, you can go one step further and use an implicit for loop instead of an explicit for loop.

# Recommended writing method. Code time: 1.7 seconds
def computeSum(size: int) -> int:
    return sum(range(size)) # Implicit for loop instead of explicit for loop

def main():
    size=10000
    for _ in range(size):
        sum = computeSum(size)

main()

6.3 Reduce the calculation of the inner for loop

# This writing method is not recommended. Code time: 12.8 seconds
import math

def main():
    size=10000
    sqrt = math.sqrt
    for x in range(size):
        for y in range(size):
            z = sqrt(x) + sqrt(y)

main()

In the above code, sqrt(x) is located inside the for loop. It will be recalculated during each training process, which increases the time overhead.

# Recommended writing method. Code time: 7.0 seconds
import math

def main():
    size=10000
    sqrt = math.sqrt
    for x in range(size):
        sqrt_x = sqrt(x) # Reduce the calculation of the inner for loop
        for y in range(size):
            z = sqrt_x + sqrt(y)

main()

7. Use numba.jit

We follow the example introduced above and use numba.jit on this basis. numba can JIT compile Python functions into machine code for execution, greatly improving the code running speed. For more information about numba see the home page below:

http://numba.pydata.org/numba.pydata.org

# Recommended writing method. Code time: 0.62 seconds
import numba

@numba.jit
def computeSum(size: float) -> int:
    sum = 0
    for i in range(size):
        sum + = i
    return sum

def main():
    size=10000
    for _ in range(size):
        sum = computeSum(size)

main()

8. Choose the appropriate data structure

Python’s built-in data structures such as str, tuple, list, set, dict The bottom layer is implemented in C, which is very fast. It is almost impossible to implement a new data structure yourself and achieve the built-in speed in terms of performance.

list is similar to std::vector in C++ and is a dynamic array. It will pre-allocate a certain amount of memory space. When the pre-allocated memory space is used up and elements are added to it, it will apply for a larger memory space, then copy all the original elements there, and then destroy the previous memory. space before inserting new elements. The operation is similar when deleting elements. When the used memory space is less than half of the pre-allocated memory space, an additional small memory will be applied for, an element copy will be made, and then the original large memory space will be destroyed. Therefore, if there are frequent addition and deletion operations, and the number of added and deleted elements is large, the efficiency of the list will not be high. In this case, you should consider using collections.deque. collections.deque is a double-ended queue that has the characteristics of both stack and queue. It can perform insertion and deletion operations of O(1) complexity at both ends.

The search operation of list is also very time-consuming. When you need to frequently find certain elements in list, or frequently access these elements in order, you can use bisect to maintain the order of list objects and store them in order. A binary search is performed to improve search efficiency.

Another common requirement is to find minimum or maximum values. At this time, you can use the heapq module to convert list into a heap, so that the time complexity of obtaining the minimum value is O(1).

The following web page gives the time complexity of various operations on commonly used Python data structures:

TimeComplexity – Python Wikiwiki.python.org

Reference materials

  • https://zhuanlan.zhihu.com/p/143052860

  • David Beazley & Brian K. Jones. Python Cookbook, Third edition. O’Reilly Media, ISBN: 9781449340377, 2013.

  • Zhang Ying & Lai Yonghao. Writing high-quality code: 91 suggestions for improving Python programs. Machinery Industry Press, ISBN: 9787111467045, 2014.

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Python entry skill treeHomepageOverview 386872 people are learning the system