cpuinfo library: generate C++ interface using Python

Article directory

    • 1. Purpose
    • 2. Design
    • 3. Generator: Python code
    • 4. Result: C++ code


1. Purpose

Background: The cpuinfo library provides CPU information query, covers common instruction set architectures such as x86 and arm, and can provide correctness verification of self-implemented CPU basic functions. Self-implementation of the basic functions of the CPU is to simplify the functions and at the same time provide the original implementation instead of copying the open source code. The original refers to having its own thinking process.

The cpuinfo library is a C language-based library officially maintained by pytorch. What does that mean? Its interface is C-style, and its implementation is also C-style. The advantage is that it is easier to understand the code, but the disadvantage is that it is more verbose, especially at the beginning and end of each call. Manual initialization and deinitialization:

  • cpuinfo_initialize()
  • cpuinfo_deinitialize()
    Once you forget to initialize or deinitialize, it is easy to get an error. For ease of use, it is advisable to use C++ to encapsulate the implementation of the API functions in cpuinfo.h, provide a class CpuInfo, perform initialization and deinitialization in its constructor and destructor, and use it as People who call the CpuInfo class don’t need to know about these tasks at all.

The final code screenshot is as follows:

2. Design

The way to compare naive is to find each API function, and then copy them to the cpuinfo.hpp file one by one, and write functions one by one to encapsulate the call. Disadvantages: too many manual operations, easy to write mistakes, if you want to modify the whole (such as adding a newline character as a whole), it is very troublesome.

Where it is easy to go wrong, it will go wrong!

Since a total of about 50 functions starting with cpuinfo_ are involved, consider using Python to scan the cpuinfo.h file to automatically generate CpuInfo class, and automatically write to the cpuinfo.hpp file. The characteristics of this approach are also very clear:

  • If pytorch officially updates the cpuinfo library, adds or deletes the API, or modifies the parameter list and return value type of the API, it can be automatically regenerated, avoiding potential manual operations
  • You need to have a certain proficiency in Python and be able to correctly parse each of the original APIs, including parsing out:
    • return type
    • function name
    • parameter list

This is actually a bit like the process of implementing a simple parser in the compilation principle small homework.

Specific design:

  • CodeWriter class: Responsible for writing code, including tabs, spaces, saving files, etc.
  • Parse the functions of C API: parse out the return value type, parameter name, parameter list
  • Specific function calls: API-by-API analysis
  • Special handling: For constructors and destructors, return and return value types are not required and cannot be written

3. Generator: Python code

Python code:

# Author: ChrisZZ <[email protected]>
# Homepage: https://github.com/zchrissirhcz

class CodeWriter(object):
    def __init__(self, indent_len):
        self.lines = []
        self.indent_num = 0
        self.indent_len = indent_len

    def write(self, content):
        padding = (self.indent_len * self.indent_num) * ' '
        line = padding + content
        self. lines. append(line)

    def save(self, filename):
        with open(filename, "w") as fout:
            for line in self.lines:
                fout. write(line + "\
")

    def tab(self):
        self.indent_num += 1

    def backspace(self):
        if (self. indent_num > 0):
            self.indent_num -= 1

header_path = "/home/zz/work/cpuinfo/include/cpuinfo.h"

s1 = []
with open(header_path, "r") as fin:
    for line in fin. readlines():
        line = line.rstrip()
        if (" CPUINFO_ABI " in line) and (not ("define" in line)):
            s1.append(line)

class Param(object):
    def __init__(self, type_str, name):
        self. type_str = type_str
        self.name = name

class FunctionDeclaration(object):
    def __init__(self, line: str, delimeter="CPUINFO_ABI"):
        items = line. split(delimeter)
        self.return_type = items[0].rstrip()
        function_name_and_param_lst = items[1].strip()[0:-1]
        self.c_function_name = function_name_and_param_lst.split('(')[0]
        self.cpp_function_name = self.c_function_name[len("cpuinfo_"):]
        
        self.param_lst_str = function_name_and_param_lst.split('(')[1][:-1]
        self.param_lst = []
        param_kv_lst = self.param_lst_str.split(',')
        for param_kv in param_kv_lst:
            param_kv_items = param_kv. split(' ')
            param_name = param_kv_items[-1]
            param_type_str = ' '.join(param_kv_items[0:-1])
            param = Param(param_type_str, param_name)
            self.param_lst.append(param)


w = CodeWriter(4)
w.write("#include <cpuinfo.h>")
w. write("")
w.write("class CpuInfo")
w.write("{")
w.write("public:")
w. tab()
for line in s1:
    #w. write(s)
    fd = FunctionDeclaration(line)

    require_return = True
    if fd.c_function_name == "cpuinfo_initialize":
        declaration = "CpuInfo()"
        require_return = False
    elif fd.c_function_name == "cpuinfo_deinitialize":
        declaration = "~CpuInfo()"
        require_return = False
    else:
        declaration = "{:s} {:s}".format(fd.return_type, fd.cpp_function_name)
        if fd.param_lst_str == "void":
            declaration += "()"
        else:
            declaration + = "({:s})". format(fd. param_lst_str)

    w. write(declaration)
    w. write('{')
    w. tab()
    call = ""
    if require_return:
        call = "return"
    call + = "{:s}".format(fd.c_function_name)
    if fd.param_lst_str == "void":
        call + = "();"
    else:
        call += "("
        index = 0
        for param in fd.param_lst:
            if (index > 0):
                call + = ", "
            call + = "{:s}".format(param.name)
            index += 1
        call += ");"
    w. write(call)
    w. backspace()
    w. write('}')
w. backspace()
w.write("};")
w.save("tests/cpuinfo.hpp")

4. Result: C++ code

Generated cpuinfo.hpp C++ code:

#include <cpuinfo.h>

class CpuInfo
{<!-- -->
public:
    CpuInfo()
    {<!-- -->
        cpuinfo_initialize();
    }
    ~CpuInfo()
    {<!-- -->
        cpuinfo_deinitialize();
    }
    const struct cpuinfo_processor* get_processors()
    {<!-- -->
        return cpuinfo_get_processors();
    }
    const struct cpuinfo_core* get_cores()
    {<!-- -->
        return cpuinfo_get_cores();
    }
    const struct cpuinfo_cluster* get_clusters()
    {<!-- -->
        return cpuinfo_get_clusters();
    }
    const struct cpuinfo_package* get_packages()
    {<!-- -->
        return cpuinfo_get_packages();
    }
    const struct cpuinfo_uarch_info* get_uarchs()
    {<!-- -->
        return cpuinfo_get_uarchs();
    }
    const struct cpuinfo_cache* get_l1i_caches()
    {<!-- -->
        return cpuinfo_get_l1i_caches();
    }
    const struct cpuinfo_cache* get_l1d_caches()
    {<!-- -->
        return cpuinfo_get_l1d_caches();
    }
    const struct cpuinfo_cache* get_l2_caches()
    {<!-- -->
        return cpuinfo_get_l2_caches();
    }
    const struct cpuinfo_cache* get_l3_caches()
    {<!-- -->
        return cpuinfo_get_l3_caches();
    }
    const struct cpuinfo_cache* get_l4_caches()
    {<!-- -->
        return cpuinfo_get_l4_caches();
    }
    const struct cpuinfo_processor* get_processor(uint32_t index)
    {<!-- -->
        return cpuinfo_get_processor(index);
    }
    const struct cpuinfo_core* get_core(uint32_t index)
    {<!-- -->
        return cpuinfo_get_core(index);
    }
    const struct cpuinfo_cluster* get_cluster(uint32_t index)
    {<!-- -->
        return cpuinfo_get_cluster(index);
    }
    const struct cpuinfo_package* get_package(uint32_t index)
    {<!-- -->
        return cpuinfo_get_package(index);
    }
    const struct cpuinfo_uarch_info* get_uarch(uint32_t index)
    {<!-- -->
        return cpuinfo_get_uarch(index);
    }
    const struct cpuinfo_cache* get_l1i_cache(uint32_t index)
    {<!-- -->
        return cpuinfo_get_l1i_cache(index);
    }
    const struct cpuinfo_cache* get_l1d_cache(uint32_t index)
    {<!-- -->
        return cpuinfo_get_l1d_cache(index);
    }
    const struct cpuinfo_cache* get_l2_cache(uint32_t index)
    {<!-- -->
        return cpuinfo_get_l2_cache(index);
    }
    const struct cpuinfo_cache* get_l3_cache(uint32_t index)
    {<!-- -->
        return cpuinfo_get_l3_cache(index);
    }
    const struct cpuinfo_cache* get_l4_cache(uint32_t index)
    {<!-- -->
        return cpuinfo_get_l4_cache(index);
    }
    uint32_t get_processors_count()
    {<!-- -->
        return cpuinfo_get_processors_count();
    }
    uint32_t get_cores_count()
    {<!-- -->
        return cpuinfo_get_cores_count();
    }
    uint32_t get_clusters_count()
    {<!-- -->
        return cpuinfo_get_clusters_count();
    }
    uint32_t get_packages_count()
    {<!-- -->
        return cpuinfo_get_packages_count();
    }
    uint32_t get_uarchs_count()
    {<!-- -->
        return cpuinfo_get_uarchs_count();
    }
    uint32_t get_l1i_caches_count()
    {<!-- -->
        return cpuinfo_get_l1i_caches_count();
    }
    uint32_t get_l1d_caches_count()
    {<!-- -->
        return cpuinfo_get_l1d_caches_count();
    }
    uint32_t get_l2_caches_count()
    {<!-- -->
        return cpuinfo_get_l2_caches_count();
    }
    uint32_t get_l3_caches_count()
    {<!-- -->
        return cpuinfo_get_l3_caches_count();
    }
    uint32_t get_l4_caches_count()
    {<!-- -->
        return cpuinfo_get_l4_caches_count();
    }
    uint32_t get_max_cache_size()
    {<!-- -->
        return cpuinfo_get_max_cache_size();
    }
    const struct cpuinfo_processor* get_current_processor()
    {<!-- -->
        return cpuinfo_get_current_processor();
    }
    const struct cpuinfo_core* get_current_core()
    {<!-- -->
        return cpuinfo_get_current_core();
    }
    uint32_t get_current_uarch_index()
    {<!-- -->
        return cpuinfo_get_current_uarch_index();
    }
    uint32_t get_current_uarch_index_with_default(uint32_t default_uarch_index)
    {<!-- -->
        return cpuinfo_get_current_uarch_index_with_default(default_uarch_index);
    }
};