[Link loading and libraries] Organization of Linux shared libraries

Organization of Linux shared libraries

Due to the many advantages of dynamic linking, a large number of programs have begun to use the dynamic linking mechanism, resulting in an extremely large number of shared objects in the system. If there is no good way to organize these shared objects, the shared object files in the entire system will be scattered in various directories, causing great problems for long-term maintenance and upgrades. Therefore, the operating system generally has certain rules for the directory organization and use of shared objects. In this chapter, we will introduce the management of shared libraries under Linux.

Shared library version

  • Shared library compatibility

Developers of shared libraries will constantly update versions of shared libraries to correct original bugs, add new features, or improve performance. Due to the flexibility of dynamic linking, the program itself and the shared libraries that the program depends on can be developed and updated independently. However, updates to the shared library version may cause the interface to be changed or deleted, which may cause the program that depends on the shared library to fail. normal operation. In the simplest case, updates to shared libraries can be divided into two categories.

  1. Compatible with updates. All updates only add some content based on the original shared library, and all original interfaces remain unchanged.
  2. Not compatible with updates. The shared library update changes the original interface, and programs using the original interface of the shared library may not be able to run or run abnormally.

The main behaviors that cause the shared library ABI of C language to change are as follows:

  1. The behavior of the exported function changes, which means that the results produced after calling this function are different from before.
  2. Exported functions are removed.
  3. The structure of the exported data changes, such as the structure of the structure variables defined by the shared library changes: structure members are deleted, the order is changed, or other behaviors that cause changes in the memory layout of the structure
  4. The interface of the exported function changes, such as the function return value and parameters are changed.
  • Shared library version naming

Linux has a set of rules for naming each shared library in the system. It stipulates that the file name rules for shared libraries must be as follows:

libname.so.x.y.z

Use the prefix “lib” at the beginning, the name of the library and the suffix “.so” in the middle, and the version number consisting of three numbers at the end. “x” represents the major version number, “y” represents the minor version number, and “z” represents the release version number. The three version numbers have different meanings.

Major version number indicates a major upgrade of the library. Libraries with different major version numbers are incompatible. Programs that rely on the old major version number need to change the corresponding parts and recompile them. Run in new version of shared library
Minor version number represents an incremental upgrade of the library, that is, adding some new interface symbols while keeping the original symbols unchanged. When the major version numbers are the same, the library with a higher minor version number is backward compatible with the library with a lower minor version number.
Release version number indicates some bug fixes, performance improvements, etc. of the library. It does not add any new interfaces or make changes to the interfaces.

  • SO-NAME

The major version number and minor version number of a shared library determine the interface of a shared library. How does the dynamic linker know which shared libraries a program depends on and what their version numbers are?

Solaris and Linux generally use a naming mechanism called SO-NAME to record the dependencies of shared libraries. Each shared library has a corresponding “SO-NAME”. This SO-NAME is the file name of the shared library, minus the minor version number and release version number, and retains the major version number.
The system will create a soft link that is the same as “SO-NAME” and points to it in the directory where it is located for each shared library

So what is the use of creating a soft link with the name “SO-NAME”?

In fact, this soft link will point to the shared library in the directory with the same major version number and the latest minor version number and release version number.

The purpose of establishing a soft link named SO-NAME is to make all modules that depend on a certain shared library use the SO-NAME of the shared library when compiling, linking and running, instead of using the detailed version number.
When compiling and outputting an ELF file, save the SO-NAME of the dependent shared library to “.dynamic”, so that when the dynamic linker searches for shared library dependent files, it will use the SO-NAME in various shared library directories in the system. -NAME soft links are automatically directed to the latest version of the shared library.

Linux provides a tool called “ldconfig”. When installing or updating a shared library in the system, you need to run this tool. It will traverse all default shared library directories, such as /ib, /usr/ib, etc., and then update All soft links point to the latest version of the shared library; if a new shared library is installed, ldconfig will create the corresponding soft link for it.

Symbol version

If there are multiple copies of a shared library with the same major version number and different minor version numbers in the system, the dynamic linker will use the copy with the highest minor version number. If the minor version number of the found shared library is lower than the required version, the policy of the SunOS 4.x system is to issue a warning message to the user, indicating that there are only shared libraries with lower minor version numbers in the system, but the running program will continue to run. In some systems that adopt a more conservative strategy, if the minor version number in such a system does not have a high enough minor version to satisfy dependencies, the program will be prohibited from running to prevent unexpected situations. In a system that adopts the second strategy, if there are only shared libraries with low version numbers in the system, these programs cannot run. We can call this problem Minor version number intersection problem

This minor version number intersection problem has not been improved in any way because of the existence of SO-NAME. For this problem, modern systems solve this problem through a more sophisticated way, that is, the symbol version mechanism.

  • Symbol-based versioning mechanism

Glibc under Linux supports a solution called a conformance-based version mechanism starting from version 2.1. The basic idea of this solution is to allow each exported and imported symbol to have an associated version number. Its actual approach is similar to the name modification method.

When we upgrade libfoo.so.1.2 to 1.3, we still keep the SO-NAME of libfoo.so.1, but mark the global symbols added in the new version of 1.3 with a mark, such as “VERS 1.3”. Then, if a shared library is upgraded every time its minor version number is upgraded, we can mark the global symbols added in the new minor version number accordingly, and we can clearly see that each symbol in the shared library has the corresponding labels, such as “VERS 1.1”, “VERS 1.2”, “VERS 1.3”, “VERS 1.4”.

  • Symbol versioning in Solaris

Solaris’s ld linker adds a new version mechanism and scope mechanism for shared libraries

The idea of the version mechanism is very simple, that is, to define some collections of symbols. These collections themselves have names, such as “VERS 1.1”, “VERS 1.2”, etc. Each collection contains some designated symbols. In addition to having symbols, , a set can also contain another set, for example, “VERS 1.2” can contain the set “VERS 1.1”.

In Solaris, programmers can write files called symbolic version scripts when linking against shared libraries. The linker generates shared libraries according to the relationships specified in the symbol version script at link time, and sets the collection of symbols and the relationships between them.

When the symbols of the shared library have a version set, one of the most obvious effects is that when we build (compile and link) the application, the linker can record the version it uses in the final output file of the program. Symbols collection.

  • GCC extensions to the Solaris symbol versioning mechanism

GCC also allows the use of an assembly macro called “.symver” to specify the symbol version. This assembly macro can be used in GAS assembly, or in GCC’s C/C++ source code to embed assembly instructions. mode usage.

asm(".symver add,addEVERS_1.1");
  • Practice of symbol version mechanism in Linux system

Under Linux, when we use ld to link a shared library, we can use the “-version-script” parameter; if we use GCC, we can use the “-Xlinker” parameter plus “-version-script”, which is equivalent to adding “-version -script” is passed to the ld linker. For example, the compiled source code is “lib.c” and the symbol version script file is “lib.ver”:

gcc -shared -fpIC lib.c -Xlinker --version-script lib.ver -o lib.so

Shared library system path

At present, most open source operating systems, including Linux, comply with a standard called FHS. This standard stipulates how system files in a system should be stored, including the structure, organization and role of each directory, which is conducive to promoting various open source operations. Compatibility between systems.
FHS stipulates that there are three main locations for storing shared libraries in a system. They are as follows:

  1. /ib, this location mainly stores the most critical and basic shared libraries of the system, such as dynamic linker, C language runtime library, mathematics library, etc.
  2. /usr/lib, this directory mainly stores some key shared libraries that are not required for system runtime, mainly some shared libraries used during development.
  3. /usr/local/lib, this directory is used to place some libraries that are not very related to the operating system itself, mainly libraries for third-party applications.

Shared library search process

Dynamically linked ELF executable files also start the dynamic linker when they are started. In Linux systems, the dynamic linker is /lib/ld-linux.so.X (X is the version number). The module path that any dynamically linked module depends on is stored in the “.dynamic” section, represented by an item of type DT_NEED. If the absolute path stored in DT_NEED is an absolute path, the dynamic linker will search according to this path; if it is stored in DT_NEED If a relative path is saved, the dynamic linker will look for shared libraries in /ib, /usr/lib, and the directories specified by the /etc/ld.so.conf configuration file.

ld.so.conf is a text configuration file that may contain other configuration files that store directory information.
/usr/local/lib
/lib/i486-linux-gnu
/usr/lib/i486-linux-gnu

There is a program called ldconfig in the Linux system. The function of this program is to create, delete or update the corresponding SO-NAME (that is, the corresponding symbolic link) for each shared library in the shared library directory, so that the SO-NAME of each shared library -NAME can point to the correct shared library file; and this program will also collect these SO-NAMEs, store them centrally in the letc/ld.so.cache file, and create a SO-NAME cache.

Environment variables

  • LD_LIBRARY_PATH

Linux systems provide many methods to change the path used by the dynamic linker to load shared libraries. The easiest way to change the shared library search path is to use the LD_LIBRARY_PATH environment variable

  • LD_PRELOAD

There is another environment variable in the system called LD_PRELOAD. In this file, we can specify some pre-loaded shared libraries or even target files. The files specified in LD_PRELOAD will be loaded before the dynamic linker searches for shared libraries according to fixed rules. It has priority over shared libraries in the directory specified in LD_LIBRARY_PATH.

  • LD_DEBUG

There is also a very useful environment variable LD_DEBUG. This variable can turn on the debugging function of the dynamic linker. When we set this variable, the dynamic linker will print out various useful information at runtime, which is useful for our development and debugging sharing. Libraries are of great help.

LD_DEBUG=files ./HelloWorld.out