The purpose of each software in the gcc compilation tool set and the ELF file format

Article directory

  • 1. What is GCC
  • 2. The composition of GCC
  • 3. Use of GCC
    • 1. Create the test.c file and enter the code
    • 2. Preprocessing
    • 3. Compile to assembly code
    • 4. Assembly
    • 5. Connect
      • ① Common usage
      • ② Library link
        • Ⅰ Change test.c code
        • Ⅱ Link the dynamic library and run it
      • 6. Analyze ELF files

1. What is GCC

The full name of gcc is GNU Compiler Collection, which is a compiler that can compile multiple languages. Initially gcc was used as a C language compiler (GNU C Compiler). Now in addition to the C language, it also supports C++, java, Pascal and other languages. gcc supports multiple hardware platforms.

2. Composition of GCC

As shown in the picture:

3. Use of GCC

The four stages of GCC compilation are: preprocessing (also called preprocessing), compilation
(Compilation), Assembly and Linking

1. Create the test.c file and enter the code

//test.c
#include <stdio.h>
int main(void)
{<!-- -->
printf("Hello World!\\
");
return 0;
}

2. Preprocessing

Preprocessing processes preprocessing instructions in the source code, such as macro definitions, conditional compilation, etc. It will modify and replace the source code according to these instructions and generate a preprocessed source code file

gcc -E test.c -o test.i or gcc -E

View the preprocessed test.i

The file is still readable at this time

3. Compile into assembly code

During the compilation phase, the compiler translates the preprocessed source code into assembly language. It performs lexical analysis, syntax analysis and semantic analysis to generate intermediate code or assembly code.

gcc -S test.i -o test.s

4. Assembly

During the assembly phase, the assembler converts the assembly code into machine code, which is an object file. It translates assembly instructions into machine instructions and generates binary files (ELF files) related to the target platform.

gcc -c test.s -o test.o


When you open the file, you can see that the test.o file at this time is already a binary machine instruction.

5. Connection

① Common usage

During the linking phase, the connector combines multiple object files and library files into an executable file or shared library. It parses symbol references and symbol definitions, associates them, and generates the final executable or shared library.

gcc test.o -o test

You can also pass

gcc test.c -o test

complete in one step

View the compiled executable file through ./test:

② Library link

In the linking phase, the system will first issue the gcc command
Search the path specified by parameter -L and then search from the path specified by the environment variable LIBRARY_PATH; then search from the default path /lib, /usr/lib, /usr/local/lib, and then link the dynamic library first
pass

ldd test

You can check which libraries test is linked to

You can see that the linked libraries are mainly Linux’s glibc
dynamic library

How to create a dynamic library is explained in the use of GCC dynamic libraries and static libraries.

We try to use the created liboxx.so library

Ⅰ Change test.c code

Select the link to the liboxx.so dynamic library created in the previous step.

Ⅱ Link dynamic library and run
gcc -o test test.c liboxx.so


View linked libraries through the ldd command

You can see that liboxx.so is already linked

6. Analyze ELF files

Definition: ELF (Executable and Linkable Format) is a standard file format for executable files and linkable files. It is a file format used to represent binary executable files, shared libraries, object files, etc., and is widely used in Unix-like systems.
The ELF file format defines how to organize and store the layout and structure of executable code, data, symbol tables, debugging information, etc. in memory. The ELF file format has the following characteristics:
Scalability: The ELF file format supports multiple architectures, such as x86, ARM, MIPS, etc., and can adapt to different hardware platforms.
1. Independence: The ELF file format is independent of a specific operating system and therefore can run on different operating systems.
2. Relocatability: The ELF file format supports the relocation of code and data, allowing multiple target files to be linked into an executable file or shared library.
3. Extensible header: The header of the ELF file format contains metadata information about the file, such as entry point address, segment table, symbol table, etc.
4. Debugging information support: ELF file format can contain debugging information, used for program debugging and symbol table lookup, etc.
The ELF file format is widely used in Unix-like systems, such as Linux, macOS, etc. It provides a standard file format that enables different compilers, linkers, and debuggers to work together on executable files, shared libraries, and object files.

The ELF file format is as shown in the figure below. The information between the ELF Header and Section Header Table is
It is Section. A typical ELF file contains the following sections:
.text: The instruction code segment of the compiled program.
.rodata: ro stands for read only, that is, read-only data (such as constant const).
.data: Initialized C program global variables and static local variables.
.bss: Uninitialized C program global variables and static local variables.
.debug: debugging symbol table, the debugger uses the information in this section to help debugging.

You can use readelf -S to view its various sections

You can also disassemble the ELF file to view its instructions and data
use:

objdump -D test


This is the machine instruction of test after disassembly
If you need to disassemble it into a file and view it:

objdump -D test > test.txt
//Indicates disassembling test into test.txt

You can also pass

objdump -S test

Disassemble the file and display mixed C language code
premise:

gcc -o test -g test.c -loxx

The -g option indicates that debugging information is generated during the compilation process, so that the debugger can be used for source code level debugging when debugging the program.


references:
1.Linux GCC common commands
2. The story behind the GCC compiler