C++ (Qt) software debugging—linux uses dmesg to locate the program crash location (14)

C++ (Qt) software debugging-linux uses dmesg to locate program crash location (14)

Article directory

  • C++ (Qt) software debugging—linux uses dmesg to locate the program crash location (14)
    • 1 Introduction
    • 2. ELF file
    • 3. Commonly used tools
    • 4. Use dmesg to locate the abnormal location
      • 1.1 Exceptions occur in executable programs
      • 1.2 Exception occurs in dynamic library

1. Foreword

In our daily development, we often encounter situations where the program crashes and exits. Generally, under Linux, we can generate a core file for debugging and locate the abnormal location. However, if we forget to generate a core file or the program does not run on the user’s machine, After generating the core, you cannot locate the exception by debugging the core file. In this case, you can use the information in dmesg to help debug.

  • There may be differences in dmesg in different systems. For example, in the Kirin system, I did not find the dmesg log after the program crashed;
  • Test system: ubuntu20.04

2. ELF file

ELF (Executable and Linkable Format) is a universal binary file format mainly used to store executable programs, shared libraries, object files, etc. in Unix and Linux systems. The following is a detailed description of the ELF file:

  1. File Header:

    • The ELF file starts with a fixed-size file header, which contains basic information about the file, such as file type, target architecture, entry point address, etc.
  2. Program Header Table:

    • The program header table is a table that contains descriptive information about each program segment (a segment is a memory allocation unit) in the ELF file, such as segment type, file offset, virtual memory address, size, etc.
  3. Section Header Table:

    • The section header table contains descriptive information about each section (logical partition within a section, such as code section, data section) in the ELF file, including name, type, size, offset, etc.
      • .init(_init): The init section usually refers to the initialization code segment of the program. It is the starting point of program execution and the entrance to the init section. The dot is the starting address of the program, where the operating system will begin execution. This entry point is usually the address specified in the header of the executable file.
      • .text (code section): This section contains the machine code instructions of the program, which is the executable code part. It is one of the most important sections in an executable file.
      • .data (data section): The data section contains global and static variables in the program, as well as the data values of initialized global and static variables. This data can be accessed and modified throughout the life cycle of the program.
      • .rodata (read-only data section): This section contains read-only data, usually constants, string literals, etc. This data cannot be modified at runtime.
      • .bss (Uninitialized Data Section): This section contains uninitialized data for global and static variables, that is, they are not initialized when the program is loaded, but are initialized to zero or null values at runtime.
      • .plt (Procedure Linkage Table): Section used for delayed binding of dynamic links, mainly used for shared libraries.
      • .got (Global Offset Table): The global offset table contains the addresses of global variables and functions used for dynamic linking.
      • .debug (debug information section): Contains debugging information, such as symbol tables, source code mapping, etc., which is helpful for debugging and analyzing programs.
      • .textrel (relocation section): Contains information that needs to be dynamically relocated in the code, used to handle position-independent code.
      • .comment (comment section): Contains comment information about the program, which usually does not affect the execution of the program.
      • .strtab (string table section): A table containing strings such as section names, symbol names, etc., used for indexing and searching of strings.
  4. Section Data:

    • ELF files contain various segments, including code segments, data segments, read-only segments, write-only segments, etc. These segments contain the actual data and instructions of the program.
  5. Symbol Table:

    • The symbol table stores information about symbols (such as functions and variables) defined and referenced in the program, including symbol names, addresses, sizes, etc. It is very useful during debugging and linking.
  6. Relocation Table:

    • The relocation table contains information that needs to be addressed at link time to ensure that the program is loaded and executed correctly in memory.
  7. Dynamic Linking Information:

    • If the ELF file supports dynamic linking (shared libraries), it contains dynamic linking information, including the name of the shared library and related information.
  8. String Table:

    • The string table stores various strings, such as symbol names, segment names, etc., which can be referenced in the ELF file through indexes.
  9. Version Information:

    • If the file supports versioning, version-related information is included so that the correct shared library version can be selected during dynamic linking.
  10. Additional information:

    • ELF files can also contain other information specific to the target architecture and application, such as program entry points, alignment requirements, file permissions, etc.

3. Commonly used tools

  • file: View basic information about the file.
  • ldd: View the dynamic link libraries that an executable file or shared library file depends on.
  • nm: Tool for viewing symbol information in binary executables or shared libraries.
  • strings: Used to extract printable character sequences from binary files, often used to find text in files.
  • strip: Linux command for stripping symbol table information from executable files.
  • readelf: Display information about ELF files.
  • objdump: Tool for viewing and analyzing executable files, object files, and shared libraries, usually provided with GNU Binutils. It provides detailed information about these files, including program code, data segments, symbol tables, relocation tables, and more.
  • netstat: A command line tool used to display network-related information such as network connections, routing tables, interface statistics, Masquerade connections, etc.
  • ps: used to display process information of the current system.
  • top/htop: Command line tool for monitoring system processes, commonly used in Unix and Unix-like systems. It provides real-time information about the processes running in the system, including CPU usage, memory usage, process list, and other system performance-related information.
  • dmesg: Command used to view and analyze Linux operating system kernel messages. It provides detailed reports on system startup, hardware detection, device driver loading, kernel errors, and more.
  • addr2line: Convert addresses to file names and line numbers.
  • tcpdump: A powerful network packet capture tool for capturing and analyzing network data packets.

4. Use dmesg to locate abnormal locations

  • Test program: C++, Qt

    #include "widget.h"
    #include "ui_widget.h"
    
    Widget::Widget(QWidget *parent)
        : QWidget(parent)
        , ui(new Ui::Widget)
    {<!-- -->
        ui->setupUi(this);
    }
    
    Widget::~Widget()
    {<!-- -->
        delete ui;
    }
    
    
    void Widget::on_pushButton_clicked()
    {<!-- -->
        QWidget* w = nullptr;
    // w->show(); // Exception occurs in dynamic library
        int* a = nullptr;
        *a = 10; // The exception occurred in the executable program
    
    
    }
    

1.1 Exceptions occur in executable programs

  • Execute the program and crash;

  • Use the dmesg command to view the program exception log (sometimes you need to use sudo dmesg);

    [44741.877559] untitled4[14899]: segfault at 0 ip 00005598c5d3b8e4 sp 00007ffd7505a2b8 error 6 in untitled4[5598c5d3b000 + 1000]
    [44741.877582] Code: ff ff ff 48 89 ef be 38 00 00 00 5d e9 c5 f9 ff ff 0f 1f 44 00 00 f3 0f 1e fa 48 83 ef 10 eb d6 66 0f 1f 44 00 00 f3 0f 1e fa < c7> 04 25 00 00 00 00 00 00 00 00 0f 0b 66 2e 0f 1f 84 00 00 00 00
    
    • untitled4[14899]: The program and process number where the exception occurred;
    • segfault at [address]: The memory address where the segfault occurred.
    • ip [instruction pointer address]: The address of the instruction pointer that caused the segfault in virtual memory.
    • sp [stack pointer address]: The address of the stack pointer that caused the segfault.
    • error [error code]: segfault error code, usually a non-zero value indicating error.
    • in [file where exception occurred] [first address of file loaded into memory + offset]: The path and function name of the binary file that caused the segfault.
  • Two types of addresses will be used here;

    1. Memory address: Memory address refers to the physical address or virtual address in the computer memory, which is used to locate the program’s instructions, data, stack, etc. at runtime. Memory addresses are dynamic and change as the program executes, especially in systems that use virtual memory, where the operating system maps memory addresses to physical memory or disk. Memory addresses are the addresses actually used by a program when it is running, and they correspond to storage locations in physical or virtual memory. What is recorded in dmesg is the address of the instruction pointer in memory when the program crashes;
    2. Addresses in executables: Addresses in executables are the internal address space of a compiled program and are typically used by compilers and linkers to determine code, data, and symbols when building an executable. relative position. These addresses are assigned at compile time and link time, and saved in the executable file. They are usually relative addresses and are used to locate data and instructions inside the executable file. The addresses in the executable file are static and will not change as the program is executed.
  • Therefore, to use the address of dmesg to locate the abnormal location, you need to convert the memory address to the address in the executable file;

  • Since the entry point of the init section is the starting address of the program, the address in the file = instruction pointer address - the first address of the file loaded into memory + the init section address ;

    • You can view the init section address through the nm command or the objdump -p command;

    • 00005598c5d3b8e4 - 5598c5d3b000 + 3000 = 38E4

    • Use addr2line -e executable program address -Cfi to get the abnormal function, the file where it is located, and the line number in the file as shown in the figure below.

    • Note: If it is compiled using Release mode and does not contain debugging symbols, you can only see the function where the exception occurred, not the file and line number.

1.2 Exception occurs in dynamic library

  • When an exception occurs in the dynamic library, the dmesg log information is as follows;

    [47072.125853] untitled4[15938]: segfault at 28 ip 00007f885dca8f7e sp 00007ffe035bc640 error 4 in libQt5Widgets.so.5.12.5[7f885db0e000 + 61d000]
    [47072.125882] Code: 48 89 df be 01 00 00 00 5b 48 8b 40 68 ff e0 90 66 90 66 2e 0f 1f 84 00 00 00 00 00 48 8b 05 b1 ec 6a 00 53 48 89 fb 48 8b 38 <48> 8b 43 28 8b 70 0c 48 8b 07 ff 90 a8 00 00 00 83 f8 04 74 2d 83
    
  • Since the stacks of the executable program and the dynamic library are independent of each other and are not on the same stack, when an exception occurs in the dynamic library, the address in the dmesg log information is the address of the dynamic library in memory;

  • So The address in the file = the instruction pointer address - the first address of the file loaded into memory;

  • 00007f885dca8f7e - 7f885db0e000 = 19AF7E

  • Use ldd to view the dynamic library path;

  • Use addr2line -e dynamic library address -Cfi to get the abnormal function, the file where it is located, and the line number in the file as shown in the figure below.

{__/}
(? ?′? ?^? ?`?)~?
| ? |