[Shell command collection linker tool] Linux ld command links target files and libraries into executable files or library files

Directory title

Description
- Grammar format
- Parameter Description
- Error conditions
Notes
Underlying implementation
Example
- Example 1
- Example 2
- Example 3
- Example 4
- Example 5
- Example 6
- Example 7
Use c language to simulate ld ideas
Conclusion

Shell command column: full analysis of Linux Shell commands

Description

ld is a linker tool in the Linux environment. Its main function is to combine multiple target files (.o or .obj file) linked into an executable file or library file. The following are the main functions and functions of the ld command:

Object file linking: ld can link multiple object files into a single output file. These object files are usually generated by a compiler and contain the machine code of the program.
Resolve symbols: During the linking process, ld will resolve symbols in the object file and ensure that all function and variable references are resolved correctly.
Generate executable files: ld can not only generate executable files, but also generate shared libraries or static libraries.
Handling relocation: Since the code in the target file may be executed at any location, ld needs to handle the relocation of these codes to ensure that the code is executed at the correct memory address.
Merge segments: ld will merge similar segments in the target file, such as .text (code segment) or .data (data segment).
Link-time optimization: In some cases, ld can also perform link-time optimization to improve the performance of the generated executable.
Handling startup code: ld is also responsible for linking the startup code, which is the first piece of code that runs when the program starts executing.

Overall, the ld command is an indispensable tool in Linux programming. It ensures that all parts of the program are correctly combined to generate a runnable executable file.

Grammar format

ld [options] object_files [--] [library_files]

Parameter description

-b : Specifies the format of the object code input file.
-Bstatic: Only use static libraries.
-Bdynamic: Only use dynamic libraries.
-Bsymbolic: Bundle references to global symbols in shared libraries.
-c , --mri-script=: For compatibility with the MRI linker, ld accepts script files written in the MRI command language.
--cref: Create a cross-reference table.
-d,-dc,-dp: Space is allocated for public symbols even if a relocatable output file is specified (using -r). The script command “FORCE_COMMON_ALLOCATION” has the same effect.
-defsym: Create the specified global symbol in the output file.
-demangle: Restore symbol names in error messages.
-e : Use the specified symbol as the initial execution point of the program.
-E,--export-dynamic: For ELF format files, when creating a dynamically linked executable file, add all symbols to the dynamic symbol table.
-f , --auxiliary=: For ELF format shared objects, set the DT_AUXILIARY name.
-F , --filter=: For ELF format shared objects, set the DT_FILTER name. This tells the dynamic linker that the symbol table of the shared object being created should be used as a filter for the symbol table of the shared object name.
-g: Ignored. Used to provide compatibility with other tools.
-h: For ELF format shared objects, set the DT_SONAME name.
-I, -dynamic-linker , --dynamic-linker=: Specify the dynamic linker. This only makes sense when building an ELF executable that depends on a dynamic link library. The default dynamic linker is usually correct, don’t use this option unless you know what you are doing.
-l , --library=: Add the specified library file to the list of files to be linked.
-L , --library-path=searchdir: Add the specified path to the directory list of the search library.
-M, --print-map: Display link map for diagnostic purposes.
-Map=: Output the link map to the specified file.
-m : Emulate the specified linker.
-N,--omagic: Specify reading/writing text and data segments.
-n,--nmagic: Turn off section page alignment and disable linking to shared libraries. If the output format supports Unix-style magic numbers, then mark the output as “NMAGIC”.
-noinhibit-exec: Generate output files even if non-fatal link errors occur. Normally, if the linker encounters an error during the linking process, it will not generate an output file.
-no-keep-memory: ld usually caches the input file’s symbol table in memory to optimize memory usage. This option tells ld not to cache the symbol table. When linking large executables, you may need to use this option if ld runs out of memory space.
-O : For non-zero optimization levels, ld will optimize the output. This operation is time consuming and should be used only when generating final results.
-o , --output=: Specify the name of the output file.
-oformat=: Specify the binary format of the output file.
-R ,--just-symbols=: Read symbol names and addresses from the specified file.
-r,--relocatable: Generate relocatable output (called a partial join).
-rpath=: Add the specified directory to the runtime library search path.
-rpath-link=: Specifies the directory to search for runtime shared libraries.
-S,--strip-debug: Ignore debugger symbol information from the output file.
-s,--strip-all: Ignore all symbol information from the output file.
-shared, -Bshareable: Create a shared library.
-split-by-file[=size]: Create additional segments up to size in the output file for each target file. size defaults to 1.
-split-by-reloc[=count]: Create additional segments in the output file according to the specified length.
--section-start==: Locate the specified section at the specified address in the output file.
-T , --script=: Use scriptfile as the linker script. This script replaces ld ‘s default linker script (not adds to it), so the script must specify everything needed for the output file. If the script file does not exist in the current directory, ld searches in the directory specified by the -L option.
-Ttext=: Use the specified address as the starting point of the text segment.
-Tdata=: Use the specified address as the starting point of the data segment.
-Tbss=: Use the specified address as the starting point of the bss segment.
-t,--trace: Display the names of input files as they are processed.
-u , --undefined=: Force the specified symbol to be undefined in the output file.
-v, -V, --version: Display the ld version number.
-warn-common: Warn when a common symbol is combined with another common symbol.
-warn-constructors: Warn if no global constructors are used.
-warn-once: Warn only once for each undefined symbol.
-warn-section-align: Warn if the output section address is changed for alignment.
--whole-archive: For the specified archive file, include all files in the archive.
-X, --discard-locals: Remove all local temporary symbols.
-x, --discard-all: Remove all local symbols.

Error conditions

Unresolved symbol: If ld cannot find the definition of a symbol during the linking process, it will report an error and prompt an unresolved symbol.
Multiple definitions: If the same symbol is defined in multiple target files, ld will report an error.
File format mismatch: If you try to link target files in different formats, such as ELF and a.out, ld will report an error.
Library not found: If the library specified using the -l option is not found in the library search path, ld will report an error.
Input/output errors: If an error occurs while reading an input file or writing an output file, ld will report an error.

Notes

When using the ld command in Linux, there are a few things to consider to ensure that the linking process goes smoothly and produces the correct output file:

Order of libraries: On the command line, the order of libraries matters. Normally, object files should be listed first, followed by library files. This is because ld processes arguments from left to right, and when it resolves symbol references in an object file, it looks for those symbols in subsequent library files.
Linking system libraries: When linking system libraries, the -l option is usually used, such as -lm for linking math libraries. But to make sure the library’s search paths are set correctly, you can add additional search paths using the -L option.
Static vs. dynamic linking: By default, ld will try dynamically linked libraries. If you need static linking, you can use the -static option. But be aware that static linking will increase the size of the output file.
Symbol conflict: Make sure there are no duplicate definitions of symbols in different target files, otherwise ld will report a multiple definition error.
Use an appropriate linker script: ld Use linker scripts to control the layout of the output files. If you have special needs, you can use the -T option to specify a custom link script.
Remove unnecessary symbol information: In order to reduce the size of the output file, you can use the -s option to remove symbol information. But this can make subsequent debugging difficult.
Ensure compatibility: Compatibility issues may arise if the linked object files were generated on a different system or using different compiler options. Make sure all files are compiled in the same environment.
Check Output: After linking is complete, use the file command to check the type of the output file to make sure it is in the expected format (for example, an ELF executable file).
Avoid using outdated libraries or options: Some libraries or options may have been deprecated over time. Make sure you are using current versions of libraries and recommended linking options.
Read the documentation: The functions and options of ld may change with version updates. Check man ld or other relevant documentation regularly for the latest information and advice.

In short, you need to pay attention to many aspects when using the ld command to ensure that the linking process is correct and the expected output file is generated.

Underlying implementation

The ld command, part of the GNU Binutils suite, is the standard linker in Linux. Its underlying implementation involves multiple complex steps and algorithms. The following is an overview of its core implementation:

Input processing:
- ld first reads all input object files and library files.
- It parses ELF (Executable and Linkable Format) or other format file headers to obtain segment information, symbol tables, etc.
Symbol analysis:
- ldBuilds a global symbol table containing symbols from all input files.
- For unresolved symbols (for example, functions referenced in an object file but defined elsewhere), ld searches all input library files for definitions of these symbols.
- If the definition of a symbol cannot be found, ld will report an error.
Address assignment:
- ld merges the segments of all input files (such as .text, .data) into one large segment and assigns an address to it.
- It also handles relocation entries that specify the location of code or data that needs to be modified at link time.
Relocation:
- Because object files are compiled independently, the address references they contain may not be the final correct addresses. ld updates these references with relocation entries so that they point to the correct addresses.
Output generation:
- Once all symbols have been resolved and all segments assigned addresses, ld generates an output file.
- The output file is usually in ELF format, but can be in other formats, depending on the target platform.
Optimization:
- In some cases, ld can also perform link-time optimizations, such as removing unused code or data.

The underlying implementation of ld is mainly written in C language and is open source. Interested in the detailed implementation, you can check out the source code of the GNU Binutils project.

Overall, the underlying implementation of ld involves multiple complex steps, from parsing of input files to address assignment, relocation and final output generation. These steps ensure that the resulting executable or library file has correct access to its code and data when run.

Example

Example 1

Link two target files file1.o and file2.o to generate an executable file program:

ld -o program file1.o file2.o

Example 2

Link the object file file.o and use the static library libstatic.a:

ld -o program file.o -Bstatic -lstatic

Example 3

Link the object file and specify the path of the dynamic linker:

ld -o program file.o -I/usr/local/lib/ld-linux.so.2

Example 4

Link the target file and specify the binary format of the output file as elf64-x86-64:

ld -o program file.o -oformat=elf64-x86-64

Example 5

Link the object files and force the symbol my_symbol to be an undefined symbol in the output file:

ld -o program file.o -u my_symbol

Example 6

Link object files using linker script linker_script.ld:

ld -o program file.o -T linker_script.ld

Example 7

Link the target file and display the name when processing the input file:

ld -o program file.o -t

Use c language to simulate ld ideas

Implementing the functionality of the ld command is a complex task because ld is a complete program linker that handles multiple file formats, parses symbols, handles relocations, etc. Fully implementing the functionality of ld requires a lot of code and in-depth knowledge.

But I can provide you with a simplified conceptual implementation of a linker that will help you understand the basic workings of a linker. This is just a very basic example and cannot be used for actual linking tasks.

#include <stdio.h>
#include <stdlib.h>

// Assume we have two simple target file formats:
// file1.o: "HELLO "
// file2.o: "WORLD"

int main(int argc, char *argv[]) {<!-- -->
    if (argc < 4) {<!-- -->
        printf("Usage: %s output input1 input2\\
", argv[0]);
        return 1;
    }

    FILE *input1 = fopen(argv[2], "rb");
    FILE *input2 = fopen(argv[3], "rb");
    FILE *output = fopen(argv[1], "wb");

    if (!input1 || !input2 || !output) {<!-- -->
        perror("Error opening files");
        return 1;
    }

    // Simply copy the contents of the two input files to the output file
    char ch;
    while ((ch = fgetc(input1)) != EOF) {<!-- -->
        fputc(ch, output);
    }
    while ((ch = fgetc(input2)) != EOF) {<!-- -->
        fputc(ch, output);
    }

    fclose(input1);
    fclose(input2);
    fclose(output);

    printf("Linking completed.\\
");
    return 0;
}

This simplified version of the linker simply copies the contents of two input files to a single output file. In the actual ld linker, it will handle complex tasks such as various object file formats, parsing symbols, handling relocations, merging sections, etc.

If you really want to get a deeper understanding of how linkers work, I recommend reading professional books on linkers and loaders, or looking at the source code of an open source linker like GNU ld.

Conclusion

During our exploration process, we have gained an in-depth understanding of the powerful functions and widespread applications of Shell commands. However, learning these techniques is just the beginning. The real power comes from how you integrate them into your daily routine to increase efficiency and productivity.

Psychology tells us that learning is a continuous and actively involved process. So, I encourage you to not only read and understand these commands, but also practice them. Try creating your own commands and gradually master shell programming so that it becomes part of your daily routine.

At the same time, please remember that sharing is a very important part of the learning process. If you found this blog helpful, please feel free to like and leave a comment. Sharing the problems or interesting experiences you encountered when using Shell commands can help more people learn from them.
In addition, I also welcome you to bookmark this blog and come back to check it at any time. Because review and repeated practice are also the keys to consolidating knowledge and improving skills.

Finally, please remember: everyone can become a Shell programming expert through continuous learning and practice. I look forward to seeing you make further progress on this journey!

Read my CSDN homepage and unlock more exciting content: Bubble’s CSDN homepage