cc1
cc1
is a core part of the LLVM toolchain, specifically within the Clang compiler framework. Although developers may not interact directly with it in their daily programming work, understanding how it works is helpful for a deep understanding of the entire compilation process.
-
Definition:
cc1
is the actual compiler frontend for Clang. When we use theclang
command line tool to compile C/C++ code, it actually startscc1
internally to do most of the work. -
Main Responsibilities:
- Lexical analysis: Breaking the source code into tokens.
- Syntax analysis: organize these tokens into a syntax tree.
- Semantic analysis: Determine the meaning of the syntax tree and perform preliminary error checking.
- Generate intermediate representation: Generate LLVM IR from syntax tree.
-
Why it exists:
- The main reason for separating
cc1
andclang
is for modularity. This allows Clang to easily call other frontends when needed, such ascc1plus
for C++ code. - It also provides other tools, such as compiler plugins or advanced tools, with direct access to various stages of the compilation process.
- The main reason for separating
-
How to use: Typically, developers do not need to call
cc1
directly because theclang
tool will do this for us automatically. However, understanding its existence and knowing how to use it can help us better debug and understand the compilation process. For example, use theclang -###
command to show howclang
callscc1
. -
Example:
If you run the following command:clang -### example.c
We’ll see how
clang
callscc1
and the arguments passed to it. -
Note:
- Although
cc1
is a central part of the compilation process, it is not responsible for linking or assembly. This work is done by other tools, such aslld
or the system’s default linker. cc1plus
is a similar tool designed specifically for working with C++ code.
- Although
Overall, cc1
is the core frontend in the Clang compiler that processes source code and generates LLVM IR. Understanding it and the entire compilation process is an in-depth study of compilation principles and the LLVM framework.
llc
llc
is a component in the LLVM tool chain. Its main function is to convert LLVM intermediate representation (LLVM IR) into target machine code. LLVM IR is a low-level, architecture-independent intermediate representation, usually suffixed with a .ll
or .bc
file. llc
allows you to convert this intermediate representation into assembly or machine code for many supported target platforms.
-
Main functions: Convert LLVM’s intermediate representation (IR) into assembly code or machine code for a specific target platform.
-
Usage scenarios:
- When we have an LLVM IR file and want to generate assembly or machine code for a specific architecture.
- In the complete compilation process,
llc
is usually executed immediately afterclang
(converting C/C++ source code to LLVM IR).
-
Basic usage:
llc input.ll # will generate a platform-dependent assembly file, such as input.s
-
Common options:
-march=
: Specify the target architecture, such asx86
,arm
,aarch64
, etc.-filetype=
: Specify the output file type. For example,asm
represents the assembly file (default), andobj
represents the object file.-o
: Specify the name of the output file.
-
Example:
If we have an LLVM IR file namedinput.ll
and want to convert it to assembly code for the ARM architecture:llc -march=arm input.ll -o output-arm.s
-
Note:
- Before using
llc
, you may want to ensure that the target architecture’s backend is available at the time of the LLVM installation. Otherwise, you risk receiving an error stating that the architecture is not supported. - Usually, for developers,
llc
is just one step in the complete compilation and linking process. In order to generate executable binaries, additional tools and steps are required, such as assemblers and linkers.
- Before using
In general, llc
is one of the core components of LLVM, allowing developers to generate target machine code from LLVM IR. Through it, LLVM provides a platform-independent compilation strategy, making optimization and code generation possible for multiple target platforms.
“Assembler” in LLVM usually refers to the LLVM IR’s assembler and disassembler. This is slightly different from a machine code assembler in the usual sense. The LLVM IR’s assembler and disassembler are tasked with converting between LLVM’s intermediate representation (IR) and its textual form.
-
LLVM IR: LLVM IR is an intermediate representation of LLVM and can be considered a low-level but still readable programming language. It comes in two forms:
- Text form (also known as LLVM assembly language). Usually saved as a
.ll
file. - Binary form (also known as LLVM bitcode). Usually saved as a
.bc
file.
- Text form (also known as LLVM assembly language). Usually saved as a
-
Task:
- Assembler: Converts textual LLVM IR to binary LLVM bitcode.
- Disassembler: Converts binary LLVM bitcode to textual LLVM IR.
-
Use:
This conversion can be performed using thellvm-as
andllvm-dis
tools.- Use
llvm-as
to assemble:llvm-as input.ll -o output.bc
- Use
llvm-dis
to disassemble:llvm-dis input.bc -o output.ll
- Use
-
Note:
- “Assembly” and “disassembly” here are relative to LLVM IR, not to machine code assembly.
- The LLVM project does not provide a machine code assembler in the traditional sense. When LLVM needs to generate machine code, it usually generates machine code bitcode directly rather than generating assembly code. Nonetheless, it is possible to generate target machine assembly code from an LLVM IR using the
llc
tool. - The textual representation of LLVM IR is often used for debugging, analysis, or teaching purposes because it provides a human-readable intermediate compiled representation.
Overall, LLVM provides tools to assemble and disassemble its intermediate representation, the LLVM IR. This allows developers to convert between text and binary formats, making it easier to understand, debug, and optimize their code.
lld
lld
is a linker in the LLVM project. Linking is the final stage of the compilation process and involves combining the various object files produced by the compiler (usually .o
or .obj
files) into a single executable File, shared library or static library. And lld
is designed to perform this task.
-
Definition:
lld
is the official linker of the LLVM project, designed to provide high-performance and modular linking for a variety of platforms. It aims to provide speeds that are comparable to or better than other system linkers and to work seamlessly with other LLVM tools. -
Features:
- Cross-platform:
lld
supports multiple target platforms, including ELF (Linux), Mach-O (macOS), and COFF (Windows). - Performance:
lld
is designed to be fast. It is often compared for performance with other linkers such as GNU ld or gold and provides good performance in most cases. - Simplicity: Compared to other linkers,
lld
‘s code base is relatively small and modular, making it easy to maintain and extend.
- Cross-platform:
-
How to use:
If you have the full LLVM suite installed, you can usually uselld
in the following ways:clang -fuse-ld=lld your_source_file.c
Or use directly:
lld [options] input_files
-
Subproject:
lld
itself is modular and contains several subprojects, each dedicated to providing support for a specific target platform:ELF
: Provides support for ELF-based systems such as Linux.COFF
: Provides support for Windows platforms.Mach-O
: Provides support for Apple platforms such as macOS and iOS.Wasm
: Provides support for WebAssembly.MinGW
: Provides support for MinGW.
-
Why choose
lld
:- Seamless integration with LLVM: If you are already using the LLVM tool chain (such as
clang
),lld
provides the ability to seamlessly integrate with these tools. - Open source and active development: As part of the LLVM project,
lld
is actively developed and receives support from the community. - Performance: For projects that need to be built quickly,
lld
may provide faster link times than other linkers.
- Seamless integration with LLVM: If you are already using the LLVM tool chain (such as
-
Note:
- Although
lld
has proven reliable in many scenarios, there may be some compatibility or feature differences with other linkers depending on the specific use case and platform.
- Although
In summary, lld
is the linker component of the LLVM project, providing a high-performance, modular solution for linking object files into executables or libraries.