clang-query use

1.What is clang-query?

clang-query is a command line interface tool based on the libclang library that can perform queries in the context of the Clang abstract syntax tree (AST), helping users understand the structure of the code.

2.What is the principle of clang-query?

Use clang-query to first compile your source code through clang to generate a compiled database, and then provide this database to clang-query for querying.

3.What can clang-quey do?

clang-query can perform matching of a certain expression, display nodes on the abstract syntax tree, set and get variables, etc. It supports dynamic fuzzy matching, which greatly improves query flexibility and matching accuracy.

4.How to use clang-query?

Use clang-query tool

clang-query demo.c --

In the command line parameters, “–” is regarded as a special mark, which tells the command line parser that the following content should not be regarded as options (that is, not parameters starting with -), but should be regarded as ordinary input parameters. .

5. What are the commonly used configuration items?

  1. set output detailed-ast
    This command changes the output format to detailed AST (Abstract Syntax Tree, Abstract Syntax Tree), so that the output results contain more detailed and hierarchical AST node information. It will help you better understand and explore the abstract syntax tree of your program.

  2. set traversal IgnoreUnlessSpelledInSource
    This command sets the traversal mode to process only nodes that appear explicitly in the source code. It is used to ignore those AST nodes in the program that are implicitly generated by the compiler and do not appear directly in the source code.

  3. set bind-root true
    This command will bind the entire matching AST to the root node. If you use the .bind(“foo”) command, then you need to set set bind-root false.

  4. set print-matcher true
    This command prints the matcher that matches it when printing the matching node. This is particularly useful for understanding which matcher caused the match.

  5. enable output dump
    This command will enable dump output mode, which can output detailed information about matching nodes.

6. Explanation of terms that may be used?

caller (caller)

callee (callee)

void foo() {
    // ...
}

void bar() {
    foo(); // Here, bar is caller and foo is callee.
}

bar is a caller because it calls other functions; foo is a callee because it is called by other functions.

In Clang’s abstract syntax tree (AST), there is also a CalleeAST node, which represents the called function part of the function call expression.

For example, in the foo() expression, foo is callee. Using the callee() matcher, you can select this part of the function call when writing AST Matchers.

#include <stdlib.h>
#include <memory.h>

void *acme_zmalloc(size_t s){
  void *ptr=malloc(s);
  memset(ptr,0,s);
  return ptr;
}

void *foo(int s){
  return malloc(s);
}

7. What are the commonly used checking commands?

(1) Query function declaration

m is match, which means matching.

clang-query> m functionDecl()
(2) Check whether the matching function declaration is in the main file

Filter function declarations defined in the include file and only return function declarations defined in the source file

cclang-query> m functionDecl(isExpansionInMainFile())

Match #1:

/home/josecep/workplace/init.cpp:4:1: note: "root" binds here
void function(void){
^~~~~~~~~~~~~~~~~~~~
1 match.
clang-query>

(3) Query variable declaration
clang-query> m varDecl()

(4) Check whether the matching variable declaration is in the main file

clang-query> m varDecl(isExpansionInMainFile())

Match #1:

/home/josecep/workplace/init.cpp:5:2: note: "root" binds here
        int x = 0;
        ^~~~~~~~~

Match #2:

/home/josecep/workplace/init.cpp:6:2: note: "root" binds here
        char *txt = nullptr;
        ^~~~~~~~~~~~~~~~~~~

Match #3:

/home/josecep/workplace/init.cpp:7:2: note: "root" binds here
        double d = NAN;
        ^~~~~~~~~~~~~~
3 matches.

(5) Enable detailed AST output mode

In this mode, the output will not only contain the matched AST node (such as a function declaration or variable declaration), but will contain detailed information about the entire AST.

set output detailed-ast

To turn off the verbose AST output mode, use the command set output print.
set traversal IgnoreUnlessSpelledInSource

(6) Query variable declaration in main file
clang-query> m varDecl(isExpansionInMainFile())

Match #1:

Binding for "root":
VarDecl 0x5646c7531ef8 </home/josecep/workplace/init.cpp:5:2, col:10> col:6 x 'int' cinit
`-IntegerLiteral 0x5646c7531f60 <col:10> 'int' 0


Match #2:

Binding for "root":
VarDecl 0x5646c7531fb0 </home/josecep/workplace/init.cpp:6:2, col:14> col:8 txt 'char *' cinit
`-ImplicitCastExpr 0x5646c7532028 <col:14> 'char *' <NullToPointer>
  `-CXXNullPtrLiteralExpr 0x5646c7532018 <col:14> 'std::nullptr_t'
.........
(7) Match malloc and free function declarations in the main file
clang-query> m functionDecl(isExpansionInMainFile(),anyOf(hasName("malloc"),hasName("free")))
0 matches.
(8) Match all function declarations with f in the main file

matchesName() usually matches full names

clang-query> m functionDecl(isExpansionInMainFile(),matchesName("f"))

Match #1:

Binding for "root":
FunctionDecl 0x5646c7531e38 </home/josecep/workplace/init.cpp:4:1, line:8:1> line:4:6 function 'void ()'
`-CompoundStmt 0x5646c75321d0 <col:20, line:8:1>
  |-DeclStmt 0x5646c7531f80 <line:5:2, col:11>
  | `-VarDecl 0x5646c7531ef8 <col:2, col:10> col:6 x 'int' cinit
  | `-IntegerLiteral 0x5646c7531f60 <col:10> 'int' 0
  |-DeclStmt 0x5646c7532040 <line:6:2, col:21>
  | `-VarDecl 0x5646c7531fb0 <col:2, col:14> col:8 txt 'char *' cinit
  | `-ImplicitCastExpr 0x5646c7532028 <col:14> 'char *' <NullToPointer>
  | `-CXXNullPtrLiteralExpr 0x5646c7532018 <col:14> 'std::nullptr_t'
  `-DeclStmt 0x5646c75321b8 <line:7:2, col:16>
    `-VarDecl 0x5646c7532070 <col:2, /usr/include/math.h:98:35> /home/josecep/workplace/init.cpp:7:9 d 'double' cinit
      `-ImplicitCastExpr 0x5646c75321a0 </usr/include/math.h:98:15, col:35> 'double' <FloatingCast>
        `-ParenExpr 0x5646c7532180 <col:15, col:35> 'float'
          `-CallExpr 0x5646c7532140 <col:16, col:34> 'float'
            |-ImplicitCastExpr 0x5646c7532128 <col:16> 'float (*)(const char *) noexcept' <BuiltinFnToFnPtr>
            | `-DeclRefExpr 0x5646c75320d8 <col:16> '<builtin fn type>' Function 0x5646c73fccd8 '__builtin_nanf' 'float (const char *) noexcept'
            `-ImplicitCastExpr 0x5646c7532168 <col:32> 'const char *' <ArrayToPointerDecay>
              `-StringLiteral 0x5646c75320f8 <col:32> 'const char[1]' lvalue ""

1 match.

(9) Query all binary operation expressions in the main file

binaryOperator(isExpansionInMainFile()) searches the main file for all binary operator expressions (such as + , -, *, / etc.).

clang-query> m binaryOperator(isExpansionInMainFile())
0 matches.

  • Find an expression in the main file for the binary operator “*” whose right-hand side is an integer literal equal to 0.
clang-query> m binaryOperator(isExpansionInMainFile(),hasOperatorName("*"),hasRHS(integerLiteral(equals(0))))
0 matches.
(10) Match all function calls in the source code
m callExpr()
(11) Match those function call expressions whose lower-level functions are function declarations
clang-query> m callExpr(callee(functionDecl()))

callExpr() is an expression that matches function calls, while callee(functionDecl()) refers to those functions that are called in function calls that are function declarations.

void foo() {
    /* some logic */
}

int main() {
    foo();
}

In this case, the line “foo();” will be matched by the Clang-query command because it is a function call expression and its callee is a function declaration.

clang-query> m callExpr(callee(functionDecl(hasName("malloc"))))

clang-query> m callExpr(callee(functionDecl(hasName("free"))))


8. What reference materials are there?

  1. Clang-query User Guide: Official user guide
https://clang.llvm.org/docs/LibASTMatchersReference.html
  1. Clang-query tutorial: There are some clang-query examples and tutorials on how to use this tool to query AST
https://kevinaboos.wordpress.com/2013/07/23/clang-tutorial-part-ii-libtooling-example/
  1. Youtube tutorial: demonstrates how to install and use clang-query
https://www.youtube.com/watch?v=VqCkCDFLSsc