From Java bytecode to ASM practice

1. Overview

The concept of AOP (Aspect Oriented Programming) is now widely used. The following is an explanation excerpted from Baidu Encyclopedia, which is relatively easy to understand.

In the software industry, AOP is the abbreviation of Aspect Oriented Programming, which means: Aspect-oriented programming, a technology that achieves unified maintenance of program functions through pre-compilation and run-time dynamic agents. AOP is the continuation of OOP, a hot spot in software development, an important content in the Spring framework, and a derivative paradigm of functional programming. AOP can be used to isolate various parts of the business logic, thereby reducing the coupling between the various parts of the business logic, improving the reusability of the program, and improving the efficiency of development.

AOP is a programming idea, but there are many ways to implement it, such as: Spring, AspectJ, JavaAssist, ASM, etc. Since I am doing Android development, I will use some examples in Android.

  • JakeWharton’s hugo is a typical application. It uses a custom Gradle plug-in + AspectJ to print the parameters, return results and execution time of methods with specific annotations to Logcat to facilitate development and debugging.
  • Since I have recently been learning about Java bytecode and ASM, I followed suit and wrote a Koala, which implements the same function as hugo. It prints the parameters, return results and execution time of specific annotated methods to Logcat for convenience. Development and debugging, but I use the custom Gradle plug-in + ASM method

So what is ASM? Here is an article introducing ASM. It is a well-written AOP tool: Introduction to ASM 3.0. Here is an excerpt:

ASM is a Java bytecode manipulation framework. It can be used to dynamically generate classes or enhance the functionality of existing classes. ASM can directly generate binary class files or dynamically change class behavior before the class is loaded into the Java virtual machine. Java classes are stored in strictly formatted .class files that have enough metadata to parse all elements of the class: class name, methods, properties, and Java bytecode (instructions). After ASM reads information from the class file, it can change the class behavior, analyze the class information, and even generate new classes according to user requirements.

To put it simply, javac compiles the .java file into a .class file. Although the contents of the .class file are different, they all have the same format. ASM uses the visitor mode to follow the unique format of the .class file. Scan the contents of the .class file from beginning to end. During the scanning process, you can do some operations on the .class file. It feels a bit like black technology.

2. Java bytecode & virtual machine

2.1 Java Bytecode

When it comes to Java bytecode, many people may not be very familiar with it. They probably know that javac can be used to compile a .java file into a .class file. The .class file stores the bytecode content corresponding to the .java file, such as The following Demo.java code is very simple:

package com.lijiankun24.classpractice;

public class Demo {

    private int m;

    public int inc() {
        return m + 1;
    }
}

Compile and generate the corresponding Demo.class file through javac, and use a plain text file to open Demo.class. The content is a binary stream based on 8-bit bytes. On the surface, it is composed of hexadecimal symbols. A long string of hexadecimal symbols complies with the Java Virtual Machine specification

cafe babe 0000 0034 0013 0a00 0400 0f09
0003 0010 0700 1107 0012 0100 016d 0100
0149 0100 063c 696e 6974 3e01 0003 2829
5601 0004 436f 6465 0100 0f4c 696e 654e
756d 6265 7254 6162 6c65 0100 0369 6e63
0100 0328 2949 0100 0a53 6f75 7263 6546
696c 6501 0009 4465 6d6f 2e6a 6176 610c
0007 0008 0c00 0500 0601 0004 4465 6d6f
0100 106a 6176 612f 6c61 6e67 2f4f 626a
6563 7400 2100 0300 0400 0000 0100 0200
0500 0600 0000 0200 0100 0700 0800 0100
0900 0000 1d00 0100 0100 0000 052a b700
01b1 0000 0001 000a 0000 0006 0001 0000
0001 0001 000b 000c 0001 0009 0000 001f
0002 0001 0000 0007 2ab4 0002 0460 ac00
0000 0100 0a00 0000 0600 0100 0000 0600
0100 0d00 0000 0200 0e

If you then use javap -verbose Demo.class to view the contents of the Demo.class, as shown below

From the picture above, we can see that the .class file mainly contains constant pool, field table, method table and attribute table. How to analyze the contents of the constant pool and method table from the binary stream based on 8-bit bytes? This article has a detailed introduction to understanding the bytecode structure of .class files. This article uses a simple example to analyze the .class files represented by hexadecimal conformance step by step.

2.2 Java virtual machine class loading mechanism

The above section introduces the structure of the .class file, but the .class file is static. It will eventually be loaded by the virtual machine before it can be executed. So the question is, when will the .class file be loaded?

Generally speaking, a .class file contains a Java class, and .class files and Java classes are closely related. When talking about the loading timing of .class files, we have to mention the life cycle of Java classes. Everyone must know that the life cycle of a Java class includes loading, verification, preparation, parsing, There are seven steps of initialization, use, and uninstall. The Java virtual machine specification does not stipulate the loading timing of Java classes, but However, it stipulates the timing of initialization of Java classes, and loading must be before initialization, so it can also be said to indirectly stipulate the loading of .class files. timing.

There are five situations where a class must be initialized. These five situations are called active references to Java classes, except for active references In addition, other references to Java classes are called passive references.

As mentioned above, the life cycle of Java classes is divided into loading, verification, preparation, parsing, Initialization, Use, Uninstall, the most important of which are the first five steps Loading, Verification >, preparation, parsing, initialization, what happened in these five steps?

Take a simple example as shown below. In the Constant class below, there is a static static code block and a static static variable. When is value assigned? When will the static code block be executed? The answer is in the Initialization phase of the class.

public class Constant {

    static {
        System.out.println("Constant init!");
    }

    public static String value = "lijiankun24!";
}

In a Java class, if there are static static code blocks and static static variables, the compiler will automatically generate a class constructor for this class. (Note, not the instance constructor), the static static code block will be executed in the class constructor and the static static will be initialized. code> variable, class constructor is executed in the initialization phase of the class

When it comes to loading Java classes, we have to talk about the class loader ClassLoader in Java. We must also be clear about the parent delegation model and its benefits.

The above is just a rough introduction. If you want to know more about the five types of active references, the life cycle of classes, class constructors, class loaders, and the parent delegation model, if you want to know more details, please read this article Article understanding the class loading mechanism in JVM

2.3 Java virtual machine bytecode execution engine

A very important area in the Java memory model is the Java virtual machine stack. When each method in Java is executed, a stack frame will be pushed into the Java virtual machine stack. After the method execution is completed, the stack frame will also be popped out of the stack.
The most important concepts in stack frame are the two concepts of local variable table and operand stack. When executing the bytecode of a Java method , in fact, it is to call Java bytecode instructions to manipulate local variable table, operand stack, and finally return the execution result. If you want to learn Java bytecode instructions, I recommend this article.

In addition to the execution process of the method, you also need to understand the method call in Java. Method invocation refers to the process of confirming the direct reference of the method through the symbolic reference of the method in the .class file. This process may occur during the loading phase or during the running phase.
There are some methods whose direct reference to the method has been determined during the loading phase, such as static methods, private methods, and instance constructor methods. The call of such methods is called parsing; in addition to parsing , the static dispatch of the method also determines the direct reference of the method during the loading phase. The most common such method is the overloaded method.
There are some methods that confirm the direct reference of the method during the running phase, such as: override method. When calling the override method, you need to specify the actual type of the object. So specific Java bytecode invokevirtual is needed to determine the appropriate method.

The Java virtual machine is based on stack interpretation and execution. The stack mentioned here is the Java virtual machine stack. Interpretation and execution are relative to compilation and execution. Interpretation and execution means: the code is compiled to generate words. After the section code instruction set, it is interpreted and executed by the interpreter. You don’t need to understand this too deeply, just understand these definitions.

The above introduces the stack frame, method call, parsing, static dispatch, and in the Java virtual machine stack. >Dynamic dispatch and Java virtual machine stack-based interpretation and execution. For details, please refer to the virtual machine bytecode execution engine.

3. Visitor Mode & amp; ASM

3.1 Visitor Mode

The ASM library is a code analysis and modification tool based on the Java bytecode level. So what is the relationship between ASM and the visitor mode? The visitor mode is mainly used to modify and operate some data with relatively stable data structure. Through the previous study, we know that the structure of the .class file is fixed, mainly including constant pool, field table, method table, attribute table, etc., through When using visitor mode to scan the contents of each table in the .class file, you can modify the contents. Before learning ASM, you can learn about visitor mode and ASM through this article.

3.2 Introduction and use of ASM library

ASM can directly produce binary .class files or dynamically modify class behavior before the class is loaded into the JVM. The introduction and use of the ASM library article introduces the structure of the ASM library and several important Core APIs, including ClassVisitor, ClassReader, ClassWriter, MethodVisitor and AdviceAdapter, etc., and through two simple examples, it introduces how to modify methods in Java classes. bytecode and bytecode for modifying attributes.

When you first start using it, you may not be very clear about the execution of bytecode. It will be more difficult to use ASM. ASM official also provides a help tool ASMifier. We can first write the target code and then compile it into a .class file through javac. , and then analyze this .class file through ASMifier to get the ASM code corresponding to the code that needs to be inserted.

As mentioned above, for details on the use of the Core Api and ASMifier of the ASM library, please refer to this article Introduction and Use of the ASM Library.

4. Koala

Finally, after learning the theoretical knowledge, in order to practice, I wrote a small project, using a custom Gradle plug-in + ASM to implement a library with the same functions as Jake Wharton’s hugo library, called Koala, which passes in specific annotation methods. The parameters, return results and execution time are printed to Logcat to facilitate development and debugging.

4.1 Add Koala Gradle Plugin dependency

Add the following code in build.gradle of the project:

 buildscript {
        repositories {
            maven {
                url "https://plugins.gradle.org/m2/"
            }
        }
        dependencies {
            classpath "gradle.plugin.com.lijiankun24:buildSrc:1.1.1"
        }
    }

Add the following code to build.gradle in the module you need to use:

 apply plugin: "com.lijiankun24.koala-plugin"
4.2 Add Koala dependency

Gradle:

 compile 'com.lijiankun24:koala:1.1.2'

Maven:

 <dependency>
        <groupId>com.lijiankun24</groupId>
        <artifactId>koala</artifactId>
        <version>1.1.2</version>
        <type>pom</type>
    </dependency>
4.3 Use

It is very simple to use. Add the @KoalaLog annotation to the Java method, as shown below:

 @KoalaLog
    public String getName(String first, String last) {
        SystemClock.sleep(15); // Don't ever really do this!
        return first + " " + last;
    }

When the above method is called, the output in Logcat looks like this:

09-04 20:51:38.008 12076-12076/com.lijiankun24.practicedemo I/0KoalaLog: ┌──────────────────────── ──────────────────────────────────────────── ─------
09-04 20:51:38.008 12076-12076/com.lijiankun24.practicedemo I/1KoalaLog: │ The class's name: com.lijiankun24.practicedemo.MainActivity
09-04 20:51:38.008 12076-12076/com.lijiankun24.practicedemo I/2KoalaLog: │ The method's name: getName(java.lang.String, java.lang.String)
09-04 20:51:38.008 12076-12076/com.lijiankun24.practicedemo I/3KoalaLog: │ The arguments: [li, jiankun]
09-04 20:51:38.008 12076-12076/com.lijiankun24.practicedemo I/4KoalaLog: │ The result: li jiankun
09-04 20:51:38.008 12076-12076/com.lijiankun24.practicedemo I/5KoalaLog: │ The cost time: 15ms
09-04 20:51:38.008 12076-12076/com.lijiankun24.practicedemo I/6KoalaLog: └──────────────────────────── ───────------─────────────────────────────────- ----
4.4 Confusion rules
 -keep class com.lijiankun24.koala.** { *; }

github address: Koala

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Java Skill TreeHomepageOverview 138,800 people are learning the system