How webpack works

Directory

    • Merge code
    • Modular
    • Webpack packaging
    • The structure of webpack
    • webpack source code
      • addEntry and _addModuleChain
      • buildModule
      • Compilation hook
      • Output build results

Understand the implementation principles of webpack, master the basic workflow of webpack, and use webpack in daily life
When encountering a problem, it can help us gain insight into the root of the problem, help us clarify the basic ideas for solving the problem, and also help us better understand loaders and plugins.
The use and significance of it make it easier to handle some customized construction requirements.

Putting aside webpack’s complex loader and plugin mechanisms, webpack is essentially a JS module Bundler, used to package multiple code modules. So let’s put aside the intricate overall implementation of webpack and take a look at a relatively simple JS module Bunlder. What is the basic workflow? After understanding how bundler works, we can further organize the entire webpack process and understand the mechanisms of loader and plugin.

Merge code

Before the JS module solution came out, some front-end class libraries would write simple tools to merge code files in order to split code files, and merge multiple code files together in an orderly manner to become the final JS file. for example:

// A.js
function A() {}

//B.js
function B() {}

// result
function A() {}
function B() {
  // You can use method A here
}

Such a merge script is very simple, readers can try to write one using Node. Such a simple merge method has obvious shortcomings, and it will become difficult to maintain as the code base grows larger:

  • The order in which files are merged is difficult to determine
  • Variable and method naming conflicts in code files are easy

In order to solve the above problems, the JS module solution came out. Today, webpack is the most popular JS module bundler in the front-end community. Let’s take a look at how webpack solves the above two problems.

Modularization

The first question is the order of positions when the module files are merged together. A simple understanding is the execution order of the module code. The CommonJS specification and the ES Module specification define how to declare dependencies in a module. We write the following code in the module:

// entry.js
import { bar } from './bar.js'; // Depends on ./bar.js module

//bar.js
const foo = require('./foo.js'); // Depends on the ./foo.js module

It is to declare that when the current module code (that is, the entry.js) is to be executed, it needs to depend on the execution of the “bar.js” module, and the “bar.js” module depends on “foo.js”. The bundler needs to parse the dependency bar.js from this entry code (the first paragraph), then read the bar.js code file, parse out the dependency foo.js code file, continue to parse its dependencies, and continue recursively until there are no more updates. Multiple dependent modules eventually form a module dependency tree.

Dependency resolution and management is a very important job of webpack, the bundler. If the foo.js file does not depend on other modules, then the dependency tree of this simple example is relatively simple: entry.js -> bar.js -> foo.js. Of course, in daily development What you encounter are generally quite complex code module dependencies.

If we put it into the processing of merged files, the codes of the above “foo.js” and “bar.js” modules need to be placed in front of our entry code. But instead of simply merging in the order of dependencies, webpack adopts a more clever approach, which incidentally solves the second problem of file merging mentioned earlier.

Webpack packaging

Under the premise that the dependencies have been resolved, webpack will use the characteristics of JavaScript Function to provide some code to integrate the various modules, that is, to package each module into a JS Function and provide a method to reference the dependent modules, as shown below __webpack__require__ in the example. This can not only avoid variables interfering with each other, but also effectively control the execution order. A simple code example is as follows:

// Organize and package the code of each dependent module into one file using modules.
// entry.js
modules['./entry.js'] = function() {
  const { bar } = __webpack__require__('./bar.js')
}

//bar.js
modules['./bar.js'] = function() {
  const foo = __webpack__require__('./foo.js')
};

// foo.js
modules['./foo.js'] = function() {
  // ...
}

//The results of the executed code module will be saved here
const installedModules = {}

function __webpack__require__(id) {
  // ...
  // If there is one in installedModules, get it directly
  // If not, get the function from modules and execute it, cache the result in installedModules and return the result
}

When we introduced the size optimization of JS code generated by webpack earlier, we introduced that there is a configuration that can be used to remove part of the function code, because the biggest disadvantage of this implementation is that it will increase the size of the generated JS code. When webpack can determine the code When the execution order is adjusted, and the unique module ID can be used to adjust the variable names within the module to prevent conflicts, these so-called glue codes are no longer necessary.

Regarding the implementation of module code without using function, interested students can read this article: Webpack and Rollup: the same but different. You can also use rollup to build a simple example and see the generated code results.

The structure of webpack

Webpack requires strong scalability, especially for plug-in implementation. Webpack uses the tapable library (actually also a library developed by the author of webpack) to help control each step of the entire build process.

For more information about the use of this library, you can check the official documentation: tapable. It is not very complicated to use. The main function is to add various hook methods (i.e. Hook).

After webpack defines the main build process based on tapable, it uses the tapable library to add various hook methods to extend webpack to a rich set of functions, while also providing relatively powerful external scalability, that is, the plugin mechanism.

On this basis, let’s take a look at the main processes of webpack work and several important concepts.

  • Compiler, the running entrance of webpack, defines the main process of webpack construction when instantiated, and creates the core object compilation used during construction.
  • Compilation, instantiated by Compiler, stores data used by each process during the construction process and is used to control changes in these data.
  • Chunk is the class used to represent chunks, which is the backbone of the construction process. Generally, one entry corresponds to one chunk. The chunk objects required during construction are created and managed by Compilation.
  • Module, a class used to represent code modules, derives many subclasses to handle different situations. All information about the code module will be stored in the Module instance, such as dependencies to record the dependencies of the code module, etc.
  • Parser, a relatively complex part, is based on acorn to analyze the AST syntax tree and parse out the dependencies of the code module.
  • Dependency, an object used to save the dependencies corresponding to the code module during parsing
  • Template, the code template used to generate the final code. The function code mentioned above is generated using the corresponding Template.

The official definition of Compiler and Compilation is:

The compiler object represents the complete webpack environment configuration. This object is created once when starting webpack and configures all operational settings, including options, loader and plugin. When a plugin is applied within a webpack environment, the plugin will receive a reference to this compiler object. You can use this to access webpack’s main environment.

The compilation object represents a resource version build. When running the webpack development environment middleware, every time a file change is detected, a new compilation will be created, thus generating a new set of compiled resources. A compilation object represents the current module resources, compiled resources, changed files, and status information of tracked dependencies. The compilation object also provides callbacks for many key steps for the plug-in to choose and use when doing custom processing.

The above are some of the more important parts of webpack source code implementation. The approximate workflow of webpack running is as follows:

Create Compiler ->
Call compiler.run to start building ->
Create Compilation ->
Start creating Chunk based on configuration ->
Use Parser to parse dependencies starting from Chunk ->
Use Module and Dependency to manage code module relationships ->
Use Template to generate result code based on Compilation data ->

The above is just a general process in the blogger’s understanding, and the details are relatively complicated. On the one hand, the details of the technical implementation have a certain complexity, and on the other hand, the implemented functions also have a certain complexity in logic. If it is introduced in depth, the length will be very long, and The effect may not be ideal. When we have not yet implemented specific functions, we do not need to pay attention to such specific implementation details. We only need to analyze the overall process from a higher level.

Students who are interested in exploring the implementation details of a certain part can check the webpack source code and start with the basic process of webpack: Compiler Hooks.

What is provided here is the link address of the master branch of the source code of version 4.x. The source code of webpack is relatively difficult to understand. If you want to learn the entire workflow of bundler, you can consider reading the source code of rollup. The readability will be much better.

Webpack source code

The main construction processing methods of webpack are in Compilation. To understand the mechanism of loader and plugin, we need to delve into the content of Compilation.

The implementation of Compilation is also relatively complex. A single file of lib/Compilation.js has nearly 2,000 lines of code. Let’s introduce a few key parts.

addEntry and _addModuleChain

addEntry This method, as the name suggests, is used to add the configured entry to the build task. When the webpack configuration is parsed and the build is ready, the addEntry method will be executed. , and addEntry will call _addModuleChain to create a corresponding Module instance for the entry file (the entry file is equivalent to the first dependency at this time).

The _addModuleChain method will create a moduleFactory based on the type of the first dependency of the entry file, and then use this moduleFactory to create a for the entry file. >Module instance. This Module instance is used to manage the relevant data information for subsequent construction of this entry. For the specific implementation of the Module class, please refer to this source code: lib/ Module.js, this is a basic class. Most of the Module instances of the code modules we use when building are created by the lib/NormalModule.js class.

We introduce addEntry mainly to find the starting point of the entire construction, so that everything can be traced, and subsequent in-depth work can start from this point.

buildModule

When a Module instance is created, the more important step is to execute the compilation.buildModule method. This method mainly calls the Module instance’s build method, this method is mainly used to create some things needed for Module instances. For us to sort out the process, the most important part here is to call its own runLoaders method.

runLoaders This method is implemented by the class library that webpack relies on: loader-runner. This method is also relatively easy to understand. It is to execute the corresponding loaders and transfer the code source code content one by one to the loader specified in the configuration. After processing, save the processing results.

We have introduced before that the loader of webpack is the converter, and the loader comes into play at this time. As for the details of loader execution, students who are interested in in-depth knowledge can learn about the implementation of loader-runner.

After the build method of the Module instance mentioned above has executed the corresponding loader and processed the conversion of the module code itself, there is a very important step to call the Parser instance. To parse the modules that it depends on, the parsed results are stored in module.dependencies. The dependent path is saved first, and will be processed later through the compilation.processModuleDependencies method. Each dependent module recursively builds the entire dependency tree.

Compilation hook

We mentioned earlier that webpack will use tapable to define hooks for each step in the entire build process to register events, and then trigger corresponding events when specific steps are executed. The registered event functions can adjust the context data during the build, or To do additional processing work, this is webpack’s plugin mechanism.

There is such a piece of code in lib/webpack.js at the webpack execution entrance:

if (options.plugins & amp; & amp; Array.isArray(options.plugins)) {
  for (const plugin of options.plugins) {
    if (typeof plugin === "function") {
      plugin.call(compiler, compiler);
    } else {
      plugin.apply(compiler);
    }
  }
}

The apply method of this plugin is used to register event hook functions for compiler instances, and can be obtained from some event hooks of compiler A reference to the compilation instance. Through the reference, event functions can be registered for the compilation instance. By analogy, the plugin’s capabilities can be covered in the entire webpack build process.

For the names and definitions of these event functions, you can view the official documentation: compiler event hooks and compilation event hooks.

A subsequent chapter will introduce how to write a webpack plugin. You can combine the two parts to help understand the execution mechanism of the webpack plugin.

Produce build results

Finally, there is a part, which is to use Template to produce the code content of the final build result. This part will not be introduced in detail, but only some clues are left for students who are interested in continuing to delve deeper:

  • Template Basic class: lib/Template.js
  • Commonly used main Template classes: lib/MainTemplate.js
  • Code that produces build results in Compilation: compilation.createChunkAssets

This is the end of the introduction to this part of the content. For students who are interested in further exploring this part of the content, it is recommended to use breakpoint debugging, combined with the content introduced by the author, to roughly walk through the webpack construction process, and you will have an understanding of this part. The content is more impressive, and you can also use breakpoints to understand the details of a certain part in a more targeted manner.