5 tips on how to write optimal code in the V8 engine

This chapter will dive into the internals of Google’s V8 engine. We’ll also provide a few tips on how to write better JavaScript code – best practices that the SessionStack development team followed when building the product.

Overview

A JavaScript engine is a program or an interpreter that runs JavaScript code. A JavaScript engine can be implemented as a standard interpreter or as a just-in-time compiler, which compiles JavaScript into some form of bytecode.

Here is a list of popular projects implementing JavaScript engines:

  • V8 – written in C++, open-sourced by Google
  • Rhin – Led by the Mozilla Foundation, open source, fully developed in Java.
  • SpiderMonkey – The original JavaScript engine, formerly powered by Netscape, now powers Firefox.
  • JavaScriptCore – open source, popularized under the name Nitro, and developed by Apple for Safari.
  • KJS – The KDE engine, originally developed by Harri Porten for the KDE project’s Konqueror browser.
  • Chakra (JScript9) – IE
  • Chakra (JavaScript) – Microsoft Edge
  • Nashorn – Open source as part of OpenJDK, written by the Oracle Java Language and Tool Group.
  • JerryScript – a lightweight IoT engine.

The origin of the V8 engine

The V8 engine is open sourced by Google and written in C++ language. Google Chrome has this engine built in. What makes the V8 engine different from other engines is that it is also used in the popular Node.js runtime.

Originally V8 was designed to optimize the performance of JavaScript in web browsers. In order to achieve faster execution speed, V8 converts JavaScript code into more efficient machine code instead of using an interpreter. It compiles JavaScript code to machine code at execution time by implementing a just-in-time compiler, like many modern JavaScript engines like SpiderMonkey or Rhino (Mozilla) do. The main difference is that V8 does not generate bytecode or any intermediate code.

V8 used to have two compilers

Before V8 5.9 (early 2017), the engine had two compilers:

  • full-codegen – A simple and fast compiler to produce simple and relatively slow machine code.
  • Crankshaft – A more sophisticated (just-in-time) optimizing compiler used to generate efficient code.

The V8 engine also uses multiple threads internally:

  • The main thread does what you expect – fetches the code, compiles it and executes it
  • There is a separate thread to compile the code, so the main thread can keep executing while the former is optimizing the code
  • A thread for profiling tells the runtime which methods we are spending too much time on so that Crankshaft can optimize the code
  • Several threads are used to handle garbage collector cleanup.

When executing JavaScript code for the first time, V8 uses full-codegen to directly interpret the parsed JavaScript code into machine code without any conversion in between. This makes it start running machine code very quickly. Note that V8 does not use an intermediate bytecode representation, thus eliminating the need for an interpreter.

When the code has been executing for a while, the profiler thread has collected enough data to tell Crankshaft which methods can be optimized.

Next, start Crankshaft code optimization in another thread. It converts the JavaScript syntax abstraction tree into a high-level static single-assignment presentation layer called Hydrogen and tries to optimize the Hydrogen graph. Most code optimization happens at this layer.

inline

The first optimization is to inline as much code as possible ahead of time. Inlining refers to the process of replacing the call address (the line of code where the function is called) with the text of the called function. This simple step makes subsequent code optimizations more meaningful.

Hidden class

JavaScript is a prototype-based language: no classes or objects are created when cloning. JavaScript is also a dynamic programming language, which means that properties can be added or removed at will after it has been instantiated.

Most JavaScript interpreters use a dictionary-like structure (based on a hash function) to store the memory addresses of object property values in memory. This structure makes retrieving property values in JavaScript more time-consuming than in non-dynamic programming languages such as Java or C#. In Java, all object properties are determined by a fixed object layout before compilation and cannot be dynamically added or removed at runtime (well, C# has dynamic typing, which is another topic). Therefore, attribute values (pointers to these attributes) are stored in memory as contiguous buffers with fixed offsets from each other. The length of the displacement can be easily calculated based on the property type, however this is not possible in JavaScript because the property type can be changed at runtime.

Since it is very inefficient to use a dictionary to find the memory address of an object’s properties in memory, V8 uses hidden classes instead. Hidden classes work similar to fixed object layouts (classes) used in languages such as Java, except they are created at runtime. Now, let’s see what they actually look like:

function Point(x, y) {
  this.x = x;
  this.y = y;
}

var p1 = new Point(1, 2);

Once the “new Point(1,2)” call occurs, V8 will create a hidden class called “C0”.

“C0” is empty because no properties have been created for class Point.

Once the first statement “this.x = x” is executed (in the Point function), V8 will create a second hidden class based on “C0”. “C1” describes the memory address (relative to the object pointer) where x’s attribute can be found. In this example, “x” is stored at offset 0, which means that when the point object in memory is regarded as a continuous buffer, the start of the offset is consistent with the attribute “x”. V8 will update “C0” with a “class switch”, meaning that if attribute “x” is added to the point object, the hidden class will change from “C0” to “C1”. The hidden class of the following point objects is now “C1”.

Whenever a new property is added to an object, use the transformation path to update the old hidden class to the new hidden class. Hidden class transitions are important because they allow objects created in the same way to share hidden classes. If two objects share a hidden class and both objects have the same attribute added, the transformation will ensure that both objects receive the same new hidden class and all optimized code will include these new hidden classes.

The same process is repeated when the “this.y = y” statement is run (again in the Point function, after the “this.x = x” statement).

A hidden class called “C2” is created, and a class transformation is added to “C1” to indicate that the property “y” will be changed to ” C2″, then the hidden class of the point object is updated to “C2”.

Hidden class conversions depend on the order in which properties are added to the object. Look at the following code snippet:

function Point(x, y) {
  this.x = x;
  this.y = y;
}

var p1 = new Point(1, 2);
p1.a = 5;
p1.b = 6;

var p2 = new Point(3, 4);
p2.b = 7;
p2.a = 8;

Now, you would think that p1 and p2 would use the same hidden class and class transformation. However, this is not the case. For “p1”, attribute “a” is added first and then attribute “b”. For “p2”, add attribute “b” and then “a”. Thus, “p1” and “p2” end up using different hidden classes due to the different transformation paths used. In this case, a better approach is to initialize dynamic properties in the same order to facilitate reuse of hidden classes.

Inline caching

V8 takes advantage of another technique for optimizing dynamically typed languages called inline caching. Inline caching relies on the observation of repeated calls to the same method on objects of the same type.

We will touch on the general concept of inline caching (in case you don’t have time to read through the above in-depth understanding of inline caching article).

How does it work? V8 maintains a cache of object types that were passed to recently called methods as arguments, and then uses this information to assume that at some point in the future this object type will be passed to the method. If V8 is good at predicting the type of object being passed into a method, it can bypass the process of finding how to access the object’s properties and instead use stored information from previously queried objects’ hidden classes.

So how do the concept of hidden classes and inline caching relate? Whenever a method is called on a specified object, the V8 engine has to perform an operation of looking up the object’s hidden class to obtain the offset to access the specified property. After two successful calls to the same method of the same hidden class, V8 ignores the hidden class lookup and simply adds the offset of the attribute to the object pointer itself. In all subsequent calls to this method, the V8 engine assumes that the hidden class has not changed, and then uses the previously found offset to jump directly to the memory address of the specified attribute. This greatly improves code execution speed.

Memory caching is also why it is so important for objects of the same type to share hidden classes. When you create two objects of the same type but use different hidden classes (as done in the previous example), V8 will not be able to use the memory cache, because even if two objects of the same type, their corresponding hidden classes are Their properties assign different displacements.

These two objects are basically the same but the order in which “a” and “b” are created is different

Compile to machine code

Once the Hydrogen graph is optimized, Crankshaft will drop it to a lower level representation called Lithium. Most Lithium implementations are architecture-specific. Register allocation happens at this level.

Finally, Lithium is compiled to machine code. Then something else called OSR happens: current stack replacement. It used to be extremely possible to run an obviously time-consuming method before starting to compile and optimize it. V8 will not forget where the code executes slowly and reuse the optimized version of the code. Instead, it converts all context (stack, registers) so that it can switch to an optimized version of the code during execution. This is a complex task, you just have to remember that, among other optimizations, V8 initializes inline code. The V8 isn’t the only engine with this capability.

There are safety guards called deoptimizations to prevent reverse transformations and reversing code to unoptimized code when what the engine assumes doesn’t happen.

Garbage Collection

V8 uses the traditional generational mark-sweep technique to clean up old code generation for garbage collection. The marking phase halts JavaScript execution. In order to control the cost of garbage collection and make code execution more stable, V8 uses incremental marking: instead of walking the entire memory heap, trying to mark every possible object, it only walks a part of the heap, and then restarts normal code execution. The next garbage collection point will continue from where the previous heap walk left off. This causes a very short pause during normal code execution. As mentioned earlier, the cleanup phase is handled by a separate thread.

Ignition and TurboFan

With the release of V8 5.9 earlier in 2017 came a new execution pipeline. The new pipeline brings even greater performance gains and significant memory savings to real-world JavaScript programs.

The new execution pipeline is built on top of Ignition, the new V8 interpreter, and TurboFan, V8’s latest optimizing compiler.

Since the release of V8 5.9, full-codegen and Crankshaft (which V8 has been using since 2010) are no longer used by V8 to run JavaScript, as the V8 team is struggling to keep up with new JavaScript language features and optimizations made for them.

This means that the entire V8 will be a more streamlined and more maintainable architecture.

Improvements to Web and Node.js benchmarks scores

These boosts are just the beginning. The new Ignition and TurboFan pipeline prepares for future optimizations that will boost JavaScript performance and reduce V8 footprints in Chrome and Node.js for years to come.

Finally, here’s how to write well-optimized, better JavaScript code. You can easily summarize from the above, however, for your convenience, here is a summary:

How to write optimized JavaScript code

  • Order of object properties: Object properties are always instantiated in the same order so that hidden classes and later optimized code can be shared.
  • Dynamic properties: Adding properties to an object after instantiation slows down any methods that were optimized for the previously hidden class. Instead, assign all properties of the object in the object constructor.
  • Methods: Code that executes the same method repeatedly will be faster than code that runs a different method each time (thanks to inline caching).
  • Arrays: Avoid sparse arrays whose keys are not incrementing numbers. A sparse array that does not contain every element is internally called a hash table. Accessing elements in the array would be more time consuming. Likewise, try to avoid preallocating large arrays. It is best to increment as you use it. Finally, don’t delete elements from the array. This will make the keys sparse.
  • Tagged values: V8 uses 32 bits to represent objects and numbers. It uses one bit to identify whether it is an object (flag=1) or an integer (flag=0) called SMI (Small Integer), which is a small integer because it is 31 bits. Later, if a value is larger than 31 bits, V8 will box the number, convert it to a float and create a new object to store the number. Whenever possible, try to use 31-bit signed numbers to avoid expensive boxing operations to create JS objects.