nodejs’s js calls c++ and c++ calls libuv process

Registration and use of c++ modules

We know that nodejs is composed of js, c++, and c. Today we take a look at how they divide labor and cooperate. This article takes the net module as an example for analysis. We can use the net module in the following ways.

const net = require(‘net’);

net module is a native js module. Net.js corresponding to nodejs source code. It is an encapsulation of tcp and pipe. We only talk about the functions of tcp here. We can create a tcp server with the following code.

const net = require(‘net’);

net.createServer((socket) => {}).listen(80);

We know that there is no network function in js, which means that the network function is implemented by the c++ module in nodejs. So nodejs will create a c++ object at this time.

const {
  TCP,
} = internalBinding('tcp_wrap');
new TCP(TCPConstants.SERVER);

The TCP object encapsulates the functionality of the underlying TCP. It corresponds to the tcp_wrap module of the c++ layer. Let’s look at the code of the tcp_wrap module. Before analyzing tcp_wrap, let’s first look at what internalBinding does.

let internalBinding;
{
  const bindingObj = ObjectCreate(null);
  internalBinding = function internalBinding(module) {
    let mod = bindingObj[module];
    if (typeof mod !== 'object') {
      mod = bindingObj[module] = getInternalBinding(module);
    }
    return mod;
  };
}

internalBinding adds caching processing on the basis of getInternalBinding. Let’s continue looking at getInternalBinding.

// Find the corresponding module based on the module name
void GetInternalBinding(const FunctionCallbackInfo<Value> & amp; args) {
  Environment* env = Environment::GetCurrent(args);
  //Module name
  Local<String> module = args[0].As<String>();
  node::Utf8Value module_v(env->isolate(), module);
  Local<Object> exports;
  // Find the module named module_v and marked as NM_F_INTERNAL
  node_module* mod = FindModule(modlist_internal, *module_v, NM_F_INTERNAL);
  exports = InitModule(env, mod, module);
  args.GetReturnValue().Set(exports);
}

getInternalBinding finds the corresponding node from the module list through the module name. Then execute the corresponding initialization function.

//Initialize a module, that is, execute the registration function inside it
static Local<Object> InitModule(Environment* env,
                                node_module* mod,
                                Local<String> module) {
  Local<Object> exports = Object::New(env->isolate());
  Local<Value> unused = Undefined(env->isolate());
  mod->nm_context_register_func(exports, unused, env->context(), mod->nm_priv);
  return exports;
}

So where do the modules in the linked list come from? At this time we can go back and analyze tcp_wrap. The last sentence of tcp_wrap.cc is (Initialize function is the value of the nm_context_register_func attribute above)

NODE_MODULE_CONTEXT_AWARE_INTERNAL(tcp_wrap, node::TCPWrap::Initialize)

NODE_MODULE_CONTEXT_AWARE_INTERNAL is a macro. The macro expansion is as follows:

#define NODE_BUILTIN_MODULE_CONTEXT_AWARE(modname, regfunc) \
  NODE_MODULE_CONTEXT_AWARE_CPP(modname, regfunc, nullptr, NM_F_BUILTIN)
      
#define NODE_MODULE_CONTEXT_AWARE_CPP(modname, regfunc, priv, flags) \
  static node::node_module _module = { \
      NODE_MODULE_VERSION, \
      flags, \
      nullptr, \
      __FILE__, \
      nullptr, \
      (node::addon_context_register_func) (regfunc), \
      NODE_STRINGIFY(modname), \
      private, \
      nullptr\
    }; \
    void _register_ ## modname() { \
      node_module_register( & amp;_module); \ 

We see that after macro expansion, a node_module structure is first defined. Then a _register_ xxx function is defined, and the corresponding tcp module is _register_ tcp_wrap. This function will be executed when Nodejs is initialized.

1. void RegisterBuiltinModules() {
2. // After macro expansion, a series of _register_xxx functions are executed
3. #define V(modname) _register_##modname();
4.NODE_BUILTIN_MODULES(V)
5. #undef V
6. }

We see that the _register_ tcp_wrap function is executed, and there is only one line of code in it

node_module_register( & amp;_module);

node_module_register is defined as follows

extern "C" void node_module_register(void* m) {
  struct node_module* mp = reinterpret_cast<struct node_module*>(m);
  mp->nm_link = modlist_builtin;
  modlist_builtin = mp;
} 

Just add a node_module to the linked list. Registration of the module is completed. This is the linked list we just accessed through GetInternalBinding. So far, we have understood the registration of the C++ module and how to call the C++ module in js.

Using the functions of the c++ module

In this section, we will take a look at how to use the functions of the c++ module after having access to the c++ module. First, let’s take a look at what c++ does when executing new TCP in the js layer. Let’s take a look at the definition of TCP.

void TCPWrap::Initialize(Local<Object> target,
                         Local<Value> unused,
                         Local<Context> context,
                         void* priv) {
  Environment* env = Environment::GetCurrent(context);
  //Create a new function with New as callback
  Local<FunctionTemplate> t = env->NewFunctionTemplate(New);
  Local<String> tcpString = FIXED_ONE_BYTE_STRING(env->isolate(), "TCP");
  // Function name
  t->SetClassName(tcpString);
  t->InstanceTemplate()
    ->SetInternalFieldCount(StreamBase::kStreamBaseFieldCount);
?
  //Set the prototype method of t, that is, the attribute of TCP.prototype
  env->SetProtoMethod(t, "open", Open);
  env->SetProtoMethod(t, "bind", Bind);
  env->SetProtoMethod(t, "listen", Listen);
 
  // target is the exported object, set the TCP properties of the object
  target->Set(env->context(),
              tcpString,
              t->GetFunction(env->context()).ToLocalChecked()).Check();
}

We see that the above code seems a bit complicated, mainly based on some knowledge of v8. The translation into js is roughly as follows.

function FunctionTemplate(cb) {
   function Tmp() {
    Object.assign(this, map);
    cb(this);
   }
   const map = {};
   return {
    PrototypeTemplate: function() {
        return {
            set: function(k, v) {
                Tmp.prototype[k] = v;
            }
        }
    },
    InstanceTemplate: function() {
        return {
            set: function(k, v) {
                map[k] = v;
            }
        }
    },
    GetFunction() {
        return Tmp;
    }
   }
  
}
  
const TCPFunctionTemplate = FunctionTemplate((target) => { target[0] = new TCPWrap(); })
TCPFunctionTemplate.PrototypeTemplate().set('connect', TCPWrap.Connect);
TCPFunctionTemplate.InstanceTemplate().set('name', 'hi');
const TCP = TCPFunctionTemplate.GetFunction(); 

We see that when new TCP is created in the js layer, an object of the c++ layer will be new first. Then execute a function, corresponding to tcp_wrap.cc is New.

void TCPWrap::New(const FunctionCallbackInfo<Value> & amp; args) {
  // Ignore some c parameter processing
  new TCPWrap(env, args.This(), provider);
}

A TCPWrap object is created in New. The complete function is as follows:

TCPWrap::TCPWrap(Environment* env, Local<Object> object, ProviderType provider)
    : ConnectionWrap(env, object, provider) {
  //Initialize a tcp handle
  int r = uv_tcp_init(env->event_loop(), & amp;handle_);
}

At this time, the relationship diagram is as follows.

This seems simple, but there are actually a lot of details. This starts with the base class of the C++ module. TCPWrap inherits from BaseObject. Some very critical operations are done in the constructor of BaseObject (object is the c++ layer object corresponding to new TCP just now, not the object corresponding to new TCPWrap, this corresponds to the new TCPWrap object).

// Store the object in persistent_handle_ and retrieve it through object() when necessary
BaseObject::BaseObject(Environment* env, v8::Local<v8::Object> object)
    : persistent_handle_(env->isolate(), object), env_(env) {
  //Save this into object
  object->SetAlignedPointerInInternalField(0, static_cast<void*>(this));
}

So the relationship diagram obtained after TCPWrap is initialized is as follows

At this time we have completed the initialization of new TCP and return to the code at the beginning of the article. When we create a tcp server, the listen function will be called to start the server. Let’s take a look at the calling process of the listen function. When the js layer calls the listen function, the c++ layer’s Listen function will be executed.

void TCPWrap::Listen(const FunctionCallbackInfo<Value> & amp; args) {
  TCPWrap* wrap;
  // Unpack TCPWrap and store it in wrap
  ASSIGN_OR_RETURN_UNWRAP( & amp;wrap,
                          args.Holder(),
                          args.GetReturnValue().Set(UV_EBADF));
  Environment* env = wrap->env();
  int backlog;
  if (!args[0]->Int32Value(env->context()).To( & amp;backlog)) return;
  //OnConnection is the callback triggered when there is a newly established connection (three-way handshake has been completed)
  int err = uv_listen(reinterpret_cast<uv_stream_t*>( & amp;wrap->handle_),
                      backlog,
                      OnConnection);
  args.GetReturnValue().Set(err);
}

ASSIGN_OR_RETURN_UNWRAP is the key code. When using new TCP, we already know the relationship between the js layer and the c++ object. When the js layer calls the listen function, its associated object is new TCP. The function of ASSIGN_OR_RETURN_UNWRAP is to unpack the new TCPWrap object corresponding to the new TCP and use it. This is the process of the js layer calling the c++ layer functions.

c++ layer calls libuv

Next we analyze how the C++ layer calls libuv. The previous section analyzed that the TCPWrap object is obtained, and then the following code will be executed.

int err = uv_listen(reinterpret_cast<uv_stream_t*>( & amp;wrap->handle_),
                      backlog,
                      OnConnection);

So what is &wrap->handle_? Let’s take a look at the definition of TCPWrap. TCPWrap inherits from ConnectionWrap.

class TCPWrap : public ConnectionWrap<TCPWrap, uv_tcp_t>

ConnectionWrap is a template class.

// WrapType is the class of the c++ layer, UVType is the type of libuv
template <typename WrapType, typename UVType>
class ConnectionWrap : public LibuvStreamWrap {
 public:
  static void OnConnection(uv_stream_t* handle, int status);
  static void AfterConnect(uv_connect_t* req, int status);
?
 protected:
  ConnectionWrap(Environment* env,
                 v8::Local<v8::Object> object,
                 ProviderType provider);
?
  UVType handle_;
};

We see that the value of &wrap->handle_ is uv_tcp_t. This handle_ will be associated with the TCPWrap object during initialization (TCPWrap inherits HandleWrap).

HandleWrap::HandleWrap(Environment* env,
                       Local<Object> object,
                       uv_handle_t* handle,
                       AsyncWrap::ProviderType provider)
    : AsyncWrap(env, object, provider),
      state_(kInitialized),
      handle_(handle) {
  //Save the relationship between Libuv handle and c++ object
  handle_->data = this;
?
?
}

We will see the use of this later. At this time, the relationship diagram is as follows

Callback listen code

 int err = uv_listen(reinterpret_cast<uv_stream_t*>( & amp;wrap->handle_),
                      backlog,
                      OnConnection);

At this time we know what the structure passed to libuv is. When listening ends, the OnConnection function will be called back.

template <typename WrapType, typename UVType>
void ConnectionWrap<WrapType, UVType>::OnConnection(uv_stream_t* handle,
                                                    int status) {
  // Get the c++ layer TCPWrap object corresponding to the Libuv structure
  WrapType* wrap_data = static_cast<WrapType*>(handle->data);
  // Callback js, client_handle is equivalent to executing new TCP at the js layer
  Local<Value> argv[] = { Integer::New(env->isolate(), status), client_handle };
  wrap_data->MakeCallback(env->onconnection_string(), arraysize(argv), argv);
}

We see that in OnConnection, we first get the corresponding c++ layer object TCPWrap through handle->data (this is the purpose of associating these two data structures in HandleWrap, in order to maintain the context). Then call TCPWrap’s MakeCallback function (onconnection_string is the string “onconnection”). MakeCallback is defined in AsyncWrap (TCPWrap inherits AsyncWrap).

inline v8::MaybeLocal<v8::Value> AsyncWrap::MakeCallback(
    const v8::Local<v8::Name> symbol,
    int argc,
    v8::Local<v8::Value>* argv) {
  v8::Local<v8::Value> cb_v;
  //According to the attribute value represented by the string, extract the value corresponding to the attribute from the object. is a function
  if (!object()->Get(env()->context(), symbol).ToLocal( & amp;cb_v))
    return v8::MaybeLocal<v8::Value>();
  // If it is a function
  if (!cb_v->IsFunction()) {
    // TODO(addaleax): We should throw an error here to fulfill the
    // `MaybeLocal<>` API contract.
    return v8::MaybeLocal<v8::Value>();
  }
  //Callback, see async_wrap.cc
  return MakeCallback(cb_v.As<v8::Function>(), argc, argv);
}

object() is the object corresponding to the new TCP in the js layer. Get the value of the onconnection attribute of the object (this attribute is set in the js layer). The value is a function, and then continue to call MakeCallback.

MaybeLocal<Value> AsyncWrap::MakeCallback(const Local<Function> cb,
                                          int argc,
                                          Local<Value>* argv) {
 MaybeLocal<Value> ret = InternalMakeCallback(
      env(), object(), cb, argc, argv, context);
  return ret;
}

MakeCallback continues to call InternalMakeCallback, which calls the v8 interface execution function to call back the js layer.

ret = callback->Call(env->context(), recv, argc, argv)

This is the rough interaction process of js, c++, and libuv in nodejs, and it is also a general process.