Babylon Native headless rendering of 3D scenes

When we say rendering, we’re usually talking about an application rendering at 60fps, whether it’s a game or another application that uses the GPU. However, in other scenarios we might want to use the GPU to run processes that don’t display anything at all, such as processing video, manipulating images, or rendering 3D assets, perhaps all running on the server. In this article I will describe how to implement such a scenario using Babylon Native. Specifically, I’ll show you an example of how to use Babylon Native to capture screenshots of 3D assets using DirectX 11 on Windows.

Disclaimer: The API contract used in this example is subject to change as the core team is still working on the correct API contract form.

Recommended: Use NSDT editor to quickly build programmable 3D scenes

1. Console application

The sample repository is located here. It uses CMake to generate Visual Studio projects for Windows. Babylon Native and DirectXTK dependencies are included via submodules and used in CMake. The DirectXTK dependency is only used to save DirectX textures to PNG files.

The core of the application is a file called App.cpp and its JavaScript counterpart in index.js. Let’s dig into some details, starting with the native code side.

1.1 Create DirectX graphics device

First, we need to create a standalone DirectX device.

winrt::com_ptr<ID3D11Device> CreateD3DDevice()
{
    winrt::com_ptr<ID3D11Device> d3dDevice{};
    uint32_t flags = D3D11_CREATE_DEVICE_SINGLETHREADED | D3D11_CREATE_DEVICE_BGRA_SUPPORT;
    winrt::check_hresult(D3D11CreateDevice(nullptr, D3D_DRIVER_TYPE_HARDWARE, nullptr, flags, nullptr, 0, D3D11_SDK_VERSION, d3dDevice.put(), nullptr, nullptr));
    return d3dDevice;
}

This code is not uncommon, but it can be adapted to use WARP, for example, if the environment does not have a GPU.

Next, we’ll use this DirectX device to create a Babylon Native graphics device.

std::unique_ptr<Babylon::Graphics::Device> CreateGraphicsDevice(ID3D11Device* d3dDevice)
{
    Babylon::Graphics::DeviceConfiguration config{d3dDevice};
    std::unique_ptr<Babylon::Graphics::Device> device = Babylon::Graphics::Device::Create(config);
    device->UpdateSize(WIDTH, HEIGHT);
    return device;
}

We have to specify the width and height (1024×1024 in this case) because Babylon Native devices are not associated with windows or views like they normally are.

1.2 Create JavaScript host environment

Of course, we also have to create a JavaScript hosting environment, in this case using Chakra (the default under Windows) to load the Babylon.js core and loader modules as well as index.js where the previously mentioned JavaScript logic resides. Afterwards we also start rendering a frame, which will prevent JavaScript from queuing graphics commands.

auto runtime = std::make_unique<Babylon::AppRuntime>();
runtime->Dispatch([ & amp;device](Napi::Env env)
{
    device->AddToJavaScript(env);

    Babylon::Polyfills::Console::Initialize(env, [](const char* message, auto)
    {
        std::cout << message;
    });

    Babylon::Polyfills::Window::Initialize(env);
    Babylon::Polyfills::XMLHttpRequest::Initialize(env);

    Babylon::Plugins::NativeEngine::Initialize(env);
});

Babylon::ScriptLoader loader{*runtime};
loader.LoadScript("app:///Scripts/babylon.max.js");
loader.LoadScript("app:///Scripts/babylonjs.loaders.js");
loader.LoadScript("app:///Scripts/index.js");

device->StartRenderingCurrentFrame();
deviceUpdate->Start();

Using Chakra in Visual Studio is convenient because we can add a debugger; add a statement anywhere in the JavaScript code and the Visual Studio Instant Debugger will prompt you to debug the JavaScript through a dialog box. Please note that the application must be running in a debug configuration to work properly.

1.3 Output texture

We also have to create an output render target texture for the Babylon.js camera’s outputRenderTarget. First, we create a DirectX render target texture.

winrt::com_ptr<ID3D11Texture2D> CreateD3DRenderTargetTexture(ID3D11Device* d3dDevice)
{
    D3D11_TEXTURE2D_DESC desc{};
    desc.Width = WIDTH;
    desc.Height = HEIGHT;
    desc.MipLevels = 1;
    desc.ArraySize = 1;
    desc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
    desc.SampleDesc = {1, 0};
    desc.Usage = D3D11_USAGE_DEFAULT;
    desc.BindFlags = D3D11_BIND_RENDER_TARGET | D3D11_BIND_SHADER_RESOURCE;
    desc.CPUAccessFlags = 0;
    desc.MiscFlags = 0;

    winrt::com_ptr<ID3D11Texture2D> texture;
    winrt::check_hresult(d3dDevice->CreateTexture2D( & amp;desc, nullptr, texture.put()));
    return texture;
}

We then expose the native texture to JavaScript through a Babylon Native plug-in called ExternalTexture.

std::promise<void> addToContext{};
std::promise<void> startup{};

loader.Dispatch([externalTexture = Babylon::Plugins::ExternalTexture{outputTexture.get()}, & amp;addToContext, & amp;startup](Napi::Env env)
{
    auto jsPromise = externalTexture.AddToContextAsync(env);
    addToContext.set_value();

    jsPromise.Get("then").As<Napi::Function>().Call(jsPromise,
    {
        Napi::Function::New(env, [ & amp; startup](const Napi::CallbackInfo & amp; info)
        {
            auto nativeTexture = info[0];
            info.Env().Global().Get("startup").As<Napi::Function>().Call(
            {
                nativeTexture,
                Napi::Value::From(info.Env(), WIDTH),
                Napi::Value::From(info.Env(), HEIGHT),
            });
            startup.set_value();
        })
    });
});

addToContext.get_future().wait();

deviceUpdate->Finish();
device->FinishRenderingCurrentFrame();

startup.get_future().wait();

Note that because this is not a normal rendering application, we are rendering individual frames explicitly, so we also need a synchronized construct (std::promise in this case) to ensure the correct order.

As described in the ExternalTexture documentation, the ExternalTexture::AddToContextAsync function requires the graphics device to render one frame before completing. The addToContext future will wait until AddToContextAsync is called, and FinishRenderingCurrentFrame will render a frame to allow AddToContextAsync to complete.

2. JavaScript (Part 1)

Next, we’ll review the first part of the JavaScript side (startup). Ignoring the typical Babylon.js engine and scene setup, this function takes a parameter called nativeTexture, which is the texture from the AddToContextAsync result. This parameter is then wrapped using wrapNativeTexture and added as a color attachment to the Babylon.js render target texture. We’ll see how to use it shortly.

function startup(nativeTexture, width, height) {
    engine = new BABYLON.NativeEngine();

    scene = new BABYLON.Scene(engine);
    scene.clearColor = BABYLON.Color3.White();

    scene.createDefaultEnvironment({ createSkybox: false, createGround: false });

    outputTexture = new BABYLON.RenderTargetTexture(
        "outputTexture",
        {
            width: width,
            height: height
        },
        scene,
        {
            colorAttachment: engine.wrapNativeTexture(nativeTexture),
            generateDepthBuffer: true,
            generateStencilBuffer: true
        }
    );
}

2.1 glTF assets

Back to the native code, we are now ready to load the glTF resource and capture the screenshot.

struct Asset
{
    const char* Name;
    const char* Url;
};

std::array<Asset, 3> assets =
{
    Asset{"BoomBox", "https://raw.githubusercontent.com/KhronosGroup/glTF-Sample-Models/master/2.0/BoomBox/glTF/BoomBox.gltf"},
    Asset{"GlamVelvetSofa", "https://raw.githubusercontent.com/KhronosGroup/glTF-Sample-Models/master/2.0/GlamVelvetSofa/glTF/GlamVelvetSofa.gltf"},
    Asset{"MaterialsVariantsShoe", "https://raw.githubusercontent.com/KhronosGroup/glTF-Sample-Models/master/2.0/MaterialsVariantsShoe/glTF/MaterialsVariantsShoe.gltf"},
};

for (const auto & asset : assets)
{
    RenderDoc::StartFrameCapture(d3dDevice.get());

    device->StartRenderingCurrentFrame();
    deviceUpdate->Start();

    std::promise<void> loadAndRenderAsset{};

    loader.Dispatch([ & amp;loadAndRenderAsset, & amp;asset](Napi::Env env)
    {
        std::cout << "Loading " << asset.Name << std::endl;

        auto jsPromise = env.Global().Get("loadAndRenderAssetAsync").As<Napi::Function>().Call(
        {
            Napi::String::From(env, asset.Url)
        }).As<Napi::Promise>();

        jsPromise.Get("then").As<Napi::Function>().Call(jsPromise,
        {
            Napi::Function::New(env, [ & amp;loadAndRenderAsset](const Napi::CallbackInfo & amp;)
            {
                loadAndRenderAsset.set_value();
            })
        });
    });

    loadAndRenderAsset.get_future().wait();

    deviceUpdate->Finish();
    device->FinishRenderingCurrentFrame();

    RenderDoc::StopFrameCapture(d3dDevice.get());

    auto filePath = GetModulePath() / asset.Name;
    filePath.concat(".png");
    std::cout << "Writing " << filePath.string() << std::endl;

    // See https://github.com/Microsoft/DirectXTK/wiki/ScreenGrab#srgb-vs-linear-color-space
    winrt::check_hresult(DirectX::SaveWICTextureToFile(context.get(), outputTexture.get(), GUID_ContainerFormatPng, filePath.c_str(), nullptr, nullptr, true));
}

This may seem long, but it’s not overly complicated. We loop through each asset and call the JavaScript function loadAndRenderAssetAsync, wait for it to complete, and save the PNG to disk.

3. JavaScript (Part 2)

The loadAndRenderAssetAsync function on the JavaScript side imports the glTF asset, sets up the camera, waits for the scene to be ready, and renders a single frame. This should look similar to what happens with a web application using Babylon.js!

async function loadAndRenderAssetAsync(url) {
    if (rootMesh) {
        rootMesh.dispose();
    }

    const result = await BABYLON.SceneLoader.ImportMeshAsync(null, url, undefined, scene);
    rootMesh = result.meshes[0];

    scene.createDefaultCamera(true, true);
    scene.activeCamera.alpha = 2;
    scene.activeCamera.beta = 1.25;
    scene.activeCamera.outputRenderTarget = outputTexture;

    await scene.whenReadyAsync();

    scene.render();
}

The camera’s output render target is assigned the previous output render target texture so that the scene will render to this output texture instead of the default back buffer, which of course does not exist in this context. This, in turn, will render directly to the native DirectX render target texture we set up earlier.

4. Results

An example of building and running the ConsoleApp is shown below.

Output 3 PNG files:

5. Debugging using RenderDoc

And one more thing! Notice the helper function calls to RenderDoc::StartFrameCapture and RenderDoc::StopFrameCapture? These will tell RenderDoc to start and stop capturing frames, since RenderDoc doesn’t know when a frame starts or stops since we’re not in a typical rendering situation. We can turn on RenderDoc capture by uncommenting a line in RenderDoc.h. Using RenderDoc is useful for debugging GPU issues.

6. Conclusion

I hope this gives you an idea of how to use Babylon Native in a headless environment. This is not a typical scenario, but it is one that is more difficult or expensive to implement using other technologies. We will continue to work hard to make Babylon Native useful in as many scenarios as possible.

Original link: Babylon Native headless rendering – BimAnt