Principle and Implementation of BVH Raycast Algorithm

In this tutorial, we’ll introduce and use a technique developed by researchers Kay and Kajiya in 1986. Since then, other acceleration structures have proven better than their technique, but their solution could help formulate the principles by which most acceleration structures are built. This is a good starting point, and like Utah Teapot, we’ll have the opportunity to introduce some interesting new techniques that have emerged from many other computer graphics algorithms, such as octree structures. We strongly recommend that you read their papers.

Recommendation: Use NSDT Scene Designer to quickly build 3D scenes.

1. The range of bounding volume

Figure 1: Loose and complex bounding volumes

In the image above, when the bounding volume representing the object fits the object too loosely, many rays that intersect the extent do not intersect the shape and are wasted (top). Using a more complex bounding volume (bottom) gives better results, but ray tracing is more expensive.

Previously we showed intuitively that ray tracing can be accelerated using simple techniques such as ray tracing against bounding volumes. As Kay and Kajiya point out in their paper, these techniques only work if ray tracing these bounding volumes, or what they call extents, is much faster than ray tracing the objects themselves. They also point out that if the bounds loosely fit an object, then many rays that intersect the bounding volume are likely to actually miss the object inside.

On the other hand, if the range accurately describes the object, then all rays that intersect the range will also intersect the object. Obviously, the bounding volume meeting this criterion is the object itself. Therefore, a good choice for a bounding volume is a shape that provides a good trade-off between compactness (how close the range fits to the object) and speed (complex bounding volumes are more expensive to ray-trace than simple shapes).

The idea is shown in Figure 1, where the box around the teapot is raytraced faster than the tighter bounding volume around the teapot in the image below. However, more rays intersecting this box will miss the teapot geometry than would fit the model more closely. Shapes like spheres and boxes yield some pretty good results in most cases, but using the idea of finding a good compromise between simplicity and speed, Kay and Kajiya propose to further improve these simple shapes.

Figure 2: The bounding box of an object can be seen as the intersection of planes parallel to the x and y axes

Bounding boxes can be seen as planes that intersect each other. To make this demonstration easier, we only consider the two-dimensional case. Suppose we need to compute the bounding box of a teapot. Technically, this can be seen as the intersection of two planes parallel to the y-axis and two planes parallel to the x-axis (Figure 2). Do we now need to find a way to compute these planes? what should we do? We’ve already covered the plane equation to calculate the intersection of the ray with the box, and how to calculate the intersection of the ray with the triangle. Let’s recall that the plane equation (in three dimensions this time) can be defined as:

where the A, B, C terms define the plane’s normal (a vector perpendicular to the plane), and d is the distance along this vector from the world origin to this plane. x, y, z define the 3D Cartesian coordinates of a point lying on the plane. This equation states that the coordinates of any point multiplied by the plane normal coordinates, minus d, equals zero. This equation can also be used to project the vertices of the teapot onto a plane and find a value for d.

For a given point P(x,y,z) and a given plane N(x,y,z), we can solve the equation to get:

We can rewrite this equation into a more traditional point-matrix multiplication form (Equation 1):

Figure 3: Projecting a point on a plane with a normal (1,0,0).
If we project a vertex P(x,y,z) on a y-axis plane parallel to the normal N(1,0,0), giving the distance along the x-axis from the origin to a plane parallel to the y-axis, if We repeat this process for all vertices of the model, and we can prove that the point d-value with the minimum value and the point d-value with the maximum value correspond to the minimum and maximum range of the x-coordinate of the object respectively. The two values dnear and dfar describe the two planes of the constrained object (as shown in Figure 3). We can do this with the following pseudocode:

// plane normal
vector N(1, 0, 0);
float Dnear = INFINITY;
float Dfar = -INFINITY;
for (each point in model) {
    // project point
    float D = N.x * P.x + N.y * P.y + N.z * P.z;
    Dnear = min(D, Dnear);
    Dfar = max(D, Dfar);
}

Figure 4: The intersection of three plates parallel to the xz- yz- and xy-planes respectively defines an axis-aligned bounding box (AABB)

In their paper, Kay and Kajiya refer to the region of space between two planes as the slab, and the normal vectors that define the orientation of the slab as the plane-set normal. They observed:

Different choices of plane set normals result in different bounding plates for the object. The intersection of a set of bounding slabs produces a bounding volume. In order to create a closed bounding volume in 3-space, at least three bounding plates must be involved and they must be chosen so that the defined normal vectors span 3-space.

The simplest example of this principle is the axis-aligned bounding box (AABB), which is defined by three flat plates parallel to the xz, yz, and xy planes, respectively (Figure 4). However, Kay and Kajiya propose to use not just three but seven plates to obtain a tighter bounding volume. The plane set normals for these slabs are pre-selected and independent of the objects being bounded. To better visualize how this works, let’s go back to the two-dimensional case again. Assume the plane set normals used to bind the objects are:

Figure 5: An object bounded by four plane set normals.

Figure 5 shows an object surrounded by these preselected plane set normals (this is a reproduction of Figure 3 in the Kay and Kajiya paper).

As you can see, the resulting bounding box fits the object better than a simple bounding box. For the 3D case, they use seven plane set normals:

The first three plane-set normals define an axis-aligned bounding box, and the last four plane-set normals define an octagonal parallelepiped. To construct the bounding volume of an object, we simply find the minimum and maximum values of
Each plate is normalized by projecting the vertices of the model onto the seven plane set normals.

2. Calculate the intersection point of light and volume

Next, we need to write some code to raytrace the volume. The principle is very similar to that of ray box intersection. A slab is defined by two mutually parallel planes, if the ray is not parallel to these planes, it will intersect them producing two values tmin and tfar. To calculate the intersection distance, we simply substitute into the ray equation A x + B y + C z ? d = N i ? P x , y , z ? d = 0, leading to (Equation 2):

where Ni is one of the seven plane-set normals, O and R are the origin and direction of the ray, respectively, and d is the distance from the world origin to the plane with normal Ni, which has the intersection point Pxyz. Ni.O and Ni.R can be reused to calculate the intersection distance t between the ray and the two planes. Substituting precomputed d values (dnear and dfar) for each plane, the test board yields a value.

// computing intersection of a given ray with slab
float NdotO = N . O;
float NdotR = N . R;
float tNear = (dNear - NdotO) / NdotR;
float tFar = (dFar - NdotO) / NdotR;

Figure 6: When the product of R and N is less than zero, the effects of near and far need to be reversed.

As with ray tracing for boxes, care must be taken when the denominator Ni.R approaches zero. Also, we need to swap dnear and dfar when the denominator is less than zero (see Figure 6). As for ray box intersection, this test is performed for each slab that encloses the object. Among all calculated tnear we will keep the largest value, all calculated tfar values
value, we will keep the smallest one. If the ray intersects the volume, the t-values indicate where those intersections are along the ray. The resulting interval (defined by tnear and tfar)) can be used to estimate the position of the object along the ray. Now let’s try to put what we have learned into practice.

3. BVH implementation source code

The following C++ code implements the methods described in this chapter. The BVH class is derived from the base AccelerationStructure class. In this class we create a structure called Extents that holds the dnear and dfar values for all seven predefined plane set normals (Lines 7-11).

class BVH : public AccelerationStructure
{
    static const uint8_t kNumPlaneSetNormals = 7;
    static const Vec3f planeSetNormals[kNumPlaneSetNormals];
    struct Extents
    {
        Extents()
        {
            for (uint8_t i = 0; i < kNumPlaneSetNormals; + + i)
                d[i][0] = kInfinity, d[i][1] = -kInfinity;
        }
        bool intersect(const Ray<float> & amp; ray, float & amp; tNear, float & amp; tFar, uint8_t & amp; planeIndex);
        float d[kNumPlaneSetNormals][2];
    };
    Extents *extents;
public:
    BVH(const RenderContext *rcx);
    const Object* intersect(const Ray<float> &ray, IsectData &isectData) const;
    ~BVH();
};

In the constructor of the class, we allocate an extents array to store the bounding volume data for all objects in the scene (line 3). Then we loop through all objects and call the method computeBounds of the Object class to compute the values dNear and dFar for each slab surrounding the object (Lines 4-8). In the code snippet below, we only show this function of the PolygonMesh class. We iterate over all vertices of the mesh and project them onto the current plane (Lines 27-31). This ends the work done in the class constructor.

BVH::BVH(const RenderContext *rcx) : AccelerationStructure(rcx), extents(NULL)
{
    extents = new Extents[rcx->objects. size()];
    for (uint32_t i = 0; i < rcx->objects. size(); + + i) {
        for (uint8_t j = 0; j < kNumPlaneSetNormals; + + j) {
            rcx->objects[i]->computeBounds(planeSetNormals[j], extents[i].d[j][0], extents[i].d[j][1]);
        }
    }
}

const Vec3f BVH::planeSetNormals[BVH::kNumPlaneSetNormals] = {
    Vec3f(1, 0, 0),
    Vec3f(0, 1, 0),
    Vec3f(0, 0, 1),
    Vec3f( sqrtf(3) / 3.f, sqrtf(3) / 3.f, sqrtf(3) / 3.f),
    Vec3f(-sqrtf(3) / 3.f, sqrtf(3) / 3.f, sqrtf(3) / 3.f),
    Vec3f(-sqrtf(3) / 3.f, -sqrtf(3) / 3.f, sqrtf(3) / 3.f),
    Vec3f( sqrtf(3) / 3.f, -sqrtf(3) / 3.f, sqrtf(3) / 3.f) };
}

class PolygonMesh : public Object
{
    ...
    void computeBounds(const Vec3f & amp;planeNormal, float & amp;dnear, float & amp;dfar) const
{
        float d;
        for (uint32_t i = 0; i < maxVertIndex; + + i) {
            d = dot(planeNormal, P[i]);
            if (d < dnear) dnear = d;
            if (d > dfar) dfar = d;
        }
    }
    ...
};

Instead of intersecting every object in the scene, we call the intersection method of the BVH class once the render function is called. First, the rays are tested against all bounding volumes of all objects in the scene. To do this, we call the intersect method from the Extent structure with the current test object’s volume data (line 24). This method simply computes the intersection of the current ray with each of the seven plates surrounding the object, and keeps track of the maximum of the dNear values and the minimum of the computed dFar values. If dFar is greater than dFar, intersection with the bounding volume occurs and the function returns true. In the code version below, if the ray intersects the volume, we set the member variable N in the IsectData structure to the normal of the intersecting plane. The result of the dot product of this vector N with the ray direction is used to set the color of the current pixel. The resulting image is shown in Figure 7. This helps visualize the bounding volume that is intersected and surrounds the object.

inline bool BVH::Extents::intersect(
const Ray<float> & ray,
    float *precomputedNumerator, float
*precomputeDenominator,
float &tNear, float &tFar, uint8_t
&planeIndex)
{
for (uint8_t i = 0; i < kNumPlaneSetNormals; + + i) {
        float tn = (d[i][0] - precomputedNumerator[i]) / precomputedDenominator[i];
        float tf = (d[i][1] - precomputedNumerator[i]) / precomputedDenominator[i];
        if (precomputeDenominator[i] < 0) std::swap(tn, tf);
        if (tn > tNear) tNear = tn, planeIndex = i;
        if (tf < tFar) tFar = tf;
        if (tNear > tFar) return false;
    }
 
    return true;
}

const Object* BVH::intersect(const Ray<float> & amp;ray, IsectData & amp;isectData) const
{
float tClosest = ray.tmax;
    Object *hitObject = NULL;
    float precomputedNumerator[BVH::kNumPlaneSetNormals], precomputeDenominator[BVH::kNumPlaneSetNormals];
    for (uint8_t i = 0; i < kNumPlaneSetNormals; + + i) {
        precomputedNumerator[i] = dot(planeSetNormals[i], ray.orig);
        precomputeDenominator[i] = dot(planeSetNormals[i], ray.dir);;
    }
    for (uint32_t i = 0; i < rc->objects. size(); + + i) {
        __sync_fetch_and_add( &numRayVolumeTests, 1);
        float tNear = -kInfinity, tFar = kInfinity;
        uint8_t planeIndex;
        if (extents[i].intersect(ray, precomputedNumerator, precomputedDenominator, tNear, tFar, planeIndex)) {
            if (tNear < tClosest)
                tClosest = tNear, isectData.N = planeSetNormals[planeIndex], hitObject = rc->objects[i];
        }
    }
 
    return hitObject;
}

Figure 7: This image shows the intersecting bounding volumes enclosing each of the 32 Bezier patches that make up the teapot.
But in the final version of the intersect method below, if the bounding volume intersects, we test for intersection between the ray and the object (or object, if it’s a triangle mesh) enclosed by the volume. When the test succeeds, if the intersection The distance is the smallest we’ve found so far, we update tClosest and keep a pointer to the intersected object.

int main(int argc, char **argv)
{
    clock_t timeStart = clock();
    ...
    rc->accelStruct = new BVH(rc);//AccelerationStructure(rc);
    render(rc, filename);
    ...
    printf("Render time : .2f (sec)\\
", (float)(timeEnd - timeStart) / CLOCKS_PER_SEC);
    printf("Total number of triangles : %d\\
", totalNumTris);
    printf("Total number of primary rays : %llu\\
", numPrimaryRays);
    printf("Total number of ray-triangles tests : %llu\\
", lrt::numRayTrianglesTests);
    printf("Total number of ray-triangles intersections : %llu\\
", lrt::numRayTrianglesIsect);
    printf("Total number of ray-volume tests : %llu\\
", lrt::numRayBoxTests);
    return 0;
}

Finally, if we compile and run the program with the new acceleration structure, we get the following statistics:

Render time: 2.92 (sec)
Total number of triangles : 16384
Total number of primary rays : 307200
Total number of ray-triangles tests : 80998400
Total number of ray-triangles intersections : 111889
Total number of ray-volume tests : 9830400

This technique is 1.34 times faster than the method in the previous chapter. This may not seem like much, but a few seconds saved in a simple scene can turn into hours in a complex one.

4. Improvement plan

Although we improved on the results from Chapter 2, this technique still suffers from the problem that the rendering time is proportional to the number of objects in the scene. To further improve the performance of this approach, Kay and Kajiya suggest using a hierarchy of volumes. To quote the author of the paper again:

For each such node, we compute a bounding volume. Then, if the ray misses the bounding volume of a given node in the hierarchy, we can refuse further consideration of that node’s entire subtree. Using a hierarchy results in a cost of computing images that scales logarithmically with the number of objects in the scene.

We will implement this idea in the next chapter.

Original Link: BVH RayCast Algorithm – BimAnt