Actual combat! NVIDIA CUDA-based Point Cloud Library (PCL) accelerates lidar point cloud

Author | Point Cloud PCL

Click the Card below to follow the “Heart of Autonomous Driving” public account

ADAS giant volume of dry information is now available

Click to enter→Heart of Autonomous Driving [Point Cloud Processing] Technical Exchange Group

This article is for academic sharing only. If there is any infringement, please contact us to delete the article.

Preface

In this article, we will introduce how to use CUDA-PCL to process point clouds to obtain the best performance. Since PCL cannot fully utilize CUDA on Jetson, NVIDIA has developed some CUDA-based libraries with the same functions as PCL. Code address: https://github.com/NVIDIA-AI-IOT/cuPCL.git (only dynamic libraries and header files, the author said the source code will be open source in the future).

cuPCL contains libraries for processing point clouds using CUDA, as well as examples for their usage. There are several subfolders in the project, each subfolder contains: a library implemented by CUDA and a sample code that uses the library and checks the performance and accuracy by comparing its output with that of PCL. The library supports Xavier, Orin and Linux x86.

about the author

Lei Fan is a senior CUDA software engineer at NVIDIA. He currently works with NVIDIA’s Chinese technical support engineer team to develop solutions for optimizing software performance through CUDA.

Lily Li, is working in developer relations on NVIDIA’s robotics team. She currently works on developing robotics solutions within the Jetson ecosystem to help create best practices.

main content

Many Jetson users choose LiDAR as the main sensor for positioning and perception. LiDAR describes the spatial environment around the vehicle as a set of three-dimensional points, called a point cloud. Point clouds sample the surfaces of surrounding objects and have long-range and high-definition capabilities. The characteristics of accuracy make it very suitable for advanced obstacle perception, map making, positioning and planning algorithms.

b5505912de126cd57c650ee67c90f439.jpeg

In this article, CUDA-PCL 1.0 is introduced. Here we mainly introduce three CUDA-accelerated PCL libraries:

1.CUDA-ICP

2.CUDA-Segmentation

3.CUDA-Filter

3a3794d6438972291d4fa256feea7675.png

CUDA-ICP

In iterative closest point (ICP), a point cloud target or reference point cloud is fixed, while the source points are transformed to best match the reference point. The algorithm iteratively computes the required transformation matrix to minimize the error metric, which is typically the distance from a source point to a reference point cloud, such as the sum of squared differences between matching pairs of coordinates. ICP is one of the widely used algorithms for aligning 3D models given an initial guess pose required by a rigid transformation. The advantages of ICP include high-precision matching results, strong robustness to different initializations, etc. However, it consumes a lot of computing resources. To improve ICP performance on Jetson, NVIDIA has released a CUDA-based ICP that can replace the original ICP version in the Point Cloud Library (PCL). The following code examples are CUDA-ICP examples. This class can be instantiated and then cudaICP.icp() executed directly.

cudaICP icpTest(nPCountM, nQCountM, stream);
    icpTest.icp(cloud_source, nPCount,
            float *cloud_target, int nQCount,
            int Maxiterate, double threshold,
            Eigen::Matrix4f &transformation_matrix, stream);

ICP computes the transformation matrix between two point clouds:

source(P)* transformation =target(Q)

Because lidar provides a point cloud with a fixed number, we can get the maximum number of points. Both nPCountM and nQCountM are used to allocate cache for ICP.

class cudaICP
{
public:
    /* nPCountM and nQCountM are the maximum of count for input clouds.
       They are used to pre-allocate memory.
    */
    cudaICP(int nPCountM, int nQCountM, cudaStream_t stream = 0);
    ~cudaICP(void);
    /*
    cloud_target = transformation_matrix *cloud_source
    When the Epsilon of the transformation_matrix is less than threshold,
    the function returns transformation_matrix.
    Input:
        cloud_source, cloud_target: Data pointer for the point cloud.
        nPCount: Point number of the cloud_source.
        nQCount: Point number of the cloud_target.
        Maxiterate: Threshold for iterations.
        threshold: When the Epsilon of the transformation_matrix is less than
            threshold, the function returns transformation_matrix.
    Output:
        transformation_matrix
    */
    void icp(float *cloud_source, int nPCount,
            float *cloud_target, int nQCount,
            int Maxiterate, double threshold,
            Eigen::Matrix4f &transformation_matrix,
            cudaStream_t stream = 0);
    void *m_handle = NULL;
};

Table 2 shows the performance comparison results of CUDA-ICP and PCL-ICP

ea01df1cc3f1e17f7b9713031ea3222f.png

The state of the point cloud two frames before ICP

4c97493a50b9118e3b47bb5b3c095939.png

The state of the point cloud two frames after ICP

857a753c59b09d891064602d09d8146b.png

CUDA-Segmentation

The point cloud map contains many ground points, which not only makes the entire map look messy, but also brings trouble to the classification, identification, and tracking of subsequent obstacle point clouds, so they need to be deleted first. Ground removal can be achieved through point cloud segmentation. The library achieves this using Random Sampling Consensus (Ransac) fitting and nonlinear optimization. The following is the sample code for CUDA-Segmentation. First instantiate the class, initialize the parameters, and then call cudaSeg.segment directly to perform ground removal.

//Now Just support: SAC_RANSAC + SACMODEL_PLANE
  std::vectorindexV;
  cudaSegmentation cudaSeg(SACMODEL_PLANE, SAC_RANSAC, stream);
  segParam_t setP;
  setP.distanceThreshold = 0.01;
  setP.maxIterations = 50;
  setP.probability = 0.99;
  setP.optimizeCoefficients = true;
  cudaSeg.set(setP);
  cudaSeg.segment(input, nCount, index, modelCoefficients);
  for(int i = 0; i < nCount; i + + )
  {
    if(index[i] == 1)
    indexV.push_back(i);
  }

CUDA-Segmentation segments an input with nCount points using some parameters, index is the index of the input, representing the target plane, and modelCoefficients is the coefficient group of the plane.

typedef struct {
  double distanceThreshold;
  int maxIterations;
  double probability;
  bool optimizeCoefficients;
} segParam_t;
class cudaSegmentation
{
public:
    //Now Just support: SAC_RANSAC + SACMODEL_PLANE
    cudaSegmentation(int ModelType, int MethodType, cudaStream_t stream = 0);
    ~cudaSegmentation(void);
    /*
    Input:
        cloud_in: Data pointer for point cloud
        nCount: Count of points in cloud_in
    Output:
        Index: Data pointer that has the index of points in a plane from input
      modelCoefficients: Data pointer that has the group of coefficients of the plane
    */
    int set(segParam_t param);
    void segment(float *cloud_in, int nCount,
            int *index, float *modelCoefficients);
private:
    void *m_handle = NULL;
};

Table 3 shows the performance comparison between CUDA-Segmentation and PCL-Segmentation.

24c437c269c50a2f3648774bd03b8fb1.png

Figures 3 and 4 show the raw point cloud data, followed by a processed version that retains only obstacle-related point clouds. This example is typical in point cloud processing, including ground removal, deleting some point clouds and extracting features, and clustering some point clouds.

0cd884a46ab5ce71290b5b15174919a4.png

Figure 3. Original point cloud of CUDA-Segmentation

4725fb5ae9f04a07f69cb4f0d63026df.png

Figure 4. Point cloud processed by CUDA-Segmentation

CUDA-Filter

Filtering is one of the most important preprocessing operations before segmentation, detection, and recognition of point clouds. Through filtering, the coordinate constraints of the point cloud can be realized, and the X, Y, and Z axes of the point cloud can be directly filtered. Point cloud filtering can only constrain the Z axis or the three coordinate axes X, Y, and Z. CUDA-Filter currently only supports PassThrough, but more methods will be supported in the future. The following is a code example for the CUDA-Filter example that creates an instance of the class, initializes the parameters, and then calls the cudaFilter.filter function directly.

cudaFilter filterTest(stream);
  FilterParam_t setP;
  FilterType_t type = PASSTHROUGH;
  setP.type = type;
  setP.dim = 2;
  setP.upFilterLimits = 1.0;
  setP.downFilterLimits = 0.0;
  setP.limitsNegative = false;
  filterTest.set(setP);
  filterTest.filter(output, & amp;countLeft, input, nCount);

CUDA-Filter uses parameters to filter the input nCount points, and then after filtering by CUDA, outputs results with countLeft points.

typedef struct {
    FilterType_t type;
    //0=x,1=y,2=z
    int dim;
    float upFilterLimits;
    float downFilterLimits;
    bool limitsNegative;
} FilterParam_t;
class cudaFilter
{
public:
    cudaFilter(cudaStream_t stream = 0);
    ~cudaFilter(void);
    int set(FilterParam_t param);
    /*
    Input:
        source: data pointer for point cloud
        nCount: count of points in cloud_in
    Output:
        output: data pointer which has points filtered by CUDA
        countLeft: count of points in output
    */
    int filter(void *output, unsigned int *countLeft, void *source, unsigned int nCount);
    void *m_handle = NULL;
};

Table 4. shows the performance comparison between CUDA-Filter and PCL-Filter.

b6934377b4105abca11a8fa92f8dfb7a.png

Figures 5 and 6 show examples of PassThrough filters with constraints on the X-axis.

77a2bfd50ab2494baa2ab5e805aeee7d.png

Figure 6. Original point cloud.

bf2ec91d8d1024ef6e943a929c576f37.png

Figure 6. Point cloud filtered by constraints on the X-axis.

Comparison with other modules

VoxelGrid

5970ed5bcd973e0bd25f6a1fcc35040a.png

cuOctree

11276a5cfca6c0b09227587e42de52ab.png

cuCluster

7d50bdce58503816ce74bc0538f58e6a.png

cuNDT

8f3b6ebf0054f8f17dacd80f2e68b603.png

Related Links

https://github.com/NVIDIA-AI-IOT/cuPCL.git

https://developer.nvidia.com/blog/accelerating-lidar-for-robotics-with-cuda-based-pcl/

https://developer.nvidia.com/blog/detecting-objects-in-point-clouds-with-cuda-pointpillars/

① Exclusive video courses on the entire network

BEV perception, millimeter wave radar vision fusion, Multi-sensor calibration, Multi-sensor fusion, Multi-mode Dynamic 3D target detection,Point cloud 3D target detection,Target tracking,Occupancy,cuda and TensorRT model deployment< /strong>,Collaborative sensing,Semantic segmentation,Autonomous driving simulation,Sensor deployment, strong>Decision planning, trajectory prediction and other multi-directional learning videos (Scan the QR code to learn)

f34e65fc6377af58b0db06fe594cd441.png
Video official website: www.zdjszx.com

② China’s first autonomous driving learning community

A communication community of nearly 2,000 people, involving 30+ autonomous driving technology stack learning routes. Want to know more about autonomous driving perception (2D detection, segmentation, 2D/3D lane lines, BEV perception, 3D target detection, Occupancy, multi-sensor fusion, Multi-sensor calibration, target tracking, optical flow estimation), automatic driving positioning and mapping (SLAM, high-precision maps, local online maps), technical solutions in the fields of automatic driving planning control/trajectory prediction, AI model deployment and implementation, industry trends, The job is posted. Welcome to scan the QR code below and join the Knowledge Planet of the Autonomous Driving Heart. This is a place with real information. You can communicate with the big guys in the field about various problems in getting started, studying, working, and changing jobs, and share daily. Paper + code + video, looking forward to the exchange!

1dd9e7483b22452f37365650c9d75ad0.png

③【Heart of Autonomous Driving】Technical Exchange Group

The Heart of Autonomous Driving is the first autonomous driving developer community, focusing ontarget detection, semantic segmentation, panoramic segmentation, instance segmentation, key point detection, lane lines, target tracking, 3D target detection, BEV perception, multi-modal perception , Occupancy, multi-sensor fusion, transformer, large model, point cloud processing, end-to-end autonomous driving, SLAM, optical flow estimation, depth estimation, trajectory prediction, high-precision map, NeRF, planning control, model deployment and implementation, autonomous driving simulation Testing, product manager, hardware configuration, AI job search communicationetc. Scan the QR code to add Autobot Assistant WeChat invitation to join the group, note: school/company + direction + nickname (quick way to join the group)

04693b65696556917c918a008954936d.jpeg

④【Heart of Autonomous Driving】Platform Matrix, Welcome to contact us!

f204b8230d4956c2b487712ab183174e.jpeg

syntaxbug.com © 2021 All Rights Reserved.