3D point cloud data into bird’s-eye view Python implementation code

I mainly refer to the English blog to write this article, and it is only used as a reference for personal study notes.

Article directory

- 1. Point cloud data
- 2. Image and point cloud coordinates
- 3. Create a bird’s-eye view of point cloud data
- - 3.1 Relevant coordinate axes of bird’s eye view
  - 3.2 Limiting the range of point cloud data
  - 3.3 Mapping point positions to pixel positions
  - 3.4 Switching to a new zero point
  - 3.5 pixel value
  - 3.6 Create image matrix
  - 3.7 display

1. Point cloud data

Point cloud data should be represented as a numpy array with

N rows, at least

3 columns. Each row corresponds to a point, with at least

3 values representing its position in space

(

the y

)

(x,y,z)

(x,y,z).

If the point cloud data comes from a lidar sensor, it may provide additional values for each point, such as “reflectivity,” which is a measure of how much of the laser beam is reflected back by obstacles at that location. In this case, point cloud data may be a

N\times 4

N-by-4 array.

2. Image and point cloud coordinates

Some important things to note about images:

Coordinate values in images are always positive.
The origin is at the upper left corner.
Coordinates are integer values.

Things to note about point cloud coordinates:

Coordinate values in a point cloud can be positive or negative.
Coordinates can take real numbers.
The positive x-axis represents forward.
The positive y-axis represents left.
A positive z-axis represents upwards.

3. Create a bird’s-eye view of point cloud data

3.1 Relevant coordinate axes of bird’s eye view

In order to create a bird’s-eye view image, the relevant axes of the point cloud data will be the x-axis and y-axis.

However, as we can see from the image above, we must be careful and take into account the following points:

The meaning of the x-axis and y-axis are exactly opposite.
The x and y axes point in opposite directions.
You have to move these values so that

(

0

,

0

)

(0,0)

(0,0) becomes the smallest coordinate value in the image.

3.2 Limiting the range of point cloud data

Often it is useful to focus only on specific regions of the point cloud. Therefore, we want to create a filter that only keeps points within our region of interest.

Since we’re looking at the data from the top and we’re aiming to convert it to an image, I’ll use an orientation that is more consistent with the image axes. Below, I specify the range of values I want to focus on relative to the origin. Anything to the left of the origin is considered negative, and anything to the right is considered positive. The x-axis of the point cloud will be interpreted as the forward direction (this will be the upward direction of our bird’s-eye image).

The code below sets the rectangle of interest to span 10m on either side of the origin and 20m forward from the origin.

side_range=(-10, 10) # left-most to right-most
fwd_range=(0, 20) # back-most to forward-most

Next, we create a filter that only keeps points that are actually within the rectangle we specify.

# EXTRACT THE POINTS FOR EACH AXIS
x_points = points[:, 0]
y_points = points[:, 1]
z_points = points[:, 2]

# FILTER - To return only indices of points within desired cube
# Three filters for: Front-to-back, side-to-side, and height ranges
# Note left side is positive y axis in LIDAR coordinates
f_filt = np.logical_and((x_points > fwd_range[0]), (x_points < fwd_range[1]))
s_filt = np.logical_and((y_points > -side_range[1]), (y_points < -side_range[0]))
filter = np. logical_and(f_filt, s_filt)
indices = np.argwhere(filter).flatten()

#KEEPERS
x_points = x_points[indices]
y_points = y_points[indices]
z_points = z_points[indices]

3.3 Mapping point positions to pixel positions

Now, we have a bunch of real-valued points. to map these values to integer position values. We could simply typecast all x and y values to integers, but we might end up losing a lot of resolution. For example, if the points are measured in meters, each pixel would represent

1\times1

1×1 meter rectangle, we will lose details smaller than that. This might be fine if you have a point cloud like a mountain scene. But if you want to capture finer details and recognize people, cars, or even smaller things, then this method is not good.

However, the above method can be slightly modified so that we can get the resolution we want. We can scale the data first before typecasting to an integer. For example, if the unit of measure is meters and we want a resolution of 5cm, we can do this:

res = 0.05
# CONVERT TO PIXEL POSITION VALUES - Based on resolution
x_img = (-y_points / res).astype(np.int32) # x axis is -y in LIDAR
y_img = (-x_points / res).astype(np.int32) # y axis is -x in LIDAR

You may have noticed that the x and y axes have been swapped and the directions reversed so we can start working with image coordinates.

3.4 Switch to new zero point

The x and y data are not quite ready to be mapped to an image. We may also have negative x and y values. So we need to shift the data so that (0,0) is the minimum value.

# SHIFT PIXELS TO HAVE MINIMUM BE (0,0)
# floor and ceil used to prevent anything being rounded to below 0 after shift
x_img -= int(np. floor(side_range[0] / res))
y_img += int(np. ceil(fwd_range[1] / res))

height_range = (-2, 0.5) # bottom-most to upper-most

# CLIP HEIGHT VALUES - to between min and max heights
pixel_values = np.clip(a = z_points,
                           a_min=height_range[0],
                           a_max=height_range[1])

3.5 pixel value

So we’ve used point data to specify x and y positions in the image. All we need to do now is specify what values we want to fill these pixel locations with. One possibility is to fill it with height data. Two things must be kept in mind:

Pixel values should be integers.
Pixel value should be a value between 0-255.

We can grab the min and max height values from the data and rescale the range to fit in the range 0~255. Another approach, which will be used here, is to set a range of height values that we want to focus on, and anything above or below that range is clipped to the min and max. This is useful because it allows us to obtain the maximum amount of detail from the region of interest.

height_range = (-2, 0.5) # bottom-most to upper-most

# CLIP HEIGHT VALUES - to between min and max heights
pixel_values = np.clip(a = z_points,
                           a_min=height_range[0],
                           a_max=height_range[1])

Next, we rescale these values into values between 0-255 and typecast them to integers.

def scale_to_255(a, min, max, dtype=np.uint8):
    """ Scales an array of values from specified min, max range to 0-255
        Optionally specify the data type of the output (default is uint8)
    """
    return (((a - min) / float(max - min)) * 255).astype(dtype)

# RESCALE THE HEIGHT VALUES - to be between the range 0-255
pixel_values = scale_to_255(pixel_values, min=height_range[0], max=height_range[1])

3.6 Create image matrix

Now that we’re ready to actually create the image, we just need to initialize an array whose size depends on the extent we want the image array to be and the resolution we choose. We then use the x and y point values converted to pixel locations to specify indices into the array, and assign to these indices the values we chose as pixel values in the previous section.

# INITIALIZE EMPTY ARRAY - of the dimensions we want
x_max = 1 + int((side_range[1] - side_range[0])/res)
y_max = 1 + int((fwd_range[1] - fwd_range[0])/res)
im = np.zeros([y_max, x_max], dtype=np.uint8)

# FILL PIXEL VALUES IN IMAGE ARRAY
im[y_img, x_img] = pixel_values

3.7 Display

Currently, images are stored as numpy arrays. If we wish to visualize it, we can convert it to a PIL image, and view it.

# CONVERT FROM NUMPY ARRAY TO A PIL IMAGE
from PIL import Image
im2 = Image.fromarray(im)
im2. show()