Talking about “360 panoramic stitching” of fisheye camera from open source projects

Table of Contents

Overview

Let’s start from the background of 360 panorama

Talk about perspective changes across parameter calibration

Post-processing of spliced pictures

references

Overview

The reason for writing this article stems entirely from open source projects (see reference 1 for GitHub). This project covers a relatively complete production process of the surround view system, including complete calibration, projection, splicing and real-time operation processes. This article mainly sorts out some implementation details in panoramic stitching technology, and records my own thoughts in some places. In view of this open source project, the follow-up plans are to: (1) Complete the 360 panoramic stitching demo based on the calibration parameters (internal and external parameters) of the autonomous vehicle’s surround-view camera; (2) The parameter calibration links to be skipped in this article will be taken out separately later. Detailed analysis.

Start from the background of 360 panorama

Baidu Encyclopedia: “The 360-degree panoramic reversing image is a set of timely image information (bird’s-eye view images) that allows you to view a 360-degree panoramic fusion around the car through a vehicle-mounted display screen, with an ultra-wide viewing angle and seamless splicing to understand the vehicle. The peripheral vision blind zone is a parking assistance system that helps car drivers park their vehicles more intuitively and safely. It is also called a panoramic parking imaging system or a panoramic parking imaging system (different from the market that displays the surrounding images of the car on the display screen. “Panorama” system for split display).”

Originating from the Internet, such as intrusion and deletion

To put it more simply, the 360 panoramic image visually presents a bird’s-eye view, that is, a bird’s-eye view from the sky looking at the roof of the car or the ground. However, the bird’s-eye view effect is achieved by splicing photos from multiple perspectives. As shown in the picture below, when we look at our car, we can find that there is a fish-eye camera in the four directions of the car, front, rear, left and right. Generally, fish-eye cameras in the left and right directions are mostly placed under the rearview mirror due to the problem of field of view.

By splicing four fisheye cameras onto a picture through a certain method, and then pasting the vehicle photo into the center of the picture, a complete panoramic image can be approximately formed. The picture below shows the front, back, left and right photos of the four-way fisheye camera used in the open source project.

Chat perspective changes across parameter calibration

The focus of writing this article is to sort out how to transform the perspective from fisheye pictures to panoramic pictures, skipping the internal and external parameter calibration of the camera. To complete the change from a monocular fisheye image to a bird’s-eye view, the most important step is to complete the following Projection transformation from the left image to the right image, also called perspective transformation.

The change from left to right is achieved by the following expression. Let me add here that some students call the perspective matrix external parameters. I think this statement is far-fetched. In order to illustrate this problem, we will first discuss it in the world coordinate system. The same principle applies to other coordinate systems.

Please refer to the above formula. Based on back-projecting the pixel coordinates back to the world coordinate system, internal and external parameters are required. So even if the perspective is the world coordinate, can the perspective matrix only be called external parameters?
Secondly, is it appropriate to define the coordinates obtained after perspective conversion as world coordinates? Because the perspective transformation of each perspective is separate and not in the same coordinates.
Also, what you get from pixel coordinates to world coordinates is the world coordinates under the normalized plane, which are not world coordinates in the strict sense. Is it accurate to define the coordinates obtained after perspective conversion as world coordinates?

The above three questions are just a starting point, and I would like to discuss them with you. I also plan to write two more articles on “camera internal parameters” and “real vehicle panoramic stitching” respectively to verify this matter from data.

Quoted from:
Perspective Transformation | TheAILearner

The above formula explains the perspective transformation matrix and the physical meaning of its elements. If you are interested, you can go directly to the original address to read it. The following illustration of perspective transformation is also quite vivid.

The author performed the same perspective transformation on all four-way fisheye images. But there is another important task before implementing projection, which is to remove distortion of the image and make some corrections to the field of view during the process of removing distortion.

Regarding the Correction Matrix in the CV library de-distortion method: Refer to the above formula. If you have derived the value of the corresponding element position of the internal parameter matrix, it should be clear that the function of multiplying the coefficient at the corresponding non-zero element position is: Complete The scaling and translation of pixel coordinates on each axis. When the image size is defined, the effect of this change is tantamount to a “cropping” effect. In this way, the distortion removal work is completed.

def update_undistort_maps(self):
    new_matrix = self.camera_matrix.copy()
    # Calibrate internal parameters
    new_matrix[0, 0] *= self.scale_xy[0]
    new_matrix[1, 1] *= self.scale_xy[1]
    new_matrix[0, 2] + = self.shift_xy[0]
    new_matrix[1, 2] + = self.shift_xy[1]
    width, height = self.resolution

    self.undistort_maps = cv2.fisheye.initUndistortRectifyMap(
        self.camera_matrix,
        self.dist_coeffs,
        np.eye(3),
        new_matrix,
        (width, height),
        cv2.CV_16SC2
    )
    return self

def undistort(self, image):
    print("undistort_maps: ", self.undistort_maps)
    result = cv2.remap(image, *self.undistort_maps, interpolation=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT)
    return result

Further we start to deal with perspective transformation: Perspective transformation requires the input image and its transformation matrix. For the transformation of the transformation matrix, please refer to the source code. The basic principle is to assume that the size of the calibration cloth in the real world is scaled to the corresponding coordinate values of several points, which are used as dst and the coordinates of the corresponding position of the original fisheye image as src, through the getPerspectiveTransform method Obtain, I won’t go into details here, I will write related articles later.

def project(self, image):
    result = cv2.warpPerspective(image, self.project_matrix, self.project_shape)
    return result

The next step is to put together the perspective drawings from each perspective: The main thing that needs to be sorted out here is the relative position and direction of each drawing. Since the perspective transformation map of each perspective is processed separately, which is equivalent to the orientation of the camera, the rear camera needs to be center transformed and posted above the bird’s-eye view. The left camera view needs to be rotated left and placed on the left side, and the right side is the same.

def flip(self, image):
    if self.camera_name == "front":
        return image.copy()
    elif self.camera_name == "back":
        return image.copy()[::-1, ::-1, :]
    elif self.camera_name == "left":
        return cv2.transpose(image)[::-1]
    else:
        return np.flip(cv2.transpose(image), 1)

Post-processing of spliced pictures

The splicing and smoothing of the bird’s-eye view in the final stage is not the focus of this article, but it mainly involves several strategies:

birdview = BirdView()
Gmat, Mmat = birdview.get_weights_and_masks(projected)
birdview.update_frames(projected)
birdview.make_luminance_balance().stitch_all_parts()
birdview.make_white_balance()
birdview.copy_car_image()

Weighted averaging of pixel values in overlapping areas
Adjust the brightness of each area for brightness consistency in the stitched image
Improve the problem of different intensities of different channels of the camera through color balance

References

[1] https://github.com/neozhaoliang/surround-view-system-introduction