This article is based on OpenCV4.80. The configuration of the environment may be discussed separately later. Let me mention vcpkg first. It is really easy to use.
1 General process
Sparse point clouds are gradually generated from multiple images. This process usually includes the following steps:
-
Initial rebuild:
The selection of the initial two pictures is very important. This is the basis of the entire process. Subsequent additions will be based on these two pictures.
- For the input image, feature points need to be extracted first (for example, SIFT, SURF or ORB feature points). Then, by matching the feature points in different images, establish the correspondence between them
- Estimate the camera’s extrinsic parameter matrix (rotation matrix
R
and translation vectorT
) through the essential matrixE
between the two images, and then use < strong>TriangulationCalculate some initial 3D points
For specific operations, please check my previous blog
-
Incremental reconstruction:
From here, gradually add images and gradually expand the 3D point cloud
- Add new image: Load a new image into the rebuild process
- Feature extraction and matching: extract feature points from new images and match them with previous images to obtain new matching relationships
- Pose estimation: Estimate the camera pose of the new image relative to the previous image, usually using
PnP
(Perspective-n-Point) – within the known camera parametersK
, use the three-dimensional points (object_points) at this angle and their corresponding image points (image_points) coordinates to estimate the shooting time location information - 3D point triangulation: Use new matching pairs and estimated poses (
R
,T
) to triangulate to generate new 3D points . - Point cloud merging: Merge newly generated 3D points with previous point clouds to build a larger sparse point cloud
-
Global point cloud optimization: After the sparse point cloud has been generated, global point cloud optimization techniques, such as Bundle Adjustment, can be used to improve the accuracy of the point cloud.
2 Prepare code
In the previous article, we said that all the code was squeezed into the main function, which was very unsightly. Now let’s optimize the code.
I just learned C++, so please forgive me for the comparison.
2.1 Include.h
All used header files and macros are included here for easy use later.
Since Bundle Adjustment will be used later, ceres is introduced. After configuring the specific environment, it may be said (it is really troublesome, and vcpkg is highly recommended) strong> ), a large number of #define
and #pragma warning(disable: 4996)
are about ceres errors
#ifndef INCLUDES_H #define INCLUDES_H #define GLOG_NO_ABBREVIATED_SEVERITIES #define _CRT_NONSTDC_NO_DEPRECATE #define NOMINMAX #define _CRT_NONSTDC_NO_WARNINGS #pragma warning(disable: 4996) #include <opencv2/opencv.hpp> #include <iostream> #include <vector> #include <fstream> #include <ceres/ceres.h> #include <ceres/rotation.h> using namespace cv; using namespace std; #endif // !INCLUDES_H #pragma once
2.2 Constructor
Constructor class, which contains functions for several key steps of 3D reconstruction:
findCamera
: The initial construction uses the E matrix and R, T (including RANSAC)maskoutPoints
: Filter points through the interior point mark maskpointsReconstruct
: Triangulate to generate a three-dimensional point cloud by matching points with R and T
Constructor.h:
#ifndef CONSTRUCTOR_H #define CONSTRUCTOR_H #include "Includes.h" #include "Images.h" class Constructor {<!-- --> public: //Input K, matching points in Figure 1, matching points in Figure 2; output R, T; points are filtered static void findCamera(Mat K, vector<Point2f> & point1, vector<Point2f> & point2, Mat & output_R, Mat & output_T, vector<uchar> & mask); \t //Input the matching point of the graph, and mark the interior point as mask; return the vector<Point2f> matching point after mask static void maskoutPoints(vector<Point2f> & amp; input_points, vector<uchar> & amp; input_mask); //Input R, T, matching points in Figure 1, R, T, matching points in Figure 2; return vector<Point3f> three-dimensional points static vector<Point3d> & pointsReconstruct(const Mat & K, Mat & R1, Mat & T1, Mat & R2, Mat & T2, vector<Point2f> & points1, vector<Point2f> & amp; points2); }; #endif // !CONSTRUCTOR_H #pragma once
Constructor.cpp:
#include "Constructor.h" void Constructor::findCamera(Mat K, vector<Point2f> & point1, vector<Point2f> & point2, Mat & output_R, Mat & output_T, vector<uchar> & mask) {<!-- --> vector<uchar> inliers; Mat F; F = findFundamentalMat(point1, point2, inliers, FM_RANSAC, 1, 0.5); Mat E = K.t() * F * K; //Mat E = findEssentialMat(point1, point2, K, RANSAC, 0.6, 1.0, inliners); \t mask = inliers; // Filter out new matching points based on internal points Constructor::maskoutPoints(point1, inliers); Constructor::maskoutPoints(point2, inliers); // Decompose the E matrix and obtain the R and T matrices int pass_count = recoverPose(E, point1, point2, K, output_R, output_T); } void Constructor::maskoutPoints(vector<Point2f> & amp; input_points, vector<uchar> & amp; input_mask) {<!-- --> vector<Point2f> temp_points(input_points); input_points.clear(); for (int i = 0; i < temp_points.size(); + + i) {<!-- --> if (input_mask[i]) {<!-- --> input_points.push_back(temp_points[i]); } } } vector<Point3d> & amp; Constructor::pointsReconstruct(const Mat & amp; K, Mat & amp; R1, Mat & amp; T1, Mat & amp; R2, Mat & amp; T2, vector<Point2f> & amp; points1, vector<Point2f> & amp; points2) {<!-- --> //Construct projection matrix Mat proj1(3, 4, CV_32FC1); Mat proj2(3, 4, CV_32FC1); // Combine the rotation matrix and translation vector into the projection matrix R1.convertTo(proj1(Range(0, 3), Range(0, 3)), CV_32FC1); T1.convertTo(proj1.col(3), CV_32FC1); R2.convertTo(proj2(Range(0, 3), Range(0, 3)), CV_32FC1); T2.convertTo(proj2.col(3), CV_32FC1); //Multiply the internal parameter matrix and the projection matrix to get the final projection matrix Mat fK; K.convertTo(fK, CV_32FC1); proj1 = fK * proj1; proj2 = fK * proj2; //Triangulate to get homogeneous coordinates Mat point4D_homogeneous(4, points1.size(), CV_64F); triangulatePoints(proj1, proj2, points1, points2, point4D_homogeneous); //Convert homogeneous coordinates to three-dimensional coordinates vector<Point3d> point3D; point3D.clear(); point3D.reserve(point4D_homogeneous.cols); for (int i = 0; i < point4D_homogeneous.cols; + + i) {<!-- --> Mat<float> col = point4D_homogeneous.col(i); col /= col(3); point3D.push_back(Point3d(col(0), col(1), col(2))); } // Store the three-dimensional coordinates in the Point3d vector and return return point3D; }
2.3 Image
In order to add images, we need to store the corresponding points in space for each feature point in the image – correspond_struct_idx
Image class, which has member variables:
Mat image
–Storage imagevector
–Storage feature pointskeyPoints Mat descriptor
–Storage feature descriptorvector
–The index of the spatial point corresponding to the matching point in the point cloudcorrespond_struct_idx vector
–Storage matching pointsmatchedPoints vector
– stores the color information of matching pointscolors Mat R, T
–Storage the rotation matrix and translation vector of the camera
There are also several important functions for image processing:
Images
: Constructor, which extracts feature points when reading the image.matchFeatures
: Match feature pointsfindColor
: Extract color informationgetObjPointsAndImgPoints
: Find the points in the current match that are already in the point cloud, get object_points, and image_points – for PnP strong> Prepare
Image.h:
#ifndef IMAGES_H #define IMAGES_H #include "Includes.h" class Images {<!-- --> public: Mat image; // store image vector<KeyPoint> keyPoints; //Storage feature points Mat descriptor; //Storage feature descriptor vector<int> correspond_struct_idx; //The index of the spatial point corresponding to the matching point in the point cloud vector<Point2f> matchedPoints; // store matching points vector<Vec3b> colors; //Storage color information of matching points Mat R, T; //Storage the rotation matrix and translation vector of the camera vector<Point3f> object_points; // Three-dimensional points corresponding to the matching points in the previous picture vector<Point2f> image_points; //The corresponding pixels in the current image //Constructor, read the image from the specified path and extract SIFT feature points and descriptors Images(string const image_paths); // Feature matching function, match the current image with another image void matchFeatures(Images & amp; otherImage, vector<DMatch> & amp; outputMatches); //Extract color information from matching points void findColor(); // Traverse the current match, find the points in the current match that are already in the point cloud, and obtain object_points and image_points void getObjPointsAndImgPoints(vector<DMatch> & amp; matches, vector<Point3d> & amp; all_reconstructed_points, Images & amp; preImage); }; #endif // !IMAGES_H #pragma once
Image.cpp:
#include "Images.h" Images::Images(string const image_path) {<!-- --> // read image this->image = imread(image_path); if (this->image.empty()) {<!-- --> cout << "Could not read image: " << image_path << endl; } //Extract SIFT feature points and descriptors Ptr<SIFT> sift = SIFT::create(0, 17, 0.0000000001, 16); sift->detectAndCompute(this->image, noArray(), this->keyPoints, this->descriptor); for (int i = 0; i < keyPoints.size(); i + + ) {<!-- --> correspond_struct_idx.push_back(-1); } } void Images::findColor() {<!-- --> // Traverse all matching points for (Point2f & amp; Points : this->matchedPoints) {<!-- --> // Get the color of the pixel Vec3b color = this->image.at<Vec3b>(Points.y, Points.x); // Store color in color vector this->colors.push_back(color); } } void Images::matchFeatures(Images & amp; otherImage, vector<DMatch> & amp; outputMatches) {<!-- --> //Clear matching points otherImage.matchedPoints.clear(); this->matchedPoints.clear(); vector<vector<DMatch>> matches; FlannBasedMatcher matcher; // Use FlannBasedMatcher for feature matching matcher.knnMatch(this->descriptor, otherImage.descriptor, matches, 2); // Calculate the minimum distance float min_dist = FLT_MAX; for (int r = 0; r < matches.size(); + + r) {<!-- --> // If the nearest neighbor distance is greater than 2.5 times the second nearest neighbor distance, skip the matching point if (matches[r][0].distance < 2.5 * matches[r][1].distance) {<!-- --> // Calculate the minimum distance float dist = matches[r][0].distance; if (dist < min_dist) {<!-- --> min_dist = dist; } } } // Filter out good matching points for (int i = 0; i < matches.size(); i + + ) {<!-- --> if (matches[i][0].distance < 0.76 * matches[i][1].distance & amp; & amp; matches[i][0].distance < 8 * max(min_dist, 10.0f)) {<!-- --> outputMatches.push_back(matches[i][0]); } } // Store the matching points in the matchedPoints vector for (int i = 0; i < outputMatches.size(); + + i) {<!-- --> this->matchedPoints.push_back(this->keyPoints[outputMatches[i].queryIdx].pt); otherImage.matchedPoints.push_back(otherImage.keyPoints[outputMatches[i].trainIdx].pt); } } // Get three-dimensional space points and image points from matching points void Images::getObjPointsAndImgPoints(vector<DMatch> & amp; matches, vector<Point3d> & amp; all_reconstructed_points, Images & amp; preImage) {<!-- --> // Clear object_points and image_points this->object_points.clear(); this->image_points.clear(); // Traverse all matching points for (int i = 0; i < matches.size(); i + + ) {<!-- --> // Get the index of the three-dimensional space point corresponding to the matching point in the previous image int matched_world_point_indices = preImage.correspond_struct_idx[matches[i].queryIdx]; // If the matching point exists in the corresponding three-dimensional space point in the previous image if (matched_world_point_indices > 0) {<!-- --> // Add it (the three-dimensional point in the previous image) to object_points this->object_points.push_back(all_reconstructed_points[matched_world_point_indices]); // Add matching points (2D points of this new image) to image_points this->image_points.push_back(this->keyPoints[matches[i].trainIdx].pt); } } }
3 Specific implementation
Based on the construction of the initial three-dimensional point cloud of the previous two pictures, we will implement the incremental construction of multiple pictures.
3.1 Initial build
It has been described in detail in the previous blogs: matching, using the method of calculating the E matrix to obtain the camera external parameters R and T, and performing triangulation to construct a point cloud.
Special: For the subsequent image reconstruction, we need to record the relationship between each point and the point cloud of the initial two images
void initConstruction(vector<Images> & amp; initImages, vector<Point3d> & amp; all_reconstructed_points, vector<Vec3b> & amp; all_points_colors) {<!-- --> initImages.push_back(*(new Images(INIT_IMG_PATH1))); initImages.push_back(*(new Images(INIT_IMG_PATH2))); vector<DMatch> matches; initImages[0].matchFeatures(initImages[1], matches); vector<uchar> mask; Constructor::findCamera(K, initImages[0].matchedPoints, initImages[1].matchedPoints, initImages[1].R, initImages[1].T, mask); initImages[0].R = Mat::eye(3, 3, CV_64FC1); initImages[0].T = Mat::zeros(3, 1, CV_64FC1); all_reconstructed_points = Constructor::pointsReconstruct(K, initImages[0].R, initImages[0].T, initImages[1].R, initImages[1].T, initImages[0].matchedPoints, initImages[1].matchedPoints ); initImages[1].findColor(); for (int i = 0; i < initImages[1].colors.size(); i + + ) {<!-- --> all_points_colors.push_back(initImages[1].colors[i]); } // Record the relationship between each point and point cloud of the initial two pictures according to the mask int idx = 0; for (int i = 0; i < matches.size(); i + + ) {<!-- --> if (mask[i]) {<!-- --> initImages[0].correspond_struct_idx[matches[i].queryIdx] = idx; initImages[1].correspond_struct_idx[matches[i].trainIdx] = idx; idx++; } } }
3.2 Incremental Build
-
Create
subImageBag
and then addinitImages[1]
to the container, which represents the second image ininitImages
(array index is 1) Will be compared with the subsequent ones (otherwise the next picture will be added to match with whom) -
Loop through the image file paths in the
sub_image_paths
container -
Within the loop, create a new
Images
for each image file path and add it to thesubImageBag
container. In this way, the containersubImageBag
contains multiple images, where the first image is the second of the initial image pair, and the remaining images are added step by step -
Call the
addImageConstruction
function, passingsubImageBag
as parameters, as well asall_reconstructed_points
for storing sparse point clouds andall_points_colors< for point cloud colors. /code>
-
Loop through each image in the
subImageBag
container, starting at index 1 (because the first image is the initial image and is used for the initial build, skip it) -
For each pair of adjacent images, do the following:
-
Use the
matchFeatures
method to find the feature point matching relationship between two adjacent images and store the matching results in thematches
container -
Use the
getObjPointsAndImgPoints
method to obtain the three-dimensional points and image points corresponding to the matching feature points - prepare for PnP -
Through RANSAC filtering, use the
findCamera
method to filter matching points and generate a mask to mark valid matching points (just for filtering) -
Use the
solvePnPRansac
method to estimate the camera pose of the new image and obtain R, T -
Convert the rotation vector to a rotation matrix (
solvePnPRansac
results in an r vector) -
Use the
pointsReconstruct
method to reconstruct the three-dimensional points between the new image and the previous image, and store the results innew_restructure_points
-
Use the
findColor
method to obtain the color information of the points in the new image -
Record the relationship between each point and point cloud of the initial two pictures:
Traverse
matches
, record the relationship between the newly generated point and each point and point cloud of the initial two images according to the tag inmask
, and maintain the relationship between the point and the point cloud. correspondence between -
Finally, add the newly generated three-dimensional points
new_restructure_points
and their color information toall_reconstructed_points
andall_points_colors
to continuously expand the point cloud
-
-
void addImageConstruction(vector<Images> & amp; subImageBag, vector<Point3d> & amp; all_reconstructed_points, vector<Vec3b> & amp; all_points_colors) {<!-- --> for (int i = 1; i < subImageBag.size(); i + + ) {<!-- --> cout << i << endl; vector<DMatch> matches; subImageBag[i - 1].matchFeatures(subImageBag[i], matches); subImageBag[i].getObjPointsAndImgPoints(matches, all_reconstructed_points, subImageBag[i - 1]); // Just for RANSAC filtering matching points and obtaining mask vector<uchar> mask; Mat discardR, discardT; Constructor::findCamera(K, subImageBag[i - 1].matchedPoints, subImageBag[i].matchedPoints, discardR, discardT, mask); solvePnPRansac(subImageBag[i].object_points, subImageBag[i].image_points, K, noArray(), subImageBag[i].R, subImageBag[i].T); Rodrigues(subImageBag[i].R, subImageBag[i].R); vector<Point3d> new_restructure_points; new_restructure_points = Constructor::pointsReconstruct(K, subImageBag[i - 1].R, subImageBag[i - 1].T, subImageBag[i].R, subImageBag[i].T, subImageBag[i - 1].matchedPoints, subImageBag[i].matchedPoints); subImageBag[i].findColor(); //Record the relationship between each point and point cloud of the initial two pictures int idx = 0; for (int k = 0; k < matches.size(); k + + ) {<!-- --> if (mask[k]) {<!-- --> subImageBag[i - 1].correspond_struct_idx[matches[k].queryIdx] = all_reconstructed_points.size() + idx; subImageBag[i].correspond_struct_idx[matches[k].trainIdx] = all_reconstructed_points.size() + idx; idx++; } } for (int k = 0; k < new_restructure_points.size(); k + + ) {<!-- --> all_reconstructed_points.push_back(new_restructure_points[k]); all_points_colors.push_back(subImageBag[i].colors[k]); } } }
3.3 Complete main.cpp
// Define the image file path and the path to save the results //#define INIT_IMG_PATH1 "test_img\images\100_7103.jpg" //#define INIT_IMG_PATH2 "test_img\images\100_7104.jpg" #define INIT_IMG_PATH1 "test_img\First stage\B25.jpg" #define INIT_IMG_PATH2 "test_img\First stage\B24.jpg" #define PLY_SAVE_PATH "test_img\results\output.ply" #include "Includes.h" #include "Images.h" #include "Constructor.h" //const Mat K = (Mat_(3, 3) << 2905.88, 0, 1416, 0, 2905.88, 1064, 0, 0, 1); const Mat K = (Mat_ (3, 3) << 719.5459, 0, 0, 0, 719.5459, 0, 0, 0, 1); //const vector sub_image_paths = { /*"test_img\images\100_7100.jpg", "test_img\images\100_7101.jpg", " test_img\images\100_7102.jpg",*/ /*"test_img\images\100_7103.jpg", "test_img\images\ 100_7104.jpg",*/ "test_img\images\100_7105.jpg", "test_img\images\100_7106.jpg", "test_img\\ \images\100_7107.jpg", "test_img\images\100_7108.jpg", "test_img\images\100_7109.jpg"/*, "test_img\images\100_7110.jpg"*/ }; const vector sub_image_paths = { "test_img\First stage\B23.jpg", "test_img\First stage\B22.jpg ", "test_img\First stage\B21.jpg" }; void initConstruction(vector<Images> & amp; initImages, vector<Point3d> & amp; all_reconstructed_points, vector<Vec3b> & amp; all_points_colors) {<!-- --> initImages.push_back(*(new Images(INIT_IMG_PATH1))); initImages.push_back(*(new Images(INIT_IMG_PATH2))); vector<DMatch> matches; initImages[0].matchFeatures(initImages[1], matches); vector<uchar> mask; Constructor::findCamera(K, initImages[0].matchedPoints, initImages[1].matchedPoints, initImages[1].R, initImages[1].T, mask); initImages[0].R = Mat::eye(3, 3, CV_64FC1); initImages[0].T = Mat::zeros(3, 1, CV_64FC1); all_reconstructed_points = Constructor::pointsReconstruct(K, initImages[0].R, initImages[0].T, initImages[1].R, initImages[1].T, initImages[0].matchedPoints, initImages[1].matchedPoints ); initImages[1].findColor(); for (int i = 0; i < initImages[1].colors.size(); i + + ) {<!-- --> all_points_colors.push_back(initImages[1].colors[i]); } // Record the relationship between each point and point cloud of the initial two pictures according to the mask int idx = 0; for (int i = 0; i < matches.size(); i + + ) {<!-- --> if (mask[i]) {<!-- --> initImages[0].correspond_struct_idx[matches[i].queryIdx] = idx; initImages[1].correspond_struct_idx[matches[i].trainIdx] = idx; idx++; } } } void addImageConstruction(vector & amp; subImageBag, vector & amp; all_reconstructed_points, vector & amp; all_points_colors) { for (int i = 1; i < subImageBag.size(); i + + ) { cout << i << endl; vector matches; subImageBag[i - 1].matchFeatures(subImageBag[i], matches); subImageBag[i].getObjPointsAndImgPoints(matches, all_reconstructed_points, subImageBag[i - 1]); // Just for RANSAC filtering matching points and obtaining mask vector mask; Mat discardR, discardT; Constructor::findCamera(K, subImageBag[i - 1].matchedPoints, subImageBag[i].matchedPoints, discardR, discardT, mask); solvePnPRansac(subImageBag[i].object_points, subImageBag[i].image_points, K, noArray(), subImageBag[i].R, subImageBag[i].T); Rodrigues(subImageBag[i].R, subImageBag[i].R); vector new_restructure_points; new_restructure_points = Constructor::pointsReconstruct(K, subImageBag[i - 1].R, subImageBag[i - 1].T, subImageBag[i].R, subImageBag[i].T, subImageBag[i - 1].matchedPoints, subImageBag[i].matchedPoints); subImageBag[i].findColor(); //Record the relationship between each point and point cloud of the initial two pictures int idx = 0; for (int k = 0; k < matches.size(); k + + ) { if (mask[k]) { subImageBag[i - 1].correspond_struct_idx[matches[k].queryIdx] = all_reconstructed_points.size() + idx; subImageBag[i].correspond_struct_idx[matches[k].trainIdx] = all_reconstructed_points.size() + idx; idx++; } } for (int k = 0; k < new_restructure_points.size(); k + + ) { all_reconstructed_points.push_back(new_restructure_points[k]); all_points_colors.push_back(subImageBag[i].colors[k]); } } } int main() { try { vector initImages; vector all_reconstructed_points; vector all_points_colors; initConstruction(initImages, all_reconstructed_points, all_points_colors); vector subImageBag; subImageBag.push_back(initImages[1]); for (auto & amp; image_path : sub_image_paths) { subImageBag.push_back(Images(image_path)); } addImageConstruction(subImageBag, all_reconstructed_points, all_points_colors); // Manually output point cloud ply file std::ofstream plyFile(PLY_SAVE_PATH); // ply header information plyFile << "ply\\ "; plyFile << "format ascii 1.0\\ "; plyFile << "element vertex " << all_reconstructed_points.size() << "\\ "; plyFile << "property float x\\ "; plyFile << "property float y\\ "; plyFile << "property float z\\ "; plyFile << "property uchar blue\\ "; plyFile << "property uchar green\\ "; plyFile << "property uchar red\\ "; plyFile << "end_header\\ "; //Write point cloud data for (int i = 0; i < all_reconstructed_points.size(); + + i) { cv::Vec3b color = all_points_colors[i]; cv::Point3f point = all_reconstructed_points[i]; plyFile << point.x << " " << point.y << " " << point.z << " " << static_cast (color[0]) << " " << static_cast (color[1]) << " " << static_cast (color[2]) << std::endl; } plyFile.close(); return 0; } catch (Exception e) { cout << e.msg << endl; } }
4 Summary notes
Source code That is Include.h, Constructor.h, Constructor.cpp, Image.h, Image.cpp, main.cpp given above
Before incrementally adding pictures (construction of two initial pictures):
After incremental graph construction:
Note:
At present, only the basic process has been completed. There are many places that need to be optimized, such as
- Parameter settings for SIFT
- Parameter settings for RANSAC
- Initial image selection (very important)
- Settings for ratio test in
matchFeatures
- Other optimization measures can also be added to eliminate outrageous points
- The most important BA hasn’t joined yet!
- ·······
At present, the results are not good, the revolution has not yet succeeded, comrades still need to work hard!
References:
Multi-view 3D reconstruction based on OpenCV and C++
OpenCV implements SfM (3): multi-view 3D reconstruction