Xidian Computer Vision Assignment 2 Image Registration and Stitching

Image registration and stitching

ps: The picture of the water dispenser on the C floor of the school used to match the picture. Thinking about it, I feel that the days of being criticized by Ma Yuan in the C floor are too tiring. It is for reference only

Directory

Image registration and stitching 1

1 Overall idea 2

2 SIFT algorithm 2

2.1 Algorithm Principle 2

2.2 Algorithm step 2

2.3 Code implementation 3

2.4 SIFT algorithm renderings 4

3 RANSAC algorithm matching feature points 4

3.1 Introduction to RANSAC algorithm 4

3.2 Basic assumptions of RANSAC 4

3.3 Basic steps of RANSAC 5

3.4 Application of RANSAC in image matching 5

3.5RANSAC matching algorithm implementation code 5

3.6 RANSAC rendering after eliminating wrong matches 9

4 Use image geometric transformation algorithm to realize image transformation and splicing 9

4.1 Basic matrix of geometric changes 9

4.2 Basic ideas 10

4.2.1 Obtain the single mapping matrix 10

4.2.2 Transformation of Chapter 2 Pictures 10 through Unimapicity Matrix

4.2.3 Stitching pictures 10

4.3 Implementation code 10

4.4 Implementation result picture 11

5 overall code 11

1 Overall idea

(1) Based on the relevant content of image feature point extraction, select the relevant feature point matching algorithm (SIFT) to complete the feature point extraction

(2) Use RANSAC algorithm to achieve feature point matching

(3) Use image geometric transformation algorithm to realize image transformation and splicing

2 SIFT algorithm

2.1 Algorithm Principle

Scale-invariant feature transform (SIFT) is a computer vision algorithm used to detect and describe local features in images. It finds extreme points in the spatial scale and extracts their positions, Scale and rotation invariants. SIFT features are based on some local appearance interest points on the object and are independent of the size and rotation of the image. The tolerance for light, noise, and slight changes in viewing angle is also quite high. Based on these characteristics, they are highly salient and relatively easy to retrieve. In a large feature database, objects are easily identified and misidentification is rare.

2.2 Algorithm Steps

1. Scale space extreme value detection: Search image locations on all scales. Potential interest points that are invariant to scale and rotation are identified through Gaussian differential functions.

First, the image is sampled to obtain images of different scales.

After obtaining the Gaussian pyramid, use the same positions of each group of adjacent two levels in the Gaussian pyramid to subtract to generate a Gaussian difference pyramid.

Please add a picture description

2. Key point positioning: At each candidate position, the position and scale are determined through a well-fitted model. Key points are chosen based on their stability.

Please add a picture description

3. Direction determination: Based on the local gradient direction of the image, one or more directions are assigned to each key point position. All subsequent operations on the image data transform relative to the orientation, scale, and position of the keypoints, providing invariance to these transformations. After completing the gradient calculation of the key points, the histogram is used to count the gradient and direction of the pixels in the neighborhood. The gradient histogram divides the direction range from 0 to 360 degrees into 36 bins, each of which is 10 degrees. As shown in the figure, the peak direction of the histogram represents the main direction of the key point. The peak value of the direction histogram represents the direction of the neighborhood gradient at the feature point, and the maximum value in the histogram is used as the main direction of the key point. In order to enhance the robustness of matching, only the direction with a peak value greater than 80% of the main direction peak is retained as the auxiliary direction of the key point.

Please add an image description

4. Key point description: In the neighborhood around each key point, the local gradient of the image is measured at the selected scale. These gradients are transformed into a representation that allows relatively large local shape deformations and illumination changes. Through the above steps, for each key point, there are three pieces of information: position, scale and direction. The next step is to establish a descriptor for each key point and use a set of vectors to describe this key point so that it does not change with various changes, such as lighting changes, viewing angle changes, etc. This descriptor includes not only key points, but also pixels around the key points that contribute to it, and the descriptor should have high uniqueness to increase the probability of correct matching of feature points.

2.3 Code Implementation

The code uses the built-in function in the python opencv library to find the key point information kp1 and its feature vector des1

The matching process is then roughly implemented by comparing the Euclidean distance between the key points of the two images.

This process uses the flann algorithm, which is based on KDTree to find the distance between different points.

We match an optimal point and a secondary point for each key point. When the distance between the optimal point and the own point is much smaller than the distance between the secondary optimal point and itself, we consider this point to be a good matching point. (And when the optimal point is far better than the second optimal point, the optimal point is considered to be the matching point)

Finally, the matching point pairs are put into the good array.

kp1, des1 = sift.detectAndCompute(img1gray, None)
kp2, des2 = sift.detectAndCompute(img2gray, None)
\# FLANN parameters find the closest point through KDtree algorithm
FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(des1, des2, k=2)
matchesMask = [[0, 0] for i in range(len(matches))]

good = []
pts1 = []
pts2 = []
\# Only when the gap between the optimal point and the second point is large can the accuracy of the optimal point be reflected.
for i, (m, n) in enumerate(matches):
` `if m.distance < 0.7 \* n.distance:
` `good.append(m)
` `pts2.append(kp2[m.trainIdx].pt)
` `pts1.append(kp1[m.queryIdx].pt)
` `matchesMask[i] = [1, 0]
2.4 SIFT algorithm renderings

3 RANSAC algorithm matching feature points

3.1 Introduction to RANSAC algorithm

RANSAC is the abbreviation of “RANdom SAmple Consensus”. It can iteratively estimate the parameters of a mathematical model from a set of observational data sets containing “outliers”. It is an uncertain algorithm – it has a certain probability of arriving at a reasonable result; in order to increase the probability, the number of iterations must be increased.

3.2 Basic assumptions of RANSAC

1. “Inlier” (normal data, correct data) data can describe its distribution through several sets of model parameters, while “outlier” (abnormal data) data is data that is not suitable for modeling.

2. Data can be affected by noise, which refers to outliers such as extreme noise or misinterpretations of measurements or incorrect assumptions about the data.

3. RANSAC assumes that, given a (usually small) set of interior point groups, there exists a program that can estimate the parameters that best explain or are most suitable for this data model.

3.3 Basic steps of RANSAC

1: Assume a model (such as a straight line equation), and randomly select N (take 2 as an example) sample points to fit the model.

2: Since it is not strictly linear, the data points have certain fluctuations. Assume that the tolerance range is: sigma. Find the points within the tolerance range of the distance fitting curve and count the number of points.

3: Randomly select N points again and repeat the first to second steps until the iteration ends.

4: After each fitting, there are corresponding numbers of data points within the tolerance range. Find the situation with the largest number of data points, which is the final fitting result.

3.4 Application of RANSAC in image matching

Select 8 points from the matched point pairs and estimate the fundamental matrix F using the normalized 8-point method.

Calculate the distance dp from the remaining point pairs to their corresponding epipolar lines. If dp < d, the point is an interior point, otherwise it is an exterior point. Write down the number of interior points that meet this condition as num

Iterates k times, or if the proportion of the number of interior points num is greater than or equal to 95%, it will stop. Select the fundamental matrix with the largest num as the final result.

3.5RANSAC matching algorithm implementation code
def compute_fundamental(x1, x2):
` `n = x1.shape[1]
` `if x2.shape[1] != n:
` `raise ValueError("Number of points don't match.")

` `# build matrix for equations
` `A = np.zeros((n, 9))
` `for i in range(n):
` `A[i] = [x1[0, i] \* x2[0, i], x1[0, i] \* x2[1, i], x1[0, i] \* x2 [2, i],
` `x1[1, i] \* x2[0, i], x1[1, i] \* x2[1, i], x1[1, i] \* x2[2, i],
` `x1[2, i] \* x2[0, i], x1[2, i] \* x2[1, i], x1[2, i] \* x2[2, i]]

` `#compute linear least square solution
` `U, S, V = np.linalg.svd(A)
` `F = V[-1].reshape(3, 3)

` `# constrain F
` `# make rank 2 by zeroing out last singular value
` `U, S, V = np.linalg.svd(F)
` `S[2] = 0
` `F = np.dot(U, np.dot(np.diag(S), V))

` `return F / F[2, 2]


def compute_fundamental_normalized(x1, x2):
` `*""" Computes the fundamental matrix from corresponding points
` `(x1,x2 3\*n arrays) using the normalized 8 point algorithm. """*

` `n = x1.shape[1]
` `if x2.shape[1] != n:
` `raise ValueError("Number of points don't match.")

` `# normalize image coordinates
` `x1 = x1 / x1[2]
` `mean_1 = np.mean(x1[:2], axis=1)
` `S1 = np.sqrt(2) / np.std(x1[:2])
` `T1 = np.array([[S1, 0, -S1 \* mean_1[0]], [0, S1, -S1 \* mean_1[1]], [0, 0, 1]])
` `x1 = np.dot(T1, x1)

` `x2 = x2 / x2[2]
` `mean_2 = np.mean(x2[:2], axis=1)
` `S2 = np.sqrt(2) / np.std(x2[:2])
` `T2 = np.array([[S2, 0, -S2 \* mean_2[0]], [0, S2, -S2 \* mean_2[1]], [0, 0, 1]])
` `x2 = np.dot(T2, x2)

` `#compute F with the normalized coordinates
` `F = compute_fundamental(x1, x2)
` `# print (F)
` `# reverse normalization
` `F = np.dot(T1.T, np.dot(F, T2))

` `return F / F[2, 2]


def randSeed(good, num=8):
` `*'''
` `**:param** good: initial matching point pair
` `**:param** num: Select the number of randomly selected point pairs
` `**:return**: 8 point pair list
` `'''*
` `eight_point = random.sample(good, num)
` `return eight_point


def PointCoordinates(eight_points, keypoints1, keypoints2):
` `*'''
` `**:param** eight_points: random eight points
` `**:param** keypoints1: point coordinates
` `**:param** keypoints2: point coordinates
` `**:return**:8 points
` `'''*
` `x1 = []
` `x2 = []
` `tuple_dim = (1.,)
` `for i in eight_points:
` `tuple_x1 = keypoints1[i[0].queryIdx].pt + tuple_dim
` `tuple_x2 = keypoints2[i[0].trainIdx].pt + tuple_dim
` `x1.append(tuple_x1)
` `x2.append(tuple_x2)
` `return np.array(x1, dtype=float), np.array(x2, dtype=float)
def ransac(good, keypoints1, keypoints2, confidence, iter_num):
` `Max_num = 0
` `good_F = np.zeros([3, 3])
` `inlier_points = []
` `for i in range(iter_num):
` `eight_points = randSeed(good)
` `x1, x2 = PointCoordinates(eight_points, keypoints1, keypoints2)
` `F = compute_fundamental_normalized(x1.T, x2.T)
` `num, ransac_good = inlier(F, good, keypoints1, keypoints2, confidence)
` `if num > Max_num:
` `Max_num = num
` `good_F = F
` `inlier_points = ransac_good
` `print(Max_num, good_F)
` `return Max_num, good_F, inlier_points


def computeReprojError(x1, x2, F):
` `*"""
` `Calculate projection error
` `"""*
` `ww = 1.0 / (F[2, 0] \* x1[0] + F[2, 1] \* x1[1] + F[2, 2])
` `dx = (F[0, 0] \* x1[0] + F[0, 1] \* x1[1] + F[0, 2]) \* ww - x2[0]
` `dy = (F[1, 0] \* x1[0] + F[1, 1] \* x1[1] + F[1, 2]) \* ww - x2[1]
` `return dx \* dx + dy \* dy


def inlier(F, good, keypoints1, keypoints2, confidence):
` `num = 0
` `ransac_good = []
` `x1, x2 = PointCoordinates(good, keypoints1, keypoints2)
` `for i in range(len(x2)):
` `line = F.dot(x1[i].T)
` `# In epipolar geometry, the epipolar line expression is [A B C], Ax + By + C=0, and the direction vector can be expressed as [-B,A]
` `line_v = np.array([-line[1], line[0]])
` `err = h = np.linalg.norm(np.cross(x2[i, :2], line_v) / np.linalg.norm(line_v))
` `# err = computeReprojError(x1[i], x2[i], F)
` `if abs(err) < confidence:
` `ransac_good.append(good[i])
` `num + = 1
` `return num, ransac_good
3.6 RANSAC rendering after eliminating incorrect matches

Please add a picture description

4 Use image geometric transformation algorithm to realize image transformation and splicing

4.1 Basic matrix of geometric changes

Please add a picture description

4.2 Basic ideas
4.2.1 Obtaining the single mapping matrix

M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)

Through the findHomography function in opencv we can get the change matrix M

4.2.2 Convert Chapter 2 Pictures through Unimaplicity Matrix

warpImg = cv2.warpPerspective(testImg, M, (testImg.shape[1], testImg.shape[0]),flags=cv2.WARP_INVERSE_MAP)

Through the warpPerspective function in opencv we can get the changed picture

4.2.3 Stitching pictures

Find the coordinates of the overlapping parts of the pictures. The non-overlapping parts of the new pictures are drawn according to the original pictures. The overlapping parts are merged using interpolation.

4.3 Implementation Code
src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in good]).reshape(-1, 1, 2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0) # Find the homography matching matrix through the ransac algorithm
mask_matches = mask.ravel().tolist()
warpImg = cv2.warpPerspective(testImg, M, (testImg.shape[1], testImg.shape[0]),
` `flags=cv2.WARP_INVERSE_MAP)

\# Get overlapping parts
for col in range(0, cols):
` `if srcImg[:, col].any() and warpImg[:, col].any():
` `left = col
` `break
for col in range(cols - 1, 0, -1):
` `if srcImg[:, col].any() and warpImg[:, col].any():
` `right = col
` `break

res = np.zeros([rows, cols, 3], np.uint8)
for row in range(0, rows):
` `for col in range(0, cols):
` `if not srcImg[row, col].any():
` `res[row, col] = warpImg[row, col]
` `elif not warpImg[row, col].any():
` `res[row, col] = srcImg[row, col]
` `# Use interpolation to fuse the overlapping parts of the two images.
` `else:
` `srcImgLen = float(abs(col - left))
` `testImgLen = float(abs(col - right))
` `alpha = srcImgLen / (srcImgLen + testImgLen)
` `res[row, col] = np.clip(srcImg[row, col] \* (1 - alpha) + warpImg[row, col] \* alpha, 0, 255)

4.4 Implementation result picture

Please add a picture description

5 Overall code

 https://gitee.com/orangeinus/xd_-cs_computer_vison_2