1 Affine transformation
1.1 What is affine transformation
In image processing, it is often necessary to perform various operations on images, such as translation, scaling, rotation, flipping, etc., which are all affine transformations of images. Image affine transformation, also known as image affine mapping, means that in geometry, a vector space is transformed into another vector space by a linear transformation followed by a translation. Usually the rotation of the image plus the lifting is the image affine transformation. The affine transformation requires an M matrix to implement, but because the affine transformation is more complicated, it is difficult to find this M matrix.
1.2 Mathematical expression of affine transformation
Affine transformation, also called affine projection, refers to a linear transformation of a vector space followed by a translation in geometry to transform it into another vector space. Therefore, affine transformation actually means to talk about how to transform two vector spaces.
Suppose there is a vector space k:
There is also a vector space j:
If we want to change the vector space from k to j, we can transform it through the following formula
By splitting the above formula, we can get
We then convert the above equation into matrix multiplication
The conversion between two vector spaces can be achieved through the parameter matrix M. When performing affine transformation, we only need one matrix M to achieve translation, scaling, rotation and flip transformation.
1.3 Affine transformation in opencv
In OpenCV, uses warpAffine
function to implement affine transformation.
cv2.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]) → dst
src
: input image arrayM
: affine transformation matrixdsize
: The size of the transformed imageflags
: interpolation algorithm usedborderValue
: border padding value
1.3.1 Image translation
In the plane coordinate system, there are points P(x,y) and points P′(x′,y′). If we want to move point P to P’, we can achieve it through the following transformation
Among them, Δx and Δy are the offsets in the x direction and y direction, and we convert them into matrix form.
The above matrix M is the translation parameter of affine transformation, which is implemented using the warpAffine function in OpenCV as follows:
import cv2 import numpy as np import matplotlib.pyplot as plt def show_cmp_img(original_img,transform_img): _, axes = plt.subplots(1, 2) # show image axes[0].imshow(original_img) axes[1].imshow(transform_img) # Set subtitle axes[0].set_title("original image") axes[1].set_title("transform image") plt.show() #Define an image translation matrix #x translates to the left (negative numbers to the left, positive numbers to the right) 100 #y translates downward (negative numbers go up, positive numbers go down) 200 pixels M = np.array([[1, 0, -100], [0, 1, 200]], dtype=np.float) # Read the image that needs to be translated img = cv2.imread("../data/girl02.jpg") # Convert the image from BGR to RGB img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Define the size of the image after translation and keep it consistent with the original image size dsize = img.shape[:2][::-1] # It is easier for everyone to observe that white is used to fill the border. translation_img = cv2.warpAffine(img, M, dsize, borderValue=(255, 255, 255)) # show image show_cmp_img(img, translation_img)
The running results are shown as follows:
1.3.2 Image flip
Use opencv’s affine transformation to achieve horizontal flipping, vertical flipping, and mirror inversion of images (horizontal and vertical flipping at the same time)
A, B, C, and D in the above figure represent the four vertices of the image. If we need to flip the image horizontally, then we need to exchange point A and point B, and exchange point C and point D, along the The center line of the x-axis symmetrically exchanges positions, and horizontal flipping can be achieved through the following formula
The w in the above formula represents the width of the image. In the same way, the implementation formula for vertical flip can be obtained.
h in the above formula represents the height of the image,
the transformation matrix for image flipping:
Using the warpAffine function in OpenCV is implemented as follows:
import cv2 import matplotlib.pyplot as plt import numpy as np def show_cmp_img(original_img,transform_img): _, axes = plt.subplots(1, 2) # show image axes[0].imshow(original_img) axes[1].imshow(transform_img) # Set subtitle axes[0].set_title("original image") axes[1].set_title("transform image") plt.show() horizontal_flip = True vertical_flip = True img = cv2.imread("../data/girl02.jpg") # Get the width and height of the input image height,width = img.shape[:2] #Initialize transformation matrix M = np.array([[0, 0, 0], [0, 0, 0]], dtype=np.float) # horizontal flip if horizontal_flip: M[0] = [-1, 0, width] # Flip vertically if vertical_flip: M[1] = [0, -1, height] # Convert the image from BGR to RGB img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Define the size of the image after scaling img_flip = cv2.warpAffine(img, M, (width, height)) show_cmp_img(img, img_flip)
The running results are shown as follows:
OpenCV’s flip function flips the image
flip function parameters:
src
: input image arrayflipCode
: Image flip parameter, 1 means horizontal flip, 0 means vertical flip, -1 means mirror flip.
img = cv2.imread("../data/girl02.jpg") #horizontal flip horizontal_flip_img = cv2.flip(img,1) #vertical flip vertical_flip_img = cv2.flip(img,0) #mirror flip mirror_flip_img = cv2.flip(img,-1)
numpy index flip image
img = cv2.imread("../data/girl02.jpg") #horizontal flip horizontal_flip_img = img[:,::-1] #vertical flip vertical_flip_img = img[::-1] #mirror flip mirror_flip_img = img[::-1,::-1]
1.3.3 Image scaling
If we want to scale point P in the coordinate system, we can achieve it through the following formula
By adding a scaling factor in front of x and y, we also convert it into matrix form
Through the above matrix M, we can achieve the scaling of the image, using the warpAffine function in OpenCV to achieve the following:
import cv2 import numpy as np import matplotlib.pyplot as plt def show_cmp_img(original_img,transform_img): _, axes = plt.subplots(1, 2) # show image axes[0].imshow(original_img) axes[1].imshow(transform_img) # Set subtitle axes[0].set_title("original image") axes[1].set_title("transform image") plt.show() # Define the multiple of wide scaling fx = 0.5 # Define the high zoom factor fy = 2 #Define an image scaling matrix M = np.array([[fx, 0, 0], [0, fy, 0]], dtype=np.float) # read image img = cv2.imread("../data/girl02.jpg") # Get the width and height of the image height, width = img.shape[:2] # Convert the image from BGR to RGB img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Define the size of the image after scaling scale_img = cv2.warpAffine(img, M, (int(width*fx), int(height*fy))) # show image show_cmp_img(img, scale_img)
The results are shown below:
The resize function in opencv can also achieve the same effect.
1.3.4 Image rotation
Rotation around the origin: Let’s first look at how a point on a two-dimensional plane rotates around the origin.
In the above figure, point v is obtained after rotating θ degrees around the origin point v′. We express the coordinate point in the form of polar coordinates to get v(rcos?,rsin?), so v′(rcos(θ + ?) ,rsin(θ + ?)) can be obtained by expanding it using sine and cosine
Then express the above formula with matrix M, we can get
Special note: When we establish the rectangular coordinate system, we set it with the lower left corner as the origin. However, for the image, we set it with the upper left corner as the origin, so we need to invert the angle θ and combine it with the characteristics of trigonometric functions. The expression of M matrix is as follows
It should also be noted that the angles here are all in radians
, so we also need to convert them. The conversion code is as follows
#Convert angles to radians radian_theta = theta/180 * np.pi
Rotate the image counterclockwise θ degrees around the origin. The opencv code is implemented as follows:
import cv2 import numpy as np import matplotlib.pyplot as plt def show_cmp_img(original_img,transform_img): _, axes = plt.subplots(1, 2) # show image axes[0].imshow(original_img) axes[1].imshow(transform_img) # Set subtitle axes[0].set_title("original image") axes[1].set_title("transform image") plt.show() theta = 30 # Convert angles to radians radian_theta = theta/180 * np.pi # Define the transformation matrix for rotation around the origin M = np.array([[np.cos(radian_theta), np.sin(radian_theta), 0], [-np.sin(radian_theta), np.cos(radian_theta), 0]]) # read image img = cv2.imread("../data/girl02.jpg") # Define the width and height of the rotated image height, width = img.shape[:2] # Convert the image from BGR to RGB img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Rotate counterclockwise around the origin\theta degrees rotate_img = cv2.warpAffine(img, M, (width, height)) # show image show_cmp_img(img,rotate_img)
The running results are shown as follows:
1.3.5 Rotate around any point
Point v in the figure below is rotated 90 degrees around point (a, b) to obtain v′. It can be equivalent to first translating the v point to the v1? point, then rotating the v1? point 90 degrees around the origin to get the v2? point, and finally translating the v2? point by the same length in the opposite direction of the v point's translation. Finally get v′. In this way, we convert the problem of rotation around any coordinate point into the problem of rotation around the origin.
Let’s review the transformation formula for rotating coordinates around the origin:
On the basis of the rotation transformation formula around the origin, we improve it to rotate around any point c(a,b). We now translate the original coordinates to get the transformed coordinates, and finally proceed in the opposite direction of the previous translation. Translation, you get the transformation formula of rotation around any point:
Expand it to get
Express the above equation with matrix M:
c(a,b) in the above formula represents the center of rotation, because the coordinate system problem requires the inversion of θ. The final expression of the M matrix is as follows
The code using opencv is as follows:
import cv2 import numpy as np import matplotlib.pyplot as plt def show_cmp_img(original_img,transform_img): _, axes = plt.subplots(1, 2) # show image axes[0].imshow(original_img) axes[1].imshow(transform_img) # Set subtitle axes[0].set_title("original image") axes[1].set_title("transform image") plt.show() img = cv2.imread("../data/girl02.jpg") theta = 30 height, width = img.shape[:2] # Define rotation around the center of the image point_x, point_y = int(width/2), int(height/2) # Convert angles to radians radian_theta = theta / 180 * np.pi # Define the transformation matrix for rotation around any point M = np.array([[np.cos(radian_theta), np.sin(radian_theta), (1-np.cos(radian_theta))*point_x-point_y*np.sin(radian_theta)], [-np.sin(radian_theta), np.cos(radian_theta), (1-np.cos(radian_theta))*point_y + point_x*np.sin(radian_theta)]]) # Convert the image from BGR to RGB img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Define the width and height of the rotated image height, width = img.shape[:2] # Rotate counterclockwise around the origin\theta degrees rotate_img = cv2.warpAffine(img, M, (width, height)) # show image show_cmp_img(img, rotate_img)
The running results are shown as follows:
The part of the image rotated around the center of the image is cropped
. If we want the rotated image to still be complete, the code is as follows:
import cv2 import numpy as np import matplotlib.pyplot as plt def show_cmp_img(original_img,transform_img): _, axes = plt.subplots(1, 2) # show image axes[0].imshow(original_img) axes[1].imshow(transform_img) # Set subtitle axes[0].set_title("original image") axes[1].set_title("transform image") plt.show() img = cv2.imread("../data/girl02.jpg") theta = 30 is_completed = True height, width = img.shape[:2] # Define rotation around the center of the image point_x, point_y = int(width/2), int(height/2) # Convert angles to radians radian_theta = theta / 180 * np.pi # Define the transformation matrix for rotation around any point M = np.array([[np.cos(radian_theta), np.sin(radian_theta), (1-np.cos(radian_theta))*point_x-point_y*np.sin(radian_theta)], [-np.sin(radian_theta), np.cos(radian_theta), (1-np.cos(radian_theta))*point_y + point_x*np.sin(radian_theta)]]) # Convert the image from BGR to RGB img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Define the width and height of the rotated image height, width = img.shape[:2] # Determine whether the rotated image needs to remain intact if is_completed: # Increase the width and height of the image after rotation to prevent it from being cropped new_height = height * np.cos(radian_theta) + width * np.sin(radian_theta) new_width = height * np.sin(radian_theta) + width * np.cos(radian_theta) # Increase the translation parameter of the transformation matrix M[0, 2] + = (new_width - width) * 0.5 M[1, 2] + = (new_height - height) * 0.5 height = int(np.round(new_height)) width = int(np.round(new_width)) # Rotate counterclockwise around the origin\theta degrees rotate_img = cv2.warpAffine(img, M, (width, height)) # show image show_cmp_img(img, rotate_img)
The running results are shown as follows:
2 Use opencv to correct image deformities
In the daily process of image processing, we often encounter distorted images. First, we need to correct the distorted images, and then send them to the depth model for processing. The distorted images are as follows:
In order to correct the tilted target, first, we need to use methods such as contour detection to obtain the coordinate values of the 4 key points of the target; then use the corresponding transformation to obtain the new 4 coordinate points; then use these 4 pairs of key points Calculate the affine transformation matrix M; finally apply the affine transformation matrix to the target. Proceed as follows:
- Read the input image;
- Get the 4 coordinate points of the original target (upper left, lower left, upper right, lower right);
- Calculate the new coordinate point through 4 coordinate points;
- Use opencv to calculate the affine transformation matrix M;
- Apply an affine transformation to transform and display the results.
2.1 Get four vertex coordinates
def get4points(img: np.ndarray, thed, n): # Grayscale and binarization gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) ret, binary = cv2.threshold(gray, thed, 255, cv2.THRESH_BINARY) # Search for contours contours, hierarchy = cv2.findContours( binary, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE) #Select the required contour according to the contour length len_list = [] for i in range(len(contours)): len_list.append(len(contours[i])) # Select the second longest one sy = np.argsort(np.array(len_list))[-n] # Find vertices sum_list = [] dif_list = [] for i in contours[sy]: sum = i[0][0] + i[0][1] sum_list.append(sum) dif_list.append(i[0][0]-i[0][1]) id_lb = np.argsort(np.array(sum_list)) id_lb2 = np.argsort(np.array(dif_list)) lu_id, rd_id = id_lb[0], id_lb[-1] ld_id, ru_id = id_lb2[0], id_lb2[-1] points = np.array([contours[sy][lu_id][0], contours[sy][rd_id][0], contours[sy][ld_id][0], contours[sy][ru_id][0]]) return points, contours, sy
2.2 Affine transformation
def four_point_transform(image, pts): # Get coordinate points and separate them rect = order_points(pts) (tl, tr, br, bl) = rect # Calculate the width value of the new image and select the maximum value of the horizontal difference widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2)) widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2)) maxWidth = max(int(widthA), int(widthB)) # Calculate the height value of the new image and select the maximum value of the vertical difference heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2)) heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2)) maxHeight = max(int(heightA), int(heightB)) # Construct 4 coordinate points of the new picture dst = np.array([ [0, 0], [maxWidth - 1, 0], [maxWidth - 1, maxHeight - 1], [0, maxHeight - 1]], dtype="float32") # Get the affine transformation matrix and apply it M = cv2.getPerspectiveTransform(rect, dst) # Perform affine transformation warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight)) # Return the transformed result return warped
2.3 Complete code
# coding=utf-8 import numpy as np import cv2 import matplotlib.pyplot as plt def get4points(img: np.ndarray, thed, n): # Grayscale and binarization gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) ret, binary = cv2.threshold(gray, thed, 255, cv2.THRESH_BINARY) # Search for contours contours, hierarchy = cv2.findContours( binary, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE) #Select the required contour according to the contour length len_list = [] for i in range(len(contours)): len_list.append(len(contours[i])) # Select the second longest one sy = np.argsort(np.array(len_list))[-n] # Find vertices sum_list = [] dif_list = [] for i in contours[sy]: sum = i[0][0] + i[0][1] sum_list.append(sum) dif_list.append(i[0][0]-i[0][1]) id_lb = np.argsort(np.array(sum_list)) id_lb2 = np.argsort(np.array(dif_list)) lu_id, rd_id = id_lb[0], id_lb[-1] ld_id, ru_id = id_lb2[0], id_lb2[-1] points = np.array([contours[sy][lu_id][0], contours[sy][rd_id][0], contours[sy][ld_id][0], contours[sy][ru_id][0]]) return points, contours, sy def order_points(pts): #Initialize coordinate points rect = np.zeros((4, 2), dtype = "float32") # Get the coordinate points of the upper left corner and lower right corner s = pts.sum(axis = 1) rect[0] = pts[np.argmin(s)] rect[2] = pts[np.argmax(s)] # Calculate the discrete differences between the upper left corner and the lower right corner respectively diff = np.diff(pts, axis = 1) rect[1] = pts[np.argmin(diff)] rect[3] = pts[np.argmax(diff)] return rect def four_point_transform(image, pts): # Get coordinate points and separate them rect = order_points(pts) (tl, tr, br, bl) = rect # Calculate the width value of the new image and select the maximum value of the horizontal difference widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2)) widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2)) maxWidth = max(int(widthA), int(widthB)) # Calculate the height value of the new image and select the maximum value of the vertical difference heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2)) heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2)) maxHeight = max(int(heightA), int(heightB)) # Construct 4 coordinate points of the new picture dst = np.array([ [0, 0], [maxWidth - 1, 0], [maxWidth - 1, maxHeight - 1], [0, maxHeight - 1]], dtype="float32") # Get the affine transformation matrix and apply it M = cv2.getPerspectiveTransform(rect, dst) # Perform affine transformation warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight)) # Return the transformed result return warped def show_cmp_img(original_img, transform_img): _, axes = plt.subplots(1, 2) # show image axes[0].imshow(original_img) axes[1].imshow(transform_img) # Set subtitle axes[0].set_title("original image") axes[1].set_title("transform image") plt.show() # Read pictures image = cv2.imread('../data/warp01.png') points, _, _ = get4points(image, 127, 1) # Get the original coordinate point pts = np.array(points, dtype="float32") # Transform the original image warped = four_point_transform(image, pts) show_cmp_img(image, warped)
The running results are shown as follows:
The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. OpenCV skill treeHomepage Overview 24096 people are learning the system