Use opencv to correct image distortion

1 Affine transformation

1.1 What is affine transformation

In image processing, it is often necessary to perform various operations on images, such as translation, scaling, rotation, flipping, etc., which are all affine transformations of images. Image affine transformation, also known as image affine mapping, means that in geometry, a vector space is transformed into another vector space by a linear transformation followed by a translation. Usually the rotation of the image plus the lifting is the image affine transformation. The affine transformation requires an M matrix to implement, but because the affine transformation is more complicated, it is difficult to find this M matrix.

1.2 Mathematical expression of affine transformation

Affine transformation, also called affine projection, refers to a linear transformation of a vector space followed by a translation in geometry to transform it into another vector space. Therefore, affine transformation actually means to talk about how to transform two vector spaces.
Suppose there is a vector space k:

There is also a vector space j:

If we want to change the vector space from k to j, we can transform it through the following formula

By splitting the above formula, we can get

We then convert the above equation into matrix multiplication

The conversion between two vector spaces can be achieved through the parameter matrix M. When performing affine transformation, we only need one matrix M to achieve translation, scaling, rotation and flip transformation.

1.3 Affine transformation in opencv

In OpenCV, uses warpAffine function to implement affine transformation.

cv2.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]) → dst

src: input image array

M: affine transformation matrix

dsize: The size of the transformed image

flags: interpolation algorithm used

borderValue: border padding value

1.3.1 Image translation

In the plane coordinate system, there are points P(x,y) and points P′(x′,y′). If we want to move point P to P’, we can achieve it through the following transformation

Among them, Δx and Δy are the offsets in the x direction and y direction, and we convert them into matrix form.

The above matrix M is the translation parameter of affine transformation, which is implemented using the warpAffine function in OpenCV as follows:

import cv2
import numpy as np
import matplotlib.pyplot as plt


def show_cmp_img(original_img,transform_img):
    _, axes = plt.subplots(1, 2)
    # show image
    axes[0].imshow(original_img)
    axes[1].imshow(transform_img)
    # Set subtitle
    axes[0].set_title("original image")
    axes[1].set_title("transform image")
    plt.show()


#Define an image translation matrix
#x translates to the left (negative numbers to the left, positive numbers to the right) 100
#y translates downward (negative numbers go up, positive numbers go down) 200 pixels
M = np.array([[1, 0, -100], [0, 1, 200]], dtype=np.float)

# Read the image that needs to be translated
img = cv2.imread("../data/girl02.jpg")

# Convert the image from BGR to RGB
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Define the size of the image after translation and keep it consistent with the original image size
dsize = img.shape[:2][::-1]

# It is easier for everyone to observe that white is used to fill the border.
translation_img = cv2.warpAffine(img, M, dsize, borderValue=(255, 255, 255))

# show image
show_cmp_img(img, translation_img)

The running results are shown as follows:

1.3.2 Image flip

Use opencv’s affine transformation to achieve horizontal flipping, vertical flipping, and mirror inversion of images (horizontal and vertical flipping at the same time)

A, B, C, and D in the above figure represent the four vertices of the image. If we need to flip the image horizontally, then we need to exchange point A and point B, and exchange point C and point D, along the The center line of the x-axis symmetrically exchanges positions, and horizontal flipping can be achieved through the following formula

The w in the above formula represents the width of the image. In the same way, the implementation formula for vertical flip can be obtained.

h in the above formula represents the height of the image, the transformation matrix for image flipping:

Using the warpAffine function in OpenCV is implemented as follows:

import cv2
import matplotlib.pyplot as plt
import numpy as np


def show_cmp_img(original_img,transform_img):
    _, axes = plt.subplots(1, 2)
    # show image
    axes[0].imshow(original_img)
    axes[1].imshow(transform_img)
    # Set subtitle
    axes[0].set_title("original image")
    axes[1].set_title("transform image")
    plt.show()


horizontal_flip = True
vertical_flip = True

img = cv2.imread("../data/girl02.jpg")

# Get the width and height of the input image
height,width = img.shape[:2]

#Initialize transformation matrix
M = np.array([[0, 0, 0], [0, 0, 0]], dtype=np.float)

# horizontal flip
if horizontal_flip:
    M[0] = [-1, 0, width]

# Flip vertically
if vertical_flip:
    M[1] = [0, -1, height]

# Convert the image from BGR to RGB
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Define the size of the image after scaling
img_flip = cv2.warpAffine(img, M, (width, height))

show_cmp_img(img, img_flip)

The running results are shown as follows:

OpenCV’s flip function flips the image

flip function parameters:

src: input image array
flipCode: Image flip parameter, 1 means horizontal flip, 0 means vertical flip, -1 means mirror flip.

img = cv2.imread("../data/girl02.jpg")

#horizontal flip
horizontal_flip_img = cv2.flip(img,1)

#vertical flip
vertical_flip_img = cv2.flip(img,0)

#mirror flip
mirror_flip_img = cv2.flip(img,-1)

numpy index flip image

img = cv2.imread("../data/girl02.jpg")

#horizontal flip
horizontal_flip_img = img[:,::-1]

#vertical flip
vertical_flip_img = img[::-1]

#mirror flip
mirror_flip_img = img[::-1,::-1]

1.3.3 Image scaling

If we want to scale point P in the coordinate system, we can achieve it through the following formula

By adding a scaling factor in front of x and y, we also convert it into matrix form

Through the above matrix M, we can achieve the scaling of the image, using the warpAffine function in OpenCV to achieve the following:

import cv2
import numpy as np
import matplotlib.pyplot as plt


def show_cmp_img(original_img,transform_img):
    _, axes = plt.subplots(1, 2)
    # show image
    axes[0].imshow(original_img)
    axes[1].imshow(transform_img)
    # Set subtitle
    axes[0].set_title("original image")
    axes[1].set_title("transform image")
    plt.show()


# Define the multiple of wide scaling
fx = 0.5

# Define the high zoom factor
fy = 2

#Define an image scaling matrix
M = np.array([[fx, 0, 0], [0, fy, 0]], dtype=np.float)

# read image
img = cv2.imread("../data/girl02.jpg")

# Get the width and height of the image
height, width = img.shape[:2]

# Convert the image from BGR to RGB
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Define the size of the image after scaling
scale_img = cv2.warpAffine(img, M, (int(width*fx), int(height*fy)))

# show image
show_cmp_img(img, scale_img)

The results are shown below:

The resize function in opencv can also achieve the same effect.


1.3.4 Image rotation
Rotation around the origin: Let’s first look at how a point on a two-dimensional plane rotates around the origin.

In the above figure, point v is obtained after rotating θ degrees around the origin point v′. We express the coordinate point in the form of polar coordinates to get v(rcos?,rsin?), so v′(rcos(θ + ?) ,rsin(θ + ?)) can be obtained by expanding it using sine and cosine

Then express the above formula with matrix M, we can get

Special note: When we establish the rectangular coordinate system, we set it with the lower left corner as the origin. However, for the image, we set it with the upper left corner as the origin, so we need to invert the angle θ and combine it with the characteristics of trigonometric functions. The expression of M matrix is as follows

It should also be noted that the angles here are all in radians, so we also need to convert them. The conversion code is as follows
#Convert angles to radians
radian_theta = theta/180 * np.pi
Rotate the image counterclockwise θ degrees around the origin. The opencv code is implemented as follows:

import cv2
import numpy as np
import matplotlib.pyplot as plt


def show_cmp_img(original_img,transform_img):
    _, axes = plt.subplots(1, 2)
    # show image
    axes[0].imshow(original_img)
    axes[1].imshow(transform_img)
    # Set subtitle
    axes[0].set_title("original image")
    axes[1].set_title("transform image")
    plt.show()


theta = 30

# Convert angles to radians
radian_theta = theta/180 * np.pi

# Define the transformation matrix for rotation around the origin
M = np.array([[np.cos(radian_theta), np.sin(radian_theta), 0],
             [-np.sin(radian_theta), np.cos(radian_theta), 0]])
# read image
img = cv2.imread("../data/girl02.jpg")

# Define the width and height of the rotated image
height, width = img.shape[:2]

# Convert the image from BGR to RGB
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Rotate counterclockwise around the origin\theta degrees
rotate_img = cv2.warpAffine(img, M, (width, height))

# show image
show_cmp_img(img,rotate_img)
 The running results are shown as follows:

1.3.5 Rotate around any point
Point v in the figure below is rotated 90 degrees around point (a, b) to obtain v′. It can be equivalent to first translating the v point to the v1? point, then rotating the v1? point 90 degrees around the origin to get the v2? point, and finally translating the v2? point by the same length in the opposite direction of the v point's translation. Finally get v′. In this way, we convert the problem of rotation around any coordinate point into the problem of rotation around the origin.

Let’s review the transformation formula for rotating coordinates around the origin:

On the basis of the rotation transformation formula around the origin, we improve it to rotate around any point c(a,b). We now translate the original coordinates to get the transformed coordinates, and finally proceed in the opposite direction of the previous translation. Translation, you get the transformation formula of rotation around any point:

Expand it to get

Express the above equation with matrix M:

c(a,b) in the above formula represents the center of rotation, because the coordinate system problem requires the inversion of θ. The final expression of the M matrix is as follows

The code using opencv is as follows:
import cv2
import numpy as np
import matplotlib.pyplot as plt


def show_cmp_img(original_img,transform_img):
    _, axes = plt.subplots(1, 2)
    # show image
    axes[0].imshow(original_img)
    axes[1].imshow(transform_img)
    # Set subtitle
    axes[0].set_title("original image")
    axes[1].set_title("transform image")
    plt.show()


img = cv2.imread("../data/girl02.jpg")

theta = 30
height, width = img.shape[:2]

# Define rotation around the center of the image
point_x, point_y = int(width/2), int(height/2)

# Convert angles to radians
radian_theta = theta / 180 * np.pi

# Define the transformation matrix for rotation around any point
M = np.array([[np.cos(radian_theta), np.sin(radian_theta),
               (1-np.cos(radian_theta))*point_x-point_y*np.sin(radian_theta)],
              [-np.sin(radian_theta), np.cos(radian_theta),
               (1-np.cos(radian_theta))*point_y + point_x*np.sin(radian_theta)]])

# Convert the image from BGR to RGB
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Define the width and height of the rotated image
height, width = img.shape[:2]

# Rotate counterclockwise around the origin\theta degrees
rotate_img = cv2.warpAffine(img, M, (width, height))

# show image
show_cmp_img(img, rotate_img)
The running results are shown as follows:

The part of the image rotated around the center of the image is cropped. If we want the rotated image to still be complete, the code is as follows:
import cv2
import numpy as np
import matplotlib.pyplot as plt


def show_cmp_img(original_img,transform_img):
    _, axes = plt.subplots(1, 2)
    # show image
    axes[0].imshow(original_img)
    axes[1].imshow(transform_img)
    # Set subtitle
    axes[0].set_title("original image")
    axes[1].set_title("transform image")
    plt.show()


img = cv2.imread("../data/girl02.jpg")

theta = 30
is_completed = True
height, width = img.shape[:2]

# Define rotation around the center of the image
point_x, point_y = int(width/2), int(height/2)

# Convert angles to radians
radian_theta = theta / 180 * np.pi

# Define the transformation matrix for rotation around any point
M = np.array([[np.cos(radian_theta), np.sin(radian_theta),
               (1-np.cos(radian_theta))*point_x-point_y*np.sin(radian_theta)],
              [-np.sin(radian_theta), np.cos(radian_theta),
               (1-np.cos(radian_theta))*point_y + point_x*np.sin(radian_theta)]])
# Convert the image from BGR to RGB
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Define the width and height of the rotated image
height, width = img.shape[:2]

# Determine whether the rotated image needs to remain intact
if is_completed:
    # Increase the width and height of the image after rotation to prevent it from being cropped
    new_height = height * np.cos(radian_theta) + width * np.sin(radian_theta)
    new_width = height * np.sin(radian_theta) + width * np.cos(radian_theta)

    # Increase the translation parameter of the transformation matrix
    M[0, 2] + = (new_width - width) * 0.5
    M[1, 2] + = (new_height - height) * 0.5
    height = int(np.round(new_height))
    width = int(np.round(new_width))
# Rotate counterclockwise around the origin\theta degrees
rotate_img = cv2.warpAffine(img, M, (width, height))
# show image
show_cmp_img(img, rotate_img)
The running results are shown as follows:

2 Use opencv to correct image deformities
In the daily process of image processing, we often encounter distorted images. First, we need to correct the distorted images, and then send them to the depth model for processing. The distorted images are as follows:

In order to correct the tilted target, first, we need to use methods such as contour detection to obtain the coordinate values of the 4 key points of the target; then use the corresponding transformation to obtain the new 4 coordinate points; then use these 4 pairs of key points Calculate the affine transformation matrix M; finally apply the affine transformation matrix to the target. Proceed as follows:

Read the input image;
Get the 4 coordinate points of the original target (upper left, lower left, upper right, lower right);
Calculate the new coordinate point through 4 coordinate points;
Use opencv to calculate the affine transformation matrix M;
Apply an affine transformation to transform and display the results.

2.1 Get four vertex coordinates
def get4points(img: np.ndarray, thed, n):
    # Grayscale and binarization
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    ret, binary = cv2.threshold(gray, thed, 255, cv2.THRESH_BINARY)

    # Search for contours
    contours, hierarchy = cv2.findContours(
        binary,
        cv2.RETR_LIST,
        cv2.CHAIN_APPROX_SIMPLE)

    #Select the required contour according to the contour length
    len_list = []
    for i in range(len(contours)):
        len_list.append(len(contours[i]))

    # Select the second longest one
    sy = np.argsort(np.array(len_list))[-n]

    # Find vertices
    sum_list = []
    dif_list = []
    for i in contours[sy]:
        sum = i[0][0] + i[0][1]
        sum_list.append(sum)
        dif_list.append(i[0][0]-i[0][1])

    id_lb = np.argsort(np.array(sum_list))
    id_lb2 = np.argsort(np.array(dif_list))
    lu_id, rd_id = id_lb[0], id_lb[-1]
    ld_id, ru_id = id_lb2[0], id_lb2[-1]

    points = np.array([contours[sy][lu_id][0], contours[sy][rd_id][0],
                       contours[sy][ld_id][0], contours[sy][ru_id][0]])

    return points, contours, sy
2.2 Affine transformation
def four_point_transform(image, pts):
    # Get coordinate points and separate them
    rect = order_points(pts)
    (tl, tr, br, bl) = rect
    # Calculate the width value of the new image and select the maximum value of the horizontal difference
    widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
    widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
    maxWidth = max(int(widthA), int(widthB))

    # Calculate the height value of the new image and select the maximum value of the vertical difference
    heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
    heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
    maxHeight = max(int(heightA), int(heightB))

    # Construct 4 coordinate points of the new picture
    dst = np.array([
        [0, 0],
        [maxWidth - 1, 0],
        [maxWidth - 1, maxHeight - 1],
        [0, maxHeight - 1]], dtype="float32")

    # Get the affine transformation matrix and apply it
    M = cv2.getPerspectiveTransform(rect, dst)
    # Perform affine transformation
    warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))

    # Return the transformed result
    return warped
2.3 Complete code
# coding=utf-8
import numpy as np
import cv2
import matplotlib.pyplot as plt


def get4points(img: np.ndarray, thed, n):
    # Grayscale and binarization
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    ret, binary = cv2.threshold(gray, thed, 255, cv2.THRESH_BINARY)

    # Search for contours
    contours, hierarchy = cv2.findContours(
        binary,
        cv2.RETR_LIST,
        cv2.CHAIN_APPROX_SIMPLE)

    #Select the required contour according to the contour length
    len_list = []
    for i in range(len(contours)):
        len_list.append(len(contours[i]))

    # Select the second longest one
    sy = np.argsort(np.array(len_list))[-n]

    # Find vertices
    sum_list = []
    dif_list = []
    for i in contours[sy]:
        sum = i[0][0] + i[0][1]
        sum_list.append(sum)
        dif_list.append(i[0][0]-i[0][1])

    id_lb = np.argsort(np.array(sum_list))
    id_lb2 = np.argsort(np.array(dif_list))
    lu_id, rd_id = id_lb[0], id_lb[-1]
    ld_id, ru_id = id_lb2[0], id_lb2[-1]

    points = np.array([contours[sy][lu_id][0], contours[sy][rd_id][0],
                       contours[sy][ld_id][0], contours[sy][ru_id][0]])

    return points, contours, sy


def order_points(pts):
    #Initialize coordinate points
    rect = np.zeros((4, 2), dtype = "float32")

    # Get the coordinate points of the upper left corner and lower right corner
    s = pts.sum(axis = 1)
    rect[0] = pts[np.argmin(s)]
    rect[2] = pts[np.argmax(s)]

    # Calculate the discrete differences between the upper left corner and the lower right corner respectively
    diff = np.diff(pts, axis = 1)
    rect[1] = pts[np.argmin(diff)]
    rect[3] = pts[np.argmax(diff)]

    return rect


def four_point_transform(image, pts):
    # Get coordinate points and separate them
    rect = order_points(pts)
    (tl, tr, br, bl) = rect
    # Calculate the width value of the new image and select the maximum value of the horizontal difference
    widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
    widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
    maxWidth = max(int(widthA), int(widthB))

    # Calculate the height value of the new image and select the maximum value of the vertical difference
    heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
    heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
    maxHeight = max(int(heightA), int(heightB))

    # Construct 4 coordinate points of the new picture
    dst = np.array([
        [0, 0],
        [maxWidth - 1, 0],
        [maxWidth - 1, maxHeight - 1],
        [0, maxHeight - 1]], dtype="float32")

    # Get the affine transformation matrix and apply it
    M = cv2.getPerspectiveTransform(rect, dst)
    # Perform affine transformation
    warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))

    # Return the transformed result
    return warped


def show_cmp_img(original_img, transform_img):
    _, axes = plt.subplots(1, 2)
    # show image
    axes[0].imshow(original_img)
    axes[1].imshow(transform_img)
    # Set subtitle
    axes[0].set_title("original image")
    axes[1].set_title("transform image")
    plt.show()


# Read pictures
image = cv2.imread('../data/warp01.png')

points, _, _ = get4points(image, 127, 1)

# Get the original coordinate point
pts = np.array(points, dtype="float32")

# Transform the original image
warped = four_point_transform(image, pts)

show_cmp_img(image, warped)

The running results are shown as follows:


        The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. OpenCV skill treeHomepage Overview 24096 people are learning the system