[OpenCV implements geometric transformation of images]

Article directory

- Summary: OpenCV implements geometric transformation, image thresholding and smoothing of images
- transform
- summary

Summary: OpenCV implements geometric transformation, image thresholding and smoothing of images

There are three important topics for image processing using the OpenCV library: geometric transformations, image thresholding, and image smoothing. In the geometric transformation section, the translation, rotation, scaling and affine transformation of the image are introduced in detail, and how to use OpenCV functions to implement these operations. Then the concepts and methods of image threshold processing include simple threshold, adaptive threshold and Otsu threshold. Finally, image smoothing techniques, including mean filtering, Gaussian filtering and median filtering, and their applications in denoising and blurring.

main content:

Geometric transformation:
    Image translation: Introduces how to implement image translation operation through OpenCV's cv2.warpAffine() function.
    Image rotation: Demonstrates how to use OpenCV's cv2.getRotationMatrix2D() and cv2.warpAffine() functions to achieve image rotation.
    Image Scaling: Introduces the concepts of image reduction and enlargement, and how to use OpenCV's cv2.resize() function to achieve scaling.
    Affine transformation: The definition and implementation of affine transformation are discussed, including the calculation and application of transformation matrices.

Image thresholding:
    Simple threshold: Introduces the basic principles of simple threshold processing and the use of the OpenCV function cv2.threshold().
    Adaptive Thresholding: Discusses the concept of adaptive thresholding and the use of the cv2.adaptiveThreshold() function in OpenCV.
    Otsu Threshold: Introduces the principle of Otsu threshold method and how to use OpenCV's cv2.threshold() function combined with the cv2.THRESH_OTSU flag to implement automatic threshold selection.

Image smoothing:
    Mean filtering: Detailed introduction to the concept of mean filtering and the application of the cv2.blur() function in OpenCV.
    Gaussian filtering: Discussed the principle of Gaussian filtering and the use of OpenCV function cv2.GaussianBlur().
    Median filtering: Introduces the characteristics of median filtering and how to use OpenCV’s cv2.medianBlur() function to implement median filtering.

Transform

Apply different geometric transformations to the image, like translation, rotation, affine transformation
Function: cv.getPerspectiveTransform

1.Zoom
OpenCV provides two main image transformation functions: cv.warpAffine and cv.warpPerspective, which are used to complete various types of image transformations. cv.warpAffine outputs a 2×3 transformation matrix, while cv.warpPerspective outputs a 3×3 transformation matrix.

In image processing, scaling is a common operation that resizes an image. OpenCV provides the cv.resize() function, which can manually set the size of the image or use a scale factor to scale it. During the scaling process, you can choose different interpolation methods. Typically, cv.INTER_AREA is used to shrink images, while cv.INTER_CUBIC (slower) and cv.INTER_LINEAR are used to enlarge images. Among them, the cv.INTER_LINEAR interpolation method can be applied to various scaling scenarios.

Here are two ways to resize the input image:

import numpy as np
import cv2 as cv

img = cv.imread('img.png')

#Method 1: Use fx and fy as scaling factors for scaling
res = cv.resize(img, None, fx=2, fy=2, interpolation=cv.INTER_CUBIC)

#Method 2: Manually set a new image size for scaling
height, width = img.shape[:2]
res = cv.resize(img, (2 * width, 2 * height), interpolation=cv.INTER_CUBIC)
cv.namedWindow('Resized Image', cv.WINDOW_NORMAL) # Define the window and adjust the window size
cv.imshow('Resized Image', res) # Display the image in the window

cv.waitKey(0) # Wait for the user to press any key
cv.destroyAllWindows() # Close the window

result:

2. Panning
Panning refers to moving the position on the image. If you know the current position (x,y)(x,y) and the target position (tx,ty)(tx?,ty?), you can create an affine transformation matrix MM as follows:

You can create this matrix using the np.float32 data type from the Numpy library and pass it to OpenCV’s cv.warpAffine() function for translation operations. In this matrix, txtx? represents the translation amount on the x-axis, and tyty? represents the translation amount on the y-axis. With this transformation matrix, you can translate the image to a new position.

import numpy as np
import cv2 as cv

# Read grayscale image
img = cv.imread('img.png', 0)
rows, cols = img.shape

#Define translation matrix
M = np.float32([
                [1, 0, 100], # x-axis translation 100 pixels
                [0, 1, 50] # y-axis translation by 50 pixels
])

#Apply translation transformation
dst = cv.warpAffine(img, M, (cols, rows))

# Display the translated image
cv.imshow('img', dst)
cv.waitKey(0)
cv.destroyAllWindows()

3. Rotate
Implement image rotation operations. Typically, a rotation operation requires a rotation angle (θ), and the center point of the rotation. OpenCV provides a function cv.getRotationMatrix2D to help you calculate the rotation matrix. The function takes the following parameters:
Coordinates of the rotation center point (center_x, center_y)
Rotation angle θ
scaling factor scale
Using these parameters, the function will return a transformation matrix, which can be passed to the cv.warpAffine() function to implement the rotation of the image.

import numpy as np
import cv2 as cv

# Read grayscale image
img = cv.imread('img.png', 0)

# Get the height and width of the image
rows, cols = img.shape

# Calculate the coordinates of the rotation center point, 90 is the angle of rotation, 1 is the scaling factor (no scaling)
center_x = (cols - 1) / 2.0
center_y = (rows - 1) / 2.0

# Get the rotation matrix. The parameters are the rotation center coordinates, rotation angle, and scaling factor.
M = cv.getRotationMatrix2D((center_x, center_y), 180, 1)# Apply affine transformation to rotate the image 90 degrees
dst = cv.warpAffine(img, M, (cols, rows))
# Display the translated image
cv.imshow('img', dst)
cv.waitKey(0)
cv.destroyAllWindows()

fe2.png)
4.Affine transformation

It is a linear transformation that keeps parallel lines in the image still parallel. When performing affine transformation, we need to select three non-collinear points in the original image and their corresponding positions in the output image. These three points will determine an affine transformation matrix, which can be used to map any point in the original image to the corresponding point in the output image. OpenCV provides the function cv.getAffineTransform for calculating the affine transformation matrix.

Here is an example that demonstrates how to use the cv.getAffineTransform function to perform an affine transformation:

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt

# Read the input image
img = cv.imread('drawing.png')
rows, cols, ch = img.shape

# Three points in the original image and their corresponding positions in the output image
pts1 = np.float32([[50, 50], [200, 50], [20, 200]])
pts2 = np.float32([[10, 100], [200, 50], [100, 250]])

# Calculate the affine transformation matrix
M = cv.getAffineTransform(pts1, pts2)

#Apply affine transformation
dst = cv.warpAffine(img, M, (cols, rows))

# Display input image and output image
plt.subplot(121), plt.imshow(img), plt.title("Input")
plt.subplot(122), plt.imshow(dst), plt.title("Output")
plt.show()

In this example, pts1 are the three points in the original image, and pts2 are their corresponding positions in the output image. The function cv.getAffineTransform calculates the affine transformation matrix M, and then the cv.warpAffine function applies the matrix to the original image to obtain the output image dst. The picture on the left is the input image, and the picture on the right is the output image.

5.Perspective transformation
Perspective transformation is a linear transformation that converts any quadrilateral area in an image into another quadrilateral area. In perspective transformation, we need a 3 × 3 matrix that can map any point in the original image to the corresponding point in the output image. In order to find this transformation matrix, we need to select four corresponding points in the input image and the output image. At least three of these four points cannot be collinear. This perspective transformation matrix can be calculated using the function cv.getPerspectiveTransform and then passed to the cv.warpPerspective function to apply the transformation.

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt

# Read the input image
img = cv.imread('sudoku.png')
rows, cols, ch = img.shape

# Four points in the original image and their corresponding positions in the output image
pts1 = np.float32([[56, 65], [368, 52], [28, 387], [389, 390]])
pts2 = np.float32([[0, 0], [300, 0], [0, 300], [300, 300]])

# Calculate the perspective transformation matrix
M = cv.getPerspectiveTransform(pts1, pts2)

# Apply perspective transformation
dst = cv.warpPerspective(img, M, (300, 300))

# Display input image and output image
plt.subplot(121), plt.imshow(img), plt.title('Input')
plt.subplot(122), plt.imshow(dst), plt.title('Output')
plt.show()

In this example, pts1 are the four points in the original image, and pts2 are their corresponding positions in the output image. The function cv.getPerspectiveTransform calculates the perspective transformation matrix M, and then the cv.warpPerspective function applies the matrix to the original image to obtain the output image dst. The picture on the left is the input image, and the picture on the right is the output image.

Summary

Master the key skills of using the OpenCV library for geometric transformation, thresholding and smoothing of images. These technologies have wide applications in the fields of image processing, computer vision, and image analysis. They are of great significance for processing various image data and can be applied to more complex image processing tasks.