Comprehensive Experiment Report on Digital Image Processing
1. Canny edge detection
We can directly use the first-order derivative operator or the second-order derivative operator to convolve the image to extract the edge of the image, but these methods have certain shortcomings. On the one hand, the detection accuracy is not high enough; on the other hand, the extracted edge Not a single pixel long. Canny edge detection is a more advanced method that can solve the above problems.
The Canny edge detection algorithm is proposed based on the following three goals:
- low error rate
- Edge points should be well positioned
- Single Edge Point Response
The Canny algorithm does not simply use a certain convolution kernel to filter images, but has a set of steps. The steps of the Canny algorithm can be roughly divided into the following four steps:
- Gaussian filtering (reduce noise, prevent false edges)
- Calculate the gradient and gradient direction of an image
- Apply non-maximum suppression to a gradient image
- Detect and connect edges using dual thresholding and connectivity analysis
The first two steps are equivalent to a simple edge detection algorithm, the third step is to fix the edge to a single pixel width, and the fourth step is to remove false edges
Detailed description
Detailed step-by-step explanations are as follows:
- Convolve the original image data with a Gaussian mask. Use Gaussian smoothing to denoise the image to facilitate subsequent processing
- Use the Sobel operator to filter to obtain the output in the x and y directions, and then calculate the edge strength and angle on this basis. The strength can be obtained by taking the square root of fx and fy, and the angle can be obtained by taking arctan from the ratio of fy to fx1.
- Quantize the edge angle. The calculation of the angle obtained by directly using arctan is too complicated, so the angle can be divided into 0, 45, 90, and 135 with 45 degrees as the boundary. Among them, -0.4142 < tan < 0.4142, and the rest can be deduced by analogy.
- According to the edge angle, the edge intensity is suppressed by non-maximum value, and the edge of the image is refined. Only obtaining the global gradient is not enough to determine the edge, and the point with the largest local gradient must be retained to suppress non-maximum values. For example, when 0 degrees, take the maximum value of (x, y), (x+1, y), (x-1, y).
- Edges are detected and connected using a dual-threshold algorithm. A typical way to reduce the number of false edge segments is to use a threshold on N[i, j], assigning all values below the threshold a value of zero. The double-threshold algorithm applies two thresholds τ1 and τ2 to the non-maximum suppressed image, and 2τ1≈τ2, so that two threshold edge images N1[i,j] and N2[i,j] can be obtained. Since N2[i, j] is obtained using a high threshold, it contains few false edges, but there are discontinuities (not closed). The double-threshold method needs to connect the edges into contours in N2[i, j]. When the end point of the contour is reached, the algorithm searches for the edges that can be connected to the contour at the 8 adjacent points of N1[i, j]. In this way, The algorithm keeps collecting edges in N1[i,j] until N2[i,j] is connected.
Code Implementation
import math import numpy as np import cv2 def myCanny(image, tl, th): #Gauss Blur image = cv2.GaussianBlur(image, (5,5), 3.0, sigmaY=3.0) print('Gauss Blur down') #GradientCalculation gx = cv2.Sobel(image,cv2.CV_16S,1,0) gy = cv2.Sobel(image,cv2.CV_16S,0,1) m = (gx*gx + gy*gy) ** 0.5 theta = np.zeros(m.shape) for x in range(0, m. shape[0]): for y in range(0, m. shape[1]): if(gx[x][y] == 0): theta[x][y] = math.pi/2 else: theta[x][y] = math.atan(gy[x][y]/gx[x][y]) print('Gradient Calculation down') #Non-Maximum Suppression gn = m. copy() for x in range(0, m. shape[0]): for y in range(0, m. shape[1]): if math.pi * -1/8 <= theta[x][y] and theta[x][y] <= math.pi * 1/8: target = (0,1,0,-1) elif math.pi * 1/8 <= theta[x][y] and theta[x][y] <= math.pi * 3/8: target = (-1,-1,1,1) elif math.pi * -3/8 <= theta[x][y] and theta[x][y] <= math.pi * -1/8: target = (1,-1,-1,1) else: target = (1,0,-1,0) if 0 <= x + target[0] and x + target[0] < m.shape[0] and 0 <= y + target[1] and y + target[1] < m.shape[1]: if m[x][y] < m[x + target[0]][y + target[1]]: gn[x][y] = 0 if 0 <= x + target[2] and x + target[2] < m.shape[0] and 0 <= y + target[3] and y + target[3] < m.shape[1]: if m[x][y] < m[x + target[2]][y + target[3]]: gn[x][y] = 0 print('Non-Maximum Suppression down') #Dual-threshold edge detection gnh = gn.copy() gnl = gn. copy() for x in range(0, gn. shape[0]): for y in range(0, gn. shape[1]): if gn[x][y] > 255: gn[x][y] = 255 if gn[x][y] <th: gnh[x][y] = 0 if gn[x][y] < tl: gnl[x][y] = 0 gnl = gnl - gnh label = np.zeros(gn.shape) s = [] q = [] connected = False for x in range(0, gn. shape[0]): for y in range(0, gn. shape[1]): if gnl[x][y] > 0 and label[x][y] == 0: label[x][y] = 255 s.append((x,y)) q.append((x,y)) while s: xy = s.pop() target = (-1,-1,-1,0,-1,1,0,-1,0,1,1,-1,1,0,1,1) for i in range(0, 8): tempx, tempy = xy[0] + target[i*2], xy[1] + target[i*2 + 1] if 0 <= tempx and tempx < gn.shape[0] and 0 <= tempy and tempy < gn.shape[1]: if gnl[tempx][tempy] > 0 and label[tempx][tempy] == 0: label[tempx][tempy] = 255 s.append((tempx,tempy)) q.append((tempx,tempy)) if gnh[tempx][tempy] > 0: connected = True if connected == False: while q: xy = q.pop() label[xy[0]][xy[1]] = 0 q = [] connected = False if gnh[x][y] > 0: label[x][y] = 255 print('Dual-threshold edge detection down') return label.astype(np.uint8) image = cv2.imread("lena.jpg", 0) mycanny = mycanny(image,20,100) cvcanny = cv2.Canny(image, 20,100) cv2.imshow('image', image) cv2.imshow('myCanny', mycanny) cv2.imshow('OpencvCanny', cvcanny) k = cv2.waitKey(0) if k == 27: # wait for ESC key to exit cv2.destroyAllWindows()
Experimental results
Image before processing
processed image
2. OSTU image segmentation method
Otsu method (OTSU) is an algorithm to determine the image binarization segmentation threshold, and it is the best method to find the global threshold of the image.
Advantages: The calculation is simple and fast, and it is not affected by the brightness and contrast of the image.
Disadvantage: Sensitive to image noise; only for single target segmentation; when the size ratio of the target and the background is very different, and the variance function between classes may show bimodal or multimodal, the effect is not good at this time.
Rationale
Since different regions of an image have large differences, we can use this principle to design algorithms. The difference between different regions of the image can be described by variance.
Assume that there is a threshold TH that can divide the image into two regions with mean values m1 and m2 respectively. The probability that the pixel is divided into two regions is p1, p2, then the mean value of the entire image can be expressed as:
m
=
p
1
m
1
+
p
2
m
2
m = p_1m_1 + p_2m_2
m=p1?m1? + p2?m2?
The between-class variance can be expressed as:
σ
2
=
p
1
(
m
1
?
m
)
2
+
p
2
(
m
2
?
m
)
2
\sigma^2=p_1(m_1-m)^2 + p_2(m_2-m)^2
σ2=p1?(m1m)2 + p2?(m2m)2
Simplified:
σ
2
=
p
1
p
2
(
m
1
?
m
2
)
2
\sigma^2=p_1p_2(m_1-m_2)^2
σ2=p1?p2?(m1m2?)2
The TH that maximizes the above formula is the required threshold.
Code Implementation
import math import numpy as np import cv2 def myOtsu(image): n = np.zeros(256,dtype=int) for x in range(image. shape[0]): for y in range(image.shape[1]): n[image[x][y]] + = 1 p = n / sum(n) p1 = np.zeros(256) m = np.zeros(256) for k in range(0, 256): p1[k] = sum(p[:k + 1]) if k > 0: m[k] = m[k-1] + k * p[k] mg = m[255] varB = np. zeros(256) for k in range(0, 256): if p1[k] > 0 and p1[k] < 1: varB[k] = ((mg * p1[k] - m[k]) ** 2) / (p1[k] * (1 - p1[k])) resultList = [] for k in range(0, 256): if varB[k] == np.amax(varB): resultList.append(k) result = np.average(resultList).astype(np.uint8) print(result) newImage = np. zeros(image. shape) for x in range(image. shape[0]): for y in range(image.shape[1]): if image[x][y] > result: newImage[x][y] = 255 return newImage image = cv2.imread('3.jpg',0) new = myOtsu(image) cv2.imshow('image', new) k = cv2.waitKey(0) if k == 27: # wait for ESC key to exit cv2.destroyAllWindows()
OpenCV face detection
Comprehensive experiment report on digital image processing. Topic: Photo-Based Face Detection Using OpenCV.
Pre-description
Haar-like
In layman’s terms, it is the facial features. The Haar eigenvalue reflects the grayscale variation of the image. For example, some features of the face can be simply described by rectangular features, such as: the eyes are darker than the cheeks, the sides of the bridge of the nose are darker than the bridge of the nose, and the mouth is darker than the surroundings.
OpenCV API
The APIs used in this experiment include common image reading, grayscale conversion, image display, simple image editing, etc.; the description is as follows:
Load image
The target path of the image needs to be provided.
import cv2 image = cv2.imread(imagepath)
Grayscale conversion
The function of grayscale conversion is to convert the image into grayscale and reduce the calculation intensity.
import cv2 gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Drawing
OpenCV can edit the image arbitrarily; you can use the following function to draw a rectangle on the image:
import cv2 cv2.rectangle(image,(x,y),(x+w,y+w),(0,255,0),2)
The last parameter of the function specifies the size of the brush.
Show images
The processed image is either displayed or saved to a physical storage medium.
import cv2 cv2.imshow("Image Title",image)
Get training set
The essence is some descriptions of facial features; after OpenCV completes the training, it can perceive the features on the picture for face detection
import cv2 face_cascade = cv2.CascadeClassifier(r'./haarcascade_frontalface_default.xml')
This training data is open source and can be used directly.
Training data reference address: https://github.com/opencv/opencv/tree/master/data/haarcascades
Face detection
After training, you can use OpenCV to recognize new pictures
import cv2 faces = face_cascade. detectMultiScale( gray, scaleFactor = 1.15, minNeighbors = 5, minSize = (5,5), flags = cv2.cv.CV_HAAR_SCALE_IMAGE )
You can adjust the recognition accuracy by modifying the parameter value of this function.
After completing the experimental process through the above API, the obtained data can be post-processed and the results can be visualized.
Code Implementation
Image based
import cv2 import numpy as np import sys, os, glob, numpy from skimage import io img = cv2.imread("test.jpg") color = (0, 255, 0) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) classfier = cv2.CascadeClassifier("haarcascade_frontalface_alt2.xml") faceRects = classfier. detectMultiScale(grey, scaleFactor=1.2, minNeighbors=3, minSize=(32, 32)) if len(faceRects) > 0: for faceRect in faceRects: x, y, w, h = faceRect cv2.rectangle(img, (x - 10, y - 10), (x + w + 10, y + h + 10), color, 3) cv2.imwrite('output.jpg',img) cv2.imshow("Find Faces!",img) cv2.waitKey(0)
Video based
import cv2 import sys import logging as log import datetime as dt from time import sleep cascPath = "haarcascade_frontalface_alt2.xml" faceCascade = cv2. CascadeClassifier(cascPath) video_capture = cv2. VideoCapture(0) while True: if not video_capture.isOpened(): print('Unable to load camera.') sleep(5) pass ret, frame = video_capture. read() gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) faces = faceCascade. detectMultiScale( gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30), # flags=cv2.cv.CV_HAAR_SCALE_IMAGE ) for (x, y, w, h) in faces: cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2) cv2.imshow('Video', frame) if cv2.waitKey(1) & amp; 0xFF == ord('q'): break video_capture. release() cv2.destroyAllWindows()
Resources
Size: 25.2MB
Resource download: https://download.csdn.net/download/s1t16/87484814
Note: If the current article or code violates your rights, please private message the author to delete it!