Comprehensive experiment of digital image processing based on Python100010917

Comprehensive Experiment Report on Digital Image Processing

1. Canny edge detection

We can directly use the first-order derivative operator or the second-order derivative operator to convolve the image to extract the edge of the image, but these methods have certain shortcomings. On the one hand, the detection accuracy is not high enough; on the other hand, the extracted edge Not a single pixel long. Canny edge detection is a more advanced method that can solve the above problems.

The Canny edge detection algorithm is proposed based on the following three goals:

  1. low error rate
  2. Edge points should be well positioned
  3. Single Edge Point Response

The Canny algorithm does not simply use a certain convolution kernel to filter images, but has a set of steps. The steps of the Canny algorithm can be roughly divided into the following four steps:

  1. Gaussian filtering (reduce noise, prevent false edges)
  2. Calculate the gradient and gradient direction of an image
  3. Apply non-maximum suppression to a gradient image
  4. Detect and connect edges using dual thresholding and connectivity analysis

The first two steps are equivalent to a simple edge detection algorithm, the third step is to fix the edge to a single pixel width, and the fourth step is to remove false edges

Detailed description

Detailed step-by-step explanations are as follows:

  1. Convolve the original image data with a Gaussian mask. Use Gaussian smoothing to denoise the image to facilitate subsequent processing
  2. Use the Sobel operator to filter to obtain the output in the x and y directions, and then calculate the edge strength and angle on this basis. The strength can be obtained by taking the square root of fx and fy, and the angle can be obtained by taking arctan from the ratio of fy to fx1.
  3. Quantize the edge angle. The calculation of the angle obtained by directly using arctan is too complicated, so the angle can be divided into 0, 45, 90, and 135 with 45 degrees as the boundary. Among them, -0.4142 < tan < 0.4142, and the rest can be deduced by analogy.
  4. According to the edge angle, the edge intensity is suppressed by non-maximum value, and the edge of the image is refined. Only obtaining the global gradient is not enough to determine the edge, and the point with the largest local gradient must be retained to suppress non-maximum values. For example, when 0 degrees, take the maximum value of (x, y), (x+1, y), (x-1, y).
  5. Edges are detected and connected using a dual-threshold algorithm. A typical way to reduce the number of false edge segments is to use a threshold on N[i, j], assigning all values below the threshold a value of zero. The double-threshold algorithm applies two thresholds τ1 and τ2 to the non-maximum suppressed image, and 2τ1≈τ2, so that two threshold edge images N1[i,j] and N2[i,j] can be obtained. Since N2[i, j] is obtained using a high threshold, it contains few false edges, but there are discontinuities (not closed). The double-threshold method needs to connect the edges into contours in N2[i, j]. When the end point of the contour is reached, the algorithm searches for the edges that can be connected to the contour at the 8 adjacent points of N1[i, j]. In this way, The algorithm keeps collecting edges in N1[i,j] until N2[i,j] is connected.

Code Implementation

import math
import numpy as np
import cv2

def myCanny(image, tl, th):

    #Gauss Blur
    image = cv2.GaussianBlur(image, (5,5), 3.0, sigmaY=3.0)
    print('Gauss Blur down')
    #GradientCalculation
    gx = cv2.Sobel(image,cv2.CV_16S,1,0)
    gy = cv2.Sobel(image,cv2.CV_16S,0,1)
    m = (gx*gx + gy*gy) ** 0.5
    theta = np.zeros(m.shape)
    for x in range(0, m. shape[0]):
        for y in range(0, m. shape[1]):
            if(gx[x][y] == 0):
                theta[x][y] = math.pi/2
            else:
                theta[x][y] = math.atan(gy[x][y]/gx[x][y])
    print('Gradient Calculation down')
    #Non-Maximum Suppression
    gn = m. copy()
    for x in range(0, m. shape[0]):
        for y in range(0, m. shape[1]):
            if math.pi * -1/8 <= theta[x][y] and theta[x][y] <= math.pi * 1/8:
                target = (0,1,0,-1)
            elif math.pi * 1/8 <= theta[x][y] and theta[x][y] <= math.pi * 3/8:
                target = (-1,-1,1,1)
            elif math.pi * -3/8 <= theta[x][y] and theta[x][y] <= math.pi * -1/8:
                target = (1,-1,-1,1)
            else:
                target = (1,0,-1,0)
            if 0 <= x + target[0] and x + target[0] < m.shape[0] and 0 <= y + target[1] and y + target[1] < m.shape[1]:
                if m[x][y] < m[x + target[0]][y + target[1]]:
                    gn[x][y] = 0
            if 0 <= x + target[2] and x + target[2] < m.shape[0] and 0 <= y + target[3] and y + target[3] < m.shape[1]:
                if m[x][y] < m[x + target[2]][y + target[3]]:
                    gn[x][y] = 0
    print('Non-Maximum Suppression down')
    #Dual-threshold edge detection
    gnh = gn.copy()
    gnl = gn. copy()
    for x in range(0, gn. shape[0]):
        for y in range(0, gn. shape[1]):
            if gn[x][y] > 255:
                gn[x][y] = 255
            if gn[x][y] <th:
                gnh[x][y] = 0
            if gn[x][y] < tl:
                gnl[x][y] = 0
    gnl = gnl - gnh
    label = np.zeros(gn.shape)
    s = []
    q = []
    connected = False
    for x in range(0, gn. shape[0]):
        for y in range(0, gn. shape[1]):
            if gnl[x][y] > 0 and label[x][y] == 0:
                label[x][y] = 255
                s.append((x,y))
                q.append((x,y))
                while s:
                    xy = s.pop()
                    target = (-1,-1,-1,0,-1,1,0,-1,0,1,1,-1,1,0,1,1)
                    for i in range(0, 8):
                        tempx, tempy = xy[0] + target[i*2], xy[1] + target[i*2 + 1]
                        if 0 <= tempx and tempx < gn.shape[0] and 0 <= tempy and tempy < gn.shape[1]:
                            if gnl[tempx][tempy] > 0 and label[tempx][tempy] == 0:
                                label[tempx][tempy] = 255
                                s.append((tempx,tempy))
                                q.append((tempx,tempy))
                            if gnh[tempx][tempy] > 0:
                                connected = True
                if connected == False:
                    while q:
                        xy = q.pop()
                        label[xy[0]][xy[1]] = 0
                q = []
                connected = False
            if gnh[x][y] > 0:
                label[x][y] = 255
    print('Dual-threshold edge detection down')
    return label.astype(np.uint8)
    
    

image = cv2.imread("lena.jpg", 0)
mycanny = mycanny(image,20,100)
cvcanny = cv2.Canny(image, 20,100)
cv2.imshow('image', image)
cv2.imshow('myCanny', mycanny)
cv2.imshow('OpencvCanny', cvcanny)
k = cv2.waitKey(0)
if k == 27: # wait for ESC key to exit
    cv2.destroyAllWindows()


Experimental results


Image before processing


processed image

2. OSTU image segmentation method

Otsu method (OTSU) is an algorithm to determine the image binarization segmentation threshold, and it is the best method to find the global threshold of the image.

Advantages: The calculation is simple and fast, and it is not affected by the brightness and contrast of the image.

Disadvantage: Sensitive to image noise; only for single target segmentation; when the size ratio of the target and the background is very different, and the variance function between classes may show bimodal or multimodal, the effect is not good at this time.

Rationale

Since different regions of an image have large differences, we can use this principle to design algorithms. The difference between different regions of the image can be described by variance.

Assume that there is a threshold TH that can divide the image into two regions with mean values m1 and m2 respectively. The probability that the pixel is divided into two regions is p1, p2, then the mean value of the entire image can be expressed as:

m

=

p

1

m

1

+

p

2

m

2

m = p_1m_1 + p_2m_2

m=p1?m1? + p2?m2?

The between-class variance can be expressed as:

σ

2

=

p

1

(

m

1

?

m

)

2

+

p

2

(

m

2

?

m

)

2

\sigma^2=p_1(m_1-m)^2 + p_2(m_2-m)^2

σ2=p1?(m1m)2 + p2?(m2m)2

Simplified:

σ

2

=

p

1

p

2

(

m

1

?

m

2

)

2

\sigma^2=p_1p_2(m_1-m_2)^2

σ2=p1?p2?(m1m2?)2

The TH that maximizes the above formula is the required threshold.

Code Implementation

import math
import numpy as np
import cv2

def myOtsu(image):
    n = np.zeros(256,dtype=int)
    for x in range(image. shape[0]):
        for y in range(image.shape[1]):
            n[image[x][y]] + = 1
    p = n / sum(n)
    p1 = np.zeros(256)
    m = np.zeros(256)
    for k in range(0, 256):
        p1[k] = sum(p[:k + 1])
        if k > 0:
            m[k] = m[k-1] + k * p[k]
    mg = m[255]
    varB = np. zeros(256)
    for k in range(0, 256):
        if p1[k] > 0 and p1[k] < 1:
            varB[k] = ((mg * p1[k] - m[k]) ** 2) / (p1[k] * (1 - p1[k]))
    resultList = []
    for k in range(0, 256):
        if varB[k] == np.amax(varB):
            resultList.append(k)
    result = np.average(resultList).astype(np.uint8)
    print(result)
    newImage = np. zeros(image. shape)
    for x in range(image. shape[0]):
        for y in range(image.shape[1]):
            if image[x][y] > result:
                newImage[x][y] = 255
    return newImage

image = cv2.imread('3.jpg',0)
new = myOtsu(image)
cv2.imshow('image', new)
k = cv2.waitKey(0)
if k == 27: # wait for ESC key to exit
    cv2.destroyAllWindows()

OpenCV face detection

Comprehensive experiment report on digital image processing. Topic: Photo-Based Face Detection Using OpenCV.

Pre-description

Haar-like

In layman’s terms, it is the facial features. The Haar eigenvalue reflects the grayscale variation of the image. For example, some features of the face can be simply described by rectangular features, such as: the eyes are darker than the cheeks, the sides of the bridge of the nose are darker than the bridge of the nose, and the mouth is darker than the surroundings.

OpenCV API

The APIs used in this experiment include common image reading, grayscale conversion, image display, simple image editing, etc.; the description is as follows:

Load image

The target path of the image needs to be provided.

import cv2
image = cv2.imread(imagepath)
Grayscale conversion

The function of grayscale conversion is to convert the image into grayscale and reduce the calculation intensity.

import cv2
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Drawing

OpenCV can edit the image arbitrarily; you can use the following function to draw a rectangle on the image:

import cv2
cv2.rectangle(image,(x,y),(x+w,y+w),(0,255,0),2)

The last parameter of the function specifies the size of the brush.

Show images

The processed image is either displayed or saved to a physical storage medium.

import cv2
cv2.imshow("Image Title",image)
Get training set

The essence is some descriptions of facial features; after OpenCV completes the training, it can perceive the features on the picture for face detection

import cv2
face_cascade = cv2.CascadeClassifier(r'./haarcascade_frontalface_default.xml')

This training data is open source and can be used directly.

Training data reference address: https://github.com/opencv/opencv/tree/master/data/haarcascades

Face detection

After training, you can use OpenCV to recognize new pictures

import cv2

faces = face_cascade. detectMultiScale(
   gray,
   scaleFactor = 1.15,
   minNeighbors = 5,
   minSize = (5,5),
   flags = cv2.cv.CV_HAAR_SCALE_IMAGE
)

You can adjust the recognition accuracy by modifying the parameter value of this function.

After completing the experimental process through the above API, the obtained data can be post-processed and the results can be visualized.

Code Implementation

Image based

import cv2
import numpy as np
import sys, os, glob, numpy
from skimage import io

img = cv2.imread("test.jpg")
color = (0, 255, 0)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

classfier = cv2.CascadeClassifier("haarcascade_frontalface_alt2.xml")

faceRects = classfier. detectMultiScale(grey, scaleFactor=1.2, minNeighbors=3, minSize=(32, 32))
if len(faceRects) > 0:
for faceRect in faceRects:
x, y, w, h = faceRect
cv2.rectangle(img, (x - 10, y - 10), (x + w + 10, y + h + 10), color, 3)

cv2.imwrite('output.jpg',img)
cv2.imshow("Find Faces!",img)
cv2.waitKey(0)

Video based

import cv2
import sys
import logging as log
import datetime as dt
from time import sleep

cascPath = "haarcascade_frontalface_alt2.xml"
faceCascade = cv2. CascadeClassifier(cascPath)

video_capture = cv2. VideoCapture(0)

while True:
if not video_capture.isOpened():
print('Unable to load camera.')
sleep(5)
pass

ret, frame = video_capture. read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

faces = faceCascade. detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30),
# flags=cv2.cv.CV_HAAR_SCALE_IMAGE
)

for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)

cv2.imshow('Video', frame)

if cv2.waitKey(1) & amp; 0xFF == ord('q'):
break

video_capture. release()
cv2.destroyAllWindows()

Resources

Size: 25.2MB
Resource download: https://download.csdn.net/download/s1t16/87484814
Note: If the current article or code violates your rights, please private message the author to delete it!