Application of OpenCV’s HSV color space in color recognition in unmanned vehicles

RGB belongs to the three-primary color space, which is the most familiar to everyone. Any color you see can be made by mixing the three primary colors. However, the effective processing of images in color space is usually carried out in HSV space. HSV (Hue, Saturation, Brightness Value) is a color space created according to the intuitive characteristics of color. Also known as the Hexagonal Pyramid.

The value range of HSV color space in OpenCV => H:[0, 180], S:[0, 255], V:[0, 255], H hue strong>The smaller the value is, the closer it is to red, and the higher it is, the closer it is to blue. This expression method is more accurate than simply using red to express red; the smaller the S saturation, the lighter the color, the more The larger the color, the thicker the color; the smaller the V brightness, the darker it is, and the larger it is, the brighter it is. Notice the color change in the picture above!
The reason for choosing HSV is that the hue represented by H can basically determine a certain color, and then combined with saturation and brightness information, it can be judged to be greater than a certain threshold. While RGB is composed of three components, it is necessary to judge the contribution ratio of each component. The recognition range of HSV space is wider, and it is more convenient to use.

1. Example demonstration

Let’s look at an example, take a pack of wide and narrow cigarettes with blue outer packaging, identify and track the color.

1.1, color recognition and tracking

Since I don’t have a camera installed on my desktop, I use the camera on the unmanned vehicle here. The method of obtaining the camera video is a little different from OpenCV, and it is similar. After all, it is based on OpenCV.

from jetbotmini import Camera
from jetbotmini import bgr8_to_jpeg
import cv2
import numpy as np
import traitlets
import ipywidgets.widgets as widgets
from IPython.display import display

# Target color, set to blue array here
color_lower = np.array([100,43,46])
color_upper = np.array([124, 255, 255])
# Camera instance
camera = Camera. instance(width=720, height=720)
# Display controls (video is also a continuous frame of pictures)
color_image = widgets. Image(format='jpeg', width=500, height=400)
display(color_image)

# Recognize the color in real time and feed it back to the above control
while 1:
    frame = camera. value # (720, 720, 3) (H,W,C)
    frame = cv2.resize(frame, (400, 400)) # (400, 400, 3)
    frame = cv2.GaussianBlur(frame,(5,5),0) # Gaussian filter (5, 5) means that the length and width of the Gaussian matrix are both 5, and the standard deviation is 0
    hsv = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)#Convert BGR to HSV
    mask=cv2.inRange(hsv,color_lower,color_upper)
    mask=cv2.erode(mask,None,iterations=2) # Perform corrosion operation to remove edge frizz
    mask=cv2.dilate(mask,None,iterations=2) # Dilate operation
    mask=cv2.GaussianBlur(mask,(3,3),0)
    cnts=cv2.findContours(mask.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)[-2] # Contours
    if len(cnts)>0:
        cnt = max(cnts,key=cv2.contourArea) # contour area
        (color_x,color_y),color_radius=cv2.minEnclosingCircle(cnt) # The position information of the circumscribed circle
        if color_radius > 10:
            # circle label
            cv2.circle(frame,(int(color_x),int(color_y)),int(color_radius),(255,0,255),2)
    color_image.value = bgr8_to_jpeg(frame) # convert it into a picture and pass it to the Image component

You can see that the pink circle is blue in the tracking screen. Among them, the usage of the camera and widgets have many web interactive components. If you are interested, you can refer to: the real-time captured pictures of the camera of the unmanned vehicle and the related operations of the widgets

The meaning of this code is relatively clear. Instantiate the camera, classify and mark each frame of image according to the HSV color gamut space of different colors, and first pass BGR (The image read here is BGR instead of RGB) converted to HSV, and the mask mask is bitwise ANDed with the original image to find the color After that, draw a circle (minimum circumscribed circle) on the outline of the color for labeling. If the recognition effect is not ideal when the ambient light is sufficient, we can manually change some parameter settings in an endless loop.

1.2, cv2.inRange

Next, some functions in the above code are explained
When doing mask, mask=cv2.inRange(hsv,color_lower,color_upper) This means that the values below the lower boundary array and higher than the upper boundary array are 0 black, and those in between are 255 white, belongs to single channel. Let’s look at the code to understand intuitively:

import cv2
import matplotlib.pyplot as plt

img_cv2 = cv2.imread('test.jpg')
hsv = cv2.cvtColor(img_cv2, cv2.COLOR_BGR2HSV)
lowerb = np.array([20, 20, 20])
upperb = np.array([200, 200, 200])

# Black and white single channel (H,W)
mask = cv2.inRange(hsv, lowerb, upperb)
cv2.imshow('Display', mask)
cv2.waitKey(0)
cv2.destroyAllWindows()

plt.subplot(1,2,1); plt.imshow(img_cv2, aspect='auto');plt.axis('off');plt.title('BGR')
plt.subplot(1,2,2); plt.imshow(mask,aspect='auto');plt.axis('off');plt.title('mask')
plt. show()

BGR and mask as shown below:

1.3, cv2.erode and cv2.dilate

Corrosion operation belongs to image morphology. It is the same as the literal meaning. Corrosion is carried out. We can check the help of this function:
erode(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) -> dst
srcThe size of the original image is the same as that of the target image dst, and the size of the kernel here determines the size of the corrosion, which is optional. Under the code test:

import cv2
import numpy as np

image = cv2.imread('test.jpg')
kernel = np.ones((5, 5), np.uint8)
image = cv2.erode(image, None)
cv2.imshow('erode', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Here image = cv2.erode(image, None) The second kernel parameter can be specified or not. After specifying image = cv2.erode(image, kernel), and modify the size of the kernel to see the effect:

The essence is to do convolution operation. For the number in the operation that is not 1, it is set to 0, and it is 1 only if it is all 1. The function is like a comment in the code, used to remove some noise glitches on the edge, etc. The cv2.dilate expansion function can be regarded as the reverse operation of the corrosion function. After corrosion, the black part expands, and the expansion function shrinks.

1.4, cv2.GaussianBlur

Gaussian blur is also used for denoising. Let’s take a look at adding Gaussian noise to a picture, and then use this function to make a denoising effect, mainly for Gaussian noise:

import cv2 as cv
import numpy as np
 
def myShow(name,img):
    cv.imshow(name,img)
    cv.waitKey(0)
    cv.destroyAllWindows()
# Add Gaussian noise
def addGauss(img, mean=0, val=0.01):
    img = img / 255
    gauss = np.random.normal(mean,val**0.05,img.shape)
    img = img + gauss
    return img

img = cv.imread('gauss.png')
img1 = addGauss(img)
myShow('img1',img1)

img2 = cv.GaussianBlur(img1,(3,3),0)
myShow('img2',img2)

Here I put the original image and the three images with Gaussian noise and noise reduction processing together, as shown below:

1.5, cv2.findContours and drawContours

Find the contour function findContours(image, mode, method[, contours[, hierarchy[, offset]]]) -> contours, hierarchy are detected in the binary image, so we first convert it into The grayscale image is converted into a binary image by threshold.
Finally, we draw the outline, using drawContours(image, contours, contourIdx, color[, thickness[, lineType[, hierarchy[, maxLevel[, offset]]]]] -> image
Let’s look at the implementation of the code:

import cv2
import numpy as np

image = cv2.imread('test.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
ret, binary = cv2.threshold(gray,0,255,cv2.THRESH_BINARY + cv2.THRESH_OTSU)
contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
image = cv2.drawContours(image,contours,-1,(255,0,0),2)

#cv2.imshow('erode',binary)
cv2.imshow('erode',image)
cv2.waitKey(0)
cv2.destroyAllWindows()

cv2.RETR_EXTERNAL: Detect the outer contour, ignoring the structure inside the contour.
cv2.CHAIN_APPROX_SIMPLE: Compress elements in the horizontal direction, vertical direction, and diagonal direction, and only retain the end coordinates of the direction. For example, a matrix outline only needs 4 points to save the outline information.
cv2.CHAIN_APPROX_NONE: Store all contour points, and the pixel position difference between two adjacent points does not exceed 1

1.6, HSV color value

Blue is used here as an example, what if you want other colors? What is the value in HSV, as shown in the figure below:

2. OpenCV knowledge points

A lot of OpenCV knowledge is used here. Let’s get familiar with the commonly used ones, such as reading pictures and converting them to gray pictures.

2.1, read and display pictures

import cv2

img = cv2.imread('test.jpg', 0)
cv2.imshow("image",img)
# If you comment out the wait button and release window resources below, the state of "window not responding" will appear, and the picture cannot be displayed normally
cv2.waitKey()
cv2.destroyAllWindows()

Of course, if this visual library is not installed, an error will be reported: ModuleNotFoundError: No module named ‘cv2’

Installation command: pip install opencv-python -i http://pypi.douban.com/simple/ –trusted-host pypi.douban.com

When reading the image cv2.imread(), the second parameter is 0, indicating the grayscale mode, and cv2.IMREAD_GRAYSCALE can also be used instead of 0
Other numbers are represented as follows:
1 means color image cv2.IMREAD_GRAYSCALE
2 means do not modify the mode including the channel cv2.IMREAD_UNCHANGED

2.2, read and save pictures

img = cv2.imread('test.jpg', 0)
cv2.imwrite('new.jpg', img)

In this way, the image read in grayscale mode is saved as a new image by cv2.imwrite.

2.3, read and display video

Similarly, let’s look at the effect of the video. The popular understanding is that the video continuously reads each frame (picture) in it.

import cv2

cap = cv2.VideoCapture('test.mp4')
print(cap. read()[1]. shape) # (1080, 1920, 3)

In this way, one frame of the video is read, and the return value is a tuple. What if the entire video is read? Let’s take a look:

import cv2

cap = cv2.VideoCapture('test.mp4')
if (cap. isOpened() == False):
    print('Cannot open video file')
else:
    fps = cap.get(cv2.CAP_PROP_FPS) # The parameter can be directly replaced by 5
    print('per frame speed:', fps,'FPS') # per frame speed: 29.97002997002997 FPS
    f_count = cap.get(cv2.CAP_PROP_FRAME_COUNT) # 7
    print('Total number of frames: ', f_count) #Total number of frames: 3394.0

while(cap. isOpened()):
    ret, frame = cap. read()
    if ret == True:
        cv2.imshow('Frame', frame)
        key = cv2. waitKey(20)
        # Press q to exit
        if key == ord('q'): break
    else:
        break

# Pay attention to release resources
cap. release()
cv2.destroyAllWindows()

In this way, the video is displayed just like the picture is displayed. Among them, cv2.waitKey(20) means to wait for 20 milliseconds between consecutive frames. The larger the value, the longer the waiting time, and you can see that the playback of the video becomes slower.
Of course, the more critical thing is to obtain the video of the camera monitoring.

# parameter 0 indicates the default camera of the device, and the parameter selection can be changed when the device has multiple cameras
cap = cv2.VideoCapture(0)

Of course, the ID of some external cameras may not be 0, we can use traversal to get it:

import cv2
ID = 0
while(1):
    cap = cv2.VideoCapture(ID)
    ret, frame = cap. read()
    if ret == False:
        ID += 1
    else:
        print(ID)
        break

2.4, multiple pictures composite video

The method of writing the pictures in a directory to the video is similar to the previous method, but it should be noted that the size of the pictures in the combined video must be the same, that is to say, if the pictures in the directory have different sizes If it is, simply writing the video directly will fail, so here you need to crop the picture to be the same, and use interpolation to operate:

import cv2
import os
path = 'imgs'
size = (600,400) # (W,H)
fps = 1
#fourcc = cv2. VideoWriter_fourcc('X','V','I','D')
fourcc = cv2.VideoWriter_fourcc(*'XVID')
video = cv2.VideoWriter('hi.avi',fourcc,fps,size)

for item in os.listdir(path):
    if item.lower().endswith('.jpg'):
        img = cv2.imread(os.path.join(path,item))
        img1 = cv2.resize(img, size, interpolation=cv2.INTER_CUBIC)
        print(img1. shape) # (H,W,C)
        video.write(img1)
video. release()
cv2.destroyAllWindows()

The interpolation method specified by the interpolation parameter:

INTER_NEAREST: nearest neighbor interpolation method
INTER_LINEAR: bilinear interpolation, default
INTER_CUBIC: Bicubic interpolation within a 4×4 pixel neighborhood
INTER_LANCZOS4: Lanczos interpolation within 8×8 pixel neighborhood

2.5, straight line, rectangle, circle and other shapes

Some common shapes, which are very common in practical applications, are listed here:

2.5.1, straight lines

cv2.line(img, startPoint, endPoint, color, thickness)
startPoint : pixel coordinates of the starting position
endPoint: end position pixel coordinates
color: the color to draw
thickness: the width of the drawn line

2.5.2, circle

cv2. circle(img, centerPoint, radius, color, thickness)
img: the target image object to be drawn
centerPoint: Pixel coordinates of the center position of the drawn circle
radius: the radius of the drawn circle
color: the color to draw
thickness: the width of the drawn line (thickness is a negative number, indicating that the circle is filled)

2.5.3, rectangle

cv2.rectangle(img, point1, point2, color, thickness)
img: the target image object to be drawn
point1: pixel coordinates of the upper left vertex position
point2: pixel coordinates of the lower right vertex position
color: the color to draw
thickness: the width of the drawn line

2.5.4, Text

cv2.putText(img, text, point, font, size, color, thickness)
img: the target image object to be drawn
text: the drawn text
point: pixel coordinates of the upper left vertex position
font: the text format to draw
size: the size of the drawn text
color: the color used for drawing
thickness: the width of the drawn line

2.5.5, image scaling

cv2.resize(InputArray src, OutputArray dst, Size, fx, fy, interpolation)
InputArray src: input image
OutputArray dst: output image
Size: output image size
fx, fy: scaling factor along the x-axis, y-axis
interpolation: interpolation method

3, bgr8_to_jpeg

Finally, when the image needs to be displayed in the image component, we need to feed back the image of each frame to the image component, and then the image of each frame is BGR, and the format of the image component is jpeg, so a conversion is required.
bgr8_to_jpegThis function is to encode the image into the memory buffer, which is essentially a package of the imencode function, which compresses the image
The function source code is:

def bgr8_to_jpeg(value, quality=75):
    return bytes(cv2.imencode('.jpg', value)[1])

The imencode function inside is as follows:

imencode(ext, img[, params]) -> retval, buf
ext: the file extension that defines the output format
img: the image to write
buf: The output buffer is resized to fit the compressed image

Quickly view the function source code or the situation where it is inconvenient to view the source file:

import inspect
print(inspect.getsource(bgr8_to_jpeg))

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge