[Simple Web Application 3] Realize face comparison

Article directory

  • Recap
  • Effect demo
  • Implementation process
    • 1. utils.py
    • 2. compare.html
    • 3. forms.py
    • 4. insightface_api.py
    • 5. app.py
  • Record
    • 1. Bugs
      • 1.1 cv2.imshow() reports an error
      • 1.2 The label frame of insightface face detection is disordered ()
    • 2. Miscellaneous Notes
  • Summary

Summary

Through PaddleHub‘s face detection model pyramidbox_lite_mobile, a small application that uploads faces in the browser and performs face detection is realized. In this section, the function we will implement is to upload any two face pictures and compare whether they are the same person.

csdn personal homepage: https://blog.csdn.net/m0_63238256

Effect demo

Implementation process

Main tools: Flask, Insightface

I wanted to continue using PaddleHub, but I couldn’t find a suitable out-of-the-box model, so I switched to Insightface. insightface is a relatively well-known open source library for face recognition. It can be installed directly in Python using pip. The documents I found are in English, so it is a bit difficult to get started ( Poor English).

insightface’s github address: https://github.com/deepinsight/insightface/tree/master/model_zoo

The content of this time is roughly as follows:

  • Package the face recognition function of insightface as a Web service

  • The backend obtains two face pictures from the browser form, forwards them to face recognition, and gets the face annotation frame and feature vector

  • Realize face comparison through cosine similarity of feature vectors (judging whether two faces belong to the same person)

  • Draw face annotation frame

  • Display the results on the frontend

Supplement: In fact, you don’t need to package insightface as a Web service, but just package it into a function. But my Flask and insightface are installed in different python virtual environments, so they communicate in the form of API.

Directory structure:

- templates
- compare.html
- app.py
-insightface_api.py
-foms.py
- utils.py

Among them, utils.py encapsulates some small scripts.

1.utils.py

I chose a format encoded in base64 to pass data (images, feature vectors, annotation boxes, etc.) between APIs. The data needs to be converted between different formats multiple times, so I encapsulate some conversion processes into functions and put them together. Well… It also contains several small functions for other functions.

The logic of this function implementation code will look clear.

import base64
import numpy as np
import cv2


# The content of the image file is converted to cv2 and scaled to the specified size
def content_to_cv2(contents: list, size: tuple):
    '''
    content -> np -> cv2 -> cv2<target_size>'''
    imgs_np = [np.asarray(bytearray(content), dtype=np.uint8) for content in contents]
    imgs_cv2 = [cv2.imdecode(img_np, cv2.IMREAD_COLOR) for img_np in imgs_np]
    imgs_cv2 = [cv2.resize(img_cv2, size, interpolation=cv2.INTER_LINEAR) for img_cv2 in imgs_cv2]
    return imgs_cv2


def base64_to_cv2(img: str):
    # Note: only suitable for images, not suitable for other numpy arrays, such as bboxs (face annotation frame) data
    # base64 -> binary -> ndarray -> cv2
    # decode to binary data
    img_codes = base64.b64decode(img)
    img_np = np.frombuffer(img_codes, np.uint8)
    img_cv2 = cv2.imdecode(img_np, cv2.IMREAD_COLOR)
    return img_cv2


def cv2_to_base64(image):
    data = cv2.imencode('.jpg', image)[1]
    return base64.b64encode(data.tostring()).decode('utf8')


def np_to_base64(array):
    return base64.b64encode(array.tostring()).decode('utf8')
    

def base64_to_np(arr_b64):
    return np.frombuffer(base64.b64decode(arr_b64), np.float32)


# Display images in cv2 format
def cv2_show(img_cv2):
    cv2.imshow('img', img_cv2)
    cv2.waitKey(0)
    cv2.destroyAllWindows()


# Draw the face annotation box
def cv2_with_rectangle(img_cv2, bboxes: list):
    '''return --> Draw the image of the rectangular label box'''
    bboxs = [bbox.astype('int32') for bbox in bboxs]
    for bbox in bboxes:
        cv2.rectangle(
            img_cv2,
            (bbox[0], bbox[1]),
            (bbox[2], bbox[3]),
            (255, 0, 0), # blue
            thickness=2)
    return img_cv2


# Calculate cosine similarity of feature vectors
def compare_face(emb1: np.ndarray, emb2: np.ndarray, threshold=0.6):
    '''
    @return -> (<numpy. bool>, <numpy. float32>)
    - bool: Whether it is the same face
    - float: cosine similarity [-1, 1], the larger the value, the more similar \

    @params
    - threshold: The cosine similarity threshold for judging that two faces are the same
    '''
    # return --> Cosine similarity [-1, 1], the larger the value, the more similar
    sim = np. dot(emb1, emb2) / (np. linalg. norm(emb1) * np. linalg. norm(emb2))
    print(type(sim))
    return sim > threshold, sim
    

2. compare.html

Here is a relatively crude implementation of the display effect in the browser.

<h1>Face comparison</h1>


<!-- Form for uploading images -->
<form action="" method="post" class="mt-4" enctype="multipart/form-data">
    <!-- The sentence csrf seems to be deleted -->
    {<!-- -->{ form.csrf_token }}
    <li>{<!-- -->{ form. face_img() }}</li>
    <li>{<!-- -->{ form. face_img2() }}</li>
    <li><input type="submit" value="Submit"></li>
</form>

<!-- Display test results -->
{% for img_base64 in imgs_base64 %}
    <img src="data:image/jpeg;base64, {<!-- -->{ img_base64 }}" width="250" height="250">
{% end for %}
{% if imgs_base64 %}
    <p>Are they the same person: <b>{<!-- -->{ is_like }}</b></p>
    <p>Similarity [-1, 1]: <b>{<!-- -->{ how_like }}</b></p>
{% endif %}

<!-- Display error message -->
{% if form.face_img.errors %}
    <div class="alert alert-danger">
        {% for error in form.face_img.errors %}
            {<!-- -->{ error }}
        {% end for %}
    </div>
{% endif %}

3. forms.py

The form Face2Form will get two face pictures from the browser to compare whether they are the same person.

You may be wondering why I want the Face2Form class to inherit from ImageForm, um… because ImageForm is my face in the previous section It was used in the detection, so that my previous face detection can continue to use it.

from flask_wtf import FlaskForm
from flask_wtf.file import FileAllowed, FileRequired, FileSize, FileField

class ImageForm(FlaskForm):
    face_img = FileField("face_img",
        validators=[
            FileRequired(message="cannot be empty"),
            FileAllowed(['jpg', 'png'], message="only supports jpg/png format"),
            FileSize(max_size=2048000, message="The picture cannot be larger than 2Mb")
        ],
        description="The picture cannot be larger than 2Mb, only supports jpg/png format"
    )

# Two face pictures --> used to compare whether the faces are the same
class Face2Form(ImageForm):
    face_img2 = FileField("face_img",
        validators=[
            FileRequired(message="cannot be empty"),
            FileAllowed(['jpg', 'png'], message="only supports jpg/png format"),
            FileSize(max_size=2048000, message="The picture cannot be larger than 2Mb")
        ],
        description="The picture cannot be larger than 2Mb, only supports jpg/png format"
    )

4. insightface_api.py

The python library of insightface needs to be installed first:

pip install -U insightfae

The model that can be automatically downloaded when running the code is buffalo_l. If you need to use other models, you need to manually download the model file and extract it to the ~/.insightface/models/ directory Down. I chose buffalo_sc. Because compared to the former with 326MB, buffalo_sc is only 16MB, which is more lightweight.

Model manual download address: https://github.com/deepinsight/insightface/tree/master/python-package

(At the same time, this is also a tutorial on the use of insightface’s python library)

This API receives a picture data in base64 encoding format, and returns a response in json format as the data body. The structure of the return value is as follows:

{<!-- -->
    'embeddings': [<embedding1>, <embedding2>, ...],
    'bboxes': [<bbox1>, <bbox2>]
}

code:

# basically complete
from flask import Flask, jsonify, request
from insightface.app import FaceAnalysis
from insightface.data import get_image as ins_get_image
import cv2
import numpy as np
import base64
from utils import base64_to_cv2, np_to_base64

app = Flask(__name__)
face_analysis = FaceAnalysis(providers=['CUDAExecutionProvider', 'CPUExecutionProvider'], name='buffalo_sc')
face_analysis. prepare(ctx_id=0, det_size=(640, 640))


@app.route('/detect', methods=['POST'])
def detect_faces():

    img_base64 = request.data
    img_cv2 = base64_to_cv2(img_base64)
    faces = face_analysis. get(img_cv2)

    embeddings = [np_to_base64(face['embedding']) for face in faces]
    bboxs = [np_to_base64(face['bbox']) for face in faces] # [x1, y1, x2, y2] coordinates of the upper left and lower right corners

    return jsonify({<!-- -->"embeddings": embeddings, "bboxes": bboxes})


if __name__ == '__main__':
    app.run(port=6000, debug=True)

5. app.py

I use a lot of list comprehension (learned not long ago), which may lead to poor readability of the code. I will briefly explain some of them.

rs = [requests.post(url=url, headers=headers, data=img_base64) for img_base64 in imgs_base64]
rs_json = [r.json() for r in rs]
bboxs = [r_json['bboxs'] for r_json in rs_json]
bboxs = [[base64_to_np(bbox) for bbox in bs] for bs in bboxes]

imgs_base64 is the data of two pictures, here is actually the face recognition API packaged earlier that initiates two requests, each time it recognizes a picture and gets a response object put these two responses into a list, namely rs. Therefore, the structure of rs is as follows:

[<Response1>, <Response2>]

r.json() is used to obtain the data in the response object and get the list rs_json, the recognition result of each element in the list for a picture:

[<r_json1>, <r_json2>]

Then take out the label box in each picture to get bboxs, where each character string is the result of encoding the label box data of a face into base64, because there can be multiple people in a picture face, so there may be multiple strings in each element (referring to the inner list) of bboxs.

[[str1_1, str1_2, ...],
 [str2_1, str2_2, ...]]

Then decode the string into an array of numpy, and get the decoded bboxs, roughly structured as follows

[[[x1, y1, x2, y2],
 [x1, y1, x2, y2],...]
 [[x1, y1, x2, y2],
 [x1, y1, x2, y2],...]

code:

from flask import Flask, render_template, request
import requests
from forms import Face2Form
import time
from utils import cv2_to_base64, base64_to_np
from utils import compare_face, cv2_with_rectangle, content_to_cv2


app = Flask(__name__)
app.config['SECRET_KEY'] = 'your_secret_key_here'

# 2. Compare two faces
@app.route('/compare', methods=['GET', 'POST'])
def compare():
    form = Face2Form()
    
    if form.validate_on_submit():
        
        # 1. Get the face image file from the front end
        file1 = form.face_img.data
        file2 = form.face_img2.data
        files = [file1, file2]
        contents = [file. read() for file in files]

        # 2. The image file is converted to cv2 and scaled to the specified size
        imgs_cv2 = content_to_cv2(contents, (300, 250))

        # 3. cv2 to base64 encoded string --> pass to the model
        imgs_base64 = [cv2_to_base64(img_cv2) for img_cv2 in imgs_cv2]

        # 4. Load the model --> get the feature vector + face annotation box
        headers = {<!-- -->"Content-type": "application/json"}
        url = "http://127.0.0.1:6000/detect"
        rs = [requests. post(url=url, headers=headers, data=img_base64) for img_base64 in imgs_base64]
        rs_json = [r.json() for r in rs]
        embeddings = [r_json['embeddings'] for r_json in rs_json]
        embeddings = [[base64_to_np(emb) for emb in embs] for embs in embeddings]
        bboxs = [r_json['bboxs'] for r_json in rs_json]
        bboxs = [[base64_to_np(bbox) for bbox in bs] for bs in bboxes]

        # 5. Compare the feature vectors of the first faces in the two pictures
        embs = [embeddings[i][0] for i in range(len(embeddings))]
        is_like, how_like = compare_face(embs[0], embs[1], threshold=0.5)

        # 6. Frame the detected face (the first one)
        imgs_cv2 = [cv2_with_rectangle(imgs_cv2[i], bboxs[i]) for i in range(len(imgs_cv2))]
        imgs_base64 = [cv2_to_base64(img_cv2) for img_cv2 in imgs_cv2]

        # 7. Return the comparison result
        return render_template(
            'compare.html', form=form,
            imgs_base64=imgs_base64,
            is_like=is_like,
            how_like=how_like)

    return render_template('compare.html', form=form)

# --> Start the app
if __name__ == '__main__':
    app.run(debug=True, port=5000)

Start application

Note that app.py needs to run simultaneously with insightface_api.py to work properly.

python insightface_api.py
python app.py

record

1. Bugs

1.1 cv2.imshow() reports an error

Details:

Traceback (most recent call last):
  File "d:\code_all\code_python\Web Development Basics\face_verify\compare.py", line 18, in <module>
    cv2_show(img)
  File "d:\code_all\code_python\Web Development Basics\face_verify\compare.py", line 10, in cv2_show
    cv2.imshow('img', img_cv2)
cv2.error: OpenCV(4.7.0) D:\a\opencv-python\opencv-python\opencv\modules\highgui\src\window.cpp:1272: error: (-2:Unspecified error) The function is
not implemented. Rebuild the library with Windows, GTK + 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config,
then re-run cmake or configure script in function 'cvShowImage'

Roughly speaking: This error message indicates that the window-related functions of OpenCV have not been implemented, and the library needs to be recompiled and corresponding support added. Solution: Reinstall the opencv library.

1.2 insightface face detection mark frame confusion ()

As shown below (the left is the face detection effect of PaddleHub, and the right is insightface)

The images I displayed using cv2 on the server side of insightface are normal and there is no distortion. It stands to reason that insightface as an open source face recognition library is not so bad, that is still my problem.

Solution: Unexpectedly, it is because when drawing the rectangular face labeling frame, there is a wrong subscript (the original order is [0, 1, 2, 3], it is written as [0 , 1, 2, 2]).

2. Miscellaneous Notes

1. About base64 to numpy

When converting base64 to a numpy array, pay attention to the element type, which was float32 before, and when it is decoded into an array You must write float32, otherwise you may not even match the number of elements, such as when you use uint8 (commonly used in images). And after going through the process of np --> base64 --> np, the multidimensional numpy array will be flattened. Therefore, after decoding, use reshape to return to the original shape (between apps, you can also pass shape as a piece of data).

2. About image aspect ratio and recognition effect

In the previous code, the input image is directly transformed into the specified size, such as **(250, 250)**, but if the original image is not square, such size transformation will cause distortion, that is, the image will be stretched or squished, causing faces in pictures sometimes to be unrecognizable.

It is actually possible to write a zoom that preserves the aspect ratio.

Summary

This time, I successfully implemented the face comparison function in the browser. I still have a relatively shallow understanding of the Flask framework. Sometimes I want to achieve a certain task but I don’t know how to do it, and I don’t know how to find documents and related materials. For example, I wonder what jsonify() will return to the other party, and how the other party will retrieve the data.

Hmm… I feel that the ability to contact and use unfamiliar things needs to be improved.

Article link: https://cfeng.blog.csdn.net/article/details/129719839