MySQL face vector, Euclidean distance similarity query

Foreword

For example, the title uses the extracted face feature vector to write a Euclidean SQL statement to query the top_K data records with the highest similarity in the database. Although the approach is alternative, there is a ready-made facial retrieval API on the market at the business layer, and there is now a vector database at the technical layer.

To use MySQL relational storage of 128-dimensional face vectors, the first step is to calculate the Euclidean distance by looping through each dimension, and then sorting after the root sign. Once there is a lot of data, the query speed can be imagined. But there may be special niche needs. Let’s start with a few steps from feature extraction to SQL spelling.

Environment

python 3.8

opencv-python 4.8

dlib 19.24.2

mysql 5.7

Feature vector extraction

In fact, this step can be skipped, but without this, the randomly created face vector data will not be standard. Here, we read the face in the image and use the face key point detector and facial recognition model of the dlib library to generate a 128D vector. One thing to note here is that when generating vectors, Windows may cause the following error.

Could not locate zlibwapi.dll. Please make sure it is in your library path!

It is true that the zlibwapi library needs to be downloaded first. I have placed it on gitee.com/gaoxingqufuhchao/opencv_demo. After decompression, the zlibwapi.lib file is placed in the lib of the CUDA installation location, the zlibwapi.dll file is placed in the bin of the CUDA installation location, and finally the following code is executed.

import cv2
importdlib
import os

images_path = os.path.join("./imgs/", "62.jpg")
# Return the pixel RGB value array of the image
images = cv2.imread(images_path)

# face detector
face_detector = dlib.get_frontal_face_detector()
faces = face_detector(images, 1)
face = faces[0] # Get the first face

# The upper, lower, left and right coordinates of the face
face_left = face.left()
face_right = face.right()
face_top = face.top()
face_bottom = face.bottom()

# Draw the rectangular position of the face on the picture
#rectangle_img = images
# cv2.rectangle(rectangle_img, (face_left, face_top), (face_right, face_bottom), (255, 0, 0), 2)
# cv2.imwrite('./imgs/draw_imgs/rectangle_test.png', rectangle_img)

# Face key point predictor
predictor = dlib.shape_predictor("./lib/dlib/shape_predictor_68_face_landmarks.dat")
shape = predictor(images, face)

# Draw the location of facial feature points on the picture
point_img = images
for p in range(0, 68):
    cv2.circle(point_img, (shape.part(p).x, shape.part(p).y), 2, (0, 255, 0))
# cv2.imwrite('./imgs/draw_imgs/point_test.png', point_img)

# Facial recognition model
recognition_model = dlib.face_recognition_model_v1("./lib/dlib/dlib_face_recognition_resnet_model_v1.dat")
features = recognition_model.compute_face_descriptor(point_img, shape)
features_list = list(features)

print(features_list)

cv2.imshow("frame", point_img)

# Wait indefinitely for the user to press a key, so the window remains open
cv2.waitKey(0)

cv2.destroyAllWindows()

Vector data collection and storage (table structure)

CREATE TABLE `face_data` (
  `face_id` bigint(20) NOT NULL COMMENT 'number',
  `face_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL COMMENT 'face name',
  `feature_vector` varchar(10240) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'feature vector',
  `update_time` datetime DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP COMMENT 'update time',
  `create_time` datetime DEFAULT NULL COMMENT 'Creation time',
  `data_id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'data primary key',
  PRIMARY KEY (`data_id`) USING BTREE
) ENGINE=MyISAM AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=DYNAMIC;

Customized Euclidean distance calculation function

Since Euclidean distance requires the difference of each dimension value of two points, in MySQL, it is necessary to loop through each value of the vector separated by commas, and then calculate the difference separately and accumulate the squares, and finally the root sign is the distance value. The smaller the numbers, the more similar they are. The SUBSTRING_INDEX function is used in the following statements to retrieve a string of specified length by looping from 1 to 128. The last SUBSTRING_INDEX is to retrieve the last value, which is the value of each dimension of the loop.

CREATE DEFINER=`root`@`localhost` FUNCTION `euclidean_distance`(vector1 VARCHAR(128), vector2 VARCHAR(128)) RETURNS float
    DETERMINISTIC
BEGIN
DECLARE len1 INT;
DECLARE len2 INT;
DECLARE i INT default 1;
DECLARE sum FLOAT;
SET len1 = LENGTH(vector1);
SET len2 = LENGTH(vector2);
SET sum = 0;
while i<129 do
SET sum = sum + POW(SUBSTRING_INDEX(SUBSTRING_INDEX(vector1, ',', i), ',', -1) - SUBSTRING_INDEX(SUBSTRING_INDEX(vector2, ',', i), ',', -1), 2 );
SET i = i + 1;
END while;
RETURN SQRT(sum);
END

SQL statement

To query, you need to call the above function, calculate the distance between the specified vector and the vector value in each library, and then query the TOP5 in order

1. Test to calculate the distance between two 2-dimensional vector points

select euclidean_distance("-0.0781729444861412,0.110044464468956", "-0.12611718475818634,0.13194166123867035")


2. Query the top 5 most similar items to the specified vector

SELECT
face_id,
euclidean_distance('-0.0781729444861412,0.110044464468956,0.06956595182418823,-0.04195471480488777,-0.13797412812709808,-0.010513255372 64347,-0.08628914505243301,-0.17037545144557953,0.10026334226131439,-0.12803488969802856,0.23731382191181183,-0.0946262851357 46,-0.196399986743927,-0.06725674867630005,-0.08001142740249634,0.15153078734874725,-0.14427754282951355,- 0.18382087349891663,-0.05251418426632881,-0.031287774443626404,0.1022581234574318,0.0790780633687973,0.008861299604177475,0 .020882003009319305,-0.06397232413291931,-0.2706142067909241,-0.107462577521801,-0.03476380184292793,0.043968893587589264,-0. 05694590508937836,-0.06642886996269226,0.04123011231422424,-0.18131156265735626,-0.03576948121190071,0.04664576053619385,0.13 26863318681717,0.003225848078727722, -0.047573719173669815,0.1651453822851181,-0.029888691380620003,-0.21578450500965118,0.08537548780441284,0.0766519159078598,0 .2540043294429779,0.22466638684272766,0.06360028684139252,-0.019438330084085464,-0.1689058244228363,0.043127596378326416,-0.1 0.1506 9229900836945,-0.029495960101485252,0.1517685353755951, -0.09771490097045898,0.055743880569934845,0.18273179233074188,-0.05583849176764488,0.004194958135485649,-0.11965519189834595 ,0.17633725702762604,0.07909074425697327,-0.09237003326416016,-0.22499722242355347,0.13512863218784332,-0.16307398676872253,- 0.16911545395851135,0.07714464515447617,-0.16546982526779175,-0.1413039118051529,-0.266265332698822,0.022507959976792336,0.42 01160669326782,0.12732553482055664, -0.2231403887271881,0.036990948021411896,0.010170978493988514,-0.00629305187612772,0.1687457412481308,0.10473113507032394,0. 00399409607052803,-0.013406611979007721,-0.12538184225559235,0.005433365702629089,0.2502303421497345,-0.02349778823554516,-0. 07866109907627106,0.21128004789352417,0.0021615102887153625,-0.0009294161573052406,0.06593659520149231,0.031842511147260666,0 .008746307343244553,0.05120106786489487,-0.19502216577529907, -0.037121932953596115,-0.027497289702296257,-0.0628679022192955,-0.06642049551010132,0.09603795409202576,-0.1084516420960426 3,0.13225442171096802,0.005377473309636116,0.04529388248920441,-0.05451689288020134,-0.038580723106861115,-0.1250952482223510 7,0.016826599836349487,0.20309969782829285,-0.22045192122459412,0.2263873666524887,0.12779565155506134,0.118931345641613,0.12 429559975862503,0.16083121299743652,0.15235833823680878 ,0.04558601975440979,-0.07488320767879486,-0.21370939910411835,-0.07309035211801529,0.0741523876786232,-0.01388844102621078 5,0.16118189692497253,0.10571793466806412', feature_vector)
AS distance
from face_data
order by distance asc limit 5

Others

In actual applications, there may be a pre-login account, and then face authentication is similar to a verification code, so it may be that the user ID is first obtained, and the existing user facial features in the library are taken out, and then the face in the library is used Compare the vector with the vector collected by the camera. Next, we will use numpy to demonstrate the Euclidean distance calculation of two face vectors and the OBS virtual camera face collection.

Euclidean distance

import os
importdlib
import glob
import numpy as np
import cv2

# Calculate the Euclidean distance of the face feature vector
face_01 = [-0.11144012212753296,0.18558962643146515,0.0016858093440532684,-0.030448582023382187,-0.12307003140449524,-0.035731777 54878998,-0.09033556282520294,-0.11640644073486328,0.1467275768518448,-0.020302172750234604,0.3062277138233185,-0.04535881057 38163,-0.19605347514152527,-0.0651734471321106,-0.058096982538700104,0.16225126385688782,-0.21905581653118134,- 0.1029929518699646,0.013068988919258118,0.04029736667871475,0.05545109510421753,0.048230767250061035,0.0634097084403038,0.0 8036009967327118,-0.024840131402015686,-0.30172669887542725,-0.07167031615972519,-0.04815453290939331,0.09843403100967407,-0. 07043944299221039,-0.11155916005373001,0.06715665757656097,-0.1751995086669922,-0.09061796963214874,0.07401317358016968,0.063 98393213748932,-0.05072097107768059,- 0.021325048059225082,0.2341330200433731,-0.02269885316491127,-0.15249019861221313,0.05262574926018715,0.08381971716880798,0 .3157000243663788,0.20683002471923828,0.07494638860225677,0.000228429795242846,-0.12610206007957458,0.046763330698013306,-0.1 0.1787737 9059791565,-0.04263974353671074,0.11318269371986389,-0.11839815974235535 ,0.07388490438461304,0.11303036659955978,-0.03952416777610779,0.013598351739346981,-0.09794041514396667,0.27298909425735474 ,0.08994753658771515,-0.11253070831298828,-0.18397797644138336,0.07603368163108826,-0.10888601094484329,-0.1495959311723709,0 .06924907863140106,-0.13828963041305542,-0.162084698677063,-0.3005405366420746,0.017705515027046204,0.33277949690818787,0.111 3307923078537,-0.21170896291732788 ,0.064120352268219,-0.011443777941167355,-0.08527453988790512,0.07820181548595428,0.11378934234380722,-0.055745672434568405 ,-0.016978316009044647,-0.15705107152462006,-0.021412618458271027,0.22962769865989685,-0.03235346078872681,-0.066568136215209 96,0.22491349279880524,0.012072622776031494,0.029958534985780716,0.048272471874952316,0.0701068788766861,-0.04313665628433227 5,-0.02105512097477913,- 0.20309726893901825,-0.07898224890232086,-0.0016847627703100443,-0.059206850826740265,-0.0697859525680542,0.144578397274017 33,-0.17818854749202728,0.10272862762212753,0.012656494975090027,0.026718538254499435,-0.04024617373943329,-0.016517588868737 22,-0.09254957735538483,-0.024667389690876007,0.13368789851665497,-0.22851689159870148,0.2718657851219177,0.10549997538328171 ,0.13540413975715637,0.13176411390304565, 0.11028678715229034,0.06738141179084778,0.020261352881789207,-0.025652553886175156,-0.11222656071186066,-0.0869554206728935 2,0.0743352621793747,-0.005103417672216892,0.15722909569740295,0.08712321519851685]
face_02 = [-0.06928954273462296,0.11818723380565643,0.040963172912597656,-0.03380622714757919,-0.125632643699646,0.01402871869504 4518,-0.09829825907945633,-0.13999347388744354,0.10254613310098648,-0.13002724945545197,0.2407195270061493,-0.070278152823448 18,-0.25125718116760254,-0.032886065542697906,-0.023477450013160706,0.13404496014118195,-0.16187764704227448,-0.1871164739131 9275 ,-0.03598808869719505,-0.023680226877331734,0.06534431874752045,0.06392928957939148,-0.02046673744916916,0.0262146629393100 74,-0.07153397798538208,-0.3270975947380066,-0.12721474468708038,-0.011338643729686737,0.04254578426480293,-0.076721332967281 34,-0.05328984931111336,0.05578911304473877,-0.19925925135612488,-0.0457644984126091,0.05274736136198044,0.10184930264949799, -0.010766003280878067 ,-0.04257964715361595,0.14281891286373138,-0.0182943232357502,-0.21731841564178467,0.04613770544528961,0.09813840687274933, 0.291894793510437,0.19322234392166138,0.052529361099004745,-0.04576069116592407,-0.12129618227481842,0.024770550429821014,-0. 12463519722223282,0.14057938754558563,0.19306844472885132,0.11266665160655975,0.03293035924434662,0.032030634582042694,-0.097 83944487571716,-0.01481545064598322,0.15577413141727448 ,-0.1594703644514084,0.05806709825992584,0.1744152009487152,-0.06528271734714508,-0.012003951705992222,-0.09977130591869354 ,0.15240584313869476,0.07093586027622223,-0.0664982721209526,-0.2397862672805786,0.117387555539608,-0.1688736081123352,-0.124 0.429 0091395378113, 0.12104129791259766,-0.1617836207151413,0.06288817524909973,0.02954815700650215,-0.07710471004247665,0.17532427608966827,0. 08251089602708817,0.007482852786779404,-0.034130796790122986,-0.12541988492012024,0.07749714702367783,0.24795527756214142,0.0 07125057280063629,-0.06481210887432098,0.2083437591791153,-0.013705004006624222,-0.030975420027971268,0.07249104231595993,0.0 6181362271308899,-0.010871338658034801,0.0194531362503767, -0.1996041238307953,-0.08055830746889114,-0.018493879586458206,-0.047192126512527466,-0.06708985567092896,0.1130197271704673 8,-0.1046549454331398,0.10359721630811691,0.03471799194812775,0.04366254806518555,0.004359336569905281,-0.01676999032497406,- 0.1265866607427597,-0.05440763011574745,0.17379546165466309,-0.25932636857032776,0.14294737577438354,0.11041377484798431,0.07 096583396196365,0.06915411353111267, 0.08656369149684906,0.10719767212867737,0.02909727394580841,-0.11240153759717941,-0.24186159670352936,-0.06347794830799103, 0.07049560546875,-0.013076946139335632,0.1388940066099167,0.04212008789181709]
face_03 = [-0.12067002058029175,0.17179107666015625,0.005495276302099228,-0.0025373362004756927,-0.12682169675827026,-0.046837806 701660156,-0.06272760033607483,-0.11881549656391144,0.131977841258049,-0.007651656866073608,0.30675050616264343,-0.0651157572 8654861,-0.19687950611114502,-0.05430234596133232,-0.061561085283756256,0.15335164964199066,-0.1840490847826004,- 0.10880403220653534,0.003372782375663519,0.020364956930279732,0.07791364192962646,0.04028210788965225,0.06327678263187408,0 .09058713912963867,-0.041233912110328674,-0.2671954035758972,-0.07270435988903046,-0.05469353869557381,0.10929089039564133,-0 .07597753405570984,-0.12340617924928665,0.060622505843639374,-0.19064907729625702,-0.11399897933006287,0.09063999354839325,0. 06096772477030754,-0.07348572462797165,- 0.03499014303088188,0.2246333211660385,-0.04379507154226303,-0.14929009974002838,0.049476295709609985,0.09826625883579254,0 .3019770681858063,0.1847151517868042,0.08934237062931061,-0.01426877174526453,-0.15356388688087463,0.03547971695661545,-0.166 0.168923854 82788086,-0.024090711027383804,0.09763640910387039,- 0.12262223660945892,0.07233531028032303,0.11958196014165878,-0.03347306326031685,0.018053846433758736,-0.11581023037433624, 0.24381747841835022,0.07473541796207428,-0.09524650871753693,-0.18174171447753906,0.07709110528230667,-0.12344227731227875,-0 .1506771594285965,0.09598767012357712,-0.11854802072048187,-0.1694779396057129,-0.3036326766014099,0.01272799912840128,0.3551 30672454834,0.11748693883419037,- 0.1921916902065277,0.06253601610660553,-0.008992472663521767,-0.08943237364292145,0.07774092257022858,0.09980118274688721,- 0.06231308355927467,-0.023926906287670135,-0.15287624299526215,-0.012958861887454987,0.23298338055610657,-0.04413610696792602 5,-0.06738908588886261,0.2136751115322113,0.020884856581687927,0.019170459359884262,0.05304054170846939,0.07699457556009293,- 0.04709392040967941,-0.013687841594219208, -0.1895902156829834,-0.07537028193473816,-0.009934772737324238,-0.056975316256284714,-0.06765390932559967,0.1392876952886581 4,-0.17901428043842316,0.08535695821046829,0.014542719349265099,0.011990601196885109,-0.04733775928616524,-0.0227461755275726 32,-0.11427918076515198,-0.02713892236351967,0.11226294934749603,-0.2461158186197281,0.2587936222553253,0.10381574183702469,0 .14032721519470215,0.11607213318347931 ,0.10593312233686447,0.061977699398994446,0.03079364448785782,-0.03547735884785652,-0.12035234272480011,-0.0864846110343933 1,0.09446684271097183,-0.008687352761626244,0.14321735501289368,0.08460132777690887]
#fusionyellow
face_04 = [-0.0781729444861412,0.110044464468956,0.06956595182418823,-0.04195471480488777,-0.13797412812709808,-0.010513255372643 47,-0.08628914505243301,-0.17037545144557953,0.10026334226131439,-0.12803488969802856,0.23731382191181183,-0.094626285135746, -0.196399986743927,-0.06725674867630005,-0.08001142740249634,0.15153078734874725,-0.14427754282951355,- 0.18382087349891663,-0.05251418426632881,-0.031287774443626404,0.1022581234574318,0.0790780633687973,0.008861299604177475,0 .020882003009319305,-0.06397232413291931,-0.2706142067909241,-0.107462577521801,-0.03476380184292793,0.043968893587589264,-0. 05694590508937836,-0.06642886996269226,0.04123011231422424,-0.18131156265735626,-0.03576948121190071,0.04664576053619385,0.13 26863318681717,0.003225848078727722, -0.047573719173669815,0.1651453822851181,-0.029888691380620003,-0.21578450500965118,0.08537548780441284,0.0766519159078598,0 .2540043294429779,0.22466638684272766,0.06360028684139252,-0.019438330084085464,-0.1689058244228363,0.043127596378326416,-0.1 0.1506 9229900836945,-0.029495960101485252,0.1517685353755951, -0.09771490097045898,0.055743880569934845,0.18273179233074188,-0.05583849176764488,0.004194958135485649,-0.11965519189834595 ,0.17633725702762604,0.07909074425697327,-0.09237003326416016,-0.22499722242355347,0.13512863218784332,-0.16307398676872253,- 0.16911545395851135,0.07714464515447617,-0.16546982526779175,-0.1413039118051529,-0.266265332698822,0.022507959976792336,0.42 01160669326782,0.12732553482055664, -0.2231403887271881,0.036990948021411896,0.010170978493988514,-0.00629305187612772,0.1687457412481308,0.10473113507032394,0. 00399409607052803,-0.013406611979007721,-0.12538184225559235,0.005433365702629089,0.2502303421497345,-0.02349778823554516,-0. 07866109907627106,0.21128004789352417,0.0021615102887153625,-0.0009294161573052406,0.06593659520149231,0.031842511147260666,0 .008746307343244553,0.05120106786489487,-0.19502216577529907, -0.037121932953596115,-0.027497289702296257,-0.0628679022192955,-0.06642049551010132,0.09603795409202576,-0.1084516420960426 3,0.13225442171096802,0.005377473309636116,0.04529388248920441,-0.05451689288020134,-0.038580723106861115,-0.1250952482223510 7,0.016826599836349487,0.20309969782829285,-0.22045192122459412,0.2263873666524887,0.12779565155506134,0.118931345641613,0.12 429559975862503,0.16083121299743652,0.15235833823680878 ,0.04558601975440979,-0.07488320767879486,-0.21370939910411835,-0.07309035211801529,0.0741523876786232,-0.01388844102621078 5,0.16118189692497253,0.10571793466806412]
face_01_arr = np.array(face_01)
face_02_arr = np.array(face_02)
face_03_arr = np.array(face_03)
face_04_arr = np.array(face_04)

# Generally less than 0.4, the probability is the same person
# distance = np.linalg.norm(face_02_arr - face_04_arr)
distance = np.linalg.norm(np.array([10,51]) - np.array([15,78]))
print(distance)
exit()

OBS virtual camera face collection

import cv2

indices = 0
# Get the OBS virtual camera and fill in the IP address for the real camera
cap = cv2.VideoCapture(0)
face_xml = cv2.CascadeClassifier("models/haarcascade_frontalface_default.xml") #Import XML file
while True:
    ret, frame = cap.read()
    if not ret:
        break

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #Convert to grayscale image
    face = face_xml.detectMultiScale(gray, 1.3, 10) # Detect faces and return face location information

    if indices % 3000 == 0:
        print(face)

    indices + = 1
    for (x, y, w, h) in face:
        cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 0), 2)

    cv2.imshow("frame", frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release resources
cap.release()
cv2.destroyAllWindows()

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. OpenCV skill tree Home page Overview 24094 people are learning the system