Design and implementation of python image retrieval system computer competition

0 Preface

A series of high-quality competition projects, what I want to share today is

Design and implementation of python image retrieval system

Seniors give a comprehensive score to a question here (each item is worth 5 points)

  • Difficulty factor: 3 points
  • Workload: 3 points
  • Innovation point: 4 points

This project is relatively new and suitable as a competition topic. It is highly recommended by senior students!

More information, project sharing:

https://gitee.com/dancheng-senior/postgraduate

1 Introduction to the topic

Image retrieval: It is to find pictures similar to the image to be matched from a bunch of pictures, that is, to find pictures by pictures.
In the Internet era, with the rise of various social networks, picture and video data on the Internet are growing at an alarming rate every day, gradually forming a powerful image retrieval database. For these massive pictures with rich information, how to effectively retrieve the pictures that users need from huge image databases has become an interesting research direction for researchers in the field of information retrieval.

2 Introduction to image retrieval

Given a query image containing a specific instance (e.g. a specific object, scene, building, etc.), image retrieval aims to find images containing the same instance from database images. However, due to different shooting angles, lighting, or occlusion conditions of different images, how to design an effective and efficient image retrieval algorithm that can cope with these intra-class differences is still a research problem.

Typical process of image retrieval
First, try to extract a suitable representation vector of the image from the image. Secondly, a nearest neighbor search is performed on these representation vectors using Euclidean distance or cosine distance to find similar images. Finally, some post-processing techniques can be used to fine-tune the retrieval results. It can be seen that the key to determining the performance of an image retrieval algorithm lies in the quality of the extracted image representation.

(1) Unsupervised image retrieval

Unsupervised image retrieval aims to extract image representations without resorting to other supervised information, using only the ImageNet pre-trained model as a fixed feature extractor.

Intuitive thinking
Since deep fully connected features provide a high-level description of image content and are in a “natural” vector form, an intuitive idea is to directly extract deep fully connected features as the representation vector of the image. However, since the fully connected features are intended for image classification and lack description of image details, the retrieval accuracy of this idea is average.

Utilizing deep convolutional features Since deep convolutional features have better detailed information and can handle any

CROW
Deep convolutional features are a distributed representation. Although the response value of a neuron is not very useful in determining whether the corresponding area contains the target, if multiple neurons have large response values at the same time, then the area is likely to contain the target. Therefore, CroW adds the feature maps along the channel direction to obtain a two-dimensional aggregation map, and the result of normalization and root-normalization is used as the spatial weight. The channel weight of CroW is defined according to the sparsity of the feature map, which is similar to TF- in natural language processing.
The IDF feature in the IDF feature is used to improve features that appear infrequently but have discriminative capabilities.

Class weighted features
This method attempts to combine the network’s category prediction information to make the spatial weights more discriminative. Specifically, it uses CAM to obtain the semantic information of the most representative regions corresponding to each category in the pre-trained network, and then uses the normalized CAM results as spatial weights.

PWA
PWA found that different channels of deep convolutional features correspond to responses from different parts of the target. Therefore, PWA selects a series of discriminative feature maps, combines their normalized results as spatial weights, and concatenates the results as the final image representation.

(2) Supervised image retrieval

Supervised image retrieval first fine-tunes the ImageNet pre-trained model on an additional training data set, and then extracts image representations from this fine-tuned model. To achieve better results, the training data set used for fine-tuning is usually similar to the data set to be used for retrieval. In addition, the candidate region network can be used to extract foreground regions in the image that may contain objects.

siamese network
Similar to the idea of face recognition, binary or ternary (+ ±) input is used to train the model to make the distance between similar samples as small as possible, and the distance between dissimilar samples as large as possible.

3 Image retrieval steps

Image retrieval technology mainly includes several steps, namely:

  • Enter image

  • Feature extraction

  • Metric learning

  • Reorder

  • Feature extraction: that is, reducing the dimensionality of the image data and extracting the discriminative information of the data. Generally, the dimensionality of an image is reduced to a vector;

  • Metric learning: Generally, the metric function is used to calculate the distance between image features and used as loss to train the feature extraction network so that the features extracted from similar images are similar and the features extracted from different types of images are quite different.

  • Reordering: Utilize the manifold relationship between data to reorder the measurement results to obtain better retrieval results.

4 Application examples

The senior made a demo of the image searcher here, the effect is as follows

Project code:

Key code:

? # _*_ coding=utf-8 _*_
? from math import sqrt
?import cv2
?import time
?import os
?import numpy as np
? from scipy.stats.stats import pearsonr
? #Configuration item file
?import pymysql
? from config import *
? from mysql_config import *
? from utils import getColorVec, Bdistance
?

    db = pymysql.connect(DB_addr, DB_user, DB_passwod, DB_name)
    
    def query(filename):
        if filename=="":
            fileToProcess=input("Enter the file name of the picture in the subfolder")
        else:
            fileToProcess=filename
        #fileToProcess="45.jpg"
        if(not os.path.exists(FOLDER + fileToProcess)):
            raise RuntimeError("File does not exist")
        start_time=time.time()
        img=cv2.imread(FOLDER + fileToProcess)
        colorVec1=getColorVec(img)
        #Streaming cursor processing
        conn = pymysql.connect(host=DB_addr, user=DB_user, passwd=DB_passwod, db=DB_name, port=3306,
                               charset='utf8', cursorclass = pymysql.cursors.SSCursor)
        leastNearRInFive=0
    
        Rlist=[]
        namelist=[]
        init_str="k"
        for one in range(0, MATCH_ITEM_NUM):
            Rlist.append(0)
            namelist.append(init_str)
    
        with conn.cursor() as cursor:
            cursor.execute("select name, featureValue from " + TABLE_NAME + " order by name")
            row=cursor.fetchone()
            count=1
            while row is not None:
                if row[0] == fileToProcess:
                    row=cursor.fetchone()
                    continue
                colorVec2=row[1].split(',')
                colorVec2=list(map(eval, colorVec2))
                R2=pearsonr(colorVec1, colorVec2)
                rela=R2[0]
                #R2=Bdistance(colorVec1, colorVec2)
                #rela=R2
                #Ignore positivity
                #if abs(rela)>abs(leastNearRInFive):
                #Consider positive and negative
                if rela>leastNearRInFive:
                    index=0
                    for one in Rlist:
                        if rela >one:
                            Rlist.insert(index, rela)
                            Rlist.pop(MATCH_ITEM_NUM)
                            namelist.insert(index, row[0])
                            namelist.pop(MATCH_ITEM_NUM)
                            leastNearRInFive=Rlist[MATCH_ITEM_NUM-1]
                            break
                        index + =1
                count + =1
                row=cursor.fetchone()
        end_time=time.time()
        time_cost=end_time-start_time
        print("spend ", time_cost, ' s')
        for one in range(0, MATCH_ITEM_NUM):
            print(namelist[one] + "\t\t" + str(float(Rlist[one])))


?
? if __name__ == '__main__':
? #WriteDb()
? #exit()
? query("")


?

Effect



5 Finally

More information, project sharing:

https://gitee.com/dancheng-senior/postgraduate