PPocr-openvino implementation (C++)

Foreword

Tasks during the trial period

The entry interview was about deep learning, but I never touched deep learning again after I came in. . . was arranged into traditional images and some traditional algorithms. . It is still a sweat enterprise, requiring unpaid overtime (no probationary period subsidies), and being contacted at any time on weekends, but the salary is only a small amount. . . It is still very difficult to communicate with the team leader. I don’t know whether it is a problem with his language logic or a problem with me. . . . I can’t hold on anymore, so I’ve been preparing to run away and organize documents recently, and briefly record the projects I’ve done. I won’t explain the things used, just describe the process directly.

Environment preparation

Official website download: Download link

The 2022.3 version I downloaded

Download opencl at the same time, I am using the 2023.2 version.

Here is a compressed package for both software: download link

Compress the openvino compressed package to the specified location. The 2023 version I used at first turned out to be useless (haha), and then I downloaded the 2022 version. The original 2023 version is an exe, which is on the C drive by default. For convenience, I will also put it on the C drive later. Settled.

The path is as shown in the figure:

After unzipping, create a shortcut folder for easy use.

Then add the path:

C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\3rdparty\tbb\bin
C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\3rdparty\tbb\bin\intel64\vc14
C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\3rdparty\tbb\redist\intel64\vc14
C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\bin\intel64\Release
C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\bin\intel64\Debug
C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\3rdparty\hddl\bin
C:\Program Files (x86)\Intel\openvino_2022.3.1\tools\compile_tool

Add it to the system environment variables, you can use the shortcut folder (the name of the 2023 version is very complicated, it will be much better to use the shortcut folder)

Then install opencl (I won’t post the picture if it’s already installed)

Notice! ! ! Also remember to have opencv! ! ! So is the environment!

Then in VS:

The environment is ready.

Code

Written in C++, the assigned task is to write a C++ code, generate a dll, and complete the ppocr text recognition effect in C#.

Just paste the C++ code here.

Notice! ! ! I referred to this article and made some changes.

Because a dll is generated, the image is converted into a byte array in C# and passed to C++, and the model is passed to C++ via the path and then read.

OpenvinoOcr.cpp

#include <iostream>
#include <time.h>
#include "opencv2/opencv.hpp"
#include <opencv2/core/core.hpp>
#include "OpenvinoOcr.h"
#include <inference_engine.hpp>
#include <exception>
//#include <CL/cl.hpp>
//#include <CL/opencl.h>

using namespace std;

//global variables

InferenceEngine::ExecutableNetwork* Ptr[5];
InferenceEngine::Core core;
InferenceEngine::ExecutableNetwork executable_network;
unordered_map<int, InferenceEngine::ExecutableNetwork>Models; //Dictionary
//unordered_map<int, InferenceEngine::ExecutableNetwork>::iterator ModelsIter;
//Store model variables in the dictionary, num is used to control the number of stored models
DLLEXPORT bool Modelinsert(int num, InferenceEngine::ExecutableNetwork executable_network) {

for (int i = 0; i <= num; i + + ) {
Models.insert(pair<int, InferenceEngine::ExecutableNetwork>(i, executable_network));

}
return true;
}

//Find the corresponding model variable based on the key value
DLLEXPORT InferenceEngine::ExecutableNetwork Find_model(int num)
{
InferenceEngine::ExecutableNetwork executable_network;
executable_network = Models.find(num)->second;
return executable_network;
}

//Model initialization judgment
DLLEXPORT bool InnitPtr(int num) {
if (Ptr[0] == NULL)
{
for (int i = 0; i <= num; i + + ) {
\t\t\t
Ptr[i] = new InferenceEngine::ExecutableNetwork();
}
}
return true;
}

//Read the path and input the model into the model variable
DLLEXPORT bool Create_model(int num, char* xmlPathc) {
try {
if (Models.empty() || num > Models.size()) {
string xmlPath(xmlPathc);
InferenceEngine::CNNNetwork network = core.ReadNetwork(xmlPath);
executable_network = core.LoadNetwork(network, "GPU");
\t\t
for (int i = 0; i <= num; i + + ) {
Models.insert(pair<int, InferenceEngine::ExecutableNetwork>(i, executable_network));
}
}
return true;
}
catch (exception e) {
int i = 0;
}
}


//Clear the model
DLLEXPORT void DeleteModel()
{
Models.clear();
core.UnregisterPlugin("GPU");
}

//Find model
DLLEXPORT bool FindModel(int num, char* xmlPathc) {

InnitPtr(num);
Create_model(num, xmlPathc);
return true;

}

//Normalized
void normalizeImage(const cv::Mat & amp; image, cv::Mat & amp; out, std::vector<double>mean, std::vector<double>stdv)
{
if (image.empty())
throw "normalizeImage input image is empty()!";
if (mean.size() != stdv.size())
throw "normalizeImage mean.size() != stdv.size()!";
if (mean.size() != image.channels())
throw "normalizeImage mean.size() != image.channels()";

for (double stdv_item : stdv)
{
//if standard deviation is zero, the image's all pixels are same
if (stdv_item == 0)
throw "normalizeImage stdv is zero";
}

image.convertTo(out, CV_32F, 1.0 / 256.0f, 0);

if (out.channels() == 1)
{
out -= mean[0];
out /= stdv[0];
}
else if (out.channels() > 1)
{
std::vector<cv::Mat> channelImage;
cv::split(out, channelImage);
for (int i = 0; i < out.channels(); i + + )
{
channelImage[i] -= mean[i];
channelImage[i] /= stdv[i];
}
cv::merge(channelImage, out);
}

return;
}
//padding
void paddingImage(const cv::Mat & amp; image, cv::Mat & amp; out,
int top, int left, int bottom, int right,
int bodeyType, const cv::Scalar & amp; value)
{
if (image.empty())
throw "padding input image is empty()!";

cv::copyMakeBorder(image, out, top, bottom, left, right, bodeyType, value);

return;
}

//preprocessing
void paddleOCRPreprocess(const cv::Mat & amp; image, cv::Mat & amp; out, const int targetHeight, const int targetWidth,
std::vector<double>mean, std::vector<double>stdv)
{
int sourceWidth = image.cols;
int sourceHeight = image.rows;
double sourceWHRatio = (double)sourceWidth / sourceHeight;

int newHeight = targetHeight;
int newWidth = newHeight * sourceWHRatio;

if (newWidth > targetWidth)
newWidth = targetWidth;
cv::resize(image, out, cv::Size(newWidth, newHeight));

normalizeImage(out, out, mean, stdv);

//Padding image
//the resized image's height is always equal to targetHeight, but width will not
if (newWidth < targetWidth)
{
int right = targetWidth - newWidth;
//paddingImage(out, out, 0, 0, 0, right, cv::BORDER_REPLICATE);//Padding according to the last line
paddingImage(out, out, 0, 0, 0, right, cv::BORDER_CONSTANT, cv::Scalar(0, 0, 0));//0 padding
}
//showImage(out, "padding",1,0);
}

//Post-processing identification
void paddleOCRPostProcess(cv::Mat & amp; output, std::string & amp; result, float & amp; prob)
{
std::string dict = "0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~!"#$% & amp;'()* + ,-./ ";

if (output.empty())
return;
result = "";
int h = output.rows;
int w = output.cols;
std::vector<int> maxIndex;
std::vector<float>maxProb;

double maxVal;
cv::Point maxLoc;
for (int row = 0; row < h; row + + )
{
cv::Mat temp(1, w, CV_32FC1, output.ptr<float>(row));
cv::minMaxLoc(temp, NULL, & maxVal, NULL, & maxLoc);
maxIndex.push_back(maxLoc.x);
maxProb.push_back((float)maxVal);
}

std::vector<int>selectedIndex;
std::vector<float>selectedProb;
//Find the position in maxIndex that is different from the previous index and is not 0,
//Judge the first element first
if (maxIndex.size() != 0 & amp; & amp; maxIndex[0] != 0)
{
selectedIndex.push_back(maxIndex[0]);
selectedProb.push_back(maxProb[0]);
}
for (int i = 1; i < maxIndex.size(); i + + )
{
if (maxIndex[i] != maxIndex[i - 1] & amp; & amp; maxIndex[i] != 0)
{
selectedIndex.push_back(maxIndex[i]);
selectedProb.push_back(maxProb[i]);
}
}

double meanProb = 0;
for (int i = 0; i < selectedIndex.size(); i + + )
{
result + = dict[selectedIndex[i] - 1];
meanProb + = selectedProb[i];
}
if (selectedIndex.size() == 0)
meanProb = 0;
else
meanProb /= selectedIndex.size();
prob = meanProb;
return;
}

char* string2char(string s) {
char* c;
const int len = s.length();
c = new char[len + 1];
strcpy(c, s.c_str());
return c;
}

DLLEXPORT char* OpenvinoPpocr(unsigned char* ucImg, int width, int height, double* similarity, int num)
{
string result = "";
cv::Mat image = cv::Mat(height, width, CV_8UC3, ucImg, 0);
string inputNodeName = "x", outputNodeName = "softmax_2.tmp_0";

vector<double>mean = { 0.5,0.5,0.5 };
vector<double>stdv = { 0.5,0.5,0.5 };
const int targetHeight = 48;
const int targetWidth = 320;
Ptr[num] = & amp;(Models.find(num)->second);
InferenceEngine::InferRequest inferRequest = Ptr[num]->CreateInferRequest();
inferRequest.Infer();
InferenceEngine::Blob::Ptr inputBlobPtr = inferRequest.GetBlob(inputNodeName);
InferenceEngine::SizeVector inputSize = inputBlobPtr->getTensorDesc().getDims();
auto inputdata = inputBlobPtr->buffer()
.as<InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type*>();
InferenceEngine::Blob::Ptr outputBlobPtr = inferRequest.GetBlob(outputNodeName);
auto outputData = outputBlobPtr->buffer().
as<InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type*>();
InferenceEngine::SizeVector outputSize = outputBlobPtr->getTensorDesc().getDims();
cv::Mat input, output;
size_t channels = inputSize[1];
size_t inputHeight = inputSize[2];
size_t inputWidth = inputSize[3];
rsize_t imageSize = inputHeight * inputWidth;
paddleOCRPreprocess(image, input, targetHeight, targetWidth, mean, stdv);
for (size_t pid = 0; pid < imageSize; + + pid)
{
for (size_t ch = 0; ch < channels; + + ch)
{
inputdata[imageSize * ch + pid] = input.at<cv::Vec3f>(pid)[ch];
}
}
inferRequest.Infer();
cv::Mat temp(outputSize[1], outputSize[2], CV_32FC1, outputData);
output = temp;
float prob;
paddleOCRPostProcess(output, result, prob);
* similarity = (double)prob;
char* resultc = string2char(result);
return resultc;
}


.h file, if you want to add or modify the interface, remember that the functions of cpp and h must be consistent.

#pragma once
#ifndef _PADDLEOCR_H_
#define _PADDLEOCR_H_

#include "opencv.hpp"
#include "inference_engine.hpp"
#define DLLEXPORT extern "C" __declspec(dllexport)

using namespace std;

void normalizeImage(const cv::Mat & amp; image, cv::Mat & amp; out, std::vector<double>mean, std::vector<double>stdv);

void paddingImage(const cv::Mat & amp; image, cv::Mat & amp; out,
int top, int left, int bottom, int right,
int bodeyType, const cv::Scalar & amp; value = cv::Scalar());

void paddleOCRPreprocess(const cv::Mat & amp; image, cv::Mat & amp; out, const int targetHeight, const int targetWidth,
std::vector<double>mean, std::vector<double>stdv);

void paddleOCRPostProcess(cv::Mat & amp; output, std::string & amp; result, float & amp; prob);

char* string2char(string s);

//DLLEXPORT void demo(char* xmlPathc, char* binPathc, unsigned char* ucImg, int width, int height, char* resultc);
DLLEXPORT char* OpenvinoPpocr(unsigned char* ucImg, int width, int height, double* similarity, int num);

DLLEXPORT bool Create_model(int num, char* xmlPathc);

DLLEXPORT void DeleteModel();

DLLEXPORT bool InnitPtr(int num);

DLLEXPORT bool FindModel(int num, char* xmlPathc);

#endif // !_PADDLEOCR_H_


#pragma once
#pragma once
#pragma once

Model

It seems that the model can now be used with onnx without any problem, but what I got is the ir model. I searched online for the conversion method. I may need to use the openvino package of python. You can check it in csdn for details. It is not difficult. Then Because it is still a deep learning algorithm and the models are relatively targeted, we will not include ours here. There are also modified codes for reference. Please use them well.