Foreword
Tasks during the trial period
The entry interview was about deep learning, but I never touched deep learning again after I came in. . . was arranged into traditional images and some traditional algorithms. . It is still a sweat enterprise, requiring unpaid overtime (no probationary period subsidies), and being contacted at any time on weekends, but the salary is only a small amount. . . It is still very difficult to communicate with the team leader. I don’t know whether it is a problem with his language logic or a problem with me. . . . I can’t hold on anymore, so I’ve been preparing to run away and organize documents recently, and briefly record the projects I’ve done. I won’t explain the things used, just describe the process directly.
Environment preparation
Official website download: Download link
The 2022.3 version I downloaded
Download opencl at the same time, I am using the 2023.2 version.
Here is a compressed package for both software: download link
Compress the openvino compressed package to the specified location. The 2023 version I used at first turned out to be useless (haha), and then I downloaded the 2022 version. The original 2023 version is an exe, which is on the C drive by default. For convenience, I will also put it on the C drive later. Settled.
The path is as shown in the figure:
After unzipping, create a shortcut folder for easy use.
Then add the path:
C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\3rdparty\tbb\bin C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\3rdparty\tbb\bin\intel64\vc14 C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\3rdparty\tbb\redist\intel64\vc14 C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\bin\intel64\Release C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\bin\intel64\Debug C:\Program Files (x86)\Intel\openvino_2022.3.1\runtime\3rdparty\hddl\bin C:\Program Files (x86)\Intel\openvino_2022.3.1\tools\compile_tool
Add it to the system environment variables, you can use the shortcut folder (the name of the 2023 version is very complicated, it will be much better to use the shortcut folder)
Then install opencl (I won’t post the picture if it’s already installed)
Notice! ! ! Also remember to have opencv! ! ! So is the environment!
Then in VS:
The environment is ready.
Code
Written in C++, the assigned task is to write a C++ code, generate a dll, and complete the ppocr text recognition effect in C#.
Just paste the C++ code here.
Notice! ! ! I referred to this article and made some changes.
Because a dll is generated, the image is converted into a byte array in C# and passed to C++, and the model is passed to C++ via the path and then read.
OpenvinoOcr.cpp
#include <iostream> #include <time.h> #include "opencv2/opencv.hpp" #include <opencv2/core/core.hpp> #include "OpenvinoOcr.h" #include <inference_engine.hpp> #include <exception> //#include <CL/cl.hpp> //#include <CL/opencl.h> using namespace std; //global variables InferenceEngine::ExecutableNetwork* Ptr[5]; InferenceEngine::Core core; InferenceEngine::ExecutableNetwork executable_network; unordered_map<int, InferenceEngine::ExecutableNetwork>Models; //Dictionary //unordered_map<int, InferenceEngine::ExecutableNetwork>::iterator ModelsIter; //Store model variables in the dictionary, num is used to control the number of stored models DLLEXPORT bool Modelinsert(int num, InferenceEngine::ExecutableNetwork executable_network) { for (int i = 0; i <= num; i + + ) { Models.insert(pair<int, InferenceEngine::ExecutableNetwork>(i, executable_network)); } return true; } //Find the corresponding model variable based on the key value DLLEXPORT InferenceEngine::ExecutableNetwork Find_model(int num) { InferenceEngine::ExecutableNetwork executable_network; executable_network = Models.find(num)->second; return executable_network; } //Model initialization judgment DLLEXPORT bool InnitPtr(int num) { if (Ptr[0] == NULL) { for (int i = 0; i <= num; i + + ) { \t\t\t Ptr[i] = new InferenceEngine::ExecutableNetwork(); } } return true; } //Read the path and input the model into the model variable DLLEXPORT bool Create_model(int num, char* xmlPathc) { try { if (Models.empty() || num > Models.size()) { string xmlPath(xmlPathc); InferenceEngine::CNNNetwork network = core.ReadNetwork(xmlPath); executable_network = core.LoadNetwork(network, "GPU"); \t\t for (int i = 0; i <= num; i + + ) { Models.insert(pair<int, InferenceEngine::ExecutableNetwork>(i, executable_network)); } } return true; } catch (exception e) { int i = 0; } } //Clear the model DLLEXPORT void DeleteModel() { Models.clear(); core.UnregisterPlugin("GPU"); } //Find model DLLEXPORT bool FindModel(int num, char* xmlPathc) { InnitPtr(num); Create_model(num, xmlPathc); return true; } //Normalized void normalizeImage(const cv::Mat & amp; image, cv::Mat & amp; out, std::vector<double>mean, std::vector<double>stdv) { if (image.empty()) throw "normalizeImage input image is empty()!"; if (mean.size() != stdv.size()) throw "normalizeImage mean.size() != stdv.size()!"; if (mean.size() != image.channels()) throw "normalizeImage mean.size() != image.channels()"; for (double stdv_item : stdv) { //if standard deviation is zero, the image's all pixels are same if (stdv_item == 0) throw "normalizeImage stdv is zero"; } image.convertTo(out, CV_32F, 1.0 / 256.0f, 0); if (out.channels() == 1) { out -= mean[0]; out /= stdv[0]; } else if (out.channels() > 1) { std::vector<cv::Mat> channelImage; cv::split(out, channelImage); for (int i = 0; i < out.channels(); i + + ) { channelImage[i] -= mean[i]; channelImage[i] /= stdv[i]; } cv::merge(channelImage, out); } return; } //padding void paddingImage(const cv::Mat & amp; image, cv::Mat & amp; out, int top, int left, int bottom, int right, int bodeyType, const cv::Scalar & amp; value) { if (image.empty()) throw "padding input image is empty()!"; cv::copyMakeBorder(image, out, top, bottom, left, right, bodeyType, value); return; } //preprocessing void paddleOCRPreprocess(const cv::Mat & amp; image, cv::Mat & amp; out, const int targetHeight, const int targetWidth, std::vector<double>mean, std::vector<double>stdv) { int sourceWidth = image.cols; int sourceHeight = image.rows; double sourceWHRatio = (double)sourceWidth / sourceHeight; int newHeight = targetHeight; int newWidth = newHeight * sourceWHRatio; if (newWidth > targetWidth) newWidth = targetWidth; cv::resize(image, out, cv::Size(newWidth, newHeight)); normalizeImage(out, out, mean, stdv); //Padding image //the resized image's height is always equal to targetHeight, but width will not if (newWidth < targetWidth) { int right = targetWidth - newWidth; //paddingImage(out, out, 0, 0, 0, right, cv::BORDER_REPLICATE);//Padding according to the last line paddingImage(out, out, 0, 0, 0, right, cv::BORDER_CONSTANT, cv::Scalar(0, 0, 0));//0 padding } //showImage(out, "padding",1,0); } //Post-processing identification void paddleOCRPostProcess(cv::Mat & amp; output, std::string & amp; result, float & amp; prob) { std::string dict = "0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~!"#$% & amp;'()* + ,-./ "; if (output.empty()) return; result = ""; int h = output.rows; int w = output.cols; std::vector<int> maxIndex; std::vector<float>maxProb; double maxVal; cv::Point maxLoc; for (int row = 0; row < h; row + + ) { cv::Mat temp(1, w, CV_32FC1, output.ptr<float>(row)); cv::minMaxLoc(temp, NULL, & maxVal, NULL, & maxLoc); maxIndex.push_back(maxLoc.x); maxProb.push_back((float)maxVal); } std::vector<int>selectedIndex; std::vector<float>selectedProb; //Find the position in maxIndex that is different from the previous index and is not 0, //Judge the first element first if (maxIndex.size() != 0 & amp; & amp; maxIndex[0] != 0) { selectedIndex.push_back(maxIndex[0]); selectedProb.push_back(maxProb[0]); } for (int i = 1; i < maxIndex.size(); i + + ) { if (maxIndex[i] != maxIndex[i - 1] & amp; & amp; maxIndex[i] != 0) { selectedIndex.push_back(maxIndex[i]); selectedProb.push_back(maxProb[i]); } } double meanProb = 0; for (int i = 0; i < selectedIndex.size(); i + + ) { result + = dict[selectedIndex[i] - 1]; meanProb + = selectedProb[i]; } if (selectedIndex.size() == 0) meanProb = 0; else meanProb /= selectedIndex.size(); prob = meanProb; return; } char* string2char(string s) { char* c; const int len = s.length(); c = new char[len + 1]; strcpy(c, s.c_str()); return c; } DLLEXPORT char* OpenvinoPpocr(unsigned char* ucImg, int width, int height, double* similarity, int num) { string result = ""; cv::Mat image = cv::Mat(height, width, CV_8UC3, ucImg, 0); string inputNodeName = "x", outputNodeName = "softmax_2.tmp_0"; vector<double>mean = { 0.5,0.5,0.5 }; vector<double>stdv = { 0.5,0.5,0.5 }; const int targetHeight = 48; const int targetWidth = 320; Ptr[num] = & amp;(Models.find(num)->second); InferenceEngine::InferRequest inferRequest = Ptr[num]->CreateInferRequest(); inferRequest.Infer(); InferenceEngine::Blob::Ptr inputBlobPtr = inferRequest.GetBlob(inputNodeName); InferenceEngine::SizeVector inputSize = inputBlobPtr->getTensorDesc().getDims(); auto inputdata = inputBlobPtr->buffer() .as<InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type*>(); InferenceEngine::Blob::Ptr outputBlobPtr = inferRequest.GetBlob(outputNodeName); auto outputData = outputBlobPtr->buffer(). as<InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type*>(); InferenceEngine::SizeVector outputSize = outputBlobPtr->getTensorDesc().getDims(); cv::Mat input, output; size_t channels = inputSize[1]; size_t inputHeight = inputSize[2]; size_t inputWidth = inputSize[3]; rsize_t imageSize = inputHeight * inputWidth; paddleOCRPreprocess(image, input, targetHeight, targetWidth, mean, stdv); for (size_t pid = 0; pid < imageSize; + + pid) { for (size_t ch = 0; ch < channels; + + ch) { inputdata[imageSize * ch + pid] = input.at<cv::Vec3f>(pid)[ch]; } } inferRequest.Infer(); cv::Mat temp(outputSize[1], outputSize[2], CV_32FC1, outputData); output = temp; float prob; paddleOCRPostProcess(output, result, prob); * similarity = (double)prob; char* resultc = string2char(result); return resultc; }
.h file, if you want to add or modify the interface, remember that the functions of cpp and h must be consistent.
#pragma once #ifndef _PADDLEOCR_H_ #define _PADDLEOCR_H_ #include "opencv.hpp" #include "inference_engine.hpp" #define DLLEXPORT extern "C" __declspec(dllexport) using namespace std; void normalizeImage(const cv::Mat & amp; image, cv::Mat & amp; out, std::vector<double>mean, std::vector<double>stdv); void paddingImage(const cv::Mat & amp; image, cv::Mat & amp; out, int top, int left, int bottom, int right, int bodeyType, const cv::Scalar & amp; value = cv::Scalar()); void paddleOCRPreprocess(const cv::Mat & amp; image, cv::Mat & amp; out, const int targetHeight, const int targetWidth, std::vector<double>mean, std::vector<double>stdv); void paddleOCRPostProcess(cv::Mat & amp; output, std::string & amp; result, float & amp; prob); char* string2char(string s); //DLLEXPORT void demo(char* xmlPathc, char* binPathc, unsigned char* ucImg, int width, int height, char* resultc); DLLEXPORT char* OpenvinoPpocr(unsigned char* ucImg, int width, int height, double* similarity, int num); DLLEXPORT bool Create_model(int num, char* xmlPathc); DLLEXPORT void DeleteModel(); DLLEXPORT bool InnitPtr(int num); DLLEXPORT bool FindModel(int num, char* xmlPathc); #endif // !_PADDLEOCR_H_ #pragma once #pragma once #pragma once
Model
It seems that the model can now be used with onnx without any problem, but what I got is the ir model. I searched online for the conversion method. I may need to use the openvino package of python. You can check it in csdn for details. It is not difficult. Then Because it is still a deep learning algorithm and the models are relatively targeted, we will not include ours here. There are also modified codes for reference. Please use them well.