OpenCV image segmentation real-time portrait cutout model portrait segmentation.

MODNet is a lightweight Matting model. The onnx model of MODNet has been deployed using python before. This chapter will use NCNN to deploy MODNet. In addition, the model is statically quantized to reduce its size to 1/4 of the original model. Matting effect is as follows:

The complete code and required weights are linked at the end of the article.

?About the author: Machine learning, deep learning, convolutional neural network processing, image processing
Station B project actual combat: https://space.bilibili.com/364224477
If the article is helpful to you, please comment Like Collect Add follow +
?♂ CSDN personal homepage: @purple’s personal homepage

1. NCNN compilation

For specific steps, please refer to: Official compilation tutorial

1. Compile protobuf

Download protobuf: https://github.com/google/protobuf/archive/v3.4.0.zip

Open the x64 Native Tools Command Prompt for VS 2017 command line tool in the start menu (a more advanced version is also available, I used 2022 successfully), and compile protobuf.

cd <protobuf-root-dir>
mkdir build
cd build
cmake -G"NMake Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=?%/install -Dprotobuf_BUILD_TESTS=OFF -Dprotobuf_MSVC_STATIC_RUNTIME=OFF ../cmake
nmake
nmake install```


### 2. Compile NCNN


Clone the NCNN repository:

```bash
git clone https://github.com/Tencent/ncnn.git
</code><img class="look-more-preCode contentImg-no-view" src="//i2.wp.com/csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreBlack. png" alt="" title="">

Compile NCNN (I am not using Vulkan here, refer to the official tutorial if necessary), replace the path in the command with your own path:

cd <ncnn-root-dir>
mkdir -p build
cd build
cmake -G"NMake Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=?%/install -DProtobuf_INCLUDE_DIR=<protobuf-root-dir>/build/install/include -DProtobuf_LIBRARIES=<protobuf-root-dir>/build/install /lib/libprotobuf.lib -DProtobuf_PROTOC_EXECUTABLE=<protobuf-root-dir>/build/install/bin/protoc.exe -DNCNN_VULKAN=OFF.. -DOpenCV_DIR=C:/opencv/opencv/build
nmake
nmake install

2. NCNN deployment

1. Environment

windows cpu

opencv 4.5.5

visual studio 2019

2. Onnx to ncnn model

First, convert the simplified onnx model obtained above into the ncnn model (get two files with param and bin suffixes). Pay attention to the fact that no errors are reported during the conversion, otherwise an error will occur in subsequent loading:

../ncnn/build/tools/onnx/onnx2ncnn simple_modnet.onnx simple_modnet.param simple_modnet.bin

3. C++ code

Code structure:

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly

MODNet.h code:

#pragma once
#include <string>
#include "net.h"
#include <opencv.hpp>
#include "time.h"


class MODNet
{<!-- -->

private:
std::string param_path;
std::string bin_path;
std::vector<int> input_shape;
ncnn::Net net;

const float norm_vals[3] = {<!-- --> 1 / 177.5, 1 / 177.5, 1 / 177.5 };
const float mean_vals[3] = {<!-- --> 175.5, 175.5, 175.5 };

cv::Mat normalize(cv::Mat & amp; image);
public:
MODNet() = delete;
MODNet(const std::string param_path, const std::string bin_path, std::vector<int> input_shape);
~MODNet();

cv::Mat predict_image(cv::Mat & amp; image);
void predict_image(const std::string & amp; src_image_path, const std::string & amp; dst_path);

void predict_camera();
};

</code><img class="look-more-preCode contentImg-no-view" src="//i2.wp.com/csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreBlack. png" alt="" title="">

MODNet.cpp code:

#include "MODNet.h"


MODNet::MODNet(const std::string param_path, const std::string bin_path, std::vector<int> input_shape)
:param_path(param_path), bin_path(bin_path), input_shape(input_shape) {<!-- -->
net.load_param(param_path.c_str());
net.load_model(bin_path.c_str());
}


MODNet::~MODNet() {<!-- -->
net.clear();
}


cv::Mat MODNet::normalize(cv::Mat & amp; image) {<!-- -->
std::vector<cv::Mat> channels, normalized_image;
cv::split(image, channels);

cv::Mat r, g, b;
b = channels.at(0);
g = channels.at(1);
r = channels.at(2);
b = (b / 255. - 0.5) / 0.5;
g = (g / 255. - 0.5) / 0.5;
r = (r / 255. - 0.5) / 0.5;

normalized_image.push_back(r);
normalized_image.push_back(g);
normalized_image.push_back(b);

cv::Mat out = cv::Mat(image.rows, image.cols, CV_32F);
cv::merge(normalized_image, out);
return out;
}


cv::Mat MODNet::predict_image(cv::Mat & amp; image) {<!-- -->
cv::Mat rgbImage;
cv::cvtColor(image, rgbImage, cv::COLOR_BGR2RGB);
ncnn::Mat in = ncnn::Mat::from_pixels_resize(rgbImage.data, ncnn::Mat::PIXEL_RGB, image.cols, image.rows, input_shape[3], input_shape[2]);
in.substract_mean_normalize(mean_vals, norm_vals);
ncnn::Extractor ex = net.create_extractor();
ex.set_num_threads(4);
ex.input("input", in);
ncnn::Mat out;
ex.extract("output", out);

cv::Mat mask(out.h, out.w, CV_8UC1);
const float* probMap = out.channel(0);

for (int i{<!-- --> 0 }; i < out.h; i + + ) {<!-- -->
for (int j{<!-- --> 0 }; j < out.w; + + j) {<!-- -->
mask.at<uchar>(i, j) = probMap[i * out.w + j] > 0.5 ? 255 : 0;
}
}
cv::resize(mask, mask, cv::Size(image.cols, image.rows), 0, 0);
cv::Mat segFrame;
cv::bitwise_and(image, image, segFrame, mask = mask);
return segFrame;
}


void MODNet::predict_image(const std::string & amp; src_image_path, const std::string & amp; dst_path) {<!-- -->
cv::Mat image = cv::imread(src_image_path);
cv::Mat segFrame = predict_image(image);
cv::imwrite(dst_path, segFrame);
}


void MODNet::predict_camera() {<!-- -->
cv::Mat frame;
cv::VideoCapture cap;
int deviceID{<!-- --> 0 };
int apiID{<!-- --> cv::CAP_ANY };
cap.open(deviceID, apiID);
if (!cap.isOpened()) {<!-- -->
std::cout << "Error, cannot open camera!" << std::endl;
return;
}
//--- GRAB AND WRITE LOOP
std::cout << "Start grabbing" << std::endl << "Press any key to terminate" << std::endl;
int count{<!-- --> 0 };
clock_t start{<!-- --> clock() }, end{<!-- --> 0 };
double fps{<!-- --> 0 };
for (;;)
{<!-- -->
// wait for a new frame from camera and store it into 'frame'
cap.read(frame);
// check if we succeeded
if (frame.empty()) {<!-- -->
std::cout << "ERROR! blank frame grabbed" << std::endl;
break;
}
cv::Mat segFrame = predict_image(frame);

// fps
+ + count;
end = clock();
fps = count / (float(end - start) / CLOCKS_PER_SEC);
if (count >= 50) {<!-- -->
count = 0; //Prevent count overflow
start = clock();
}
std::cout << "FPS: " << fps << " Seg Image Number: " << count << " time consume:" << (float(end - start) / CLOCKS_PER_SEC) < < std::endl;
//Set parameters related to drawing text
std::string text{<!-- --> std::to_string(fps) };
int font_face = cv::FONT_HERSHEY_COMPLEX;
double font_scale = 1;
int thickness = 2;
int baseline;
cv::Size text_size = cv::getTextSize(text, font_face, font_scale, thickness, & amp;baseline);

//Center the text box and draw it
cv::Point origin;
origin.x = 20;
origin.y = 20;
cv::putText(segFrame, text, origin, font_face, font_scale, cv::Scalar(0, 255, 255), thickness, 8, 0);

// show live and wait for a key with timeout long enough to show images
imshow("Live", segFrame);
if (cv::waitKey(5) >= 0)
break;

}
cap.release();
cv::destroyWindow("Live");
return;
}

</code><img class="look-more-preCode contentImg-no-view" src="//i2.wp.com/csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreBlack. png" alt="" title="">

main.cpp code:

#include <opencv.hpp>
#include <iostream>
#include "MODNet.h"
#include <vector>
#include "net.h"
#include "time.h"


int main() {<!-- -->
std::string param_path{<!-- --> "onnx_model\simple_modnet.param" };
std::string bin_path{<!-- --> "onnx_model\simple_modnet.bin" };
std::vector<int> input_shape{<!-- --> 1, 3, 512, 512 };
MODNet model(param_path, bin_path, input_shape);


// predict and display
cv::Mat image = cv::imread("C:\Users\langdu\Pictures\test.png");
cv::Mat segFrame = model.predict_image(image);
cv::imshow("1", segFrame);
cv::waitKey(0);

\t// Camera
//model.predict_camera();
return -1;
}

</code><img class="look-more-preCode contentImg-no-view" src="//i2.wp.com/csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreBlack. png" alt="" title="">

3. NCNN quantification

For mobile devices, the requirements for model size are very strict, and effective methods are needed to reduce its storage space. Quantification is an effective method to reduce model size. For quantification information, please refer to: [Deep Learning] Model Quantification-Notes/Experiments

The static quantization method is used here. Comparison of model size after quantization:
bin (KB) param (KB) before quantization 2523622 after quantization 644224

As can be seen from the above table, after the model is quantized, its size is only about 1/4 of the original size (there will be a certain loss of accuracy in the prediction). Let’s start the quantitative tutorial, refer to the official tutorial.

1. Model optimization

Use ncnnoptimize (in the build\tools directory) to optimize the model.

./ncnnoptimize simple_modnet.param simple_modnet.bin quantize_modnet.param quantize_modnet 0````


### 2. Create calibration table


When ncnn creates the calibration table, the mean and norm parameters can be modified by referring to [here](https://xiulian.blog.csdn.net/article/details/105511041). Also note that the pixel setting is consistent with the MODNet official repo, which is BGR. Images used for calibration are stored in the images folder.

```bash
find images/ -type f > imagelist.txt
./ncnn2table quantize_modnet.param quantize_modnet.bin imagelist.txt modnet.table mean=[177.5, 177.5, 177.5] norm=[0.00784, 0.00784, 0.00784] shape=[512, 512, 3] pixel=BGR thread=8 method= kl```


### 3. Quantification


Use the following command to get the int8 model.

```bash
./ncnn2int8 quantize_modnet.param quantize_modnet-opt.bin modnet_int8.param modnet_int8.bin modnet.table
</code><img class="look-more-preCode contentImg-no-view" src="//i2.wp.com/csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreBlack. png" alt="" title="">

4. Use

The int8 model is obtained. It is very simple to use. You only need to replace the bin and param paths in the above code with the generated int8 model path.

5. Prediction results after quantification

First, the unquantified prediction results are given:

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly

Results after quantification (accuracy is lost, shoe predictions are complete, and flooring errors are many):

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly

Design project case demonstration address: link