Deep Neural Networks – Convert PyTorch Segmentation Model and Start with OpenCV v4.8.0

Goals

In this tutorial you will learn how to

  • Convert PyTorch segmentation model
  • Run the converted PyTorch model using OpenCV
  • Evaluate PyTorch and OpenCV DNN models

We will discuss the above points using the FCN ResNet-50 architecture as an example.

Introduction

The key points involved in PyTorch classification and segmentation models and the OpenCV API’s conversion pipeline are the same. The first step is to convert the model to ONNX format using the PyTorch torch.onnx.export built-in function. Then, pass the obtained .onnx model into cv.dnn.readNetFromONNX and return a cv.dnn.Net object for DNN operations.

Practice

In this section we will cover the following points:

  1. Create a segmented model transformation pipeline and provide inference
  2. Evaluate and test segmentation models

If you only want to run an evaluation or test model pipeline, you can skip the “Model Transformation Pipeline” section.

Model conversion pipeline

The code for this chapter is located in the dnn_model_runner module and can be executed through the following command line:

python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_fcnresnet50 ““

The following code contains instructions for the following steps:

  1. Instantiate a PyTorch model
  2. Convert PyTorch model to .onnx
  3. Reading transmitted network using OpenCV API
  4. Prepare to enter data
  5. provide reasoning
  6. Get color mask from prediction
  7. Visualize results
# Initialize PyTorch FCN ResNet-50 model
original_model = models.segmentation.fcn_resnet50(pretrained=True)
# Get the path converted to ONNX PyTorch model
full_model_path = get_pytorch_onnx_model(original_model)
# Use OpenCV API to read the converted .onnx model
opencv_net = cv2.dnn.readNetFromONNX(full_model_path)
print("OpenCV model read successfully. Layer ID: \
", opencv_net.getLayerNames())
# Get preprocessed image
img, input_img = get_processed_imgs("test_data/sem_segm/2007_000033.jpg")
# Get OpenCV DNN predictions
opencv_prediction = get_opencv_dnn_prediction(opencv_net, input_img)
# Get the original PyTorch ResNet50 prediction value
pytorch_prediction = get_pytorch_dnn_prediction(original_model, input_img)
pascal_voc_classes, pascal_voc_colors = read_colors_info("test_data/sem_segm/pascal-classes.txt")
# Get the color segmentation mask
opencv_colored_mask = get_colored_mask(img.shape, opencv_prediction, pascal_voc_colors)
pytorch_colored_mask = get_colored_mask(img.shape, pytorch_prediction, pascal_voc_colors)
# Get PASCAL VOC color palette
color_legend = get_legend(pascal_voc_classes, pascal_voc_colors)
cv2.imshow('PyTorch Color Mask', pytorch_colored_mask)
cv2.imshow('OpenCV DNN color mask', opencv_colored_mask)
cv2.imshow('Color Legend', color_legend)
cv2.waitKey(0)

To provide model inference, we will use the following images from the PASCAL VOC validation dataset:

PASCAL VOC img (this image is corrupted)

The target segmentation result is

PASCAL VOC ground truth (this image is corrupted)

For PASCAL VOC color decoding and its mapping to prediction masks, we also need the pascal-classes.txt file, which contains the complete list of PASCAL VOC classes and corresponding colors.

Let’s take the pre-trained PyTorch FCN ResNet-50 as an example and dive into each code step:

  • Instantiate the PyTorch FCN ResNet-50 model:
# Initialize PyTorch FCN ResNet-50 model
original_model = models.segmentation.fcn_resnet50(pretrained=True)
  • Convert PyTorch model to ONNX format:
# Define the saving directory for further converted models
onnx_model_path = "models" # Define names for further converted models
# Define names for further converted models
onnx_model_name = "fcnresnet50.onnx" # Create a directory for further converted models
# Create directories for further converted models
os.makedirs(onnx_model_path, exist_ok=True)
# Get the full path of the converted model
full_model_path = os.path.join(onnx_model_path, onnx_model_name)
# Generate model inputs to build the graph
generated_input = Variable(
    torch.randn(1, 3, 500, 500)
)
# Export the model to ONNX format
torch.onnx.export(
    original_model,
    generated_input,
    full_model_path,
    verbose=True,
    input_names=["input"]、
    output_names=["output"]、
    opset_version=11
)

The code for this step is no different from the classification conversion case. So after successfully executing the above code, we will get models/fcnresnet50.onnx.

  • Use cv.dnn.readNetFromONNX to read the converted network and pass the ONNX model obtained in the previous step into it:
# Read the converted .onnx model using OpenCV API
opencv_net = cv2.dnn.readNetFromONNX(full_model_path)
  • Prepare to enter data:
# Read image
input_img = cv2.imread(img_path, cv2.IMREAD_COLOR)
input_img = input_img.astype(np.float32)
# Target image size
img_height = input_img.shape[0] # Target image size
img_width = input_img.shape[1] #Define preprocessing parameters
# Define preprocessing parameters
mean = np.array([0.485, 0.456, 0.406]) * 255.0
scale=1/255.0
std = [0.229, 0.224, 0.225]
# Prepare the input blob to fit the model input:


# 1. Subtract the average
# 2. Scale pixel value from 0 to 1
input_blob = cv2.dnn.blobFromImage(
    image=input_img、
    scalefactor=scale、
    size=(img_width, img_height), # Image target size
    mean=mean、
    swapRB=True, # BGR -> RGB
    crop=False # Center crop
)
# 3. Divide by standard
input_blob[0] /= np.asarray(std, dtype=np.float32).reshape(3, 1, 1)

In this step, we read the image and prepare the model input using the cv2.dnn.blobFromImage function, which returns a 4-dimensional blob. It should be noted that cv2.dnn.blobFromImage first subtracts the average value and then scales the pixel values. Therefore, the average is multiplied by 255.0 to reproduce the preprocessing sequence of the original image:

img /= 255.0
img -= [0.485, 0.456, 0.406] (0.485, 0.456, 0.406)
img /= [0.229, 0.224, 0.225]
  • OpenCV cv.dnn_Net inference:
# Set up OpenCV DNN input
opencv_net.setInput(preproc_img)
# Set up OpenCV DNN input
out = opencv_net.forward()
print("OpenCV DNN segmentation prediction: \
")
print("* shape: ", out.shape)
# Get the ID of the prediction category
out_predictions = np.argmax(out[0], axis=0)

After executing the above code, we will get the following output:

OpenCV DNN segmentation prediction:
* Shape: (1, 21, 500, 500)

Each of the 21 prediction channels (21 represents the number of PASCAL VOC classes) contains a probability that represents how likely it is that a pixel corresponds to a PASCAL VOC class.

  • PyTorch FCN ResNet-50 model inference:
original_net.eval()
preproc_img = torch.FloatTensor(preproc_img)
with torch.no_grad():
    # Get the denormalized probability of each category
    out = original_net(preproc_img)['out']
print("\
PyTorch segmentation model prediction:\
")
print("* shape: ", out.shape)
# Get the ID of the prediction category
out_predictions = out[0].argmax(dim=0)

After launching the above code, we will get the following output:

PyTorch segmentation model prediction:
* shape: torch.Size([1, 21, 366, 500])

PyTorch predictions also contain probabilities corresponding to each category prediction.

  • Get the color mask from the prediction:
# Convert mask value to PASCAL VOC color
processed_mask = np.stack([colors[color_id] for color_id in segm_mask.flatten()])
# Reshape the mask into a 3-channel image
processed_mask = processed_mask.reshape(mask_height, mask_width, 3)
processed_mask = cv2.resize(processed_mask, (img_width, img_height), interpolation=cv2.INTER_NEAREST).astype(
    np.uint8)
# Convert color mask from BGR to RGB for compatibility with PASCAL VOC colors
processed_mask = cv2.cvtColor(processed_mask, cv2.COLOR_BGR2RGB)

In this step, we map the probabilities in the segmentation masks to the appropriate colors of the predicted classes. Let’s take a look at the results:

OpenCV color mask (this image is corrupted)

For extended evaluation of the model, we can use the py_to_py_segm script of the dnn_model_runner module. This module part will be introduced in the next chapter.

Model Evaluation

The dnn/samples dnn_model_runner module allows running the complete evaluation pipeline on the PASCAL VOC dataset and testing the following PyTorch segmentation models:

  • FCN ResNet-50
  • FCN ResNet-101

This list can also be extended with further appropriate evaluation pipeline configurations.

Evaluation Mode

The following line indicates that the module is running in evaluation mode:

python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_segm--model_name<pytorch_segm_model_name>

The segmentation model selected from the list will be read into an OpenCV cv.dnn_Net object. Evaluation results (pixel accuracy, average IoU, inference time) for PyTorch and OpenCV models are written to log files. Inference time values are also displayed graphically to summarize the model information obtained.

The necessary evaluation configuration is defined in test_config.py:

@dataclass
class TestSegmConfig:
    img_root_dir: str = "./VOC2012"
    img_dir: str = os.path.join(img_root_dir, "JPEGImages/")
    img_segm_gt_dir: str = os.path.join(img_root_dir, "SegmentationClass/")
    # Reduce value: https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/data/pascal/seg11valid.txt
    segm_val_file: str = os.path.join(img_root_dir, "ImageSets/Segmentation/seg11valid.txt")
    color_file_cls:str = os.path.join(img_root_dir, "ImageSets/Segmentation/pascal-classes.txt")

These values can be modified based on the selected model pipeline.

To start the evaluation of PyTorch FCN ResNet-50, run the following line:

python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_segm --model_name fcnresnet50

Test Mode

The following line represents running the module in test mode, which provides the steps for model inference:

python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_segm --model_name <pytorch_segm_model_name> --test True --default_img_preprocess <True/False> --evaluate False

The default_img_preprocess keyword here defines whether you want to parameterize the model testing process with some specific values, or use default values such as scale, mean or std.

The test configuration is represented in the test_config.py TestSegmModuleConfig class:

@dataclass
class TestSegmModuleConfig:
    segm_test_data_dir: str = "test_data/sem_segm"
    test_module_name: str = "segmentation" (test module name)
    test_module_path: str = "segmentation.py"
    input_img: str = os.path.join(segm_test_data_dir, "2007_000033.jpg")
    model: str = ""
    frame_height: str = str(TestSegmConfig.frame_size)
    frame_width: str = str(TestSegmConfig.frame_size)
    scale: float = 1.0
    mean: List[float] = field(default_factory=lambda: [0.0, 0.0, 0.0])
    std: List[float] = field(default_factory=list)
    crop: bool = False
    rgb: bool = True
    classes: str = os.path.join(segm_test_data_dir, "pascal-classes.txt")

Default image preprocessing options are defined in default_preprocess_config.py:

pytorch_segm_input_blob = {<!-- -->
    "mean": ["123.675", "116.28", "103.53"],
    "scale": str(1 / 255.0),
    "std": ["0.229", "0.224", "0.225"],
    "crop": "False",
    "rgb": "True"
}

The basics of model testing are represented in samples/dnn/segmentation.py. segmentation.py can be executed autonomously with the transformation model provided in --input and the parameters populated for cv2.dnn.blobFromImage.

To reproduce the OpenCV steps described in “Model Conversion Pipeline” from scratch using dnn_model_runner, execute the following line:

python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_segm --model_name fcnresnet50 --test True --default_img_preprocess True --evaluate False