Goals
In this tutorial you will learn how to
- Convert PyTorch segmentation model
- Run the converted PyTorch model using OpenCV
- Evaluate PyTorch and OpenCV DNN models
We will discuss the above points using the FCN ResNet-50 architecture as an example.
Introduction
The key points involved in PyTorch classification and segmentation models and the OpenCV API’s conversion pipeline are the same. The first step is to convert the model to ONNX format using the PyTorch torch.onnx.export
built-in function. Then, pass the obtained .onnx
model into cv.dnn.readNetFromONNX and return a cv.dnn.Net object for DNN operations.
Practice
In this section we will cover the following points:
- Create a segmented model transformation pipeline and provide inference
- Evaluate and test segmentation models
If you only want to run an evaluation or test model pipeline, you can skip the “Model Transformation Pipeline” section.
Model conversion pipeline
The code for this chapter is located in the dnn_model_runner
module and can be executed through the following command line:
python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_fcnresnet50 ““
The following code contains instructions for the following steps:
- Instantiate a PyTorch model
- Convert PyTorch model to
.onnx
- Reading transmitted network using OpenCV API
- Prepare to enter data
- provide reasoning
- Get color mask from prediction
- Visualize results
# Initialize PyTorch FCN ResNet-50 model original_model = models.segmentation.fcn_resnet50(pretrained=True) # Get the path converted to ONNX PyTorch model full_model_path = get_pytorch_onnx_model(original_model) # Use OpenCV API to read the converted .onnx model opencv_net = cv2.dnn.readNetFromONNX(full_model_path) print("OpenCV model read successfully. Layer ID: \ ", opencv_net.getLayerNames()) # Get preprocessed image img, input_img = get_processed_imgs("test_data/sem_segm/2007_000033.jpg") # Get OpenCV DNN predictions opencv_prediction = get_opencv_dnn_prediction(opencv_net, input_img) # Get the original PyTorch ResNet50 prediction value pytorch_prediction = get_pytorch_dnn_prediction(original_model, input_img) pascal_voc_classes, pascal_voc_colors = read_colors_info("test_data/sem_segm/pascal-classes.txt") # Get the color segmentation mask opencv_colored_mask = get_colored_mask(img.shape, opencv_prediction, pascal_voc_colors) pytorch_colored_mask = get_colored_mask(img.shape, pytorch_prediction, pascal_voc_colors) # Get PASCAL VOC color palette color_legend = get_legend(pascal_voc_classes, pascal_voc_colors) cv2.imshow('PyTorch Color Mask', pytorch_colored_mask) cv2.imshow('OpenCV DNN color mask', opencv_colored_mask) cv2.imshow('Color Legend', color_legend) cv2.waitKey(0)
To provide model inference, we will use the following images from the PASCAL VOC validation dataset:
PASCAL VOC img (this image is corrupted)
The target segmentation result is
PASCAL VOC ground truth (this image is corrupted)
For PASCAL VOC color decoding and its mapping to prediction masks, we also need the pascal-classes.txt
file, which contains the complete list of PASCAL VOC classes and corresponding colors.
Let’s take the pre-trained PyTorch FCN ResNet-50 as an example and dive into each code step:
- Instantiate the PyTorch FCN ResNet-50 model:
# Initialize PyTorch FCN ResNet-50 model original_model = models.segmentation.fcn_resnet50(pretrained=True)
- Convert PyTorch model to ONNX format:
# Define the saving directory for further converted models onnx_model_path = "models" # Define names for further converted models # Define names for further converted models onnx_model_name = "fcnresnet50.onnx" # Create a directory for further converted models # Create directories for further converted models os.makedirs(onnx_model_path, exist_ok=True) # Get the full path of the converted model full_model_path = os.path.join(onnx_model_path, onnx_model_name) # Generate model inputs to build the graph generated_input = Variable( torch.randn(1, 3, 500, 500) ) # Export the model to ONNX format torch.onnx.export( original_model, generated_input, full_model_path, verbose=True, input_names=["input"]、 output_names=["output"]、 opset_version=11 )
The code for this step is no different from the classification conversion case. So after successfully executing the above code, we will get models/fcnresnet50.onnx
.
- Use cv.dnn.readNetFromONNX to read the converted network and pass the ONNX model obtained in the previous step into it:
# Read the converted .onnx model using OpenCV API opencv_net = cv2.dnn.readNetFromONNX(full_model_path)
- Prepare to enter data:
# Read image input_img = cv2.imread(img_path, cv2.IMREAD_COLOR) input_img = input_img.astype(np.float32) # Target image size img_height = input_img.shape[0] # Target image size img_width = input_img.shape[1] #Define preprocessing parameters # Define preprocessing parameters mean = np.array([0.485, 0.456, 0.406]) * 255.0 scale=1/255.0 std = [0.229, 0.224, 0.225] # Prepare the input blob to fit the model input: # 1. Subtract the average # 2. Scale pixel value from 0 to 1 input_blob = cv2.dnn.blobFromImage( image=input_img、 scalefactor=scale、 size=(img_width, img_height), # Image target size mean=mean、 swapRB=True, # BGR -> RGB crop=False # Center crop ) # 3. Divide by standard input_blob[0] /= np.asarray(std, dtype=np.float32).reshape(3, 1, 1)
In this step, we read the image and prepare the model input using the cv2.dnn.blobFromImage
function, which returns a 4-dimensional blob. It should be noted that cv2.dnn.blobFromImage first subtracts the average value and then scales the pixel values. Therefore, the average is multiplied by 255.0 to reproduce the preprocessing sequence of the original image:
img /= 255.0 img -= [0.485, 0.456, 0.406] (0.485, 0.456, 0.406) img /= [0.229, 0.224, 0.225]
- OpenCV
cv.dnn_Net
inference:
# Set up OpenCV DNN input opencv_net.setInput(preproc_img) # Set up OpenCV DNN input out = opencv_net.forward() print("OpenCV DNN segmentation prediction: \ ") print("* shape: ", out.shape) # Get the ID of the prediction category out_predictions = np.argmax(out[0], axis=0)
After executing the above code, we will get the following output:
OpenCV DNN segmentation prediction: * Shape: (1, 21, 500, 500)
Each of the 21 prediction channels (21 represents the number of PASCAL VOC classes) contains a probability that represents how likely it is that a pixel corresponds to a PASCAL VOC class.
- PyTorch FCN ResNet-50 model inference:
original_net.eval() preproc_img = torch.FloatTensor(preproc_img) with torch.no_grad(): # Get the denormalized probability of each category out = original_net(preproc_img)['out'] print("\ PyTorch segmentation model prediction:\ ") print("* shape: ", out.shape) # Get the ID of the prediction category out_predictions = out[0].argmax(dim=0)
After launching the above code, we will get the following output:
PyTorch segmentation model prediction: * shape: torch.Size([1, 21, 366, 500])
PyTorch predictions also contain probabilities corresponding to each category prediction.
- Get the color mask from the prediction:
# Convert mask value to PASCAL VOC color processed_mask = np.stack([colors[color_id] for color_id in segm_mask.flatten()]) # Reshape the mask into a 3-channel image processed_mask = processed_mask.reshape(mask_height, mask_width, 3) processed_mask = cv2.resize(processed_mask, (img_width, img_height), interpolation=cv2.INTER_NEAREST).astype( np.uint8) # Convert color mask from BGR to RGB for compatibility with PASCAL VOC colors processed_mask = cv2.cvtColor(processed_mask, cv2.COLOR_BGR2RGB)
In this step, we map the probabilities in the segmentation masks to the appropriate colors of the predicted classes. Let’s take a look at the results:
OpenCV color mask (this image is corrupted)
For extended evaluation of the model, we can use the py_to_py_segm
script of the dnn_model_runner
module. This module part will be introduced in the next chapter.
Model Evaluation
The dnn/samples
dnn_model_runner
module allows running the complete evaluation pipeline on the PASCAL VOC dataset and testing the following PyTorch segmentation models:
- FCN ResNet-50
- FCN ResNet-101
This list can also be extended with further appropriate evaluation pipeline configurations.
Evaluation Mode
The following line indicates that the module is running in evaluation mode:
python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_segm--model_name<pytorch_segm_model_name>
The segmentation model selected from the list will be read into an OpenCV cv.dnn_Net
object. Evaluation results (pixel accuracy, average IoU, inference time) for PyTorch and OpenCV models are written to log files. Inference time values are also displayed graphically to summarize the model information obtained.
The necessary evaluation configuration is defined in test_config.py
:
@dataclass class TestSegmConfig: img_root_dir: str = "./VOC2012" img_dir: str = os.path.join(img_root_dir, "JPEGImages/") img_segm_gt_dir: str = os.path.join(img_root_dir, "SegmentationClass/") # Reduce value: https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/data/pascal/seg11valid.txt segm_val_file: str = os.path.join(img_root_dir, "ImageSets/Segmentation/seg11valid.txt") color_file_cls:str = os.path.join(img_root_dir, "ImageSets/Segmentation/pascal-classes.txt")
These values can be modified based on the selected model pipeline.
To start the evaluation of PyTorch FCN ResNet-50, run the following line:
python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_segm --model_name fcnresnet50
Test Mode
The following line represents running the module in test mode, which provides the steps for model inference:
python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_segm --model_name <pytorch_segm_model_name> --test True --default_img_preprocess <True/False> --evaluate False
The default_img_preprocess
keyword here defines whether you want to parameterize the model testing process with some specific values, or use default values such as scale
, mean
or std
.
The test configuration is represented in the test_config.py
TestSegmModuleConfig
class:
@dataclass class TestSegmModuleConfig: segm_test_data_dir: str = "test_data/sem_segm" test_module_name: str = "segmentation" (test module name) test_module_path: str = "segmentation.py" input_img: str = os.path.join(segm_test_data_dir, "2007_000033.jpg") model: str = "" frame_height: str = str(TestSegmConfig.frame_size) frame_width: str = str(TestSegmConfig.frame_size) scale: float = 1.0 mean: List[float] = field(default_factory=lambda: [0.0, 0.0, 0.0]) std: List[float] = field(default_factory=list) crop: bool = False rgb: bool = True classes: str = os.path.join(segm_test_data_dir, "pascal-classes.txt")
Default image preprocessing options are defined in default_preprocess_config.py
:
pytorch_segm_input_blob = {<!-- --> "mean": ["123.675", "116.28", "103.53"], "scale": str(1 / 255.0), "std": ["0.229", "0.224", "0.225"], "crop": "False", "rgb": "True" }
The basics of model testing are represented in samples/dnn/segmentation.py
. segmentation.py
can be executed autonomously with the transformation model provided in --input
and the parameters populated for cv2.dnn.blobFromImage
.
To reproduce the OpenCV steps described in “Model Conversion Pipeline” from scratch using dnn_model_runner
, execute the following line:
python -m dnn_model_runner.dnn_conversion.pytorch.segmentation.py_to_py_segm --model_name fcnresnet50 --test True --default_img_preprocess True --evaluate False