Goals
In this tutorial you will learn how to
- Convert a TensorFlow (TF) segmentation model
- Run the converted TensorFlow model using OpenCV
- Evaluate TensorFlow and OpenCV DNN models
We will discuss the above points using the DeepLab architecture as an example.
Introduction
Apart from the graph optimization stage, the key concepts involved in TensorFlow classification and segmentation models are almost identical to those in the OpenCV API’s conversion pipeline. The first step in converting a TensorFlow model to cv.dnn.Net is to obtain a frozen TF model graph. A frozen graph defines the combination of the model graph structure and the retained values of the required variables (such as weights). Frozen graphs are usually saved in protobuf (.pb
) files. To read the generated segmentation model .pb
file using cv.dnn.readNetFromTensorflow, you need to modify the graph using the TF graph conversion tool.
Practice
In this part we will cover the following points:
- Create a TF classification model transformation pipeline and provide inference
- Evaluate and test TF classification models
If you only want to run an evaluation or test model pipeline, you can skip the “Model Transformation Pipeline” tutorial section.
Model conversion pipeline
The code for this subchapter is located in the dnn_model_runner
module and can be executed through the following command line:
python -m dnn_model_runner.dnn_conversion.tf.segmentation.py_to_py_deeplab
TensorFlow segmented models can be found in the TensorFlow Research Models section, which contains model implementations based on published research papers. We will retrieve the archive containing the pre-trained TF DeepLabV3 from the following link:
http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz
The complete process of obtaining frozen graphs is described in deeplab_retrievement.py
:
def get_deeplab_frozen_graph(): # Define the model path to be downloaded models_url = 'http://download.tensorflow.org/models/' mobilenetv2_voctrainval = 'deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz' # Build model link for download model_link = models_url + mobilenetv2_voctrainval try: urllib.request.urlretrieve(model_link, mobilenetv2_voctrainval) except Exception: print("TF DeepLabV3 not retrieved: {}".format(model_link)) return tf_model_tar = tarfile.open(mobilenetv2_voctrainval) # Traverse the obtained model files for model_tar_elem in tf_model_tar.getmembers(): # Check if the model file contains frozen images if TF_FROZEN_GRAPH_NAME in os.path.basename(model_tar_elem.name): # Extract frozen graphics tf_model_tar.extract(model_tar_elem, FROZEN_GRAPH_PATH) tf_model_tar.close()
After running this script
python -m dnn_model_runner.dnn_conversion.tf.segmentation.deeplab_retrievement
We will get frozen_inference_graph.pb
in deeplab/deeplabv3_mnv2_pascal_trainval
.
The extracted frozen_inference_graph.pb
needs to be optimized before loading the network using OpenCV. To optimize the graph, we use TF TransformGraph
with default parameters:
DEFAULT_OPT_GRAPH_NAME = "optimized_frozen_inference_graph.pb" DEFAULT_INPUTS = "sub_7" DEFAULT_OUTPUTS = "ResizeBilinear_3" DEFAULT_TRANSFORMS = "remove_nodes(op=Identity)" \ " merge_duplicate_nodes" \ "strip_unused_nodes" \ " fold_constants(ignore_errors=true)" \ " fold_batch_norms" \ "fold_old_batch_norms" def optimize_tf_graph( in_graph, out_graph=DEFAULT_OPT_GRAPH_NAME, inputs=DEFAULT_INPUTS, outputs=DEFAULT_OUTPUTS, transforms=DEFAULT_TRANSFORMS, is_manual=True, was_optimized=True ): #... tf_opt_graph = TransformGraph( tf_graph, inputs, outputs, transforms )
To run the graphics optimization process, execute the following command line:
python -m dnn_model_runner.dnn_conversion.tf.segmentation.tf_graph_optimizer --in_graph deeplab/deeplabv3_mnv2_pascal_trainval/frozen_inference_graph.pb
Therefore, the deeplab/deeplabv3_mnv2_pascal_trainval
directory will contain optimized_frozen_inference_graph.pb
.
Once we have the model diagram, let’s look at the steps listed below:
- Read the TF
frozen_inference_graph.pb
graph - Reading optimized TF frozen graph using OpenCV API
- Prepare to enter data
- provide reasoning
- Get color mask from prediction
- Visualize results
# Get the TF model graph from the obtained frozen graph deeplab_graph = read_deeplab_frozen_graph(deeplab_frozen_graph_path) # Use OpenCV API to read DeepLab frozen graph opencv_net = cv2.dnn.readNetFromTensorflow(opt_deeplab_frozen_graph_path) print("OpenCV model has been read successfully. Model layers: \ ", opencv_net.getLayerNames()) # Get the processed image original_img_shape, tf_input_blob, opencv_input_img = get_processed_imgs("test_data/sem_segm/2007_000033.jpg") # Get OpenCV DNN predictions opencv_prediction = get_opencv_dnn_prediction(opencv_net, opencv_input_img) # Get TF model predictions tf_prediction = get_tf_dnn_prediction(deeplab_graph, tf_input_blob) # Get PASCAL VOC category and color pascal_voc_classes, pascal_voc_colors = read_colors_info("test_data/sem_segm/pascal-classes.txt") # Get the color segmentation mask opencv_colored_mask = get_colored_mask(original_img_shape, opencv_prediction, pascal_voc_colors) tf_colored_mask = get_tf_colored_mask(original_img_shape, tf_prediction, pascal_voc_colors) # Get PASCAL VOC color palette color_legend = get_legend(pascal_voc_classes, pascal_voc_colors) cv2.imshow('TensorFlow color mask', tf_colored_mask) cv2.imshow('OpenCV DNN color mask', opencv_colored_mask) cv2.imshow('Color Legend', color_legend)
We will use the following plot of the PASCAL VOC validation dataset to provide model inference:
PASCAL VOC image (image missing)
The target segmentation result is
PASCAL VOC ground truth (image missing)
For PASCAL VOC color decoding and its mapping to prediction masks, we also need the pascal-classes.txt
file, which contains the complete list of PASCAL VOC classes and corresponding colors.
Let’s take pre-training TF DeepLabV3 MobileNetV2 as an example to understand each step in depth:
- read TF
frozen_inference_graph.pb
graph:
# Start the depth laboratory model diagram model_graph = tf.Graph() # get with tf.io.gfile.GFile(frozen_graph_path, 'rb') as graph_file: tf_model_graph = GraphDef() tf_model_graph.ParseFromString(graph_file.read()) with model_graph.as_default(): tf.import_graph_def(tf_model_graph, name='')
- Read an optimized TF frozen graph using the OpenCV API:
# Read DeepLab frozen graph using OpenCV API opencv_net = cv2.dnn.readNetFromTensorflow(opt_deeplab_frozen_graph_path)
- Prepare input data using the cv2.dnn.blobFromImage function:
# Read image input_img = cv2.imread(img_path, cv2.IMREAD_COLOR) input_img = input_img.astype(np.float32) # Input preprocessed images for TF model tf_preproc_img = cv2.resize(input_img, (513, 513)) tf_preproc_img = cv2.cvtColor(tf_preproc_img, cv2.COLOR_BGR2RGB) # Define preprocessing parameters for OpenCV DNN mean = np.array([1.0, 1.0, 1.0]) * 127.5 scale=1/127.5 # Prepare the input blob to fit the model input: # Subtract the average # 2. Adjust pixel value from 0 to 1 input_blob = cv2.dnn.blobFromImage( image=input_img、 scalefactor=scale、 size=(513, 513), # Image target size mean=mean、 swapRB=True, # BGR -> RGB crop=False # Center crop )
Please note the preprocessing order in the cv2.dnn.blobFromImage
function. First subtract the average and then multiply the pixel values by the defined scale. Therefore, to reproduce the TF image preprocessing pipeline, we multiply the average value by 127.5. Another important point is the image preprocessing of TF DeepLab. To pass an image into the TF model, we only need to build an appropriate shape, and the rest of the image preprocessing will be described in feature_extractor.py and called automatically.
- Provides OpenCV
cv.dnn_Net
inference:
# Set up OpenCV DNN input opencv_net.setInput(preproc_img) # Set up OpenCV DNN input out = opencv_net.forward() print("OpenCV DNN segmentation prediction: \ ") print("* shape: ", out.shape) # Get the ID of the prediction category out_predictions = np.argmax(out[0], axis=0)
After executing the above code, we will get the following output:
OpenCV DNN segmentation prediction: * Shape: (1, 21, 513, 513)
Each of the 21 prediction channels (21 represents the number of PASCAL VOC classes) contains a probability that represents the likelihood that a pixel corresponds to a PASCAL VOC class.
- Provides TF model inference:
preproc_img = np.expand_dims(preproc_img, 0) # Start TF session tf_session = Session(graph=model_graph) input_tensor_name = "ImageTensor:0"、 output_tensor_name = "SemanticPredictions:0". # Run inference out = tf_session.run( output_tensor_name, feed_dict={<!-- -->input_tensor_name: [preproc_img]} ) ) print("TF segmentation model prediction:\ ") print("* shape: ", out.shape)
The TF inference results are as follows:
TF segmentation model prediction: * shape: (1, 513, 513)
TensorFlow predicts an index containing the corresponding PASCAL VOC category.
- Convert OpenCV predictions to color masks:
mask_height = segm_mask.shape[0] mask_width = segm_mask.shape[1] img_height = original_img_shape[0] img_width = original_img_shape[1] # Convert mask value to PASCAL VOC color processed_mask = np.stack([colors[color_id] for color_id in segm_mask.flatten()]) # Reshape the mask into a 3-channel image processed_mask = processed_mask.reshape(mask_height, mask_width, 3) processed_mask = cv2.resize(processed_mask, (img_width, img_height), interpolation=cv2.INTER_NEAREST).astype( np.uint8) # Convert color mask from BGR to RGB processed_mask = cv2.cvtColor(processed_mask, cv2.COLOR_BGR2RGB)
In this step, we map the probabilities of the segmentation masks to the appropriate colors of the predicted classes. Let’s take a look at the results:
Color legend (image missing)
OpenCV color mask (image missing)
- Convert TF predictions to color masks:
colors = np.array(colors) processed_mask = colors[segm_mask[0]] img_height = original_img_shape[0] img_width = original_img_shape[1] processed_mask = cv2.resize(processed_mask, (img_width, img_height), interpolation=cv2.INTER_NEAREST).astype( np.uint8) # Convert color mask from BGR to RGB for compatibility with PASCAL VOC colors processed_mask = cv2.cvtColor(processed_mask, cv2.COLOR_BGR2RGB)
The results are as follows
TF color mask (image lost)
Therefore, we get two equal segmentation masks.
Model Evaluation
The dnn/samples
dnn_model_runner
module allows running the complete evaluation pipeline on the PASCAL VOC dataset and testing the execution of the DeepLab MobileNet model.
Evaluation Mode
The following line indicates running the module in evaluation mode:
python -m dnn_model_runner.dnn_conversion.tf.segmentation.py_to_py_segm
The model will be read into an OpenCV cv.dnn_Net
object. Evaluation results (pixel accuracy, average IoU, inference time) for TF and OpenCV models are written to log files. Inference time values are also displayed graphically to summarize the model information obtained.
The necessary evaluation configuration is defined in test_config.py
:
@dataclass class TestSegmConfig: frame_size: int = 500 img_root_dir: str = "./VOC2012" img_dir: str = os.path.join(img_root_dir, "JPEGImages/") img_segm_gt_dir: str = os.path.join(img_root_dir, "SegmentationClass/") # Reduce value: https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/data/pascal/seg11valid.txt segm_val_file: str = os.path.join(img_root_dir, "ImageSets/Segmentation/seg11valid.txt") color_file_cls:str = os.path.join(img_root_dir, "ImageSets/Segmentation/pascal-classes.txt")
These values can be modified based on the selected model pipeline.
Test Mode
The following line represents running the module in test mode, which provides the steps for model inference:
python -m dnn_model_runner.dnn_conversion.tf.segmentation.py_to_py_segm --test True --default_img_preprocess <True/False> --evaluate False
The default_img_preprocess
keyword here defines whether you want to parameterize the model testing process with some specific values, or use default values such as scale, mean or std.
The test configuration is represented in the test_config.py
TestSegmModuleConfig
class:
@dataclass class TestSegmModuleConfig: segm_test_data_dir: str = "test_data/sem_segm" test_module_name: str = "segmentation" (test module name) test_module_path: str = "segmentation.py" input_img: str = os.path.join(segm_test_data_dir, "2007_000033.jpg") model: str = "" frame_height: str = str(TestSegmConfig.frame_size) frame_width: str = str(TestSegmConfig.frame_size) scale: float = 1.0 mean: List[float] = field(default_factory=lambda: [0.0, 0.0, 0.0]) std: List[float] = field(default_factory=list) crop: bool = False rgb: bool = True classes: str = os.path.join(segm_test_data_dir, "pascal-classes.txt")
Default image preprocessing options are defined in default_preprocess_config.py
:
tf_segm_input_blob = {<!-- --> "scale": str(1 / 127.5)、 "mean": ["127.5", "127.5", "127.5"], "std": [], "crop": "False", "rgb": "True" }
The basics of model testing are reflected in samples/dnn/segmentation.py
. segmentation.py
can be executed autonomously with the transformation model provided in --input
and the parameters populated for cv2.dnn.blobFromImage
.
To reproduce the OpenCV steps described in “Model Conversion Pipeline” from scratch using dnn_model_runner
, execute the following code:
python -m dnn_model_runner.dnn_conversion.tf.segmentation.py_to_py_segm --test True --default_img_preprocess True --evaluate False