Deep Neural Networks – Convert a TensorFlow piecewise model and start using OpenCV v4.8.0

Goals

In this tutorial you will learn how to

  • Convert a TensorFlow (TF) segmentation model
  • Run the converted TensorFlow model using OpenCV
  • Evaluate TensorFlow and OpenCV DNN models

We will discuss the above points using the DeepLab architecture as an example.

Introduction

Apart from the graph optimization stage, the key concepts involved in TensorFlow classification and segmentation models are almost identical to those in the OpenCV API’s conversion pipeline. The first step in converting a TensorFlow model to cv.dnn.Net is to obtain a frozen TF model graph. A frozen graph defines the combination of the model graph structure and the retained values of the required variables (such as weights). Frozen graphs are usually saved in protobuf (.pb) files. To read the generated segmentation model .pb file using cv.dnn.readNetFromTensorflow, you need to modify the graph using the TF graph conversion tool.

Practice

In this part we will cover the following points:

  1. Create a TF classification model transformation pipeline and provide inference
  2. Evaluate and test TF classification models

If you only want to run an evaluation or test model pipeline, you can skip the “Model Transformation Pipeline” tutorial section.

Model conversion pipeline

The code for this subchapter is located in the dnn_model_runner module and can be executed through the following command line:

python -m dnn_model_runner.dnn_conversion.tf.segmentation.py_to_py_deeplab

TensorFlow segmented models can be found in the TensorFlow Research Models section, which contains model implementations based on published research papers. We will retrieve the archive containing the pre-trained TF DeepLabV3 from the following link:

http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz

The complete process of obtaining frozen graphs is described in deeplab_retrievement.py:

def get_deeplab_frozen_graph():
    # Define the model path to be downloaded
    models_url = 'http://download.tensorflow.org/models/'
    mobilenetv2_voctrainval = 'deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz'
    # Build model link for download
    model_link = models_url + mobilenetv2_voctrainval
    try:
        urllib.request.urlretrieve(model_link, mobilenetv2_voctrainval)
    except Exception:
        print("TF DeepLabV3 not retrieved: {}".format(model_link))
        return
    tf_model_tar = tarfile.open(mobilenetv2_voctrainval)
    # Traverse the obtained model files
    for model_tar_elem in tf_model_tar.getmembers():
        # Check if the model file contains frozen images
        if TF_FROZEN_GRAPH_NAME in os.path.basename(model_tar_elem.name):
            # Extract frozen graphics
            tf_model_tar.extract(model_tar_elem, FROZEN_GRAPH_PATH)
    tf_model_tar.close()

After running this script

python -m dnn_model_runner.dnn_conversion.tf.segmentation.deeplab_retrievement

We will get frozen_inference_graph.pb in deeplab/deeplabv3_mnv2_pascal_trainval.

The extracted frozen_inference_graph.pb needs to be optimized before loading the network using OpenCV. To optimize the graph, we use TF TransformGraph with default parameters:

DEFAULT_OPT_GRAPH_NAME = "optimized_frozen_inference_graph.pb"
DEFAULT_INPUTS = "sub_7"
DEFAULT_OUTPUTS = "ResizeBilinear_3"
DEFAULT_TRANSFORMS = "remove_nodes(op=Identity)" \
                     " merge_duplicate_nodes" \
                     "strip_unused_nodes" \
                     " fold_constants(ignore_errors=true)" \
                     " fold_batch_norms" \
                     "fold_old_batch_norms"
def optimize_tf_graph(
        in_graph,
        out_graph=DEFAULT_OPT_GRAPH_NAME,
        inputs=DEFAULT_INPUTS,
        outputs=DEFAULT_OUTPUTS,
        transforms=DEFAULT_TRANSFORMS,
        is_manual=True,
        was_optimized=True
):
    #...
    tf_opt_graph = TransformGraph(
        tf_graph,
        inputs,
        outputs,
        transforms
    )

To run the graphics optimization process, execute the following command line:

python -m dnn_model_runner.dnn_conversion.tf.segmentation.tf_graph_optimizer --in_graph deeplab/deeplabv3_mnv2_pascal_trainval/frozen_inference_graph.pb

Therefore, the deeplab/deeplabv3_mnv2_pascal_trainval directory will contain optimized_frozen_inference_graph.pb.

Once we have the model diagram, let’s look at the steps listed below:

  1. Read the TF frozen_inference_graph.pb graph
  2. Reading optimized TF frozen graph using OpenCV API
  3. Prepare to enter data
  4. provide reasoning
  5. Get color mask from prediction
  6. Visualize results
# Get the TF model graph from the obtained frozen graph
deeplab_graph = read_deeplab_frozen_graph(deeplab_frozen_graph_path)
# Use OpenCV API to read DeepLab frozen graph
opencv_net = cv2.dnn.readNetFromTensorflow(opt_deeplab_frozen_graph_path)
print("OpenCV model has been read successfully. Model layers: \
", opencv_net.getLayerNames())
# Get the processed image

original_img_shape, tf_input_blob, opencv_input_img = get_processed_imgs("test_data/sem_segm/2007_000033.jpg")
# Get OpenCV DNN predictions
opencv_prediction = get_opencv_dnn_prediction(opencv_net, opencv_input_img)
# Get TF model predictions
tf_prediction = get_tf_dnn_prediction(deeplab_graph, tf_input_blob)
# Get PASCAL VOC category and color
pascal_voc_classes, pascal_voc_colors = read_colors_info("test_data/sem_segm/pascal-classes.txt")
# Get the color segmentation mask
opencv_colored_mask = get_colored_mask(original_img_shape, opencv_prediction, pascal_voc_colors)
tf_colored_mask = get_tf_colored_mask(original_img_shape, tf_prediction, pascal_voc_colors)
# Get PASCAL VOC color palette
color_legend = get_legend(pascal_voc_classes, pascal_voc_colors)
cv2.imshow('TensorFlow color mask', tf_colored_mask)
cv2.imshow('OpenCV DNN color mask', opencv_colored_mask)
cv2.imshow('Color Legend', color_legend)

We will use the following plot of the PASCAL VOC validation dataset to provide model inference:

PASCAL VOC image (image missing)
The target segmentation result is

PASCAL VOC ground truth (image missing)
For PASCAL VOC color decoding and its mapping to prediction masks, we also need the pascal-classes.txt file, which contains the complete list of PASCAL VOC classes and corresponding colors.

Let’s take pre-training TF DeepLabV3 MobileNetV2 as an example to understand each step in depth:

  • read TF frozen_inference_graph.pb graph:
# Start the depth laboratory model diagram
model_graph = tf.Graph()
# get
with tf.io.gfile.GFile(frozen_graph_path, 'rb') as graph_file:
    tf_model_graph = GraphDef()
tf_model_graph.ParseFromString(graph_file.read())
with model_graph.as_default():
    tf.import_graph_def(tf_model_graph, name='')
  • Read an optimized TF frozen graph using the OpenCV API:
# Read DeepLab frozen graph using OpenCV API
opencv_net = cv2.dnn.readNetFromTensorflow(opt_deeplab_frozen_graph_path)
  • Prepare input data using the cv2.dnn.blobFromImage function:
# Read image
input_img = cv2.imread(img_path, cv2.IMREAD_COLOR)
input_img = input_img.astype(np.float32)
# Input preprocessed images for TF model
tf_preproc_img = cv2.resize(input_img, (513, 513))
tf_preproc_img = cv2.cvtColor(tf_preproc_img, cv2.COLOR_BGR2RGB)
# Define preprocessing parameters for OpenCV DNN
mean = np.array([1.0, 1.0, 1.0]) * 127.5
scale=1/127.5
# Prepare the input blob to fit the model input:
# Subtract the average
# 2. Adjust pixel value from 0 to 1
input_blob = cv2.dnn.blobFromImage(
    image=input_img、
    scalefactor=scale、
    size=(513, 513), # Image target size
    mean=mean、
    swapRB=True, # BGR -> RGB
    crop=False # Center crop
)

Please note the preprocessing order in the cv2.dnn.blobFromImage function. First subtract the average and then multiply the pixel values by the defined scale. Therefore, to reproduce the TF image preprocessing pipeline, we multiply the average value by 127.5. Another important point is the image preprocessing of TF DeepLab. To pass an image into the TF model, we only need to build an appropriate shape, and the rest of the image preprocessing will be described in feature_extractor.py and called automatically.

  • Provides OpenCV cv.dnn_Net inference:
# Set up OpenCV DNN input
opencv_net.setInput(preproc_img)
# Set up OpenCV DNN input
out = opencv_net.forward()
print("OpenCV DNN segmentation prediction: \
")
print("* shape: ", out.shape)
# Get the ID of the prediction category
out_predictions = np.argmax(out[0], axis=0)

After executing the above code, we will get the following output:

OpenCV DNN segmentation prediction:
* Shape: (1, 21, 513, 513)

Each of the 21 prediction channels (21 represents the number of PASCAL VOC classes) contains a probability that represents the likelihood that a pixel corresponds to a PASCAL VOC class.

  • Provides TF model inference:
preproc_img = np.expand_dims(preproc_img, 0)
# Start TF session
tf_session = Session(graph=model_graph)
input_tensor_name = "ImageTensor:0"、
output_tensor_name = "SemanticPredictions:0".
# Run inference
out = tf_session.run(
    output_tensor_name,
    feed_dict={<!-- -->input_tensor_name: [preproc_img]} )
)
print("TF segmentation model prediction:\
")
print("* shape: ", out.shape)

The TF inference results are as follows:

TF segmentation model prediction:
* shape: (1, 513, 513)

TensorFlow predicts an index containing the corresponding PASCAL VOC category.

  • Convert OpenCV predictions to color masks:
mask_height = segm_mask.shape[0]
mask_width = segm_mask.shape[1]
img_height = original_img_shape[0]
img_width = original_img_shape[1]


# Convert mask value to PASCAL VOC color
processed_mask = np.stack([colors[color_id] for color_id in segm_mask.flatten()])
# Reshape the mask into a 3-channel image
processed_mask = processed_mask.reshape(mask_height, mask_width, 3)
processed_mask = cv2.resize(processed_mask, (img_width, img_height), interpolation=cv2.INTER_NEAREST).astype(
    np.uint8)
# Convert color mask from BGR to RGB


processed_mask = cv2.cvtColor(processed_mask, cv2.COLOR_BGR2RGB)

In this step, we map the probabilities of the segmentation masks to the appropriate colors of the predicted classes. Let’s take a look at the results:

Color legend (image missing)
OpenCV color mask (image missing)

  • Convert TF predictions to color masks:
colors = np.array(colors)
processed_mask = colors[segm_mask[0]]
img_height = original_img_shape[0]
img_width = original_img_shape[1]
processed_mask = cv2.resize(processed_mask, (img_width, img_height), interpolation=cv2.INTER_NEAREST).astype(
    np.uint8)
# Convert color mask from BGR to RGB for compatibility with PASCAL VOC colors
processed_mask = cv2.cvtColor(processed_mask, cv2.COLOR_BGR2RGB)

The results are as follows

TF color mask (image lost)
Therefore, we get two equal segmentation masks.

Model Evaluation

The dnn/samples dnn_model_runner module allows running the complete evaluation pipeline on the PASCAL VOC dataset and testing the execution of the DeepLab MobileNet model.

Evaluation Mode

The following line indicates running the module in evaluation mode:

python -m dnn_model_runner.dnn_conversion.tf.segmentation.py_to_py_segm

The model will be read into an OpenCV cv.dnn_Net object. Evaluation results (pixel accuracy, average IoU, inference time) for TF and OpenCV models are written to log files. Inference time values are also displayed graphically to summarize the model information obtained.

The necessary evaluation configuration is defined in test_config.py:

@dataclass
class TestSegmConfig:
    frame_size: int = 500
    img_root_dir: str = "./VOC2012"
    img_dir: str = os.path.join(img_root_dir, "JPEGImages/")
    img_segm_gt_dir: str = os.path.join(img_root_dir, "SegmentationClass/")
    # Reduce value: https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/data/pascal/seg11valid.txt
    segm_val_file: str = os.path.join(img_root_dir, "ImageSets/Segmentation/seg11valid.txt")
    color_file_cls:str = os.path.join(img_root_dir, "ImageSets/Segmentation/pascal-classes.txt")

These values can be modified based on the selected model pipeline.

Test Mode

The following line represents running the module in test mode, which provides the steps for model inference:

python -m dnn_model_runner.dnn_conversion.tf.segmentation.py_to_py_segm --test True --default_img_preprocess <True/False> --evaluate False

The default_img_preprocess keyword here defines whether you want to parameterize the model testing process with some specific values, or use default values such as scale, mean or std.

The test configuration is represented in the test_config.py TestSegmModuleConfig class:

@dataclass
class TestSegmModuleConfig:
    segm_test_data_dir: str = "test_data/sem_segm"
    test_module_name: str = "segmentation" (test module name)
    test_module_path: str = "segmentation.py"
    input_img: str = os.path.join(segm_test_data_dir, "2007_000033.jpg")
    model: str = ""
    frame_height: str = str(TestSegmConfig.frame_size)
    frame_width: str = str(TestSegmConfig.frame_size)
    scale: float = 1.0
    mean: List[float] = field(default_factory=lambda: [0.0, 0.0, 0.0])
    std: List[float] = field(default_factory=list)
    crop: bool = False
    rgb: bool = True
    classes: str = os.path.join(segm_test_data_dir, "pascal-classes.txt")

Default image preprocessing options are defined in default_preprocess_config.py:

tf_segm_input_blob = {<!-- -->
    "scale": str(1 / 127.5)、
    "mean": ["127.5", "127.5", "127.5"],
    "std": [],
    "crop": "False",
    "rgb": "True"
}

The basics of model testing are reflected in samples/dnn/segmentation.py. segmentation.py can be executed autonomously with the transformation model provided in --input and the parameters populated for cv2.dnn.blobFromImage.

To reproduce the OpenCV steps described in “Model Conversion Pipeline” from scratch using dnn_model_runner, execute the following code:

python -m dnn_model_runner.dnn_conversion.tf.segmentation.py_to_py_segm --test True --default_img_preprocess True --evaluate False