Using Convolutional Neural Networks to Remove Moiré from Images

Abstract: This project mainly introduces how to use convolutional neural network to detect remake pictures, mainly moiré pictures; its main innovation lies in the network structure, which separates the high and low frequency information of the picture.

This article is shared from the HUAWEI CLOUD Community “A Brief Description and Code Implementation of Moiré Removal of Images”, author: Li Changan.

1 Preface

When the spatial frequency of the photosensitive element pixels is close to the spatial frequency of the stripes in the image, a new wavy interference pattern, the so-called moiré, may be generated. The grid-like texture of the sensor forms one such pattern. This effect also produces noticeable disturbances in the image when the thin strip-like structures in the pattern intersect the structures of the sensor at small angles. This phenomenon is very common in some fine texture situations, such as cloth in fashion photography. This moiré pattern may be displayed through brightness or color. But here, we only deal with the image moiré produced in the remake process.

Recapture captures a picture from a computer screen, or takes a picture pointed at the screen; this method produces moiré patterns on the picture

The main processing ideas of the paper

Haar transformation is performed on the original image to obtain four downsampled feature maps (the second sampling cA of the original image, the Horizontal high frequency cH, the Vertical vertical high frequency cV, and the Diagonal oblique high frequency cD)
Then use four independent CNNs to convolute and pool the four downsampled feature maps to extract feature information
The original text then compares each channel and each pixel of the results of the three high-frequency information convolution pools, and takes max
Cartesian product of the result obtained in the previous step and the result of cA convolution pooling

Paper address

2. Reappearance of network structure

As shown in the figure below, this project reproduces the image moiré removal method of the paper, and modifies the data processing part, and the network structure also refers to the structure in the source code, and generates four down-sampled feature maps for the image, while Not the three in the paper, you can refer to the network structure for specific processing methods.

import math
import paddle
import paddle.nn as nn
import paddle.nn.functional as F
#import pywt
from paddle.nn import Linear, Dropout, ReLU
from paddle.nn import Conv2D, MaxPool2D
class mcnn(nn.Layer):
 def __init__(self, num_classes=1000):
 super(mcnn, self).__init__()
 self.num_classes = num_classes
 self._conv1_LL = Conv2D(3,32,7,stride=2,padding=1,)
 # self.bn1_LL = nn.BatchNorm2D(128)
 self._conv1_LH = Conv2D(3,32,7,stride=2,padding=1,)
 # self.bn1_LH = nn.BatchNorm2D(256)
 self._conv1_HL = Conv2D(3,32,7,stride=2,padding=1,)
 # self.bn1_HL = nn.BatchNorm2D(512)
 self._conv1_HH = Conv2D(3,32,7,stride=2,padding=1,)
 # self.bn1_HH = nn.BatchNorm2D(256)
 self.pool_1_LL = nn.MaxPool2D(kernel_size=2, stride=2, padding=0)
 self.pool_1_LH = nn.MaxPool2D(kernel_size=2, stride=2, padding=0)
 self.pool_1_HL = nn.MaxPool2D(kernel_size=2, stride=2, padding=0)
 self.pool_1_HH = nn.MaxPool2D(kernel_size=2, stride=2, padding=0)
 self._conv2 = Conv2D(32,16,3,stride=2,padding=1,)
 self.pool_2 = nn.MaxPool2D(kernel_size=2, stride=2, padding=0)
 self.dropout2 = Dropout(p=0.5)
 self._conv3 = Conv2D(16,32,3,stride=2,padding=1,)
 self.pool_3 = nn.MaxPool2D(kernel_size=2, stride=2, padding=0)
 self._conv4 = Conv2D(32,32,3,stride=2,padding=1,)
 self.pool_4 = nn.MaxPool2D(kernel_size=2, stride=2, padding=0)
 self.dropout4 = Dropout(p=0.5)
 # self.bn1_HH = nn.BatchNorm1D(256)
 self._fc1 = Linear(in_features=64, out_features=num_classes)
 self.dropout5 = Dropout(p=0.5)
 self._fc2 = Linear(in_features=2, out_features=num_classes)
 def forward(self, inputs1, inputs2, inputs3, inputs4):
        x1_LL = self._conv1_LL(inputs1)
        x1_LL = F.relu(x1_LL)
        x1_LH = self._conv1_LH(inputs2)
        x1_LH = F.relu(x1_LH)
        x1_HL = self._conv1_HL(inputs3)
        x1_HL = F.relu(x1_HL)
        x1_HH = self._conv1_HH(inputs4)
        x1_HH = F.relu(x1_HH)
        pool_x1_LL = self. pool_1_LL(x1_LL)
        pool_x1_LH = self. pool_1_LH(x1_LH)
        pool_x1_HL = self. pool_1_HL(x1_HL)
        pool_x1_HH = self. pool_1_HH(x1_HH)
        temp = paddle. maximum(pool_x1_LH, pool_x1_HL)
 avg_LH_HL_HH = paddle. maximum(temp, pool_x1_HH)
 inp_merged = paddle.multiply(pool_x1_LL, avg_LH_HL_HH)
        x2 = self._conv2(inp_merged)
        x2 = F.relu(x2)
        x2 = self. pool_2(x2)
        x2 = self.dropout2(x2)
        x3 = self._conv3(x2)
        x3 = F.relu(x3)
        x3 = self. pool_3(x3)
        x4 = self._conv4(x3)
        x4 = F.relu(x4)
        x4 = self. pool_4(x4)
        x4 = self. dropout4(x4)
        x4 = paddle.flatten(x4, start_axis=1, stop_axis=-1)
        x5 = self._fc1(x4)
        x5 = self. dropout5(x5)
        out = self._fc2(x5)
 return out
model_res = mcnn(num_classes=2)
paddle.summary(model_res,[(1,3,512,384),(1,3,512,384),(1,3,512,384),(1,3,512,384)])
-------------------------------------------------- -------------------------
 Layer (type) Input Shape Output Shape Param #
==================================================== ===========================
   Conv2D-1 [[1, 3, 512, 384]] [1, 32, 254, 190] 4,736
   Conv2D-2 [[1, 3, 512, 384]] [1, 32, 254, 190] 4,736
   Conv2D-3 [[1, 3, 512, 384]] [1, 32, 254, 190] 4,736
   Conv2D-4 [[1, 3, 512, 384]] [1, 32, 254, 190] 4,736
  MaxPool2D-1 [[1, 32, 254, 190]] [1, 32, 127, 95] 0
  MaxPool2D-2 [[1, 32, 254, 190]] [1, 32, 127, 95] 0
  MaxPool2D-3 [[1, 32, 254, 190]] [1, 32, 127, 95] 0
  MaxPool2D-4 [[1, 32, 254, 190]] [1, 32, 127, 95] 0
   Conv2D-5 [[1, 32, 127, 95]] [1, 16, 64, 48] 4,624
  MaxPool2D-5 [[1, 16, 64, 48]] [1, 16, 32, 24] 0
   Dropout-1 [[1, 16, 32, 24]] [1, 16, 32, 24] 0
   Conv2D-6 [[1, 16, 32, 24]] [1, 32, 16, 12] 4,640
  MaxPool2D-6 [[1, 32, 16, 12]] [1, 32, 8, 6] 0
   Conv2D-7 [[1, 32, 8, 6]] [1, 32, 4, 3] 9,248
  MaxPool2D-7 [[1, 32, 4, 3]] [1, 32, 2, 1] 0
   Dropout-2 [[1, 32, 2, 1]] [1, 32, 2, 1] 0
   Linear-1 [[1, 64]] [1, 2] 130
   Dropout-3 [[1, 2]] [1, 2] 0
   Linear-2 [[1, 2]] [1, 2] 6
==================================================== ===========================
Total params: 37,592
Trainable params: 37,592
Non-trainable params: 0
-------------------------------------------------- -------------------------
Input size (MB): 9.00
Forward/backward pass size (MB): 59.54
Params size (MB): 0.14
Estimated Total Size (MB): 68.68
-------------------------------------------------- -------------------------
{'total_params': 37592, 'trainable_params': 37592}

3. Data preprocessing

The difference from the source code is that this project integrates the wavelet decomposition part of the image into the data reading part, that is, the wavelet decomposition is performed online instead of the offline wavelet decomposition in the source code and the picture is saved. First, define the wavelet decomposition function

!pip install PyWavelets
import numpy as np
import pywt
def splitFreqBands(img, levRows, levCols):
 halfRow = int(levRows/2)
 halfCol = int(levCols/2)
    LL = img[0:halfRow, 0:halfCol]
    LH = img[0:halfRow, halfCol:levCols]
    HL = img[halfRow:levRows, 0:halfCol]
    HH = img[halfRow:levRows, halfCol:levCols]
 return LL, LH, HL, HH
def haarDWT1D(data, length):
    avg0 = 0.5;
    avg1 = 0.5;
    dif0 = 0.5;
    dif1 = -0.5;
    temp = np.empty_like(data)
 # temp = temp.astype(float)
    temp = temp.astype(np.uint8)
    h = int(length/2)
 for i in range(h):
        k = i*2
        temp[i] = data[k] * avg0 + data[k + 1] * avg1;
 temp[i + h] = data[k] * dif0 + data[k + 1] * dif1;
 data[:] = temp
# computes the homography coefficients for PIL.Image.transform using point correspondences
def fwdHaarDWT2D(img):
 img = np.array(img)
 levRows = img. shape[0];
 levCols = img. shape[1];
 # img = img.astype(float)
 img = img.astype(np.uint8)
 for i in range(levRows):
        row = img[i,:]
        haarDWT1D(row, levCols)
 img[i,:] = row
 for j in range(levCols):
        col = img[:,j]
        haarDWT1D(col, levRows)
 img[:,j] = col
 return splitFreqBands(img, levRows, levCols)
!cd "data/data188843/" & amp; & amp; unzip -q 'total_images.zip'
import os
recapture_keys = ['ValidationMoire']
original_keys = ['ValidationClear']
def get_image_label_from_folder_name(folder_name):
 """
 :param folder_name:
 :return:
    """
 for key in original_keys:
 if key in folder_name:
 return 'original'
 for key in recapture_keys:
 if key in folder_name:
 return 'recapture'
 return 'unclear'
label_name2label_id = {
 'original': 0,
 'recapture': 1,}
src_image_dir = "data/data188843/total_images"
dst_file = "data/data188843/total_images/train.txt"
image_folder = [file for file in os.listdir(src_image_dir)]
print(image_folder)
image_anno_list = []
for folder in image_folder:
 label_name = get_image_label_from_folder_name(folder)
 # label_id = label_name2label_id.get(label_name, 0)
 label_id = label_name2label_id[label_name]
 folder_path = os.path.join(src_image_dir, folder)
 image_file_list = [file for file in os.listdir(folder_path) if
 file.endswith('.jpg') or file.endswith('.jpeg') or
 file.endswith('.JPG') or file.endswith('.JPEG') or file.endswith('.png')]
 for image_file in image_file_list:
 # if need_root_dir:
 # image_path = os.path.join(folder_path, image_file)
 # else:
 image_path = image_file
 image_anno_list.append(folder + "/" + image_path + "\t" + str(label_id) + '\
')
dst_path = os.path.dirname(src_image_dir)
if not os.path.exists(dst_path):
 os.makedirs(dst_path)
with open(dst_file, 'w') as fd:
 fd. writelines(image_anno_list)
import paddle
import numpy as np
import pandas as pd
import PIL. Image as Image
from paddle.vision import transforms
# from haar2D import fwdHaarDWT2D
paddle. disable_static()
# Define data preprocessing
data_transforms = transforms. Compose([
 transforms.Resize(size=(448,448)),
 transforms.ToTensor(), # transpose operation + (img / 255)
 # transforms.Normalize( # Subtract the mean and divide the standard deviation
 # mean=[0.31169346, 0.25506335, 0.12432463],
 # std=[0.34042713, 0.29819837, 0.1375536])
 #Calculation process: output[channel] = (input[channel] - mean[channel]) / std[channel]
])
# Build Dataset
class MyDataset(paddle.io.Dataset):
 """
 Step 1: Inherit paddle.io.Dataset class
    """
 def __init__(self, train_img_list, val_img_list, train_label_list, val_label_list, mode='train', ):
 """
 Step 2: Implement the constructor, define the data reading method, and divide the training and testing data sets
        """
 super(MyDataset, self).__init__()
 self.img = []
 self. label = []
 # Read csv library with pandas
 self. train_images = train_img_list
 self.test_images = val_img_list
 self. train_label = train_label_list
 self.test_label = val_label_list
 if mode == 'train':
 # Read the data of train_images
 for img, la in zip(self. train_images, self. train_label):
 self.img.append('/home/aistudio/data/data188843/total_images/' + img)
 self.label.append(paddle.to_tensor(int(la), dtype='int64'))
 else:
 # Read the data of test_images
 for img, la in zip(self. test_images, self. test_label):
 self.img.append('/home/aistudio/data/data188843/total_images/' + img)
 self.label.append(paddle.to_tensor(int(la), dtype='int64'))
 def load_img(self, image_path):
 # In actual use, use Pillow related libraries to read pictures. Here we will simulate the data first.
        image = Image.open(image_path).convert('RGB')
 # image = data_transforms(image)
 return image
 def __getitem__(self, index):
 """
 Step 3: Implement the __getitem__ method, define how to obtain data when specifying an index, and return a single piece of data (training data, corresponding label)
        """
        image = self. load_img(self. img[index])
        LL, LH, HL, HH = fwdHaarDWT2D(image)
        label = self. label[index]
 # print(LL. shape)
 # print(LH. shape)
 # print(HL. shape)
 # print(HH. shape)
        LL = data_transforms(LL)
        LH = data_transforms(LH)
        HL = data_transforms(HL)
        HH = data_transforms(HH)
 print(type(LL))
 print(LL. dtype)
 return LL, LH, HL, HH, np.array(label, dtype='int64')
 def __len__(self):
 """
 Step 4: Implement the __len__ method and return the total number of data sets
        """
 return len(self.img)
image_file_txt = '/home/aistudio/data/data188843/total_images/train.txt'
with open(image_file_txt) as fd:
    lines = fd. readlines()
train_img_list = list()
train_label_list = list()
for line in lines:
 split_list = line.strip().split()
 image_name, label_id = split_list
 train_img_list.append(image_name)
 train_label_list.append(label_id)
# print(train_img_list)
# print(train_label_list)
# test defined dataset
train_dataset = MyDataset(mode='train',train_label_list=train_label_list, train_img_list=train_img_list, val_img_list=train_img_list, val_label_list=train_label_list)
# test_dataset = MyDataset(mode='test')
# Build the training set data loader
train_loader = paddle.io.DataLoader(train_dataset, batch_size=2, shuffle=True)
# Build the test set data loader
valid_loader = paddle.io.DataLoader(train_dataset, batch_size=2, shuffle=True)
print('==============train dataset===============')
for LL, LH, HL, HH, label in train_dataset:
 print('label: {}'. format(label))
 break

4. Model training

model2 = paddle. Model(model_res)
model2.prepare(optimizer=paddle.optimizer.Adam(parameters=model2.parameters()),
              loss = nn.CrossEntropyLoss(),
              metrics=paddle.metric.Accuracy())
model2.fit(train_loader,
 valid_loader,
        epochs=5,
        verbose=1,
 )

Summary

This project mainly introduces how to use convolutional neural network to detect remake pictures, mainly moiré pictures; its main innovation lies in the network structure, which separates the high and low frequency information of the picture.

In this project, CNN is trained using only level 1 wavelet decomposition. The effect on the accuracy of multilevel wavelet decomposition networks can be explored. CNN models can be trained with more and harder examples and deeper networks.

Click to follow and learn about Huawei Cloud’s fresh technologies for the first time~

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. OpenCV skill tree Deep learning in OpenCV Image classification 15121 people are learning the system