DCGAN–Keras implementation

Article directory

  • 1. Keras and tf.keras?
  • 2. The use of Model in keras
  • 3. Use Keras to implement DCGan
    • 1. Import the necessary packages
    • 2. Specify the model input dimensions: the image size and the length of the noise vector
    • 3. Build generator
    • 4. Construct discriminator
    • 5. Build and compile DCGan
    • 6. Train the model
    • 7. Display the generated image
    • 8. Run the model
  • Summarize

1. Keras and tf.keras?

This blog is plain and simple
You can read the first two articles, they will be much more transparent.
Keras is a high-level API, but if we want to customize the loss function or others, we can only use Tensorflow to define it
tf,keras contains all Keras API, so we’d better use tf.keras

2. The use of Model in keras

see this blog post

3. Using Keras to implement DCGan

The Minst handwritten dataset is used here

I implemented it in colab, the specific version is as follows:

1. Import necessary packages

The code is as follows (example):

%matplotlib inline
 
import matplotlib.pyplot as plt
import numpy as np
 
from keras.datasets import mnist
from keras.layers import Activation, BatchNormalization, Dense, Dropout, Flatten, Reshape
from keras.layers import LeakyReLU
from keras.layers.convolutional import Conv2D, Conv2DTranspose
from keras.models import Sequential
from keras.optimizers import Adam

2. Specify model input dimensions: image size and length of noise vector

The code is as follows (example):

img_rows = 28
img_cols = 28
channels = 1
 
# Dimensions of the input image
img_shape = (img_rows, img_cols, channels)
 
# length of noise vector Z
z_dim = 100

3. Build generator

Explanation: The generator is actually an inverse process of convolution. Convolutional networks are generally used in image classification. In the process of convolution, we continuously reduce the number of width and height channels, but in the generator, we input Is a random vector, we need to let it finally output an image, here the output is a grayscale image of 28**28*1.

Together all the steps are as follows.
(1) Take a random noise vector and reshape it to 7×7×256 through a fully connected layer
tensor.
(2) Using transposed convolution, the 7×7×256 tensor is converted into a 14×14×128 tensor.
(3) Batch normalization and LeakyReLU activation functions are applied.
(4) Use transposed convolution to convert 14×14×128 tensors into 14×14×64 sheets
quantity. NOTE: The width and height dimensions remain the same. This can be done by passing
The stride parameter in Conv2DTranspose is set to 1 to achieve.
(5) Batch normalization and LeakyReLU activation functions are applied.
(6) Using transposed convolution, convert the 14×14×64 tensor to the output image size
28×28×1.
(7) Apply the tanh activation function.

def build_generator(z_dim):
   #Serialized model
    model = Sequential()
 
    # Resize the input to a tensor of size 7*7*256 through a fully connected layer
    model.add(Dense(256 * 7 * 7, input_dim=z_dim))
    model.add(Reshape((7, 7, 256)))
 
    # Convert the 7*7*256 tensor to the 14*14*128 tensor by transposing the convolutional layer
    model.add(Conv2DTranspose(128, kernel_size=3, strides=2, padding='same'))
 
    # batch normalization
    model. add(BatchNormalization())
 
    # Leaky ReLU activation function
    model.add(LeakyReLU(alpha=0.01))
 
    # Convert the 14*14*128 tensor to the 14*14*64 tensor by transposing the convolutional layer
    model.add(Conv2DTranspose(64, kernel_size=3, strides=1, padding='same'))
 
    # batch normalization
    model. add(BatchNormalization())
 
    # Leaky ReLU activation function
    model.add(LeakyReLU(alpha=0.01))
 
    # Convert the 14*14*64 tensor to a 28*28*1 tensor by transposing the convolutional layer
    model.add(Conv2DTranspose(1, kernel_size=3, strides=2, padding='same'))
 
    # output layer with tanh activation function
    model. add(Activation('tanh'))
 
    return model

4. Construct discriminator

The discriminator is no different from the ordinary convolutional neural network used for image classification. It inputs an image and outputs a value to judge the authenticity of the image.

Together all the steps are as follows.
(1) Use the convolutional layer to convert the 28×28×1 input image into a 14×14×32 sheet
quantity.
(2) Apply the LeakyReLU activation function.
(3) Convert the 14×14×32 tensor to a 7×7×64 tensor using a convolutional layer.
(4) Batch normalization and LeakyReLU activation functions are applied.
(5) Convert the 7×7×64 tensor to a 3×3×128 tensor using a convolutional layer.
(6) Apply batch normalization and LeakyReLU activation functions.
(7) Expand the 3×3×128 tensor into a vector of size 3×3×128=1152.
(8) Use the fully connected layer and input the sigmoid activation function to calculate whether the input image is true
real probability.

def build_discriminator(img_shape):
 
    model = Sequential()
 
    # Convert a tensor of size 28*28*1 to a tensor of 14*14*32 through a convolutional layer
    model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=img_shape, padding='same'))
 
    # Leaky ReLU activation function
    model.add(LeakyReLU(alpha=0.01))
 
    # Convert a tensor of size 14*14*32 to a tensor of 7*7*64 through a convolutional layer
    model.add(
        Conv2D(64,
               kernel_size=3,
               strides=2,
               input_shape=img_shape,
               padding='same'))
 
    # batch normalization
    model. add(BatchNormalization())
 
    # Leaky ReLU activation function
    model.add(LeakyReLU(alpha=0.01))
 
    # Convert the 7*7*64 tensor to the 3*3*128 tensor through the convolutional layer
    model.add(
        Conv2D(128,
               kernel_size=3,
               strides=2,
               input_shape=img_shape,
               padding='same'))
 
    # batch normalization
    model. add(BatchNormalization())
 
    # Leaky ReLU activation function
    model.add(LeakyReLU(alpha=0.01))
 
    # output layer with sigmoid activation function
    model. add(Flatten())
    model.add(Dense(1, activation='sigmoid'))
 
    return model

5. Build and compile DCGan

def build_gan(generator, discriminator):
    model = Sequential()
 
    # Combining generators and discriminators
    model. add(generator)
    model.add(discriminator)
 
    return model
# Build and compile the discriminator (using binary cross entropy as the loss function, Adam's optimization algorithm)
discriminator = build_discriminator(img_shape)
discriminator.compile(loss='binary_crossentropy',
                      optimizer=Adam(),
                      metrics=['accuracy'])
 
# build generator
generator = build_generator(z_dim)
 
# When the generator is trained, the parameters of the discriminator are fixed
discriminator. trainable = False
 
#Build and compile the GAN model of the fixed discriminator, and train the generator
gan = build_gan(generator, discriminator)
gan.compile(loss='binary_crossentropy', optimizer=Adam())
# When the generator is trained, the parameters of the discriminator are fixed
discriminator. trainable = False
 
#Build and compile the GAN model of the fixed discriminator, and train the generator
gan = build_gan(generator, discriminator)
gan.compile(loss='binary_crossentropy', optimizer=Adam())

6. Train the model

losses = []
accuracies = []
iteration_checkpoints = []
 
 
def train(iterations, batch_size, sample_interval):
 
    # Import the mnist dataset
    (X_train, _), (_, _) = mnist. load_data()
 
    # Grayscale pixel values are scaled from [0,255] to [-1, 1]
    X_train = X_train / 127.5 - 1.0
    X_train = np. expand_dims(X_train, axis=3)
 
    # The label of the real image is 1
    real = np.ones((batch_size, 1))
 
    # The labels of the fake images are all 0
    fake = np. zeros((batch_size, 1))
 
    for iteration in range(iterations):
 
        # -------------------------
        # train the discriminator
        # -------------------------
 
        # Draw a batch of real images
        idx = np.random.randint(0, X_train.shape[0], batch_size)
        imgs = X_train[idx]
 
        # Generate a batch of fake images
        z = np.random.normal(0, 1, (batch_size, 100))
        gen_imgs = generator. predict(z)
 
        # train the discriminator
        d_loss_real = discriminator. train_on_batch(imgs, real)
        d_loss_fake = discriminator. train_on_batch(gen_imgs, fake)
        d_loss, accuracy = 0.5 * np. add(d_loss_real, d_loss_fake)
 
        # ---------------------
        # training generator
        # ---------------------
 
        # Generate a batch of fake photos
        z = np.random.normal(0, 1, (batch_size, 100))
        gen_imgs = generator. predict(z)
 
        # training generator
        g_loss = gan. train_on_batch(z, real)
 
        if (iteration + 1) % sample_interval == 0:
 
            # save loss and accuracy for plotting after training
            losses.append((d_loss, g_loss))
            accuracies. append(100.0 * accuracy)
            iteration_checkpoints. append(iteration + 1)
 
            # output training process
            print("%d [D loss: %f, acc.: %.2f%%] [G loss: %f]" %
                  (iteration + 1, d_loss, 100.0 * accuracy, g_loss))
 
            # output samples of generated image
            sample_images(generator)

7. Display generated image

For completeness, Listing 4.7 includes the sample_images() function, which specifies
A 4×4 image grid is output during training iterations.

def sample_images(generator, image_grid_rows=4, image_grid_columns=4):
 
    # Random noise of samples (4*4 synthetic images)
    z = np.random.normal(0, 1, (image_grid_rows * image_grid_columns, z_dim))
 
    # generate image from random noise
    gen_imgs = generator. predict(z)
 
    # Rescale the image pixel values to within [0,1]
    gen_imgs = 0.5 * gen_imgs + 0.5
 
    # build image grid
    fig, axs = plt.subplots(image_grid_rows,
                            image_grid_columns,
                            figsize=(4, 4),
                            sharey=True,
                            sharex=True)
 
    cnt = 0
    for i in range(image_grid_rows):
        for j in range(image_grid_columns):
            # output an image grid
            axs[i, j].imshow(gen_imgs[cnt, :, :, 0], cmap='gray')
            axs[i, j].axis('off')
            cnt + = 1

8. Run the model

# Set hyperparameters
iterations = 10000
batch_size = 128
sample_interval = 1000
 
# Train the model up to the specified number of iterations
train(iterations, batch_size, sample_interval)

Summary

Problems encountered:
1. Keras update
2. The two methods of padding convolution in TensorFlow “SAME” and “VALID” are different from the deep learning I have learned before. Tensorflow see here
3. Detailed explanation of expand_dims() function in numpy
4. Usage of np.random.randint()