Pytroch implements minister handwritten digit recognition (full link model)

  • This article is a learning record blog in the 365-day deep learning training camp
  • Original author: Classmate K | Receive tutoring and project customization

Article directory

  • Preface
  • 1. Code and running results
    • 1. Import the library
    • 2.Dataset
    • 3. Data loading
    • 4. Build the model
    • 5.Train the model
  • 2. Summary

Foreword

I hope that I can study the related knowledge of deep learning well in the future.

1. Code and running results

1. Import library

import torch
import numpy as np
from matplotlib import pyplot as plt
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision import datasets
import torch.nn.functional as F

2. Dataset

URL: Mnist dataset

Training set, training set label

3. Data loading

# Define data conversion, convert image to tensor
transformation = transforms.Compose([
    transforms.ToTensor(),
])

#Create a training data set object, specify the data storage path, training flag, data conversion and download flag
train_ds = datasets.MNIST(
    r'./raw',
    train=True,
    transform=transformation,
    download=True
)

#Create a test data set object, specify the data storage path, training flag, data conversion and download flag
test_ds = datasets.MNIST(
    r'./raw',
    train=False,
    transform=transformation,
    download=True
)

# Create a training data loader, specify the training data set and batch size, and enable data shuffling
train_dl = torch.utils.data.DataLoader(train_ds, batch_size=64, shuffle=True)

# Create a test data loader, specify the test data set and batch size
test_dl = torch.utils.data.DataLoader(test_ds, batch_size=256)

# Get a batch of images and labels from the training data loader
imgs, labels = next(iter(train_dl))

Code analysis:

  1. transforms.Compose creates a data transformation object that converts the image into tensor form.
  2. datasets.MNIST creates training and test dataset objects, specifying the data storage path, training flags, data conversion, and download flags.
  3. torch.utils.data.DataLoader
    Created loaders for training and test data, specifying dataset and batch size. In the training data loader, data shuffling is also enabled.
  4. next(iter(train_dl)) Gets a batch of images and labels from the training data loader for model training.
    Summary: This code is to load the MNIST data set and prepare it for model training and testing.

4. Build model

# Define the neural network model class, inherited from nn.Module
class Model(nn.Module):
    #Constructor, defines the layers of the network structure
    def __init__(self):
        super().__init__()
        # Define the first linear layer, the input size is 28*28, and the output size is 120
        self.liner_1 = nn.Linear(28*28, 120)
        # Define the second linear layer, the input size is 120, and the output size is 84
        self.liner_2 = nn.Linear(120, 84)
        # Define the third linear layer, the input size is 84, and the output size is 10
        self.liner_3 = nn.Linear(84, 10)

    #Define forward propagation function
    def forward(self, input):
        # Flatten the input image data into a one-dimensional vector with a size of 28*28
        x = input.view(-1, 28*28)
        # After the first linear layer, apply the ReLU activation function
        x = F.relu(self.liner_1(x))
        # After the second linear layer, apply the ReLU activation function
        x = F.relu(self.liner_2(x))
        # After the third linear layer, the final output is obtained
        x = self.liner_3(x)

Code analysis:

  1. nn.Module is the base class for all neural network models in PyTorch.
  2. The structure of the neural network is defined in the init constructor, including three linear layers (fully connected layers).
  3. The forward function defines the forward propagation process, in which the image data is processed by the linear layer and the ReLU activation function, and finally the output of the model is obtained.
  4. nn.Linear represents a linear layer, that is, a fully connected layer, which defines a linear transformation.
  5. F.relu is the ReLU activation function, used to introduce nonlinear characteristics.

Summary: The input image is processed by linear layers and activation functions, and features are gradually extracted and combined to finally obtain the output of the model. This model has an output size of 10 and is suitable for classification problems such as MNIST handwritten digit classification.

5. Training model

# Define cross entropy loss function
loss_fn = torch.nn.CrossEntropyLoss()

#Define training function
def fit(epoch, model, trainloader, testloader):
    correct = 0 # Used to record the number of samples correctly classified in the training set
    total = 0 #Total number of samples in the training set
    running_loss = 0 # Used to record the loss during training

    # Traverse the training set
    for x, y in trainloader:
        y_pred = model(x) # Model prediction
        loss = loss_fn(y_pred, y) # Calculate loss
        optim.zero_grad() # Clear gradient
        loss.backward() # Backpropagation
        optim.step() # Update weights

        with torch.no_grad():
            y_pred = torch.argmax(y_pred, dim=1)
            correct + = (y_pred == y).sum().item() # Count the number of correctly classified samples
            total + = y.size(0) # Count the total number of samples
            running_loss + = loss.item() # Cumulative loss value

    epoch_loss = running_loss / len(trainloader.dataset) # Calculate the average loss
    epoch_acc = correct / total # Calculate training set accuracy

    # Evaluate on the test set
    test_correct = 0
    test_total = 0
    test_running_loss = 0

    with torch.no_grad():
        for x, y in testloader:
            y_pred = model(x)
            loss = loss_fn(y_pred, y)
            y_pred = torch.argmax(y_pred, dim=1)
            test_correct + = (y_pred == y).sum().item()
            test_total + = y.size(0)
            test_running_loss + = loss.item()

    epoch_test_loss = test_running_loss / len(testloader.dataset) # Calculate the average loss of the test set
    epoch_test_acc = test_correct / test_total # Calculate test set accuracy

    #Print training and testing indicator information
    print('epoch: ', epoch,
          'loss: ', round(epoch_loss, 3),
          'accuracy:', round(epoch_acc, 3),
          'test_loss: ', round(epoch_test_loss, 3),
          'test_accuracy:', round(epoch_test_acc, 3)
          )

    return epoch_loss, epoch_acc, epoch_test_loss, epoch_test_acc
    
# Define the optimizer, use the Adam optimizer, the learning rate is 0.001, and optimize the parameters of the model
optim = torch.optim.Adam(model.parameters(), lr=0.001)

# Define the total number of training rounds
epochs = 10

# Used to store training and testing metrics for each epoch
train_loss = []
train_acc = []
test_loss = []
test_acc = []

# Loop training model, iterate for the specified number of rounds
for epoch in range(epochs):
    # Call the fit function for model training and evaluation
    epoch_loss, epoch_acc, epoch_test_loss, epoch_test_acc = fit(epoch,
                                                                 model,
                                                                 train_dl,
                                                                 test_dl)
    
    # Record the loss and accuracy of training and testing
    train_loss.append(epoch_loss)
    train_acc.append(epoch_acc)
    test_loss.append(epoch_test_loss)
    test_acc.append(epoch_test_acc)

Code analysis:

  1. torch.nn.CrossEntropyLoss() creates a cross-entropy loss function suitable for multi-class classification problems.
  2. The fit function is a function for training the model, which includes the training and testing process.
  3. During the training process, the prediction results are obtained through forward propagation, the loss is calculated, and the model weights are updated through back propagation.
  4. Record the loss during the training process, the number of correctly classified samples and the total number of samples, which are used to calculate the training set accuracy.
  5. Evaluate on the test set, and also record the loss during the test process, the number of correctly classified samples, and the total number of samples, which are used to calculate the test set accuracy.
  6. Print training and test metric information for each epoch, including loss and accuracy.
  7. torch.optim.Adam creates the Adam optimizer, which adjusts model parameters to minimize loss.
  8. epochs defines the total number of training epochs. After each epoch, the loss and accuracy of training and testing are recorded in the corresponding lists.
  9. Finally, four lists are obtained: train_loss, train_acc, test_loss,
    test_acc, they record the training loss, training accuracy, test loss and test accuracy of each epoch respectively. These lists can be used to plot training and test curves to better understand the model’s performance.

2. Summary

  1. The above operations use a full-link model and do not involve convolution, pooling, etc., and will be gradually improved in the future.
  2. The role of Transforms.ToTensor: ①Convert the data set into a tensor; ②Normalization processing
  3. The representation of images in pytorch: [batch, channel, hight, width]