Pytorch specifies data loader to use subprocess

torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True,num_workers=4, pin_memory=True)

The num_workers parameter is a parameter of the DataLoader class that specifies the number of child processes used by the data loader. By increasing the number of num_workers, data can be read and preprocessed in parallel, thereby increasing the speed of data loading.

Typically, increasing the number of num_workers can improve the efficiency of data loading because it allows data loading and preprocessing to be performed in multiple processes simultaneously. However, when the number of num_workers exceeds a certain threshold, adding more processes may no longer bring more performance improvements, and may even cause performance degradation.

This is because increasing the number of num_workers also increases the cost of inter-process communication. When the number of num_workers is too large, the cost of inter-process communication may exceed the benefits of parallelization, resulting in performance degradation.

Additionally, computer hardware limitations need to be taken into consideration. If your computer has a limited number of CPU cores, increasing the number of num_workers may also cause performance degradation because each process requires CPU core resources.

Therefore, the setting of the num_workers parameter needs to be adjusted and optimized according to the specific situation. Typically, a reasonable num_workers value should be between 2 and 8, depending on factors such as your computer hardware configuration and the size of your data set. In practical applications, the optimal configuration can be found by trying different num_workers values.

In summary, when the value of num_workers increases from 4 to 8, the performance difference between the two may be significant if factors such as your computer hardware configuration and data set size do not change. Small, or even no significant difference.

The test code is as follows

import torch
import torchvision
import matplotlib.pyplot as plt
import torchvision.models as models
import torch.nn as nn
import torch.optim as optim
import torch.multiprocessing as mp
import time

if __name__ == '__main__':
    mp.freeze_support()
    train_on_gpu = torch.cuda.is_available()
    if not train_on_gpu:
        print('CUDA is not available. Training on CPU...')
    else:
        print('CUDA is available! Training on GPU...')

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    batch_size = 4
    #Set transformation for data preprocessing
    transform = torchvision.transforms.Compose([
        torchvision.transforms.Resize((512,512)), #Resize the image to 224x224
        torchvision.transforms.ToTensor(), # Convert to tensor
        torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) # Normalize
    ])
    dataset = torchvision.datasets.ImageFolder('C:\Users\ASUS\PycharmProjects\pythonProject1\cats_and_dogs_train',
                                                     transform=transform)


    val_ratio = 0.2
    val_size = int(len(dataset) * val_ratio)
    train_size = len(dataset) - val_size
    train_dataset, val_dataset = torch.utils.data.random_split(dataset, [train_size, val_size])

    train_dataset = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True,num_workers=4, pin_memory=True)
    val_dataset = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, shuffle=True,num_workers=4, pin_memory=True)

    model = models.resnet18()

    num_classes = 2
    for param in model.parameters():
        param.requires_grad = False

    model.fc = nn.Sequential(
        nn.Dropout(),
        nn.Linear(model.fc.in_features, num_classes),
        nn.LogSoftmax(dim=1)
    )
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    criterion = nn.CrossEntropyLoss().to(device)
    model.to(device)

    filename = "recognize_cats_and_dogs.pt"

    def save_checkpoint(epoch, model, optimizer, filename):
        checkpoint = {
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss': loss,
        }
        torch.save(checkpoint, filename)

    num_epochs = 3
    train_loss = []
    for epoch in range(num_epochs):
        running_loss = 0
        correct = 0
        total=0
        epoch_start_time = time.time()
        for i, (inputs, labels) in enumerate(train_dataset):
            # Put data on the device
            inputs, labels = inputs.to(device), labels.to(device)
            # Forward calculation
            outputs = model(inputs)
            # Calculate loss and gradient
            loss = criterion(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            # Update model parameters
            optimizer.step()
            # Record loss and accuracy
            running_loss + = loss.item()
            train_loss.append(loss.item())
            _, predicted = torch.max(outputs.data, 1)
            correct + = (predicted == labels).sum().item()
            total + = labels.size(0)
        accuracy_train = 100 * correct / total
        # Calculate the accuracy on the test set
        with torch.no_grad():
            running_loss_test = 0
            correct_test = 0
            total_test = 0
            for inputs, labels in val_dataset:
                inputs, labels = inputs.to(device), labels.to(device)
                outputs = model(inputs)
                loss = criterion(outputs, labels)
                running_loss_test + = loss.item()

                _, predicted = torch.max(outputs.data, 1)
                correct_test + = (predicted == labels).sum().item()
                total_test + = labels.size(0)
            accuracy_test = 100 * correct_test / total_test
            # Output the loss and accuracy of each epoch
        epoch_end_time = time.time()
        epoch_time = epoch_end_time - epoch_start_time
        print("Epoch [{}/{}], Time: {:.4f}s, Loss: {:.4f}, Train Accuracy: {:.2f}%, Loss: {:.4f}, Test Accuracy : {:.2f}%"
              .format(epoch + 1, num_epochs,epoch_time,running_loss / len(val_dataset),
                      accuracy_train, running_loss_test / len(val_dataset), accuracy_test))
        save_checkpoint(epoch, model, optimizer, filename)

    plt.plot(train_loss, label='Train Loss')
    #Add legend and labels
    plt.legend()
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.title('Training Loss')

    # Display graphics
    plt.show()

The results of different num_workers are as follows