P23 Loss function and backpropagation from torch.nn

1. General meaning

The smaller the loss, the better

Popular science: Back propagation means trying to adjust the parameters in the network process so that the final loss will become smaller (because the parameters are derived from loss, which is in the opposite order to the network, so it is called back propagation), and The understanding of gradient can be directly regarded as “slope”

The meaning of the loss function:

2. Read the documentation

1, L1–loss

torch.nn.L1Loss(size_average=None, reduce=None, reduction='mean')

———————— ———————————————–
Look at input and output
Input:
1. N: batch size
2. , can be data of any dimension
Template target:
Same form as input
Output, if reduction is not specified, has the same form as input

Look at the code
(1) Create new data
(2) Use reshape to expand the dimensions to four dimensions

import torch.nn
import torch
from torch.nn import L1Loss

inputs=torch.tensor([1,2,3])
targets=torch.tensor([1,2,5])
print(inputs.shape)

inputs=torch.reshape(inputs,[1,1,1,3])
targets=torch.reshape(targets,[1,1,1,3])

print(inputs.shape)

loss=L1Loss()
result=loss(inputs,targets)

print(result)

An error is reported, saying that our numerical type does not meet the requirements.
If you want floating point type, then we modify the type and use dtype

inputs=torch.tensor([1,2,3],dtype=torch.float32)
targets=torch.tensor([1,2,5],dtype=torch.float32)

This is the complete code:

import torch.nn
import torch
from torch.nn import L1Loss

inputs=torch.tensor([1,2,3],dtype=torch.float32)
targets=torch.tensor([1,2,5],dtype=torch.float32)
print(inputs.shape)

inputs=torch.reshape(inputs,[1,1,1,3])
targets=torch.reshape(targets,[1,1,1,3])

print(inputs.shape)

loss=L1Loss()
result=loss(inputs,targets)

print(result)

You can see that the result calculated through the loss function is exactly 2/3=0.6667

Now we specify reduction as summation

loss=L1Loss(reduction="sum")
result=loss(inputs,targets)
print(result)

The result is obviously 2

2. MSEloss
torch.nn.MSELoss(size_average=None, reduce=None, reduction=mean’)

Look at the code

import torch
import torch.nn
from torch.nn import MSELoss


inputs=torch.tensor([1,2,3],dtype=torch.float32)
targets=torch.tensor([1,2,5],dtype=torch.float32)
print(inputs.shape)

inputs=torch.reshape(inputs,[1,1,1,3])
targets=torch.reshape(targets,[1,1,1,3])

print(inputs.shape)

loss=MSELoss()
result=loss(inputs,targets)
print(result)

The output result is 1.333, which is consistent with our calculation.

3. CrossEntropyLoss
Cross entropy: a loss function commonly used for classification

t is useful when training a classification problem with C classes.
Commonly used for training classification problems (the number of categories is C)

Calculated as follows;
Give examples

Look at the code
First look at the shapes required in the document

import torch
from torch.nn import CrossEntropyLoss

x=torch.tensor([0.1,0.2,0.3]) where x is one-dimensional
y=torch.tensor([1])
print(x.shape)
x=torch.reshape(x,(1,3))
print(x.shape)
//Because x is the input in CrossEntropyLoss, the form is two-dimensional (batchisize, number of categories)
y is target, so we need to turn x into two dimensions //
loss_cross=CrossEntropyLoss()
result_loss=loss_cross(x,y)
print(result_loss)

Output:
torch.Size([3])
torch.Size([1, 3])
tensor(1.1019)

**For dog calculation formula:

It should be noted here that the log in the document is actually the natural logarithm of ln**

=-0.2 + ln(exp(0.1) + exp(0.2) + exp(0.3))

Correct calculation

3.1 Loss function practice

1. First copy in the neural network (sequential) built in the last study

from torch.nn import Sequential
from torch import nn
from torch.nn import Conv2d,MaxPool2d,Flatten,Linear

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1=Sequential(
            Conv2d(in_channels=3,out_channels=32,kernel_size=5,stride=1,padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, 1, 2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, 1, 2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self,x):
        x=self.model1(x)
        return x

2. Import cifar data set

from torch.utils.data import DataLoader

dataset=torchvision.datasets.CIFAR10("./P23_dataset",train=False,transform=torchvision.transforms.ToTensor(),
                                     download=True)
dataloader=DataLoader(dataset,batch_size=64)

3. The for loop takes out imgs and targets. Then operate through neural network

tudui=Tudui()

for data in dataloader:
    imgs,targets=data
    output=tudui(imgs)
    print(output)
    print(targets)

4. Final:

from torch.nn import Sequential
from torch import nn
from torch.nn import Conv2d,MaxPool2d,Flatten,Linear
import torchvision
from torch.utils.data import DataLoader

dataset=torchvision.datasets.CIFAR10("./P23_dataset",train=False,transform=torchvision.transforms.ToTensor(),
                                     download=True)
dataloader=DataLoader(dataset,batch_size=64)


class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1=Sequential(
            Conv2d(in_channels=3,out_channels=32,kernel_size=5,stride=1,padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, 1, 2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, 1, 2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self,x):
        x=self.model1(x)
        return x

tudui=Tudui()

for data in dataloader:
    imgs,targets=data
    output=tudui(imgs)
    print(output)
    print(targets)

Because the batchsize is set to have more data, we changed it to 1
Look at the output results again

tensor([[ 0.0835, -0.0427, -0.1136, 0.0751, -0.1241, 0.0557, 0.0384, 0.1432,
         -0.0817, 0.0711]], grad_fn=<AddmmBackward0>)
tensor([1])
tensor([[ 0.0749, -0.0500, -0.1154, 0.0736, -0.1187, 0.0532, 0.0397, 0.1437,
         -0.1071, 0.0638]], grad_fn=<AddmmBackward0>)
tensor([7])


//You can see that a picture will have 10 outputs after passing through the neural network. The output values represent the linear values of the neural network for 10 different types of things.
! ! The Danmaku master learned that because there is no softmax layer, no normalization, and no activation function, the output is not probability! ~! !
//tensor[1] and tensor[7] are the serial numbers in the class list

5. Use the cross entropy in the loss function we learned today

loss=nn.CrossEntropyLoss()
for data in dataloader:
    imgs,targets=data
    output=tudui(imgs)
    result_loss=loss(output,targets)
    print(output)
    print(targets)
    print(result_loss)

This calculates the error between the actual output and the target

finally:

from torch.nn import Sequential
from torch import nn
from torch.nn import Conv2d,MaxPool2d,Flatten,Linear
import torchvision
from torch.utils.data import DataLoader

dataset=torchvision.datasets.CIFAR10("./P23_dataset",train=False,transform=torchvision.transforms.ToTensor(),
                                     download=True)
dataloader=DataLoader(dataset,batch_size=1)


class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1=Sequential(
            Conv2d(in_channels=3,out_channels=32,kernel_size=5,stride=1,padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, 1, 2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, 1, 2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self,x):
        x=self.model1(x)
        return x

tudui=Tudui()


loss=nn.CrossEntropyLoss()
for data in dataloader:
    imgs,targets=data
    output=tudui(imgs)
    result_loss=loss(output,targets)
    print(output)
    print(targets)
    print(result_loss)

3.2 Backpropagation practice

Popular science: Back propagation means trying to adjust the parameters in the network process so that the final loss will become smaller (because the parameters are derived from loss, which is in the opposite order to the network, so it is called back propagation), and The understanding of gradient can be directly regarded as “slope”
1. Add backpropagation
Note that you add it to your own loss: result_loss.backward()
Instead of the inherited loss function

loss=nn.CrossEntropyLoss()
for data in dataloader:
    imgs,targets=data
    output=tudui(imgs)
    result_loss=loss(output,targets)
    result_loss.backward()
    print("ok")

2. Add breakpoints to debug

in the following variables:
tudui–model1–protected attributes–our neural network can be found under modules
Where 0 is the first convolutional layer:

Under the weight, I found that there is a grad gradient, but there is no value yet

Then we press continue running;
The code executed is:

result_loss.backward()

We found that the gradient has data here

This allows you to calculate the gradient
In order to optimize the loss

Final code:

from torch.nn import Sequential
from torch import nn
from torch.nn import Conv2d,MaxPool2d,Flatten,Linear
import torchvision
from torch.utils.data import DataLoader

dataset=torchvision.datasets.CIFAR10("./P23_dataset",train=False,transform=torchvision.transforms.ToTensor(),
                                     download=True)
dataloader=DataLoader(dataset,batch_size=1)


class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1=Sequential(
            Conv2d(in_channels=3,out_channels=32,kernel_size=5,stride=1,padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, 1, 2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, 1, 2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self,x):
        x=self.model1(x)
        return x

tudui=Tudui()


loss=nn.CrossEntropyLoss()
for data in dataloader:
    imgs,targets=data
    output=tudui(imgs)
    result_loss=loss(output,targets)
    result_loss.backward()
    print("ok")