Convolutional neural network CNN for time series prediction

Article directory

1 Preparation
- 1.1 Import the library
- 1.2 Reading data
- 1.3 Visualization
2 Data preprocessing
- 2.1 Divide training set and test set
- 2.2 Divide features and labels
3. Construct a one-dimensional convolutional neural network
4. Training model
- 4.1 Divide batches
- 4.2 Set the loss function
- 4.3 Parameter initialization
- 4.4 Training model
5 Model results
- 5.1 Loss function curve
- 5.2 True value and predicted value curve of test set

1 Preparation

1.1 Import library

import numpy as np
import pandas as pd
import torch.nn as nn #used to build the network
import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F
torch.set_default_tensor_type(torch.DoubleTensor)
#Set the default type of tensor to double-precision floating point type (torch.doubletensor) to facilitate backpropagation

1.2 Reading data

This article selects the monthly sunspot data set, which describes monthly counts of the number of sunspots observed from January 1749 to October 2023.
The data URL is as follows (you can also follow and send a private message to the blogger to obtain the data set):
http://www.sidc.be/silso/datafiles

Data=pd.read_csv("E:\Code Learning\CSDN Blog\CNN\SN_m.csv")
start_time=pd.to_datetime("1749-01-01")
end_time=pd.to_datetime("2023-11-01")
time=pd.date_range(start=start_time,end=end_time,freq='M')#Generate time series
Data['time']=time
#Set time as index
data=Data.set_index('time',drop=True, append=False, inplace=False, verify_integrity=False)
series=np.array(data['Sunspots'])

1.3 Visualization

It can be seen from the figure that this data set has strong seasonality.

plt.figure(figsize=(14,6))
plt.plot(data,label='Sunspots')
plt.grid()
plt.xlabel('Date')

2 Data preprocessing

2.1 Divide training set and test set

The custom function train_test_split is used to divide the training set and test set. Where series represents the overall sample, and split_prop represents the proportion of the training set.

def train_test_split(series,split_prop):
    #split_prop represents the proportion of the training set
    train =series[:int(split_prop*int(series.size))]
    test=series[int(split_prop*int(series.size)):]
    return train, test

Use 70% of the data set as training set and 30% as test set

split_prop=0.7#Set the division ratio
train,test=train_test_split(series,split_prop)#divide

2.2 Divide features and tags

The custom function data_process is used for sliding window sampling to obtain the features and labels of the test set and training set, where windowsize represents the sliding window size and step represents the sliding step size. The fixed sliding window sampling diagram is shown in the figure:

import random
def data_process(train,test,window_size,step):
    #Convert the data into tensor data structure and perform windowing operation to obtain a short sequence
    train_tensor=torch.from_numpy(train)#Convert training set data to tensor data structure
    train_window_split=train_tensor.unfold(0,window_size,step)#Get the data set after the "volume" of the training set
    train_set=train_window_split.numpy()
    test_tensor=torch.from_numpy(test)#Convert test set data into tensor data structure
    test_window_split=test_tensor.unfold(0,window_size,step)#Get the data set after the test set "volume"
    test_set=test_window_split.numpy()
    
    #Shuffle each short sequence in the training set
    train_temp1=train_set.tolist()#Convert to list format
    #random.shuffle() shuffles the order of elements (short sequences) in a list, does not generate a new list, but only shuffles the order of the original list.
    random.shuffle(train_temp1)#Disrupt the training set
    train_temp2=np.array(train_temp1)#Create an array, the content of the array is a shuffled list
    
    #Divide the short sequence into Feature and Label
    train_feature_array=train_temp2[:,:window_size-1]
    train_label_array=train_temp2[:,window_size-1:]#Take the last value in each short sequence as the label of the short sequence (i.e., the real value of the sequence)
    test_temp1=test_set.tolist()#Convert to list format
    test_temp2=np.array(test_temp1)#Create an array
    test_feature_array=test_temp2[:,:window_size-1]
    test_label_array=test_temp2[:,window_size-1:]
    
    #Convert ndarray (N-dimensional array type object) into tensor
    train_feature_tensor=torch.from_numpy(train_feature_array)
    train_label=torch.from_numpy(train_label_array)
    test_feature_tensor=torch.from_numpy(test_feature_array)
    test_label=torch.from_numpy(test_label_array)
    
    #Expand the data dimension to conform to CNN output
    train_feature=train_feature_tensor.reshape(train_feature_tensor.shape[0],1,train_feature_tensor.shape[1])
    test_feature=test_feature_tensor.reshape(test_feature_tensor.shape[0],1,test_feature_tensor.shape[1])
    return train_feature,train_label,test_feature,test_label

window_size=7#Set the sliding window size
step=1#Set the sliding step size
train_feature,train_label,test_feature,test_label=data_process(train,test,window_size,step)

The features and label values of the training set are as follows:

3 Construct a one-dimensional convolutional neural network

The schematic diagram of the one-dimensional convolutional neural network designed in this article is shown in the figure:

The code to construct a one-dimensional convolutional neural network is as follows:

#Define a class MyConv, inherited from the parent class nn.Module (the base class of all neural network modules under the PyTorch system)
class MyConv(nn.Module):
    def __init__(self):#__init__ is the constructor of the class, self is the instance of the class
        super(MyConv,self).__init__()#super() is used to call the constructor in the parent class
        #One layer of one-dimensional convolution
        #nn.Sequential is equivalent to a container that stores modules in order
        self.conv1=nn.Sequential(
            nn.Conv1d(in_channels=1,out_channels=32,kernel_size=3,stride=1,padding=1),
            nn.ReLU(inplace=True)
        )
        The #Convid function represents applying a one-dimensional convolution on an input signal consisting of multiple input planes
        #Among them in_channels input signal channel number, out_channels channel generated by convolution, kernel_size convolution kernel size
        #stride convolution step size, each edge of the padding input is supplemented by the number of layers of 0
        self.conv2=nn.Sequential(
            nn.Conv1d(in_channels=32,out_channels=64,kernel_size=2,stride=1,padding=1),
            nn.ReLU(inplace=True)
        )
        #Activation function ReLU
        #Change the output channel to a single value
        self.fc1=nn.Linear(64,32)
        #nn.Linear defines a linear layer of a neural network
        #The first parameter represents the number of input neurons, and the second parameter represents the number of output neurons.
        self.fc2=nn.Linear(32,1)
    def forward(self,X):
        #When calling, you can directly use MyConv(data) instead of MyConv.forward(data)
        out=self.conv1(X)#One-dimensional convolution
        out=F.avg_pool1d(out,3)#Average pooling
        out=self.conv2(out)#One-dimensional convolution
        out=F.avg_pool1d(out,3)#Average pooling
        out=out.squeeze()
        #Remove the dimension with value 1, that is, the dimension with shape=1
        #[[[1,2,3],[3,4,5]]] 》 [[1,2,3],[3,4,5]]
        #The original shape of the tensor is (1,2,3), and the shape of the tensor after squeeze() processing is (2,3)
        out=self.fc1(out)
        out=self.fc2(out)
        return out
#buildnetwork
net=MyConv()

4 Training model

4.1 Divide batch

Here are some classic terms for neural networks: batch, epoch, and iteration:

Batch: When training a neural network model, for example, if there are 1,000 samples, these samples are divided into 10 batches, which is 10 batches. The size of each batch is 100, which is batch size=100.
Epoch: The number of times to train using all the samples in the training set. In layman’s terms, the number of epochs means how many times the entire data set is trained.
iteration: The number of times to train using the batchsize number of samples. For example, if the training set has 1000 samples and batchsize = 100, then the entire training set is trained: iteration=10, epoch=1.

The custom function data_iter is used to divide batches. Among them, batch_size represents the size of a batch, features represents features, and labels represents labels.

def data_iter(batch_size,features,labels):
    num_examples=len(features)
    indices=list(range(num_examples))
    for i in range(0,num_examples,batch_size):
        j=torch.LongTensor(indices[i: min(i + batch_size,num_examples)])
        yield features.index_select(0,j),labels.index_select(0,j)

4.2 Set loss function

The loss function is an operation function used to measure the difference between the model’s predicted value f(x) and the true value Y. The loss function is mainly used in the training phase of the model. After each batch of training data is fed into the model, the predicted value is output through forward propagation. Then the loss function calculates the difference between the predicted value and the real value, which is the loss. value. After obtaining the loss value, the model updates each parameter through backpropagation to reduce the loss between the real value and the predicted value, so that the predicted value generated by the model moves closer to the real value, thereby achieving the purpose of learning.

#Loss function-mean square error
def square_loss(feature,label):
    return (net(feature)-label)**2/2

4.3 Parameter initialization

#Parameter initialization
for params in net.parameters(): #The parameters of the network include the connection weight W and bias b of the network
    torch.nn.init.normal_(params,mean=0,std=0.01)
    #Generally initialize the connection weight W in the network. The initialization parameter value conforms to the normal distribution with a mean of 0 and a standard deviation of 0.01.

lr=0.001#learning rate
num_epochs=100#Number of training rounds
batch_size=128#batch size
loss=square_loss#Loss function
optimizer=torch.optim.Adam(net.parameters(),lr)
#Set the optimizer, pass in the parameters of the network model, and set the learning rate

4.4 Training model

#Training model
train_loss=[]
test_loss=[]
#Model training
for epoch in range(num_epochs):#Outer loop training
    train_1,test_1=0.0,0.0#Update the loss function values of the training set and test machine
    for X,y in data_iter(batch_size,train_feature,train_label):#Inner loop of a batch
        l=loss(X,y).sum()
        #Calculate the loss function in each batch, that is, the difference between the output value and the real value
        if optimizer is not None:
            optimizer.zero_grad()
            #optimizer.zero_grad() is to set the gradient to zero, that is, to change the derivative of loss with respect to weight W to 0
        elif params is not None and params[0].grad is not None:
            for param in params:
                param.grad.data.zero_()
        #backpropagation
        l.backward()#Backward propagation calculates the gradient value of each parameter (the gradient of loss with respect to weight W)
        optimizer.step()#Perform one-step parameter (weight W) update through gradient descent
    train_1=loss((train_feature),train_label)
    test_1=loss((test_feature),test_label)
    train_loss.append((train_1.mean().item()))
    test_loss.append(test_1.mean().item())
    print('epoch %d,train loss %f,test loss %f'%(epoch + 1,train_1.mean().item(),test_1.mean().item()))

5 Model results

5.1 Loss function curve

#Draw the loss function loss curve
x=np.arange(num_epochs)
plt.figure(figsize=(8,6))
plt.plot(x,train_loss,label='train_loss',linewidth=1.5)
plt.plot(x,test_loss,label='test_loss',linewidth=1.5)
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend()
plt.grid()

5.2 The true value and predicted value curve of the test set

test_predict=[]
split_point=int(split_prop*int(series.size))
#split_prop represents the proportion of the training set
#split_point represents the division position of the training set and the test set
test_time=time[split_point + window_size-1:]

#Test set real sequence
test_true=series[split_point + window_size-1:]
#Test set prediction sequence
test_predict=net(test_feature).squeeze().tolist()

#draw图
#overall
plt.figure(figsize=(14,12))
plt.subplot2grid((2,1),(0,0))
plt.plot(test_time,test_true,label='true')
plt.plot(test_time,test_predict,label='predict')
plt.legend(fontsize=15)
plt.grid()