LightTS: Lightly sampled MLP structured network for multivariate time series forecasting

Paper:2022 | Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures [1]

Authors: Zhang, Tianping, Yizhuo Zhang, Wei Cao, Jiang Bian, Xiaohan Yi, Shun Zheng, and Jian Li

Institution: Tsinghua University, Microsoft Research Asia

Code: https://github.com/thuml/Time-Series-Library/blob/main/models/LightTS.py

Citations:22

9a4f5201137c78f90d9409577c9c0862.png

In the previous TimesNet paper, Wu Haixu and others converted the temporal 1D structure into a 2D structure to facilitate the extraction of more information. This article, LightTS, also converts a 1D timing structure into a 2D structure, and it is very simple. I have seen friends in timing competitions reshape the 1D timing into 2D, and then use convolution kernel modeling to extract information. Follow this article The article is very similar, except that two methods of sampling and organizing 2D data are considered here, and then MLP is used to extract features.

First, assuming that the input time series dimension is [B, T, N], the author did two types of sampling:

– Continuous sampling: focuses on capturingshort-term local patterns.

Interval Sampling: Focuses on capturinglong-term dependencies.

As shown in the figure below, it is easy to understand that the new data dimensions are [B, C, T/C, N], and N represents the number of time series.

2aaf093e14487b84f501f26a0062576c.jpeg

Note: In the paper, IEBlockC directly outputs the prediction result, but in the code, as shown in the red mark in the figure above, there will be a prediction highway result from the input end, which is then added to IEBlockCx as the final prediction output.

After that, an Information Exchange Block (IEBlock) module is passed. This module is very simple. It is to do temporal projection (MLP in the C dimension) and channel projection (T/C) for each timing sequence [B, C, T/C, 1]. After MLP in dimensions), then the two results [B, F’, T/C] are added and fed into MLP to obtain the final output with dimensions [B, F’, T/C, 1].

4f19b40723077040d7cb5cf543e4e771.jpeg

Note: In the paper, the temporal projection results are fed in series to the channel projection, but in the code, the two projections actually process the input in parallel, and then feed the output projection after addition.

After the two sampling results pass through their respective IEBlocks, they undergo linear layer conversion, merge the results, and then feed the final IEBlock to output the prediction results.

Experimental results:

fd8b5263ec2e8cfd4b13e09c71551620.png

ae79c7e266b6c2b66c1a4af0fe368c2e.png

But from the TimesNet paper, compared with Dlinear, it still makes little sense:

8fe5bcccec890f01616510debec736cb.png

Let’s go directly to the code. I wrote the dimension changes in the comments. When combined with the model diagram above, it will be very clear (I will try it later in the game):

def encoder(self, x):
    B, T, N = x.size() # [B, T, N]
    
    # [B, T, N] -> [B, T_pred, N]
    highway = self.ar(x.permute(0, 2, 1))
    highway = highway.permute(0, 2, 1)


    #continuous sampling
    # [B, T, N] -> [B, T/C, C, N]
    x1 = x.reshape(B, self.num_chunks, self.chunk_size, N)
    # [B, T/C, C, N] -> [B, N, C, T/C]
    x1 = x1.permute(0, 3, 2, 1)
    # [B, N, C, T/C] -> [B*N, C, T/C]
    x1 = x1.reshape(-1, self.chunk_size, self.num_chunks)
    # [B*N, C, T/C] -> [B*N, F, T/C]
    x1 = self.layer_1(x1)
    # [B*N, F, T/C] -> [B*N, F,]
    x1 = self.chunk_proj_1(x1).squeeze(dim=-1)


    #interval sampling
    # [B, T, N] -> [B, C, T/C, N]
    x2 = x.reshape(B, self.chunk_size, self.num_chunks, N)
    x2 = x2.permute(0, 3, 1, 2)
    x2 = x2.reshape(-1, self.chunk_size, self.num_chunks)
    x2 = self.layer_2(x2)
    x2 = self.chunk_proj_2(x2).squeeze(dim=-1)


    x3 = torch.cat([x1, x2], dim=-1) # [B*N, 2*F]


    x3 = x3.reshape(B, N, -1) # [B, N, 2*F]
    x3 = x3.permute(0, 2, 1) # [B, 2*F, N]


    out = self.layer_3(x3) # [B, T_pred, N]


    out = out + highway # [B, T_pred, N]
    return out

IEBlock:

class IEBlock(nn.Module):
    def __init__(self, input_dim, hid_dim, output_dim, num_node):
        super(IEBlock, self).__init__()


        self.input_dim = input_dim # C
        self.hid_dim = hid_dim
        self.output_dim = output_dim # F
        self.num_node = num_node # T/C


        self._build()


    def _build(self):
        self.spatial_proj = nn.Sequential(
            nn.Linear(self.input_dim, self.hid_dim),
            nn.LeakyReLU(),
            nn.Linear(self.hid_dim, self.hid_dim // 4)
        )


        self.channel_proj = nn.Linear(self.num_node, self.num_node)
        torch.nn.init.eye_(self.channel_proj.weight)


        self.output_proj = nn.Linear(self.hid_dim // 4, self.output_dim)


    def forward(self, x):
        # [B*N, C, T/C] -> [B*N, T/C, F']
        x = self.spatial_proj(x.permute(0, 2, 1))
        # [B*N, F', T/C] + [B*N, F', T/C]
        x = x.permute(0, 2, 1) + self.channel_proj(x.permute(0, 2, 1))
        # [B*N, F', T/C] -> [B*N, T/C, F]
        x = self.output_proj(x.permute(0, 2, 1))
        # [B*N, T/C, F] -> [B*N, F, T/C]
        x = x.permute(0, 2, 1)


        return x

Reference materials

[1] Zhang, T., Zhang, Y., Cao, W., Bian, J., Yi, X., Zheng, S., & Li, J. (2022). Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186.

Recommended reading:

My 2022 Internet School Recruitment Sharing

My 2021 summary

A brief discussion on the difference between algorithm positions and development positions

Internet school recruitment R&D salary summary

The current situation of Internet job hunting in 2022, gold 9 silver 10 will soon become bronze 9 iron 10! !

Public account:AI Snail Car

Stay humble, stay disciplined, keep improving

096547de515fa486a73cc05ccafb620d.jpeg

Send [Snail] to get a copy of “Hand-in-Hand AI Project” (written by AI Snail)

Send [1222] to get a good leetcode test note

Send [Four Classic Books on AI] to get four classic AI e-books