Time series trend analysis: cyclical, seasonal, holiday effects

Topic 1 (Additional) Time Series Trend Analysis

The Prophet model can capture various patterns in time series data, including periodicity (annual, quarterly, monthly, weekly), seasonality and holiday effects, etc., and can generate various components in the model, including trends, seasonality, Holiday effect. This helps explain trends and changes in time series data.

seasonal_decompose is a traditional time series decomposition tool suitable for simple seasonality and trend analysis. Prophet is a more advanced tool suitable for more complex time series analysis, especially for data containing multiple seasonal patterns and holiday effects.

This article uses two time series decomposition methods respectively, taking the time series data of stocks from 2023-01-01 to 2023-10-14 as an example, to import data, generate trend data, visualize and interpret the results.

For more financial and big data related case codes, you can follow the gzh finance melatonin’ reply keyword to get them.

1. Prophet time series decomposition

1. Data preparation

Import the required libraries: baostock and prophet

import pandas as pd
import datetime
from prophet import Prophet
import baostock as bs
import numpy as np
import matplotlib.pyplot as plt

Import stock data

lg = bs.login()
rs = bs.query_history_k_data_plus("sh.000001", "date,code,open,high,low,close",
     start_date='2023-01-01', end_date='2023-10-14',
     frequency="d", adjustflag="3")
# Output results
data_list = []
while (rs.error_code == '0') & rs.next():
    # Get a record and merge the records together
    data_list.append(rs.get_row_data())
result = pd.DataFrame(data_list, columns=rs.fields)
# Exit system
bs.logout()


Process the data: Change the time column into time data recognized by the system, and extract the required columns to generate a new table.

result['time']= pd.to_datetime(result['date'])
#Create a DataFrame containing date timestamps and observations
df = pd.DataFrame({<!-- -->'ds': result['time'],
                   'y': result['open']})

2. Create model

# Create Prophet model
model = Prophet()

model.add_seasonality(name='weekly', period=5, fourier_order=6)
model.add_seasonality(name='monthly', period=20, fourier_order=6)

# Fit model
model.fit(df)

In the above code, model = Prophet() creates an instance of the Prophet model. Only then can you add seasonal components, holidays, etc. to the model.

add_seasonality is an important parameter in the Prophet model, which is used to add custom components to better capture custom patterns in time series.
Some of the main parameters of the add_seasonality method are as follows:

Parameters Explanation
name Names that describe seasonality, such as “weekly”, “yearly”, etc.
period The period length of the seasonal component. If you want to capture weekly seasonality, set it to 7; if it’s yearly seasonality, set it to 365.25 (to leap years)
fourier_order Controls the order of the Fourier series used to derive the model. Increasing this value can make the model more flexible, but may also lead to over-exporting. Generally, it is recommended to start with 3 and adjust as needed.
prior_scale Control the influence of a certain component, used to control the smoothness of a certain component. Increasing this value will increase the influence of an ingredient. Typically, you can start with 0.1 and adjust as needed.
mode This is a string specifying the mode of the current component. Can be set to “additive” (additive mode) or “multiplicative” (multiplicative mode), depending on the nature of one.
cap and floor These optional parameters are used to handle the upper and lower limits. They represent the maximum upper bound and minimum lower bound of the time series respectively.

In the code above, I added two seasonal components to the model via the add_seasonality method:
“weekly” seasonality, the stock data has a weekly cycle of 5 days, the period parameter is 5, and a 6th-order Fourier series is used to fit the seasonal pattern.
“monthly” seasonality, with a period of 20-22 days, and also uses a 6th-order Fourier series to fit the seasonal pattern.
The purpose of these seasonal components is to capture weekly and monthly seasonal patterns in time series data.

After fitting the model, the result is a table, which contains the data of each trend and other characteristics, and is saved in the tt variable:

tt = model.predict(df)


The following is an explanation of the characteristics of the generated table, which are important results that are very useful in model predictions:

  1. trend: This is the trend component captured by the model, which represents the overall trend of the time series data. The trend column contains trend predictions for future time points.

  2. yhat_lower and yhat_upper: These are the lower and upper bounds of the prediction. They represent the range of model uncertainty about future values. yhat_lower is the lowest estimate of the predicted value, and yhat_upper is the highest estimate. These ranges can help you understand the uncertainty of your model.

  3. additive_terms: These are the additional seasonal components captured, typically seasonal patterns, holiday effects, etc. additive_terms represents the impact of additional seasonality on the forecast.

  4. monthly and weekly: These are specific seasonal components, such as monthly seasonality and weekly seasonality. They represent the impact of monthly and weekly seasonality on the forecast, respectively.

  5. multiplicative_terms: This is the captured multiplicative seasonal component, which indicates that the seasonal component affects the forecast multiplicatively. Typically, this will account for the relative scale effects of the data in the model.

  6. yhat: This is the overall predicted value of the Prophet model, which is a combination of trend, seasonality and other influencing factors. yhat is your final forecast result, which represents an overall estimate of future time points.

Together, these terms and results help you understand how the Prophet model performs time series forecasting and the impact of each component on the forecasted value. You can use these results to interpret and visualize forecast results to better understand trends and seasonality in time series data.

3. Visualized results

model.plot_components(tt)

Please add an image description
By observing weekly component plots, you can identify cyclical effects that have a significant impact on time series data. If the weekly component chart shows significant fluctuations, the data may have a week-to-week cyclical nature. From the result plot generated by the plot_components method of the Prophet model, we can see that the long-term trend, weekly seasonality, and monthly periodicity all show obvious fluctuations, which indicates that the time series data contains complex Trends and seasonal structure.
If the changes on the vertical axis are very small, about a few tenths, the changes are not big, so they can be ignored and there is no obvious weekly change. This analysis helps us better understand the impact and fluctuations of different components in time series data.

Trend:
A trend component with significant fluctuations indicates the presence of fluctuations in the long-term trend in the data. This could mean that the data shows clear upward and downward trends that change over an extended period of time. Trend fluctuations can be caused by various factors.

4. Result analysis

Prophet extracts seasonality to predict the time series and check the model fitting error:

result.set_index('time', inplace=True)
float_list = [float(str_num) for str_num in result['open']]
tt2 = tt.copy()

tt2.set_index('ds', inplace=True)

import matplotlib.pyplot as plt
res = float_list - tt2['yhat']

plt.figure(figsize=(12, 6))

ax2 = plt.subplot(2, 1, 2)
ax2.plot(df['ds'], res)
ax2.set_xlabel('date')
ax2.set_title('residuals')

#Adjust layout
plt.tight_layout()

# Display graphics
plt.show()


More relevant information, big data, and financial cases can be obtained by guanzhu gzh’finance melatonin’ reply keyword.

5. Holiday effect

If special holidays such as May 1st and October 1st need to be considered, special handling is required.
First import the holiday data that needs to be considered and generate a table:

#Create holiday data frame
qingming = pd.DataFrame({<!-- -->
  'holiday': 'qingming',
  'ds': pd.to_datetime(['2023-04-05','2023-04-06']),
  'lower_window': 0,
  'upper_window': 1,
})
laodong = pd.DataFrame({<!-- -->
  'holiday': 'laodong',
  'ds': pd.to_datetime(['2023-04-29', '2023-05-03']),
  'lower_window': 0,
  'upper_window': 1,
})
duanwu = pd.DataFrame({<!-- -->
  'holiday': 'duanwu',
  'ds': pd.to_datetime(['2023-06-22', '2023-06-24']),
  'lower_window': 0,
  'upper_window': 1,
})
guoqing = pd.DataFrame({<!-- -->
  'holiday': 'guoqing',
  'ds': pd.to_datetime(['2023-09-29', '2023-10-06']),
  'lower_window': 0,
  'upper_window': 1,
})
holidays = pd.concat([qingming, laodong,duanwu,guoqing])


When creating the model, it is similar to the previous one. You only need to add the holiday parameter:

model = Prophet(holidays=holidays)
model.add_seasonality(name='weekly', period=5, fourier_order=6)
model.add_seasonality(name='monthly', period=20, fourier_order=6)

# Fit model
model.fit(df)

Generate chart:

tt = model.predict(df)
model.plot_components(tt)


The chart will have an additional description about holidays, indicating that relatively large fluctuations occurred during Tomb-Sweeping Day and May Day.

2. Traditional time series decomposition

Stock data are not suitable for traditional time series decomposition because stock data are not continuous and only have five trading days per week.
Select Bitcoin for analysis.

result = pd.read_csv("bitcoin.csv")
result = result[-270:]

Bitcoin data
Process the data and set the row index to a standard date format:

result['date']= pd.to_datetime(result['date'])
result.set_index('date', inplace=True)

Extract the bitcoin_price column for decomposition:

result1 = seasonal_decompose(result['bitcoin_price'], model='additive')

Visualize raw data, long-term trends, seasonal trends, and errors:

# Draw decomposed series
plt.figure(figsize=(12, 10))

# Raw data
plt.subplot(4, 1, 1)
plt.plot(result.index, result['bitcoin_price'], label='open', color='b')

plt.title('Data')
plt.xlabel('Date')
plt.ylabel('Data Value')
plt.grid(False)
plt.legend()

# Trend component
plt.subplot(4, 1, 2)
plt.plot(result.index, result1.trend, label='Trend', color='b')
plt.title('Trend Component')
plt.xlabel('Date')
plt.ylabel('Trend Value')
plt.grid(False)
plt.legend()

# Seasonal effect
plt.subplot(4, 1, 3)
plt.plot(result.index, result1.seasonal, label='Seasonal', color='b')
plt.title('Seasonal Component')
plt.xlabel('Date')
plt.ylabel('Seasonal Value')
plt.grid(False)
plt.legend()

# Residuals
plt.subplot(4, 1, 4)
plt.plot(result.index, result1.resid, label='Residual', color='b')

plt.title('Residual Component')
plt.xlabel('Date')
plt.ylabel('Residual Value')
plt.grid(False)
plt.legend()

plt.tight_layout()
plt.show()



The code and data used in this article can be obtained by following the gzh finance melatonin’ reply keyword.