Topic 1 (Additional) Time Series Trend Analysis
The Prophet model can capture various patterns in time series data, including periodicity (annual, quarterly, monthly, weekly), seasonality and holiday effects, etc., and can generate various components in the model, including trends, seasonality, Holiday effect. This helps explain trends and changes in time series data.
seasonal_decompose is a traditional time series decomposition tool suitable for simple seasonality and trend analysis. Prophet is a more advanced tool suitable for more complex time series analysis, especially for data containing multiple seasonal patterns and holiday effects.
This article uses two time series decomposition methods respectively, taking the time series data of stocks from 2023-01-01 to 2023-10-14 as an example, to import data, generate trend data, visualize and interpret the results.
For more financial and big data related case codes, you can follow the gzh finance melatonin’ reply keyword to get them.
1. Prophet time series decomposition
1. Data preparation
Import the required libraries: baostock and prophet
import pandas as pd import datetime from prophet import Prophet import baostock as bs import numpy as np import matplotlib.pyplot as plt
Import stock data
lg = bs.login() rs = bs.query_history_k_data_plus("sh.000001", "date,code,open,high,low,close", start_date='2023-01-01', end_date='2023-10-14', frequency="d", adjustflag="3") # Output results data_list = [] while (rs.error_code == '0') & rs.next(): # Get a record and merge the records together data_list.append(rs.get_row_data()) result = pd.DataFrame(data_list, columns=rs.fields) # Exit system bs.logout()
Process the data: Change the time column into time data recognized by the system, and extract the required columns to generate a new table.
result['time']= pd.to_datetime(result['date']) #Create a DataFrame containing date timestamps and observations df = pd.DataFrame({<!-- -->'ds': result['time'], 'y': result['open']})
2. Create model
# Create Prophet model model = Prophet() model.add_seasonality(name='weekly', period=5, fourier_order=6) model.add_seasonality(name='monthly', period=20, fourier_order=6) # Fit model model.fit(df)
In the above code, model = Prophet() creates an instance of the Prophet model. Only then can you add seasonal components, holidays, etc. to the model.
add_seasonality is an important parameter in the Prophet model, which is used to add custom components to better capture custom patterns in time series.
Some of the main parameters of the add_seasonality method are as follows:
Parameters | Explanation |
---|---|
name | Names that describe seasonality, such as “weekly”, “yearly”, etc. |
period | The period length of the seasonal component. If you want to capture weekly seasonality, set it to 7; if it’s yearly seasonality, set it to 365.25 (to leap years) |
fourier_order | Controls the order of the Fourier series used to derive the model. Increasing this value can make the model more flexible, but may also lead to over-exporting. Generally, it is recommended to start with 3 and adjust as needed. |
prior_scale | Control the influence of a certain component, used to control the smoothness of a certain component. Increasing this value will increase the influence of an ingredient. Typically, you can start with 0.1 and adjust as needed. |
mode | This is a string specifying the mode of the current component. Can be set to “additive” (additive mode) or “multiplicative” (multiplicative mode), depending on the nature of one. |
cap and floor | These optional parameters are used to handle the upper and lower limits. They represent the maximum upper bound and minimum lower bound of the time series respectively. |
In the code above, I added two seasonal components to the model via the add_seasonality method:
“weekly” seasonality, the stock data has a weekly cycle of 5 days, the period parameter is 5, and a 6th-order Fourier series is used to fit the seasonal pattern.
“monthly” seasonality, with a period of 20-22 days, and also uses a 6th-order Fourier series to fit the seasonal pattern.
The purpose of these seasonal components is to capture weekly and monthly seasonal patterns in time series data.
After fitting the model, the result is a table, which contains the data of each trend and other characteristics, and is saved in the tt variable:
tt = model.predict(df)
The following is an explanation of the characteristics of the generated table, which are important results that are very useful in model predictions:
-
trend: This is the trend component captured by the model, which represents the overall trend of the time series data. The
trend
column contains trend predictions for future time points. -
yhat_lower and yhat_upper: These are the lower and upper bounds of the prediction. They represent the range of model uncertainty about future values.
yhat_lower
is the lowest estimate of the predicted value, andyhat_upper
is the highest estimate. These ranges can help you understand the uncertainty of your model. -
additive_terms: These are the additional seasonal components captured, typically seasonal patterns, holiday effects, etc.
additive_terms
represents the impact of additional seasonality on the forecast. -
monthly and weekly: These are specific seasonal components, such as monthly seasonality and weekly seasonality. They represent the impact of monthly and weekly seasonality on the forecast, respectively.
-
multiplicative_terms: This is the captured multiplicative seasonal component, which indicates that the seasonal component affects the forecast multiplicatively. Typically, this will account for the relative scale effects of the data in the model.
-
yhat: This is the overall predicted value of the Prophet model, which is a combination of trend, seasonality and other influencing factors.
yhat
is your final forecast result, which represents an overall estimate of future time points.
Together, these terms and results help you understand how the Prophet model performs time series forecasting and the impact of each component on the forecasted value. You can use these results to interpret and visualize forecast results to better understand trends and seasonality in time series data.
3. Visualized results
model.plot_components(tt)
By observing weekly component plots, you can identify cyclical effects that have a significant impact on time series data. If the weekly component chart shows significant fluctuations, the data may have a week-to-week cyclical nature. From the result plot generated by the plot_components
method of the Prophet model, we can see that the long-term trend, weekly seasonality, and monthly periodicity all show obvious fluctuations, which indicates that the time series data contains complex Trends and seasonal structure.
If the changes on the vertical axis are very small, about a few tenths, the changes are not big, so they can be ignored and there is no obvious weekly change. This analysis helps us better understand the impact and fluctuations of different components in time series data.
Trend:
A trend component with significant fluctuations indicates the presence of fluctuations in the long-term trend in the data. This could mean that the data shows clear upward and downward trends that change over an extended period of time. Trend fluctuations can be caused by various factors.
4. Result analysis
Prophet extracts seasonality to predict the time series and check the model fitting error:
result.set_index('time', inplace=True) float_list = [float(str_num) for str_num in result['open']] tt2 = tt.copy() tt2.set_index('ds', inplace=True) import matplotlib.pyplot as plt res = float_list - tt2['yhat'] plt.figure(figsize=(12, 6)) ax2 = plt.subplot(2, 1, 2) ax2.plot(df['ds'], res) ax2.set_xlabel('date') ax2.set_title('residuals') #Adjust layout plt.tight_layout() # Display graphics plt.show()
More relevant information, big data, and financial cases can be obtained by guanzhu gzh’finance melatonin’ reply keyword.
5. Holiday effect
If special holidays such as May 1st and October 1st need to be considered, special handling is required.
First import the holiday data that needs to be considered and generate a table:
#Create holiday data frame qingming = pd.DataFrame({<!-- --> 'holiday': 'qingming', 'ds': pd.to_datetime(['2023-04-05','2023-04-06']), 'lower_window': 0, 'upper_window': 1, }) laodong = pd.DataFrame({<!-- --> 'holiday': 'laodong', 'ds': pd.to_datetime(['2023-04-29', '2023-05-03']), 'lower_window': 0, 'upper_window': 1, }) duanwu = pd.DataFrame({<!-- --> 'holiday': 'duanwu', 'ds': pd.to_datetime(['2023-06-22', '2023-06-24']), 'lower_window': 0, 'upper_window': 1, }) guoqing = pd.DataFrame({<!-- --> 'holiday': 'guoqing', 'ds': pd.to_datetime(['2023-09-29', '2023-10-06']), 'lower_window': 0, 'upper_window': 1, }) holidays = pd.concat([qingming, laodong,duanwu,guoqing])
When creating the model, it is similar to the previous one. You only need to add the holiday parameter:
model = Prophet(holidays=holidays) model.add_seasonality(name='weekly', period=5, fourier_order=6) model.add_seasonality(name='monthly', period=20, fourier_order=6) # Fit model model.fit(df)
Generate chart:
tt = model.predict(df) model.plot_components(tt)
The chart will have an additional description about holidays, indicating that relatively large fluctuations occurred during Tomb-Sweeping Day and May Day.
2. Traditional time series decomposition
Stock data are not suitable for traditional time series decomposition because stock data are not continuous and only have five trading days per week.
Select Bitcoin for analysis.
result = pd.read_csv("bitcoin.csv") result = result[-270:]
Process the data and set the row index to a standard date format:
result['date']= pd.to_datetime(result['date']) result.set_index('date', inplace=True)
Extract the bitcoin_price column for decomposition:
result1 = seasonal_decompose(result['bitcoin_price'], model='additive')
Visualize raw data, long-term trends, seasonal trends, and errors:
# Draw decomposed series plt.figure(figsize=(12, 10)) # Raw data plt.subplot(4, 1, 1) plt.plot(result.index, result['bitcoin_price'], label='open', color='b') plt.title('Data') plt.xlabel('Date') plt.ylabel('Data Value') plt.grid(False) plt.legend() # Trend component plt.subplot(4, 1, 2) plt.plot(result.index, result1.trend, label='Trend', color='b') plt.title('Trend Component') plt.xlabel('Date') plt.ylabel('Trend Value') plt.grid(False) plt.legend() # Seasonal effect plt.subplot(4, 1, 3) plt.plot(result.index, result1.seasonal, label='Seasonal', color='b') plt.title('Seasonal Component') plt.xlabel('Date') plt.ylabel('Seasonal Value') plt.grid(False) plt.legend() # Residuals plt.subplot(4, 1, 4) plt.plot(result.index, result1.resid, label='Residual', color='b') plt.title('Residual Component') plt.xlabel('Date') plt.ylabel('Residual Value') plt.grid(False) plt.legend() plt.tight_layout() plt.show()
The code and data used in this article can be obtained by following the gzh finance melatonin’ reply keyword.