Though I have discussed other components of time series data, I can describe most time series patterns in terms of two basic classes of components: trend and seasonality. The first represents a general systematic linear or a nonlinear component that changes over time and does not repeat, or, at least, does not repeat within the time range captured by our data (e.g., a plateau followed by a period of exponential growth). The second may have a similar structure (e.g., a plateau followed by a period of exponential growth, but repeats itself in systematic intervals over time. These two general classes of time series components may coexist in real-life data (Dell, 2015). As an example, consider the sales of a company, which can grow rapidly over the years, but they still follow consistent seasonal patterns (e.g., as much as 25% of yearly sales each year are made in December, whereas only 4% in August).
Figure 1. International passenger data series (G)
This general pattern is well illustrated (see Figure 1) by the international passenger data series (G), as mentioned in the textbook Time Series: Forecast and Control by Box (Box, Jenkins, & Reinsel, 2008), representing monthly international airline passenger totals (measured in thousands) for twelve consecutive years from 1960 to 1972. If you plot the successive observations (months) of airline passenger totals, a clear and almost linear trend emerges, indicating that the airline industry enjoyed steady growth over the years (approximately four times more passengers traveled in 1970 than in 1960). At the same time, the monthly figures will follow an almost identical pattern each year (e.g., more people travel during holidays than during any other time of the year). This example data file also illustrates a very common general type of pattern in time series data, where the amplitude of the seasonal changes increases with the overall trend (i.e., the variance is correlated with the mean over the segments of the series). This pattern which is called multiplicative seasonality indicates that the relative amplitude of seasonal changes is constant over time, (Strickland, 2016) so it is related to the trend.
Decomposing Time Series Data with R
One alternative for decomposing a time series is to use the stlplus–Package in R, as follows (Hafen, 2016), using the airpass dataset from the TSA-Package in R (Chan & Ripley, 2012). The airpass dataset contains monthly total international airline passengers from 01/1960 – 12/1971. The remainder component represents time series error not explained by the other components, trend and seasonality.
Air_stl <- stlplus(airpass, t = as.vector(time(airpass)), n.p = 12, l.window = 13, t.window = 19, s.window = 35, s.degree = 1, sub.labels = substr(month.name, 1, 3))
plot(Air_stl, ylab = "Air Passenger Concentration (ppm)", xlab = "Time (years)")
The code produces the following charts:
- Time series decomposition by component over Time (Figure 2)
- Centered Seasonal plus Reminder versus Time per Month (hour, day, week, as appropriate) (Figure 3)
- Tend and Remainder over Time (Figure 4)
- Seasonal over Time by Month (hour, day, week, as appropriate) (Figure 5)
- Remainder over Time by Month (hour, day, week, as appropriate) (Figure 6)
Figure 1. Time series decomposition by component over time: raw data, seasonal component, trend, and remainder.
Figure 2. Centered Seasonal plus Reminder versus Time per Month
Figure 3. Tend and Remainder over Time
Figure 4. Seasonal over Time by Month
Figure 5. Remainder over Time by Month
The foregoing visual inspection of time series data is much better than using the decompose() function in R, which only provides a similar plot of Figure 2. Analyzing the patterns in time series data helps use specify the model we will use to base our forecasts on.
Box, G. E., Jenkins, G. M., & Reinsel, G. C. (2008). Time Series Analysis: Forecasting and Control. (4th, Ed.) Hoboken, NJ: John Wiley & Sons. Retrieved from 978-0-470-27284-8
Chan, K.-S., & Ripley, B. (2012, 11 13). TSA: Time Series Analysis. Retrieved from The Comprehensive R Archive Network (CRAN): https://cran.r-project.org/web/packages/TSA/index.html
Time Series Analysis – Statistics Textbook – Dell. (revised May 8, 2015). Retrieved from http://documents.software.dell.com/Statistics/Textbook/Time-Series-Analysis
Strickland, J. Exponential Smoothing of Time Series Data in R. (February 24, 2016). Retrieved from http://www.datasciencecentral.com/xn/detail/6448529:BlogPost:391378?xg_source=ac
Jeffrey Strickland, Ph.D.
Jeffrey Strickland, Ph.D., is the Author of Predictive Analytics Using R and a Senior Analytics Scientist with Clarity Solution Group. He has performed predictive modeling, simulation and analysis for the Department of Defense, NASA, the Missile Defense Agency, and the Financial and Insurance Industries for over 20 years. Jeff is a Certified Modeling and Simulation professional (CMSP) and an Associate Systems Engineering Professional (ASEP). He has published nearly 200 blogs on LinkedIn, is also a frequently invited guest speaker and the author of 20 books including:
- Operations Research using Open-Source Tools
- Discrete Event simulation using ExtendSim
- Crime Analysis and Mapping
- Missile Flight Simulation
- Mathematical Modeling of Warfare and Combat Phenomenon
- Predictive Modeling and Analytics
- Using Math to Defeat the Enemy
- Verification and Validation for Modeling and Simulation
- Simulation Conceptual Modeling
- System Engineering Process and Practices
Connect with Jeffrey Strickland
Contact Jeffrey Strickland
Categories: Articles, Featured, Jeffrey Strickland
The airpassenger time series clearly follows a multiplicative model. Both stl() from the stats R package and stlplus() from the stlplus R package assume an additive model. Wouldn’t it be more appropriate to take the logarithm of the data and then to back-transform to the original scale?
Thank you in advance