# Air Passengers (Seasonality)

## Introduction

In this example we will predict the monthly totals in thounsads of international airline passengers with data from 1949 to 1960.

## Descriptive Analysis & Transformations

### Heteroskedasticity

A cople of obvious feature in the plot of the data is that it shows an non-stationary heteroskedastic time series. However, to facilitate the fitting of an AR model we want a Stationary Homoskedastic time series. With that goal in mind we could consider first a boxcox transformation to deal with the heteroskedasticity but in this case a simple log transformation works just fine.

log_ap = air()
log_ap.Passengers = log.(log_ap.Passengers)
rename!(log_ap, :Passengers => :LogPassengers)

### Stationary Behavior

We still have a clear non-stationary time series displaying a marked trend and montnly seasonality.

stl_ap = stl(log_ap.LogPassengers,12,robust=true)
plot(stl_ap)

To faciliate the fitting of an AR model we want an stationary time series, and in order to have one we will be using a differentation of order 12 for the trend and seasonality.

d12_log_ap = d(log_ap,1,12)

There is a case for second differentation of oder one since it furthers reduces the variance of the resulting time series indicating a potential remaining trend, however this potential improvement might not be enough to justify this new transformation since the existing one offers now a good enough stationary time series to consider an AR model with a constant coefficient.

### Evaluating Seasonal Differentation

splot(d12_log_ap)

After a differentation of order 12 we can also see that the seasonality has mostly disappeared and we can continue our analysis with a reasonalble stationary dataset.

### Autoregressive Behavior

plot(pacf(d12_log_ap[:,2]))

We can see clear auto-correlations with values of the previous two months and with values of the same months in the previous year. Beyond this there seems to be not enough correlation to justify a more complex model, but we can always check this hypothesis lookig at the Information Criteria.

## Fitting an AR model

Let's then fit a AR with 13 parameters with a constant despite being differentiated and see how it looks:

ar_tap = ar(d12_log_ap,13)
Multivariate Autoregressive Model

ar(X, order=13, constant=true)

Residuals Summary
┌───────────────────────┬────────────┬────────────┬──────────────┬─────────────┬───────────┬──────────┬────────────┐
│              Variable │        Min │         1Q │       Median │        Mean │        3Q │      Max │  H0 Normal │
├───────────────────────┼────────────┼────────────┼──────────────┼─────────────┼───────────┼──────────┼────────────┤
│ d[1,12]_LogPassengers │ -0.0964046 │ -0.0153334 │ -0.000811767 │ 1.92423e-17 │ 0.0180719 │ 0.114315 │ 0.00891643 │
└───────────────────────┴────────────┴────────────┴──────────────┴─────────────┴───────────┴──────────┴────────────┘
┌───────────────────────┬─────────────┬────────────┬──────────┬──────────┐
│              Variable │        Mean │   Variance │ Skewness │ Kurtosis │
├───────────────────────┼─────────────┼────────────┼──────────┼──────────┤
│ d[1,12]_LogPassengers │ 1.92423e-17 │ 0.00126498 │ 0.379074 │  1.15283 │
└───────────────────────┴─────────────┴────────────┴──────────┴──────────┘

Coefficients

Φ0
┌           ┐
│ 0.029 **  │
└           ┘
Φ1
┌          ┐
│ 0.51 *** │
└          ┘
Φ2
┌           ┐
│ 0.325 *** │
└           ┘
Φ3
┌            ┐
│ -0.096     │
└            ┘
Φ4
┌           ┐
│ 0.005     │
└           ┘
Φ5
┌          ┐
│ 0.15 ^   │
└          ┘
Φ6
┌           ┐
│ 0.007     │
└           ┘
Φ7
┌            ┐
│ -0.101     │
└            ┘
Φ8
┌           ┐
│ 0.096     │
└           ┘
Φ9
┌           ┐
│ 0.144 ^   │
└           ┘
Φ10
┌            ┐
│ -0.155 ^   │
└            ┘
Φ11
┌            ┐
│ -0.122     │
└            ┘
Φ12
┌            ┐
│ -0.306 *** │
└            ┘
Φ13
┌           ┐
│ 0.294 *** │
└           ┘
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘^’ 0.1 ‘ ’ 1  and ‘+’ if fixed

Σ2 Variance/Covariance Matrix
┌            ┐
│ 0.00125435 │
└            ┘

Information Criteria
┌──────────┬──────────┬─────────┬──────────┐
│      AIC │     AICC │     BIC │      H&Q │
├──────────┼──────────┼─────────┼──────────┤
│ -6.46902 │ -6.42252 │ 73.4075 │ -6.34477 │
└──────────┴──────────┴─────────┴──────────┘

Statistics
┌───────────────────────┬─────────────────┬──────────┬──────────┐
│              Variable │ Fisher's p-test │       R2 │    R2adj │
├───────────────────────┼─────────────────┼──────────┼──────────┤
│ d[1,12]_LogPassengers │     9.99201e-16 │ 0.669557 │ 0.633152 │
└───────────────────────┴─────────────────┴──────────┴──────────┘

### Fixing Coefficients

As expected we see that $\Phi1,\Phi2,\Phi12,$ and $\Phi13$ have highly significant coefficients, also we can see some significance in $\Phi0$. This confirms the case for a further differentation of order one however, since doing so decreases the normality profile of the residuals we will keep it as it is.

Since we want to know the falues of these four parameters without the influcence of the rest we will now fit again the model fixing all coefficients except those five.

Φ = ar_tap.Φ
fΦ = copy(ar_tap.Φ)
fΦ[3:11] .= 0 # fixing coefficients from 3 to 11 to zero.
dΦ = (Φ,fΦ)   # Tuple informing AR which coefficients to fix.
arf_tap = ar(d12_log_ap,13;dΦ)
Multivariate Autoregressive Model

ar(X, order=13, constant=true)

Residuals Summary
┌───────────────────────┬───────────┬────────────┬────────────┬─────────────┬───────────┬──────────┬───────────┐
│              Variable │       Min │         1Q │     Median │        Mean │        3Q │      Max │ H0 Normal │
├───────────────────────┼───────────┼────────────┼────────────┼─────────────┼───────────┼──────────┼───────────┤
│ d[1,12]_LogPassengers │ -0.112951 │ -0.0207809 │ 0.00194734 │ 6.04092e-17 │ 0.0205974 │ 0.116951 │ 0.0157718 │
└───────────────────────┴───────────┴────────────┴────────────┴─────────────┴───────────┴──────────┴───────────┘
┌───────────────────────┬─────────────┬────────────┬──────────┬──────────┐
│              Variable │        Mean │   Variance │ Skewness │ Kurtosis │
├───────────────────────┼─────────────┼────────────┼──────────┼──────────┤
│ d[1,12]_LogPassengers │ 6.04092e-17 │ 0.00137195 │ 0.211471 │  1.22265 │
└───────────────────────┴─────────────┴────────────┴──────────┴──────────┘

Coefficients

Φ0
┌          ┐
│ 0.03 **  │
└          ┘
Φ1
┌           ┐
│ 0.538 *** │
└           ┘
Φ2
┌           ┐
│ 0.298 **  │
└           ┘
Φ3
┌         ┐
│ 0.0 +   │
└         ┘
Φ4
┌         ┐
│ 0.0 +   │
└         ┘
Φ5
┌         ┐
│ 0.0 +   │
└         ┘
Φ6
┌         ┐
│ 0.0 +   │
└         ┘
Φ7
┌         ┐
│ 0.0 +   │
└         ┘
Φ8
┌         ┐
│ 0.0 +   │
└         ┘
Φ9
┌         ┐
│ 0.0 +   │
└         ┘
Φ10
┌         ┐
│ 0.0 +   │
└         ┘
Φ11
┌         ┐
│ 0.0 +   │
└         ┘
Φ12
┌           ┐
│ -0.42 *** │
└           ┘
Φ13
┌           ┐
│ 0.329 *** │
└           ┘
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘^’ 0.1 ‘ ’ 1  and ‘+’ if fixed

Σ2 Variance/Covariance Matrix
┌            ┐
│ 0.00136042 │
└            ┘

Information Criteria
┌──────────┬──────────┬─────────┬──────────┐
│      AIC │     AICC │     BIC │      H&Q │
├──────────┼──────────┼─────────┼──────────┤
│ -6.52421 │ -6.50396 │ 29.4763 │ -6.47983 │
└──────────┴──────────┴─────────┴──────────┘

Statistics
┌───────────────────────┬─────────────────┬──────────┬──────────┐
│              Variable │ Fisher's p-test │       R2 │    R2adj │
├───────────────────────┼─────────────────┼──────────┼──────────┤
│ d[1,12]_LogPassengers │             0.0 │ 0.641615 │ 0.630327 │
└───────────────────────┴─────────────────┴──────────┴──────────┘

## Forecast of Transformed Series

fct = forecast(arf_tap,3*12)
plot(fct)

## Forecast Original Data

x0 = reshape(log_ap[:,2][1:12]',1,1,12)
fct2 = p(fct,x0)
fc = Forecast.transform(fct2,exp)
new_names = ["#Passengers"]
setnames!(fc,new_names)
plot(fc, title = "Forecast Air Passengers")

And the forecasted values in their prediction intervals for the next three years are:

Forecast Information

Forecasting ar(X, order=13, constant=true)
Integrated with x0 size: (1, 1, 12)
Data transformed with function: exp

Mean Forecasting
┌────────────┬─────────────┐
│       Date │ #Passengers │
├────────────┼─────────────┤
│ 1961-01-01 │     454.304 │
│ 1961-02-01 │     426.743 │
│ 1961-03-01 │     478.838 │
│ 1961-04-01 │     496.637 │
│ 1961-05-01 │     527.339 │
│ 1961-06-01 │     589.878 │
│ 1961-07-01 │     689.917 │
│ 1961-08-01 │     685.093 │
│ 1961-09-01 │     569.652 │
│ 1961-10-01 │     512.763 │
│ 1961-11-01 │     444.652 │
│ 1961-12-01 │     491.849 │
│ 1962-01-01 │     514.382 │
│ 1962-02-01 │     484.529 │
│ 1962-03-01 │     533.494 │
│ 1962-04-01 │     570.526 │
│ 1962-05-01 │     591.496 │
│ 1962-06-01 │     670.841 │
│ 1962-07-01 │     779.482 │
│ 1962-08-01 │     769.824 │
│ 1962-09-01 │      643.18 │
│ 1962-10-01 │     579.931 │
│ 1962-11-01 │     497.531 │
│ 1962-12-01 │     552.245 │
│ 1963-01-01 │     577.851 │
│ 1963-02-01 │       543.4 │
│ 1963-03-01 │     603.194 │
│ 1963-04-01 │     635.318 │
│ 1963-05-01 │     668.186 │
│ 1963-06-01 │     749.916 │
│ 1963-07-01 │     876.551 │
│ 1963-08-01 │     865.866 │
│ 1963-09-01 │     722.028 │
│ 1963-10-01 │     650.952 │
│ 1963-11-01 │     560.941 │
│ 1963-12-01 │     620.993 │
└────────────┴─────────────┘

Prediction Intervals alpha at: (0.8, 0.95)

Upper:
┌────────────┬────────────────────┬────────────────────┐
│       Date │ upper1_#Passengers │ upper2_#Passengers │
├────────────┼────────────────────┼────────────────────┤
│ 1961-01-01 │            476.294 │            488.362 │
│ 1961-02-01 │            453.564 │            468.437 │
│ 1961-03-01 │            512.805 │            531.751 │
│ 1961-04-01 │            534.315 │            555.404 │
│ 1961-05-01 │            568.989 │            592.353 │
│ 1961-06-01 │            637.651 │            664.487 │
│ 1961-07-01 │            746.692 │            778.615 │
│ 1961-08-01 │            742.058 │            774.106 │
│ 1961-09-01 │            617.338 │            644.177 │
│ 1961-10-01 │            555.878 │             580.15 │
│ 1961-11-01 │            482.148 │            503.261 │
│ 1961-12-01 │            533.405 │            556.806 │
│ 1962-01-01 │            557.897 │            582.403 │
│ 1962-02-01 │            525.553 │            548.658 │
│ 1962-03-01 │            578.689 │            604.144 │
│ 1962-04-01 │            618.876 │            646.108 │
│ 1962-05-01 │            641.635 │            669.876 │
│ 1962-06-01 │            727.715 │            759.749 │
│ 1962-07-01 │            845.574 │              882.8 │
│ 1962-08-01 │            835.102 │            871.869 │
│ 1962-09-01 │            697.721 │            728.442 │
│ 1962-10-01 │            629.111 │            656.812 │
│ 1962-11-01 │            539.724 │            563.489 │
│ 1962-12-01 │            599.078 │            625.457 │
│ 1963-01-01 │            626.857 │            654.459 │
│ 1963-02-01 │            589.484 │            615.441 │
│ 1963-03-01 │            654.349 │            683.162 │
│ 1963-04-01 │            689.198 │            719.546 │
│ 1963-05-01 │            724.853 │            756.771 │
│ 1963-06-01 │            813.515 │            849.337 │
│ 1963-07-01 │            950.889 │             992.76 │
│ 1963-08-01 │            939.298 │            980.659 │
│ 1963-09-01 │            783.261 │            817.752 │
│ 1963-10-01 │            706.157 │            737.252 │
│ 1963-11-01 │            608.513 │            635.308 │
│ 1963-12-01 │            673.658 │            703.322 │
└────────────┴────────────────────┴────────────────────┘

Lower:
┌────────────┬────────────────────┬────────────────────┐
│       Date │ lower1_#Passengers │ lower2_#Passengers │
├────────────┼────────────────────┼────────────────────┤
│ 1961-01-01 │            433.329 │            422.621 │
│ 1961-02-01 │            401.509 │            388.761 │
│ 1961-03-01 │            447.121 │             431.19 │
│ 1961-04-01 │            461.616 │            444.088 │
│ 1961-05-01 │            488.737 │             469.46 │
│ 1961-06-01 │            545.684 │            523.646 │
│ 1961-07-01 │            637.458 │            611.323 │
│ 1961-08-01 │            632.502 │            606.316 │
│ 1961-09-01 │            525.649 │            503.748 │
│ 1961-10-01 │            472.993 │            453.204 │
│ 1961-11-01 │            410.072 │            392.868 │
│ 1961-12-01 │             453.53 │            434.469 │
│ 1962-01-01 │            474.261 │            454.305 │
│ 1962-02-01 │            446.707 │            427.896 │
│ 1962-03-01 │            491.829 │            471.106 │
│ 1962-04-01 │            525.953 │            503.785 │
│ 1962-05-01 │            545.274 │            522.287 │
│ 1962-06-01 │            618.411 │            592.337 │
│ 1962-07-01 │            718.556 │            688.255 │
│ 1962-08-01 │            709.649 │            679.722 │
│ 1962-09-01 │            592.902 │            567.897 │
│ 1962-10-01 │            534.596 │             512.05 │
│ 1962-11-01 │            458.637 │            439.294 │
│ 1962-12-01 │            509.073 │            487.602 │
│ 1963-01-01 │            532.677 │            510.211 │
│ 1963-02-01 │            500.919 │            479.792 │
│ 1963-03-01 │            556.038 │            532.586 │
│ 1963-04-01 │            585.651 │             560.95 │
│ 1963-05-01 │            615.949 │             589.97 │
│ 1963-06-01 │             691.29 │            662.133 │
│ 1963-07-01 │            808.024 │            773.944 │
│ 1963-08-01 │            798.175 │             764.51 │
│ 1963-09-01 │            665.582 │             637.51 │
│ 1963-10-01 │            600.062 │            574.753 │
│ 1963-11-01 │            517.088 │            495.279 │
│ 1963-12-01 │            572.445 │            548.301 │
└────────────┴────────────────────┴────────────────────┘