Você está na página 1de 22

ARIMA Modelling and Forecasting

By: Amar Kumar


Introduction
Stationarity of the time series

AR and MA Models

ARIMA Models (Auto-Regressive Integrated Moving Average)

Diagnostic Checks

Forecast using ARIMA models


Stationarity of the time series
If an AR model is not stationary

previous values of the error term will have a


non-declining effect on the current value of the
dependent variable.
The coefficients on the MA process would not
converge to zero as the lag length increases.

For an AR model to be stationary, the


coefficients on the corresponding MA process
decline with lag length, converging on 0.
Stationary Series (1/2)
<= homoscedasticity

How to deal with non-stationarity:

Model the differences of the terms and not the actual term.
x(t) x(t-1) = ARMA (p , q)
Stationary Series (2/2)
1.

2.

3.
Dickey Fuller Test of Stationarity
X(t) = Rho * X(t1) + Er(t)

Rho- introduced coefficient

X(t) X(t1) = (Rho 1) X(t 1) + Er(t)


Ho: (Rho 1)=0

If null hypothesis is rejected, well get a stationary time


series.

*Also called unit root test


Auto-Regressive Time Series Model

x(t) = alpha * x(t 1) + error (t)

AR(1)
Mean reverting behaviour
Moving Average Time Series Model

x(t) = beta * error(t-1) + error (t)

MA(1)
Noisy pattern
AR Signature
MA Signature
AR or MA? It depends!
Whether a series displays AR or MA behavior often
depends on the extent to which it has been differenced.

An underdifferenced series has an AR signature


(positive autocorrelation)

After one or more orders of differencing, the


autocorrelation will become more negative and an MA
signature will emerge

Dont go too far: if series already has zero or negative


autocorrelation at lag 1, dont difference again
ARIMA Model
Box-Jenkins Methodology

A series which needs to be differenced to be made stationary is an


integrated (I) series

Lags of the stationarized series are called autoregressive (AR)


terms

Lags of the forecast errors are called moving average (MA) terms

ARIMA(p,d,q)
p = the number of autoregressive terms
d = the number of nonseasonal differences
q = the number of moving-average terms
The ARIMA filtering box
Diagnostic Checks
With this approach we only test for
autocorrelation Ljung-Box statistic.

If there is evidence of autocorrelation, we need


to go back to the identification stage and re-
specify the model, by adding more lags.

A criticism of this approach is that it fails to


identify if the model is too big or over-
parameterised, it only tells us if it is too small.
Parsimonious Model
The aim of this type of modelling is to produce a
model that is parsimonious, or as small as
possible, whilst passing the diagnostic checks.
A parsimonious model is desirable because
including irrelevant lags in the model increases
the coefficient standard errors and therefore
reduces the t-statistics.
Models that incorporate large numbers of lags,
tend not to forecast well, as they fit data specific
features, explaining much of the noise or
random features in the data.
Measuring Forecast Accuracy
To determine how accurate a forecast is, the
simplest method is to plot the forecast against
the actual values as a direct comparison
In addition it may be worthwhile to compare the
turning points, this is particularly important in
finance.
There are a number of methods to determine
accuracy of the forecast, often more than one is
included in a set of results.
Mean Squared Error (MSE)
The MSE statistic can be defined as:

1 T
MSE
T (T1 1) t T1
( yt s f t ,s ) 2

T total sample size


T1 first out - of - sample forecast observatio n
f t ,s s - step - ahead forecasts made at time t
yt s the actual value at time t
MSE Example
Steps Forecast Actual Squared
Ahead Error
1 0.1 0.15 0.0025
2 0.25 0.20 0.0025
3 0.5 0.40 0.01

MSE (0.0025 0.0025 0.01) / 3 0.005


Forecast Accuracy
There are a number of other measures
used and these include:
Mean Absolute Error
Mean Average Prediction Error
Chows test for predictive failure
Theils U-statistic (where the forecast is
compared to that of a benchmark model)
Root Mean Square Error ( the square root of
the MSE)
Application of ARIMA model
library(tseries) ; library(graphics) ; data(AirPassengers) ; par(mfrow=c(2,1))
plot(AirPassengers) ; plot(log(AirPassengers)) ; plot(diff(log(AirPassengers)))
adf.test(diff(log(AirPassengers)), alternative="stationary", k=0)
acf(log(AirPassengers)) ; acf(diff(log(AirPassengers)))
pacf(diff(log(AirPassengers)))
fit <- arima(AirPassengers, order=c(0, 1, 1), list(order=c(0, 1, 0), period = 12))
fore <- predict(fit, n.ahead=24)
U <- fore$pred + 2*fore$se
L <- fore$pred - 2*fore$se
ts.plot(AirPassengers, fore$pred, U, L, col=c(1, 2, 4, 4), lty=c(1, 1, 2, 2))
legend("topleft", c("Actual", "Forecast", "Error Bounds (95% prediction interval)"), col=c(1,
2, 4),lty=c(1, 1, 2))
res <- residuals(fit) ; acf(res) ; pacf(res)
qqnorm(residuals(fit)) ; qqline(residuals(fit))
adf.test(fit$residuals, alternative ="stationary")
tsdiag(fit)
Box.test(resid(fit), 12, type="Ljung", fitdf = 1)
Conclusion
When using ARIMA models, whether the series
is stationary or not determine how stable it is.
Box-Jenkins methodology is part art and part
science.
Forecasting of time series is an important
measure of how well a model works
There are many measures of how accurate a
forecast is, usually a number is calculated to
determine if the forecast is acceptable, although
all have faults.

Você também pode gostar