Escolar Documentos
Profissional Documentos
Cultura Documentos
AUTOCORRELATION
AUTOCORRELATION
Autocorrelation means that the error terms are lagged; that is, error terms are
influenced by the error terms in previous periods.
This breaks one of the assumptions of OLS; namely that:
Cov (eI, eJ) = 0
The most common form of a lag is autocorrelation, as noted above, however
there may be lags in the actual variables as well.
First – order Autoregressive lags (AR1) imply that the error terms are only
correlated for 1 period.
We can model this as:
Yt = B1 + B2Xt + et [Eq.1], where the model is AR(1)
et = Pe(t-1) + Vt [Eq.2]; the error term depends on some weighting of the previous
error term plus a random error ‘V’. Assume ‘V’ to have all the assumptions we bestow
upon ‘e’ normally – homoskedasticity, no serial correlation, normality, expected value
of 0.
Furthermore, assume |P| < 1.
By modelling in this way, the properties of the AR(1) model are:
Homoskedastic
E(et) = 0
Cov (et, e(t-k)) = σ2ePk ; this means that the corr (et, e(t-k)) = Pk ; hence, the
effect of previous periods diminishes the further back you go.
AUTOCORRELATION
Sources of Autocorrelation:
Inappropriate Functional Form.
Omitted variables
Incorrect Time Period
Inappropriately filtered data (e.g seasonal adjustment)
Effects of Autocorrelation (largely the same as Heteroskedasticity):
OLS is still Linear and Unbiased, but it is no longer best.
E.S.E are WRONG, hence you may not carry out hypothesis tests.
The Road to The Generalized Least Squares Estimator:
It is possible to calculate OLS e.s.e that are robust for AR(1), but as we know, they
will not necessarily be ‘Best’. Therefore we must concentrate on employing a better
estimation technique.
Substitute Eq 2. into Eq 1.
Yt = B1 + B2Xt + Pe(t-1) + Vt We now need to eliminate ‘e(t-1)’.
We achieve this by re arranging Eq 1 to get ‘et = Yt – B1 – B2Xt’, and then lagging by
one period and substituting into the above equation:
Yt = B1 + B2Xt + P[Y(t-1) – B1 – B2X(t-1)] + Vt
Yt – PY(t-1) = (1 – P)B1 + (Xt – PX(t-1))B2 + Vt
This is the nonlinear least squares estimate; however, there are problems:
LOSE THE FIRST OBSERVATION (variance is different for only that obs.)
WE DO NOT KNOW WHAT ‘P’ IS.
AUTOCORRELATION
Concerning the two problems:
Estimating ‘P’:
Simple enough, remember that et = Pe(t-1) + Vt
Remember also that et = Yt –B1 – B2Xt, and that this could be lagged.
Hence, substitute the latter into the former to get:
Yt –B1 – B2Xt = P[Y(t-1) –B1 – B2X(t-1)] + Vt
Apply Least Squares Estimation and bam, you get P.
Recovering the first observation:
The problem with the first observation is that it no longer has the same
variance as all the other observations.
Hence, it can be seen as a special case of Heteroskedasticity.
The variance is: Var(e1) = σ2e = (σ2v) / (1 – P2)
Hence, it can be fixed by using weighted least squares
estimation; multiply the original equation by (1 – P2)½
1 . Y 1
2
1
2
1 1
2
2 X1 1 e
2
1
AUTOCORRELATION
Detecting Autocorrelation:
3 Main tests (aside from graphical interpretation):
Durbin – Watson
Durbin – h Test
LM Test
Durbin – Watson:
Distribution does not hold for lagged dependent variables.
Assumes AR(1)
The test statistic is approximately :
d ≈ 2(1 – r), where ‘r’ is the estimated value of ‘P’.
The Durbin – Watson distribution is on the next slide; however, it is
very difficult, and because of the approximation, there are upper and
lower bounds where we may accept or reject the null Hypothesis of no
autocorrelation (easier to use p – values; reject if P value <
significance level).
When consulting the tables, ‘K’ is the number of explanatory variables
in the model.
AUTOCORRELATION: DURBIN –
WATSON BOUNDS TEST
H0: no positive H0*: no negative
autocorrelation autocorrelation
d
0 dL dU 2 4- du 4- dL 4
AUTOCORRELATION
Durbin–h Test:
In the case of a lagged dependent variable, one may use an
alternative test, proposed by Durbin:
Lag the dependent variable, and then calculate the test statistic:
n
h
1 n 2i
Where ‘sigma-squared beta-i’ is the variance of the parameter of the
lagged dependent variable.
This test statistic is NORMALLY distributed, so the critical value
can be found using a ‘Z’ value (how odd).
You can calculate ‘P’ by using the formula d = 2(1-P), and
substituting in the DW test stat (if given).
You can calculate the variance by simply squaring the estimated
standard error.
The test breaks down when n. Variance > 1, however.
AUTOCORRELATION
Breusch-Godfrey LM test:
Can be used for Autocorrelation greater than AR(1), and Lagged
dependent variables, so is the most ‘robust’.
Calculated in a very similar fashion to the LM test for
Heteroskedasticity;
LM = (n – k) x R2 ~ χ(1)
Where ‘n-k’ is the number of sample observations – number of
explanatory variables.
This arises because the LM test estimates the significance of lagged
residuals:
Ut = Yt + c1U(t-1) + c2U(t-2) + ...
It will test the null of ALL ‘ci’ =0.
The test has a chi-squared distribution, and it has a single degree of
freedom.
ONLY HOLDS IN LARGE SAMPLES.