Escolar Documentos
Profissional Documentos
Cultura Documentos
Class 9
Richard P. Waterman
Wharton
1 / 22
Table of contents I
1
Todays material
Collinearity
Definition
What is sxk (adjusted) ?
Review
May 19, 2016
2 / 22
Last time
3 / 22
Todays material
We will work over the material in Stine, 24.2 & 23.4, BAUR Ch. 5.
Collinearity Stine, 24.2. Chapter 5 in case book
Hypothesis testing for regression slopes Stine, 23.4 and Chapter 5 in
the case book.
4 / 22
Collinearity
Definition: correlation between the X-variables.
Consequence: it is difficult to establish which of the X-variables are
most important in the regression (they all look the same).
Visually the regression plane becomes very unstable (sausage in
space, legs on the table).
Key formula:
1
Multiple regression: SE (bk )
n sxk (adjusted)
Contrast this to simple regression where
1
Simple regression: SE (b1 )
n sx1
The difference is in whether the standard deviation of x is adjusted.
May 19, 2016
5 / 22
6 / 22
7 / 22
Interpretations
8 / 22
0.018157
1
RMSE
1
= 0.1427215
sx1
0.0072256
n
310
Of course, you can see the exact standard error on the output:
0.142952, which is really close to 0.1427215, proving our
approximation is very good.
9 / 22
10 / 22
11 / 22
Putting it together
sd
0.007226
0.002175
std.err slope
0.1429
0.4675
The standard error of the slope has increased because the standard
deviation of x has become much smaller.
Recall that the standard deviation of x is in the denominator of the
standard error formula.
The standard deviation fell from 0.007226 to 0.002175 and the
standard error increased from 0.1429 to 0.4675.
12 / 22
1
1
R 2 (Xk v .X1 ,
, Xk1 )
When all the xs are all uncorrelated the VIFs are all 1 (perfection).
As the collinearity increases so do the VIFs. VIFs above 10 are a
warning signal to take some action.
May 19, 2016
13 / 22
sxk
sxk (adjusted)
2
=
0.007226
0.002175
2
= 11.038.
output:
Note that the approximation (11.038) is close to the true value (11.0678),
so the VIFs interpretation as the ratio of the standard deviations is
justified.
1
You get the VIFs by right clicking in the parameter estimates table, choosing
columns, then VIF
May 19, 2016
14 / 22
leverage plots.
1
1
R 2 (Xk v .X1 ,
, Xk1 )
15 / 22
16 / 22
Recall the regression assumptions. If they are seriously broken, then these
tests could be misleading.
Three flavors. They all test whether slopes are equal to zero or not. They
differ in the number of slopes we are looking at simultaneously.
1
Test a subset of the regression coefficients (more than one, but not
all of them the Partial F-test).
17 / 22
Stine: p.613.
Look for the t-statistic.
The hypothesis test in English: does this variable add any explanatory
power to the model that already includes all the other X-variables?
Small p-value says YES, big p-value says NO.
18 / 22
Stine: p. 611.
It is sometimes called the whole model test.
Look for the F-statistic in the ANOVA table.
The hypothesis test in English: do any of the X-variables in the model
explain any of the variability in the Y-variable?
Small p-value says YES, big p-value says NO.
Note that the test does not identify which variables are important.
If you answer this question as NO then its back to the drawing board
none of your variables are any good!
19 / 22
20 / 22
H1 : at least one of 1 , 2 or 3 6= 0.
21 / 22
Review
22 / 22