Escolar Documentos
Profissional Documentos
Cultura Documentos
=
male if , 0
female if , 1
Female
i i i
u Female Wage + + =
0 0
o |
i i
u Wage + =
0
|
i i
u Wage + + =
0 0
o |
Regression for Females
-> sex = 0
Linear regression Number of obs= 12
F( 1, 10) = 2.21
Prob > F = 0.1683
R-squared = 0.2868
Root MSE = .38277
Robust
gpa Coef. Std. Err. t P>t [95% Conf. Interval]
hsgpa .869 .585 1.49 0.168 -.435 2.173
_cons .244 2.26 0.11 0.916 -4.785 5.273
4
Regression for Males
-> sex = 1
Linear regression Number of obs= 22
F( 1, 20) = 15.29
Prob > F = 0.0009
R-squared = 0.2236
Root MSE = .35021
Robust
gpa Coef. Std. Err. t P>t [95% Conf. Interval]
hsgpa .710 .182 3.91 0.001 .331 1.089
_cons .691 .711 0.97 0.342 -.791 2.173
5
GPA Example: Regression with Dummy Variables
regress gpa sex, robust
Linear regression Number of obs = 34
F( 1, 32) = 1.41
Prob > F = 0.2440
R-squared = 0.0443
Root MSE = .40364
Robust
gpa Coef. Std. Err. t P>t [95% Conf. Interval]
sex -.1764 .1486 -1.19 0.244 -.4792 .1264
_cons 3.540 .1231 28.75 0.000 3.2889 .7904
GPA = 3.540 0.1764 SEX, R
2
= 0.04
(0.12) (0.15)
6
Interpretation of Regression Coefficients with a
Binary Regressor
Consider a regression of wage on a constant, and Female:
For a male, the regression model is:
0
is the intercept for male workers.
For a female, the regression model is:
0
+
0
is the intercept for female workers.
0
is the shift in the intercept
7
i i i i
u Educ Female Wage + + + =
1 0 0
| o |
i i i
u Educ Wage + + =
1 0
| |
i i i
u Educ Wage + + + =
1 0 0
| o |
Copyright 2009 South-
Western/Cengage Learning
8
Example: Wage Discrimination
Consider a regression model:
(0.72) (0.26) (0.049) (0.012)
0
= -1.81
In a simple regression:
(0.21) (0.30)
9
i i i i i
u Exper Educ Female Wage + + + + =
2 1 0 0
| | o |
i
i
i i i i
u Exper Educ Female Wage + + + = 025 . 0 572 . 0 81 . 1 57 . 1
i i i
u Female Wage + = 51 . 2 10 . 7
Using Dummy Variables for Multiple Categories
4 groups: married men (MM), married women(MF), single
men (SM) and single women (SF)
Regression model:
(0.100) (0.055) (0.058) (0.056) (0.007) (0.005)
Which one is the base category?
10
i
u
i
Exper
i
Educ
i
SF
i
MF
i
MM
i
Wage Log + + + + = 027 . 0 079 . 0 110 . 0 198 . 0 213 . 0 321 . 0 ) (
Example: Effects of Physical Attractiveness on Wage
3 groups: Below average(BA), above average(AA), and
average(A)
Regression model for men:
(0.100) (0.046) (0.033)
Which one is the base category?
Regression model for women:
(0.100) (0.066) (0.049)
11
i
rs otherfacto
i
AA
i
BA
i
Wage Log + + = 016 . 0 164 . 0 321 . 0 ) (
i
rs otherfacto
i
AA
i
BA
i
Wage Log + + = 035 . 0 124 . 0 200 . 0 ) (
Outline
Last Time: What is a dummy variable?
How to interpret coefficients in a regression with
a dummy variable(s)?
Can we show the coefficient on a dummy
variable on a graph?
Today: Interaction terms and heteroskedasticity
Why do we need interaction terms?
3 types of interaction terms
What are the consequences of and the solution
for heteroskedasticity?
Interaction Terms Involving Dummy Variable
Consider a regression model:
(0.100) (0.056) (0.055) (0.072)
13
i
u
i
Married Female
i
Married
i
Female
i
Wage Log + + + = ... * 301 . 0 213 . 0 110 . 0 321 . 0 ) (
14
Interactions Between Independent
Variables: Test Score Example
- Perhaps the effect of class size reduction is bigger in districts
where many students are still learning English,
- i.e smaller classes help more if there are many English
learners, who need individual attention
- That is,
TestScore
STR
A
A
might depend on PctEL
- More generally,
1
Y
X
A
A
might depend on X
2
- How to model such interactions between X
1
and X
2
?
- We first consider binary Xs, then continuous Xs
15
(a) Interactions Between 2 Binary Variables
Y
i
= |
0
+ |
1
D
1i
+ |
2
D
2i
+ u
i
- D
1i
, D
2i
are binary
- |
1
is the effect of changing D
1
=0 to D
1
=1. In this specification,
this effect doesnt depend on the value of D
2
.
- To allow the effect of changing D
1
to depend on D
2
, include the
interaction term D
1i
D
2i
as a regressor:
Y
i
= |
0
+ |
1
D
1i
+ |
2
D
2i
+ |
3
(D
1i
D
2i
) + u
i
16
Interpreting the Coefficients
Y
i
= |
0
+ |
1
D
1i
+ |
2
D
2i
+ |
3
(D
1i
D
2i
) + u
i
General rule: compare various cases:
E(Y
i
|D
1i
=0) = |
0
+ |
2
D
2
(1)
E(Y
i
|D
1i
=1) = |
0
+ |
1
+ |
2
D
2
+ |
3
D
2
(2)
subtract (2) (1):
E(Y
i
|D
1i
=1) E(Y
i
|D
1i
=0) = |
1
+ |
3
D
2
- The effect of a change in D
1
depends on D
2
(what we wanted)
- |
3
= is the difference in the effect of a change in D
1
on Y
between those who have D
2
= 1 and those who have D
2
=0
Example: ln(wage) vs. gender and
completion of a college degree
F
is the effect of being female on wages,
C
is the effect of a college education on wages.
This regression does not allow the effect of obtaining a college
degree to depend on gender.
If
FC
is statistically different from zero, then the effect of
education on earnings is gender specific.
FC
shows by how much the wage differential between those
with college degree and those without college degree is larger
for females relative to males
17
i Ci C Fi F i
u D D Y + + + = | | |
0
i Ci Fi FC Ci C Fi F i
u D D D D Y + + + + = | | | |
0
18
Example: TestScore, STR, English learners
Let
HiSTR =
1 if 20
0 if 20
STR
STR
>
<
and HiEL =
1 if l0
0 if 10
PctEL
PctEL
>
<
TestScore = 664.1 18.2HiEL 1.9HiSTR 3.5(HiSTRHiEL)
(1.4) (2.3) (1.9) (3.1)
- Effect of HiSTR when HiEL = 0 is 1.9
- Effect of HiSTR when HiEL = 1 is 1.9 3.5 = 5.4
- Class size reduction is estimated to have a bigger effect when
the percent of English learners is large
- This interaction isnt statistically significant: t = 3.5/3.1
19
(b) Interaction between Continuous and
Binary Variables
Y
i
= |
0
+ |
1
D
i
+ |
2
X
i
+ u
i
- D
i
is binary, X is continuous
- The effect of X on Y (holding constant D) = |
2
, which does not
depend on D
- To allow the effect of X to depend on D, include the
interaction term D
i
X
i
as a regressor:
Y
i
= |
0
+ |
1
D
i
+ |
2
X
i
+ |
3
(D
i
X
i
) + u
i
20
b) Interaction between Continuous and
Binary Variables 2 Regression Lines
Y
i
= |
0
+ |
1
D
i
+ |
2
X
i
+ |
3
(D
i
X
i
) + u
i
- For observations with D
i
= 0 (the D = 0 group):
Y
i
= |
0
+ |
2
X
i
+ u
i
- The D=0 regression line
- For observations with D
i
= 1 (the D = 1 group):
Y
i
= |
0
+ |
1
+ |
2
X
i
+ |
3
X
i
+ u
i
= (|
0
+|
1
) + (|
2
+|
3
)X
i
+ u
i
The D=1 regression line
- The 2 regression lines have both different intercepts and
different slopes.
21
Interaction between Continuous and Binary
Variables 2 Regression Lines.
22
Interpreting the Coefficients
Y
i
= |
0
+ |
1
D
i
+ |
2
X
i
+ |
3
(D
i
X
i
) + u
i
General rule: compare the various cases
Y = |
0
+ |
1
D + |
2
X + |
3
(DX) (1)
Now change X:
Y + AY = |
0
+ |
1
D + |
2
(X+AX) + |
3
[D(X+AX)] (2)
subtract (2) (1):
AY = |
2
AX + |
3
DAX
Y
X
A
A
= |
2
+ |
3
D
- The effect of X depends on D (what we wanted)
- |
3
= increment to the effect of X, when D = 1
23
Example: TestScore, STR, HiEL
(=1 if PctEL > 10)
TestScore = 682.2 0.97STR + 5.6HiEL 1.28(STRHiEL)
(11.9) (0.59) (19.5) (0.97)
- When HiEL = 0:
Test Score = 682.2 0.97STR +u
- When HiEL = 1,
Test Score= 682.2 0.97STR + 5.6 1.28STR +u
= 687.8 2.25STR + u
- Two regression lines: one for each HiSTR group.
- Class size reduction is estimated to have a larger effect when
the percent of English learners is large.
24
Example: Testing hypotheses
Test Score = 682.2 0.97STR + 5.6HiEL 1.28(STRHiEL)
(11.9) (0.59) (19.5) (0.97)
- If the two regression lines have the same slope the
coefficient on STRHiEL is zero: t = 1.28/0.97 = 1.32
- The two regression lines have the same intercept the
coefficient on HiEL is zero: t = 5.6/19.5 = 0.29
- The two regression lines are the same population
coefficient on HiEL = 0 and population coefficient on
STRHiEL = 0: F = 89.94 (p-value < .001) !!
- We reject the joint hypothesis but neither individual
hypothesis (how can this be?)
- The regressors are highly correlated large standard errors
on individual coefficients
25
(c) Interaction between 2 Continuous
Variables
Y
i
= |
0
+ |
1
X
1i
+ |
2
X
2i
+ u
i
- X
1
, X
2
are continuous
- As specified, the effect of X
1
doesnt depend on X
2
- As specified, the effect of X
2
doesnt depend on X
1
- To allow the effect of X
1
to depend on X
2
, include the
interaction term X
1i
X
2i
as a regressor:
Y
i
= |
0
+ |
1
X
1i
+ |
2
X
2i
+ |
3
(X
1i
X
2i
) + u
i
26
Interpreting the Coefficients:
Y
i
= |
0
+ |
1
X
1i
+ |
2
X
2i
+ |
3
(X
1i
X
2i
) + u
i
General rule: compare the various cases
Y = |
0
+ |
1
X
1
+ |
2
X
2
+ |
3
(X
1
X
2
) (1)
Now change X
1
:
Y+ AY = |
0
+ |
1
(X
1
+AX
1
) + |
2
X
2
+ |
3
[(X
1
+AX
1
)X
2
] (2)
subtract (2) (1):
AY = |
1
AX
1
+ |
3
X
2
AX
1
or
1
Y
X
A
A
= |
1
+ |
3
X
2
- The effect of X
1
depends on X
2
(what we wanted)
- |
3
= increment to the effect of X
1
from a unit change in X
2
27
Example: TestScore, STR, PctEL
Test Score = 686.3 1.12STR 0.67PctEL + .0012(STRPctEL),
(11.8) (0.59) (0.37) (0.019)
The estimated effect of class size reduction is nonlinear because
the size of the effect itself depends on PctEL:
TestScore
STR
A
A
= 1.12 + .0012PctEL
PctEL
TestScore
STR
A
A
0 1.12
20%
1.12+.001220 = 1.10
28
Example: Hypothesis Tests
Test Score = 686.3 1.12STR 0.67PctEL + .0012(STRPctEL),
(11.8) (0.59) (0.37) (0.019)
- Does population coefficient on STRPctEL = 0?
t = .0012/.019 = .06 cant reject null at 5% level
- Does population coefficient on STR = 0?
t = 1.12/0.59 = 1.90 cant reject null at 5% level
- Do the coefficients on both STR and STRPctEL = 0?
F = 3.89 (p-value = .021) reject null at 5% level(!!)
(Why? high but imperfect multicollinearity)
Copyright 2009 South-
Western/Cengage Learning
29
30
Heteroskedasticity and Homoskedasticity
- What?
- Consequences of homoskedasticity
- Implication for computing standard errors
Homoskedasticity
If var(u|X
i
) is constant that is, if the variance of the
conditional distribution of u given X does not depend on X
then u is said to be homoskedastic. Otherwise, u is
heteroskedastic.
Example: Earnings of male and female
college graduates
Homoskedasticity: Var(u
i
) does not depend on Male
i
For women:
For men:
Homoskedasticity: the variance of earnings is
the same for men and for women
Equal group variances = homoskedasticity
Unequal group variances = heteroskedasticity
31
i i i
u Male Earnings + + =
1 0
| |
i i
u Earnings + =
0
|
i i
u Earnings + + =
1 0
| |
32
Homoskedasticity in a picture:
- E(u|X=x) = 0 (u satisfies Least Squares Assumption #1)
- The variance of u does not depends on x: u is homoskedastic.
33
Heteroskedasticity in a picture:
- E(u|X=x) = 0 (u satisfies Least Squares Assumption #1)
- The variance of u does depend on x: u is heteroskedastic
34
A real-data example from labor economics: average
hourly earnings vs. years of education (Data source:
Current Population Survey)
Heteroskedastic or homoskedastic?
35
So far we have (without saying so) assumed that
u might be heteroskedastic.
Heteroskedasticity and homoskedasticity concern var(u|X=x).
Because we have not explicitly assumed homoskedastic errors,
we have implicitly allowed for heteroskedasticity.
The OLS estimators remain unbiased, consistent and
asymptotically Normal even when the errors are heteroskedastic.
36
What if the errors are in fact homoskedastic?
- If Assumptions 1-4 hold and the errors are homoskedastic, OLS
estimators are efficient (have the lowest variance) among all linear
estimators. (Gauss-Markov theorem).
- The formula for the variance of
1
=
2
2
1
) (
2
1
)
(
X X
u
n
Var
i
i
|
Note: var(
1
|
)
( )
(
1 1
| | Var SE =
37
We now have two formulas for standard
errors for
1
|
- Homoskedasticity-only standard errors are valid only if the
errors are homoskedastic.
- The usual standard errors or heteroskedasticity robust
standard errors - are valid whether or not the errors are
heteroskedastic.
- The main advantage of the homoskedasticity-only standard
errors is that the formula is simpler. But the disadvantage is
that the formula is only correct if the errors are
homoskedastic.
38
Practical implications
- The homoskedasticity-only formula for the standard error of
1