Você está na página 1de 43

PANEL DATA (Ch. 10)

The recommended exercise questions from the textbook:

Chapter 10: All except (10.6), (10.10).

[1]

What are panel data?

Panel data consists of the observations on the same n entities at

two or more time periods T. If the data set contains observations

on the variables X and Y, then the data are denoted

(

X

it

,

Y

it

),

i =

1,

,

n and t =

1,

, T ,

where the first subscript, i, refers to the entity being observed, and

the second subscript, t, refers to the date at which it is observed.

Balanced panel Vs. unbalanced panel.

• Balanced panel:

• Unbalanced panel:

Variables are observed for each entity and

each time period.

Some missing data for at least one time

period.

We consider the analysis of balanced panel. But extension to

unbalanced is straightforward.

Panel-1

[2]

Revisiting Omitted Variables Biases

Issue:

 

Do alcohol taxes help decrease traffic deaths?

Data: fatality.wf1

 

48 U.S. states (excluding Alaska and Hawaii): N = 48.

1982 -1988: T =7.

 

fatality rate = # of traffic accident deaths per 10,000 people.

 

beertax = tax per a case of beer ($).

Estimation results for the 1982 data:

 
 

Fatality Rate = 2.01 + 0.15BeerTax

(0.15)

(0.13)

Estimation results for the 1988 data:

Fatality Rate = 1.86 + 0.44BeerTax (0.11) (0.13)

Panel-2

Panel-3

Panel-3

What is going on here?

Consider a simple multiple regression model (for a given time t):

Y it = β 0 + β 1 X it + β 2 Z i + u it , i = 1, where Z i is a time-invariant regressor.

, N,

1 What do β 1 and β 2 measure? β 1 measures the partial effect of X it on Y it with Z i held constant. Similarly, β 2 measures the partial effect of Z i on Y i with X it held constant.

If you estimate Y it = α 0 + α 1 X it + error it instead?

ˆ

α ββ+

1

p

12

cov(

X Z

it

,

i

)

var(

X

it

)

Each state would have a different level of preference for alcohol (say, Z i = Pal).

Pal (Z) and Beertax (X) could be positively related: cov(

X Z

it

,

i

)

>0.

Pal (Z) would have a positive partial effect on FatalityRate (β 2 > 0).

Thus,

αˆ

1

could be positive even if the true β 1 is negative.

How could we control Pal using panel data?

Panel-4

[3]

Two equations for 1982 and 1988:

Panel Data with Two Time Periods

FatalityRate i,1988 = β 0 + β 1 BeerTax i,1988 + β 2 Z i + u i,1988 .

FatalityRate i,1982 = β 0 + β 1 BeerTax i,1982 + β 2 Z i + u i,1982 .

FatalityRate i,1988 – Fatality i,1982

= β 1 (BeerTax i,1988 –BeerTax i,1982 ) + (u i,1988 -u i,1982 ).

No Z i in (1)! OLS on (1) will yield a consistent estimator of β 1 .

Actual estimation results for (1):

Fatality Fatality

1988

1982

(1)

= -0.072 – 1.04(BeerTax 1988 – BeerTax 1982 ) (0.065) (0.36)

− Fatality 1988 1982 (1) = -0.072 – 1.04(BeerTax 1 9 8 8 – BeerTax 1

Panel-5

Comments on the before-and-after estimation results.

• As real beer tax increases by $1 per case, the traffic fatality rate falls by 1.04 deaths per 10,000 people.

This is a big effect, because mean traffic fatality rate is

approximately two.

• This before-and-after approach works well if T = 2. What should we do if T > 2?

Panel-6

[4]

Fixed Effects Regression

(A)

A simple regression model:

Y it = β 0 + β 1 X it + β 2 Z i + u it , i = 1,

, N, t = 1,

• Set α i = β 0 + β 2 Z i . Then, we have

Y it = β 1 X it + α i + u it , which is called the “fixed effects regression model.

, T.

(1)

(2)

• For the i’th cross-sectional entity, the regression line is (2). The

slope coefficient β 1 is the same for all i, but the intercept terms α i

are different across different i (but constant over time).

• Set:

Y it = β 0 + β 1 X it + γ 2 D2 i + γ 3 D3 i +

+ γ n Dn i + u it ,

where i = 1,

, n, t = 1,

, T (nT observations),

D 2

1

if i is the nd entit y ;

2

i = ⎨

0

otherwise

,

and other dummy variables D3,

, Dn are similarly defined.

(3)

• In (3), α 1 = β 0 , α 2 = β 0 + γ 2 ,

, α n = β 0 + γ n .

• The slope coefficient β 1 and n other parameters (β 0 , γ 2 ,

be estimated by OLS on model (3).

, γ n ) can

Panel-7

• “Entity-demeaned” OLS algorithm

• Y it = β 1 X it + α i + u it

Y

i

= β 1

X

i + α i +

u i , where

1 Y = i T
1
Y
=
i
T

Σ

T

t = 1

Y

it .

------------------------------------

(

YY−= β X X + uu.

it

i

1

it

i

it

i

)

(

)

(

)

(4)

• OLS estimator of β 1 from (4) = OLS estimator of β 1 from (3).

• Least Square Assumptions for the fixed effects model:

(FEA.1)

(FEA.2) The data,

Eu X X

X

(

|

it

i

1

,

(

i

i

2

1

,

,

,

,

)

,

X α =

iT

,

XY

iT

,

i

i

1

,

0

.

Y

iT

)

, i =1,

, n, are random

sample.

(FEA.3) (

X α

it

,

i

) have nonzero finite fourth moments: Large

(FEA.4)

(FEA.5)

outliers are unlikely. There is no perfect multicollinearity.

No autocorrelation:

cov(

uu

it

,

is

|

X

i

1

,

,

)

X α =

iT

,

i

0

for all

t

s .

For multiple regressions, X it should be replaced by full list of X 1,it ,

…, X k,it .

• What happens if (FEA.5) is violated?

Panel-8

(B)

(C)

Extension to multiple X’s.

The fixed effects regression model is

Y it = β 1 X 1,it +

+ β k X k,it + α i + u it ,

where i = 1,

Equivalently, the fixed effects model can be written as

, n, and t = 1,

, T.

(5)

Y it = β 0 + β 1 X 1,it +

+ β k X k,it + γ 2 D2 i +

+ γ n Dn i + u it .

(6)

“Entity-demeaned” algorithm

(

YY− = β X X ++ β X X + uu

it

i

1

1,

it

1,

i

k

k it

,

k i

,

it

i

)

(

)

(

)

(

). (7)

OLS estimators of β 1 ,

β k from (6).

, β k from (7) = OLS estimators of β 1 ,

,

Application to Traffic Deaths.

Fixed effects regression results:

Fatality Rate = -0.66BeerTax + StateFixedEffects.

(0.20)

Panel-9

[5]

(1)

Time and Entity Fixed Effects Model

Motivation.

• Return to our FatalityRate example:

Y it = β 0 + β 1 X it + β 2 Z i + β 3 S t + u it ,

where, Y it = FatalityRate; X it = BeerTax;

Z i = time-invariant preferences for alcohol or driving of the

people in State i;

S t = Time specific effects (common to all states) such as

overall mobile safety improvements.

(2)

• Let

B

1

t

i f t is the f irst time period

0,

otherwise

.

1

= ⎨

;

Define dummy variables B2 t ,

, BT t similarly.

Time and Entity Fixed Effects Model:

Y it = β 0 + β 1 X 1,it +

+ β k X k,it + γ 2 D2 i +

+ γ n Dn i

+ δ 2 B2 t +

δ T BT t + u it .

• Too many regressors. But can get reasonably accurate estimates

(3)

of β 1 ,

inaccurate.

, β k . But the estimates of γ 2 ,

Application to traffic death

, γ n and δ 2 ,

, δ T are

Fatality Rate = -0.64Beertax + StateFixedEffects

(0.25)

Panel-10

+ TimeFixedEffects.

[6]

Drunk Driving Laws and Traffic Death

Would driving laws and economic conditions matter?

[6] Drunk Driving Laws and Traffic Death • Would driving laws and ec onomic conditions matter?

Panel-11

• Drinking or drunken driving law do not matter very much.

• Economic factors are important.

• (4) is the base model.

• Average tax = $0.5/case, and average fatality rate = 2 per 10,000 people.

• As tax increases by $0.5, fatality rate drops 0.45×0.5 = 0.225 (per

10,000).

But this result is somewhat imprecise: The confidence interval for

the effect of BeerTax at 95% of confidence level is:

−±×0.45 1.96 0.22 (-0.88, -0.02),

which is quite wide.

Panel-12

[7]

(1)

Eviews Exercise

Exercise with an artificial panel data set named “artificial_panel.xls.”

There are four variables in the excel file, “country”, “year”, “y”, and “x”. Each variable has 11 observations from the 3 rd row to the 14 th row. The data are artificial numbers for three countries, US, Japan and Korea. Notice that the variable “country” is alphabetic, not numeric.

STEP 1:

Open artificial_panel.xls using Excel. Then, using your mouse, block the data and copy them.

STEP 2:

Open Eviews. Then, type the following on the Eviews window (the narrow white window below the File, Edit, Object buttons):

create u 12 (enter)

white window below the F ile, Edit, Object buttons): create u 12 (enter) Then, a workfile

Then, a workfile window will pop up.

Panel-13

Type the followings on the Eviews window: alpha country (enter) data year y x (enter)

Type the followings on the Eviews window:

alpha country (enter) data year y x (enter)

The command “alpha” is used to create alphabetic variables, while “data” is for numeric variables.

Then, a spreadsheet will pop up.

Panel-14

Close the window by clicking on X on the North-East corner of the window. Eviews

Close the window by clicking on X on the North-East corner of the window. Eviews will ask you whether you want to delete Untitled Group. Click on the Yes button.

Panel-15

STEP 3: On the workfile, click on the show buttom. Then, a SHOW window will

STEP 3:

On the workfile, click on the show buttom. Then, a SHOW window will pop up. Type on the window:

country year y x

Panel-16

Click on OK . Then, a spreadsheet will pop up. Panel-17

Click on OK. Then, a spreadsheet will pop up.

Panel-17

Click on Edit+/- buttom and locate your cursor on the 1-country cell. And push the

Click on Edit+/- buttom and locate your cursor on the 1-country cell. And push the right button on your mouse.

Panel-18

Then, you will see that the data from the excel file ar e pasted to

Then, you will see that the data from the excel file are pasted to the spreadsheet.

Panel-19

Close the spreadsheet by clicking on X on the North-East corner. Eviews will ask you

Close the spreadsheet by clicking on X on the North-East corner. Eviews will ask you whether you want to delete Untitled Group. Click on the Yes button.

STET 4:

On the workfile, push the save buttom. Determine the drive and file folder where you want to save the file. Choose the file name

“artificial_panel.wf1”.

Panel-20

Click on the save button. Then, a “Workfile Save” window will pop up. Just click

Click on the save button. Then, a “Workfile Save” window will pop up. Just click on the ok button.

Panel-21

Then, you will be back to the workfile. Panel-22

Then, you will be back to the workfile.

Panel-22

STEP 5: On the workfile, push the Proc button. Choose Structure/Resize Current Page… Panel-23

STEP 5:

On the workfile, push the Proc button. Choose Structure/Resize Current Page…

Panel-23

Then you will have the Workfile Structure window. Choose Dated Panel . Then, you will

Then you will have the Workfile Structure window. Choose Dated Panel. Then, you will have the following screen.

Panel-24

Type 2001 for Start date , 2004 for End date , country for Cross- section

Type 2001 for Start date, 2004 for End date, country for Cross- section ID series, and year for Data series. Then, click on OK.

Panel-25

Then, you will be back to the workfile. Save it!!! STEP 6: button. Choose Equation

Then, you will be back to the workfile. Save it!!!

STEP 6:

button. Choose Equation and choose

art_pan as the name of the object. Then, an Equation Estimation window will pop up. Type “y x” on the Equation specification box.

Push the objects/new object

Panel-26

And click on Panel Options . Panel-27

And click on Panel Options.

Panel-27

Choose “Fixed” for Cross-section , “Fixed” for Period , and “White (diagonal) for Coef covariance

Choose “Fixed” for Cross-section, “Fixed” for Period, and “White (diagonal) for Coef covariance method.

By choosing “Fixed” for Cross-section, you are doing regression with dummy variables for individual entities. By choosing “Fixed” for Period, you are adding time dummy variables into regression.

Panel-28

STEP 7: Choose view/Fixed/Random Effects/Cross-section Effects . Then you will have: Panel-29

STEP 7:

Choose view/Fixed/Random Effects/Cross-section Effects. Then you will have:

Panel-29

Choose view/Fixed/Random Effects/Period Effects . Panel-30

Choose view/Fixed/Random Effects/Period Effects.

Panel-30

Choose view/Fixed/Random Effects Te sting/Redundant Fixed Effects . Panel-31

Choose view/Fixed/Random Effects Testing/Redundant Fixed Effects.

Panel-31

Panel-32

Panel-32

I found that the F and χ 2 statistics for the individual dummy variables and the time dummy variables are computed assuming the error terms in the regression models are homoskedastic over i and t. So, the results are not reliable if the error terms are in fact heteroskedastic. If you would like to test whether time effects are statistically significant,

I would like to suggest you to estimate your model choosing None for Period but including time-dummy variables as time dummy variables.

Panel-33

(2) Exercise with fatality.wf1.

-----------------------------------------------------------------------------------

variable name

----------------------------------------------------------------------------------

variable label

state

State ID (FIPS) Code

year

Year

spircons

Spirits Consumption

unrate

Unemployment Rate

perinc

Per Capita Personal Income

emppop

Employment/Population Ratio

beertax

Tax on Case of Beer

sobapt

% Southern Baptist

mormon

% Mormon

mlda

Minimum Legal Drinking Age

dry

% Residing in Dry Counties

yngdrv

% of Drivers Aged 15-24

vmiles

Ave. Mile per Driver

vmilespd

Ave. Mile per 1,000 Driver

breath

Prelim. Breath Test Law

jaild

Mandatory Jail Sentence

comserd

Mandatory Community Service

jailcom

jaild + comserd

allmort

# of Vehicle Fatalities (#VF)

mrall

Vehicle Fatality Rate (VFR) = #VF/Population

vfrall

10,000*mrall = VFR per 10,000 people

allnite

# of Night-time VF (#NVF)

mralln

Night-time VFR (NVFR)

allsvn

# of Single VF (#SVF)

a1517

#NVF, 15-17 year olds

mra1517n

NVFR, 15-17 year olds

a1829

#VF, 18-20 year olds

a1820n

#NVF, 18-20 year olds

mra1820

VFR, 18-20 year olds

mra1820n

NVFR, 18-20 year olds

a2124

#VF, 21-24 year olds

mra2124

VFR, 21-24 year olds

a2124n

#NVF, 21-24 year olds

mra2124n

NVFR, 21-24 year olds

aidall

# of alcohol-involved VF

Panel-34

da18

Dummy variable for drinking age = 18

da19

Dummy variable for drinking age = 19

da20

Dummy variable for drinking age = 20

lincperc

Log of per capita real income

mraidall

Alcohol-Involved VFR

pop

Population

pop1517

Population, 15-17 year olds

pop1820

Population, 18-20 year olds

pop2124

Population, 21-24 year olds

miles

total vehicle miles (millions)

unus

U.S. unemployment rate

epopus

U.S. Emp/Pop Ratio

gspch

GSP Rate of Change

Dum1982

Dum1983

Dum1984

:

DUM1988

------------------------------------------------------------------------------------

Panel-35

Estimation of the specification (4) on Table 10.1 in p. 368.

Dependent Variable: VFRALL Sample: 1982 1988 Cross-sections included: 48 Total panel (balanced) observations: 336 White diagonal standard errors & covariance (d.f. corrected)

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

-2.327171

1.316419

-1.767804

0.0782

BEERTAX

-0.450272

0.222005

-2.028203

0.0435

DA18

0.027509

0.065473

0.420158

0.6747

DA19

-0.019096

0.039510

-0.483315

0.6293

DA20

0.030875

0.045689

0.675767

0.4998

JAILD

0.012644

0.031940

0.395866

0.6925

COMSERD

0.034135

0.114820

0.297289

0.7665

VMILESPD

0.008226

0.008368

0.983073

0.3264

LINCPERC

1.814889

0.472220

3.843312

0.0002

UNRATE

-0.063043

0.011616

-5.427345

0.0000

DUM1982

0.533926

0.075931

7.031706

0.0000

DUM1983

0.435841

0.070418

6.189300

0.0000

DUM1984

0.246723

0.050392

4.896067

0.0000

DUM1985

0.155325

0.043688

3.555327

0.0004

DUM1986

0.189843

0.040808

4.652090

0.0000

DUM1987

0.087532

0.032452

2.697246

0.0074

 

Effects Specification

 

Cross-section fixed (dummy variables)

 

R-squared Adjusted R-squared Log likelihood Durbin-Watson stat

0.939540

Mean dependent var S.D. dependent var F-statistic Prob(F-statistic)

2.040444

0.925809

0.570194

183.8646

68.42532

1.733929

0.000000

Testing significance of the individual and time dummy variables:

[Estimation choosing “Fixed” for period and not using dummy variables as regressor.]

Redundant Fixed Effects Tests Equation: MIN Test cross-section and period fixed effects

Effects Test

Statistic

d.f.

Prob.

Cross-section F Cross-section Chi-square Period F Period Chi-square Cross-Section/Period F Cross-Section/Period Chi-square

44.772106

(47,273)

0.0000

727.186063

47

0.0000

19.685127

(6,273)

0.0000

120.798386

6

0.0000

40.398468

(53,273)

0.0000

732.351587

53

0.0000

Panel-37

Testing significance of the time dummy variables:

[Estimation choosing “None” for period and using dummy variables as regressor.]

Wald Test:

Equation: MIN

Test Statistic

Value

df

Probability

F-statistic

11.46715

(6, 273)

0.0000

Chi-square

68.80287

6

0.0000

Panel-38

Comments on (FEA.5):

What if Assumption #5 fails: so corr(u it , u is |X it ,X is ,α i ) 0?

• OLS panel data estimators of β 1 are unbiased, consistent.

• The OLS standard errors will be wrong.

• Use “heteroskedasticity and autocorrelation-consistent standard errors” (clustered standard errors).

• The clustered SE formula is NOT the usual (hetero-robust) SE formula! [Appendix 10.2 (pp. 379 – 381)].

• The clustered SE might not be very accurate if N is small.

• Eviews can compute these!

In Eviews, choose “White period” instead of “White (diagonal)”.

• Eviews can compute these! • In Eviews, choose “White period ” instead of “White (diagonal)”.

Panel-39

Estimation of the specification (7) on Table 10.1 in p. 368.

Dependent Variable: VFRALL Sample: 1982 1988 Cross-sections included: 48 Total panel (balanced) observations: 336 White period standard errors & covariance (d.f. corrected)

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

-2.327171

1.915400

-1.214979

0.2254

BEERTAX

-0.450272

0.319805

-1.407961

0.1603

DA18

0.027509

0.075267

0.365483

0.7150

DA19

-0.019096

0.053288

-0.358351

0.7204

DA20

0.030875

0.054076

0.570957

0.5685

JAILD

0.012644

0.017699

0.714386

0.4756

COMSERD

0.034135

0.142797

0.239043

0.8113

VMILESPD

0.008226

0.007355

1.118432

0.2644

LINCPERC

1.814889

0.683535

2.655150

0.0084

UNRATE

-0.063043

0.013984

-4.508168

0.0000

DUM1982

0.533926

0.098541

5.418291

0.0000

DUM1983

0.435841

0.091540

4.761205

0.0000

DUM1984

0.246723

0.064103

3.848852

0.0001

DUM1985

0.155325

0.054832

2.832774

0.0050

DUM1986

0.189843

0.042774

4.438265

0.0000

DUM1987

0.087532

0.032445

2.697841

0.0074

 

Effects Specification

 

Cross-section fixed (dummy variables)

 

R-squared Adjusted R-squared Durbin-Watson stat

0.939540

Mean dependent var S.D. dependent var Prob(F-statistic)

2.040444

0.925809

0.570194

1.733929

0.000000

• Average tax = $0.5/case, and average fatality rate = 2 per 10,000 people.

• As tax increases by $0.5, fatality rate drops 0.45×0.5 = 0.225 (per

10,000).

The confidence interval for the effect of BeerTax at 95% of

confidence level is:

−±×0.45 1.96 0.32 (-1.08, 0.18),

which is wider than (-0.88, -0.02).

Panel-41

Panel-42

Panel-42

Panel-43 Panel-43

Panel-43

Panel-43