Escolar Documentos
Profissional Documentos
Cultura Documentos
4FAQs:
Whatisthecausalrelationshipofinterest?
Descriptiveresearchislessinterestingb/citdoesntanswerquestions
ofbusinessorpolicy
Whatistheidealexperiment?
Oftenimpossibletoactuallycarryout
guidesthestructure/goalsoftheactualexperiment
Helpdecideonfruitfulresearchtopicsifyoucantsatisfactorilyresolve
thequestionwithanidealexperiment,itsawasteoftime
AvoidFUQdquestions(FundamentallyUnidentified
Questions)
exampleistheeffectofschoolentrydateoneducational
outcomesolderkidsdobetterbecausetheyreolder,evenif
theyendupworseoffatage20
Whatisyouridentificationstrategy?
Usedwhendataisnotgeneratedbyarandomizedtrialtoapproximatea
realexperiment
Whatisyourmodeofstatisticalinference?
Datararelycoversthewholepopulationhowdoyouextrapolate?
Chapter2:TheIdealExperiment
Randomassignmentisthebestresearchdesign,buttheyreveryexpensive
Correlationdoesntimplycausationdohospitalsmakepeoplesicker?
,y
isthemeasureofinterest,D
isthetreatmenteffect(foran
i
i
individual)
y
isthebaseoutcomeforthepatient(withoutthetreatment
0i
effect)
y
y
isthecausaleffectofthetreatmentfortheindividual
1i
0i
inrealdata,selectionbiasskewsresults
sickpeoplegotohospitals,healthypeopledont
randomassignmentgetsridofselectionbias,lettingususetheobserveddifferencein
outcomesasareliablemeasureofcausality
However,randomizedtrialsaredifficultandexpensive,somostresearchexploitsnatural
sourcesofrandomvariation
Regressionisuseful:
iftreatmenteffectissameforeverybody:
=treatmenteffect,=baseoutcome,
=randomvariationfromE(y
)
0i
selectionbias=correlationbetweentheregressionerror(
i
)&theregressor(D
)
i
tofindthetreatmenteffectinarandomexperiment,regre
ssY
onD
i
i
Controllingforvariablesreduces
theresidualvariance,makingtheestimatemoreaccurate
MakingRegressionMakeSense
Withoutrandomizedexperiments,regressionjustmakespredictions,cantspeaktocausality
Predictivepowersummarizedby
ConditionalExpectationFunction(CEF)
CEFisthepopulationaverageofY
whenmultiplecovariatesX
areheldfixed(E[Y
|X
])
i
ki
i
i
f
istheconditionaldensitydistribution
y
thatis,themeanofy
acrossthepopulationistheunconditional
i
expectationoftheCEF(
lawofiteratedexpectations)
importanceisthatitbreaksarandomvariableintotwopieces:
theCEFDecompositionProperty
ismeanindependentofX
(
E[
|X
])
i
i
i
i
isthusuncorrelatedwithX
i
i
(usetheCEFDecompositionProperty,thenbreakapartthetermsoftheexpectationfunction
thesecondtermwillreduceintoE[Y
|X
]bythelawofiteratedexpectations)
i
i
TheCEFPredictionProperty
saysthattheCEFisthebestpredictorofY
givenX
b/cit
i
i
solvesaminimummeansquarederrorpredictionproblem
formalstatement:
argmin(f(m(x
))=thevalueofm(x
)forwhichthefunctionf(m(x
)isminimized
i
i
i
Proof:varianceofY
isvarianceofCEF+varianceofresidual
i
2
varianceoftheresidual=E[
i]asvarianceistheexpectationofthedifferencefromthemean
(whichis0fortheresidual).
linearregressionisdonebyfindingtheCEFthatfitsthefollowingfunction
(X
b=m(x)fromthegeneralizedCEFequation)
i
WhenweassumethattheexpectationoftheerroracrossX
is0:
i
uselessmaterialoutputs
MSW=refusefrommunicipalities(households,smallbusinesses,institutions)