Você está na página 1de 7

Chpt 1

Types of Data:

 Cross sectional
 Time series
 Pooled cross sections (independent cross sections, usually at different times)
 Panel / longitudinal data (dependent cross sections, uses same individuals at different times)

Causality is difficult to determine, does y effect x or x effect y?

Ceteris Paribus, how does variable y change when x is changed when all else is kept equal?

Zero conditional mean assumption: E(u|x)=0

Chpt 2

OLS is used for regression, minimises the squared residuals.

SST=SSE+SSR

(total) = (explained) + (unexplained)

R2=SSE/SST

Log models change interpretations to percentages.

Gauss-Markov Assumptions:

SLR 1: linear parameters

SLR 2: random sampling

SLR 3: variation in values


SLR 4: zero conditional mean assumption

Parameters are unbiased when all Gauss-Markov assumptions are held,

ie: expected value is true value

Chpt 3

SLR 5: homoskedasticity, needed in conjunction with SLR 1-4 for expected value of variance to equal
true variance.

Alternative is heteroskedasticity, variance of residuals change

Gauss Markov Assumptions for MLR:

MLR 1: linear in parameters

MLR 2: random sampling

MLR 3: No perfect collinearity

MLR 4: Zero Conditional mean E(u|x)=0

MLR 5: homoscedasticity, Var(u|x)=0

MLR 1-5 needed for unbiasedness of OLS. Also means that OLS estimators are best linear unbiased
estimators (BLUE).

Partialling out interpretation: Find coefficient of explanatory variable by regressing it against


residuals. (Frisch-Waugh theorem)

Chpt 4

Omitted variable bias is worse than added unnecessary variables that cause collinearity. Omitted
variables must be uncorrelated with all included regressors to avoid omitted variable bias.
MLR 6: normality of error terms. U~N(0,sigma squared)

MLR 1-5 Gauss – Markov assumptions

MLR 1-6 Classical linear model assumptions

MLR 6 is needed for inference on parameters

Chpt 5

Reparamaterisation of model, do not look at individual t statistics

F-test, compare restricted and unrestricted model.

Chpt 6

So far, MLR 1-4 needed for unbiasedness of variables

MLR 1-5 needed for variance formulas

MLR 1-6 needed for inference

However, MLR 6 not needed for large samples.

100*exp(B1x1)-1

plim is notation for convergence by probability, Pr(|x-x|<e)->1

Adjusted R square can be negative

Can predict using Reparamaterisation


Average partial effects used to change interpretation of coefficient. Change interaction term to (x-u)
where u is any number of interest. Usually the mean.

Reparamatise model for SE of prediction.

Chpt 7

Dummy variables used for qualitative information.

Chow test is an application of the F test

Linear probability models are when y is a dummy variable that has value 0 or 1. It is always
heteroskedastic.

Chpt 8

MLR 5, homoskedasticity, needed for inference of parameters and variance formulas.

Heteroskedastic robust standard errors are used to compensate. Can now use these standard errors
for T and F tests.

Breusch Pagan test for heteroskedasticity, regress square of residuals against explanatory variables.
Null hypothesis is homoskedasticity.

White test for heteroskedasticity, regress square of residuals against fitted values and square of
fitted values. Tests for a broader class of heteroskedasticity but uses more degrees of freedom.

Weighted least squares is used when there is heteroskedasticity and its form is known. If variance is
sigma squared multiplied by some function h(x), where h(x)>0 as variance>0, then divide the
regression by the square root of h(x). Interpret the model as under OLS. If original model satisfies
MLR 1-4 and MLR 6, then new model will be BLUE.

As a special case, h(x)=1/n, the number of people in the sample. Used when data points are
averages. Then WLS is multiplying the original model by square root of n to remove
heteroskedasticity.

Estimated / Feasible GLS used when h(x) is unknown. Regress log of squared residuals against
explanatory variables and find fitted values, g. H(x) = exp(g). Weights are 1/h(x). Rerun regression
using weights 1/h(x) or transform each variable (including intercept) by dividing by square root of
h(x).

If OLS and WLS produce very different estimators then another assumption (probably ZCM) is wrong.

For linear probability model, use robust standard errors.

Chpt 9

Functional form misspecification tests with RESET.

To test if misspecification is due to omission of logs, create a general model and then use F test.
Works for any two non-nested models.

RESET test, add square and cubic of fitted values to the regression. Null hypothesis is no functional
form misspecification, i.e. coefficients are equal to zero. However does not show source of
misspecification.

Proxy variables are used when data cannot be obtained on a variable. A lagged variable can be used
as a proxy.
Measurement error in response variable is of little concern as variables are still unbiased and
consistent). Attenuation bias causes coefficient to be biased and smaller than true value when there
is measurement error. ZCM is always violated when there is measurement error.

Complete case analysis: Stata ignores data rows where some inputs (variables) are missing. If it is
random there is no bias.

Exogenous sample selection is unbiased (select based on x), while endogenous is biased.

Chpt 10

LRP long run propensity is combined effect of all lags due to a permanent change.

TS1: linear in parameters

TS2: no perfect collinearity

TS3: ZCM (strict exogeneity if uncorrelated with all x, exogeneity if only current x)

TS4: Homoskedasticity

TS5 no autocorrelation

TS1-3 needed for unbiased variables.

For exponential time trends, log the dependent variable.

Time is an important regressor to remove spurious regressions.

Seasonality and time trends tested with F tests, null hypothesis is coefficients are zero.

Chpt 11
Difference in differences estimator (DiD), found with joint variable.

ai is the unobserved heterogeneity that is time invariant, can be removed from longitudinal (panel)
data by differencing.

Chpt 12

Você também pode gostar