ECOMET2 Lecture Notes

Notes in ECOMET2
1. Review of Classical Linear Regression Model

Drivers: independent variable (right hand side)
= + +

+ +
Real world variable; stochastic or random variable

Under the condition of uncertainty, we have to expect errors.
Xs exogenous fixed variable
= 1, 2, 3, ..., n
= intercept; value of Y when all Xs are zero
In matrix form,
=

+
( ) ( )
( )
Objective:
1. To find #" (problem of estimation)

2. To perform inferences on #" (problem of inference)

$() = + % & & )*+

&'
, = - + % & & .*+

&'
& marginal contribution of & to Y, ceteris paribus.
i.
i.
Do (statistical test) individual test
ii.
Joint (overall) test
iii.
Goodness of fit
/0 : & = 0 34 / : & (<, >, )0
j= 1, 2, 3, ... . k
Alternative hypothesis = operational statement of our research agenda
Basis of decision: p-value true probability of incorrect rejection
9 < 0.05 <=>=?@ /0
ii.
iii.
iv.
/0 : =
= = = 0 A 4 C3= DE@DF @E GE H@
/0 = )I-&J = & = -&K L = 1 O
j= 1, 2, ..., k

* : proportion of variation collectively explained Y by Xs
*P adjusted * : remedy for loss of degrees of freedom
Degrees of freedom (df): number of observations left after estimation

Assumptions:
1)
2)
3)
4)
5)
6)
u is multivariate normal (MVN)

$() = 0 vanishing error expectation
$(QQ) = S T (if zero) autocorrelation
= S variances homoscedasticity
$V W& X = 0 = exogeneity assumption os Xs
A has full rank and there is no maximum number of K; Xs should not be
correlated (non-multicollinearity)
If assumptions 1-6 are valid; by Gauss-Markov Theorem then OLS is BLUE

Unbiased, consistent, sufficient, and efficient
Limited Dependent Variable Models
CLRM requires Y to be quantitative (continuous)
measure in interval or ratio scale
Binary Response Models (Y is dummy)
Differential intercept: right hand-side; dummy: left hand side
o LPM (Linear probability model)
o Logit model
o Probit model
Multinomial Response Model (Y is multinomial)
3 or more response categories
o Multinomial Logit model
o Multinomial Probit model
Nominalscale of measurement
Ordered Response Model (Y is ordinal)
o Ordered Logit model
o Ordered Probit model
Censored Response Model
(selectivity bias)
o Tobit model
o Heckit model
Discrete Dependent Variable Models
o Count models
Poisson model
Negative binomial model
o Duration models
Binary Response Models
= + % & & +
&'
= 0 ^C_<= ; = 1 4??=44 i=1, 2, 3, ..., n
) = Prb = 1| W , W
, , W
The probability that success is attained
LPM: OLS despite Y is dummy

Consequences: ), = , (predicted probability)
ANCOVA not all in the right hand side are dummy

ANOVA all in the right hand side are dummy
is not normal
Bernoulli only 1-trial; violates normality

$( ) = ) mean error is not zero

3C<( ) = ) (1 ) ) heteroscedastic
* is not reliable measure of goodnessof fit
), is not sure to fall within [0, 1]
e = + &' & W& [index function]
Score of the ith observation(utility score)

*The bigger the z, the more probable the success
*The smaller the z, the more probable the failure
*new graph
i
) = gjk
^(@)G@ where f(t) is the link function
As z becomes smaller, p is also becoming smaller
Logistic Model- logistic distribution cumulative distribution function

Logistic Link (probability distribution function)
^(@) =
l mn
(ol mn )p
; |t|< =>will result in Logit Model
Standard Normal Link (t~(N(0,1)
^(@) =
r
i
) = gjk
mnp
p ; |t| <
l mn
(ol mn )p
Let u=1 + = jt
du= = jt G@
) =
=1
ol mui
l ui
= ) =
l ui vw
i
= -gjk
Kp
) =
sK
l uivw
xi
jxi
= = hi
= + &' & W& +

z
jx
= ln y
xi
{ = logit(natural logarithm of odds ratio[ratio of success probability/failure probability]

Odds ratio also measures the probability that Y=1 relative to the probability that Y=0
Estimation using Maximum Likelihood Estimation
-estimating the betas(parameters) in such a way that the likelihood of the
sample(joint probability) is maximized
1. get 1st derivative of the likelihood function
2. get 2nd derivative for it to be maximized. Hessian needs to be negative
{ = - + &' & W& sample regression function

e- = { by maximum likelihood estimation
*test of joint hypothesis using F-statistic
Binary Response Models

Y=0,1 [failure,success]
e- = - + &' & W& [index function]
) = )| b = 1|e ]- probability that success is attained
LPM( OLS) Y is dummy
Logit( Logistic Link)
Probit(Standard Normal Link)
*Link function is a probability distribution function which will link the index from ~ to
[0,1]
Logistic pdf for the logit model
hi
) = ^(@)G@
jk
^(@) =
l mn
(ol mn )p
where < @ <
) =
(Logit Model)
ol ui
l ui
)
_D
= - + % & W&
1 )
&'
The logit of the individual is the linear function of the individuals characteristics
Solved via MLE (theres already an estimation)

{ = - + % & W& = e-

&'
Estimated logit is the linear function of the individuals characteristics
Used in forecasting:
^ ) 0.05; success -> , = 1
^ ) 0.05; failure-> , = 1
xi
=marginal contribution of W& to ) , ceteris paribus[(k-i)n]
We will be able to determine the role of success factor of the ith individuals
x
y i z = - + &' & W& (Implicit Function of Differentiation)
jxi
1 )
)
(1 ))
)
)
) (1)
W&
W&
= &
(1 ) )
)
)
)
= & ) (1 ) )
W&
W&
)
= & ) (1 ) ); > = 2,3, , 1,2,3, D C<FDC_ =^^=?@4
W&
Probit Model- Standard Normal Cumulative Distribution Function
Link Function:
^(@) =
r
mnp
p
; < @ <
The the score, the the probability of success

*The slope gives the change in probability given a unit change in X.(not constant slope)
Loans approved not depending on default

) = V + &'(& W&) X; probit model where ) is the area under the normal curve

Use MLE to estimate the Betas

e- = - + % & W&
&'
), = (e- ) ; i=1,2, .n
xi
Marginal Effects =
Censored Regression Model
)
~
1 jhi p
=
=
W& W&
2
),
= ^(e- )
W&
Tobit Model(James Tobin)

Model:
= - + &' & W& ++ ; i=1,2,3 n
D = D + D where:
D with observations of both Y and xs
D with observations only on the xs (Censored sample)
E.g. amount spent on a car by ith family is linearly dependent on the socio economic status
of the family (W& =jth socio economic variable)
-> OLS will result in Biased and inconsistent estimates
Bias= sample selectivity bias- solution is the procedure developed by James tobin via
sampling from censored normal distribution.
*wage of workers= censored variable
-major substitute for OLS
Given: V , , X; i=1,2n
O = ; marginal density function of the Y and xs with respect to Y.
FOC:

# =0 {=1,2,..k}
-& =f(sample) j=1,2,..k
SOC:

# p
< 0 ; Hessian
Heckit Model: James Heckman

-alternative to the tobit model in addressing the selectivity bias of censored sample
Two stage process:
1. Selection stage: implementation of probit model
e = 0,1 (no, yes)
Index function is the probit model
e- for each i.
2. Consumption stage : = - + &' & W& + where
(,i )
(,i )
lt tl ts|s | K|l

|l Ksl| tl | K|l
Truncated Regression Models
Ve- X
= D3=<4= __4 <C@E
Ve- X
No observation in Y= no observation in the Xs; truncated-effectively taken out from the

selection process.
Element of a random selection= inference
MLE via Sample Selection from Truncated Normal
Multinomial Logit Model
= >; j=1,2,3.. m
) = )b = >| ,
, . . ; j=1,2 , .m-1(to avoid dummy variable trap)
-Link between the Xs and probability of 0 to 1.
Link: e& = + &' W&
Link function is the logistic distribution
Link function for MLM
) (&) = omw l{h ()} where j=1,2, ..m-1

l{h () }
) () = omw l{h ()}; base category

l{h () }
Relative risk ratio=
xi ()
xi ()
= = (w op
-The probability of being in the jth category with respect to the RRR reference/ base
category
Panel Data Econometric Models

This represents real world phenomena(data in two dimensions)
Panel Data combination of cross section and time-series data.
t = @t time series observation for the ith individual on Y variable
ispace dimension indicating microunits
t time dimension period in which Y is observed(behaves like time-series)
e.g. effect of education in income with data across individuals
i=1,2,3,.n
t=1,2,3,T
unbalanced panel=missing observation
Advantages of Using panel data
1. It allows us to account for unobserved heterogeneity (or
effects)of cross section or time series observational units
which when neglected will result in, OBV(Omitted Variable
Bias
2. more information, more variability, more degrees of freedom,
less collinearity and more efficiency
3. allows us to be able to model they dynamics of
change(structural change)
4. to have basis for modelling more sophisticated phenomena
which pure CS or TS models cannot perform.
5. to mitigate aggregation bias
Panel Data Models
1. Fixed Effects Models
Assumption: effects are fixed parameters to be estimated
a. Naive model: all parameters are fixed (time and space invariant)
= + &' & &t + t ; i=1,2n and t= 1,2. T
df=nT-k
b. LSDV Models
i)
Model 1: intercept is time invariant and slopes are fixed
t = + % & &t + t

&'
=o ; is the animal spirit(unobserved heterogeneity)

ii)
Model 2: intercept is varying and slopes are fixed
t = t + % & &t + t

&'
t = + @; @ is the animal spirit of time period shock

iii)
Model 3: intercept varies and space and slopes are fixed
t = t + % & &t + t

&'
Where: t = + @- for every time period there is one delta

iv)

Model 4: all parameters are time invariant, the slopes and intercepts vary
across individual microunits.
t = + % & &t + t

&'
Simultaneous Equations Model (SEM)
-Economic models for econometric data determined by 2 or more economic relationships

multi equation model
- 1 equation per sector
- 1 sector will be represented by 1 endogenous variable
M sectors (or M endogenous varibales)
K presetermined variables (exogenous + lagged endogenous)
= ^( , _CFF=G , E@=< ) + Q
CLRM: ,
, , ; = 1,2,3, , D
SEM: 4, _CFF=G, E@=< 4 ; = 1,2,3, , D
Notations: (time based) (t=1,2,3,,T) time series

1. t =DGEF=DE4 3C<C_=
(m endogenoud var in SEM)
2. predetermined variables (x)
2 types:
1. truly endogenous variable
2. lagged ys
k-> predetermined variables
1

for i=1
t + t = Qt
W
t W1
W
t W1
Qt W1
, = 4@<?@<C_ 9C<C=@=<4

t
1
+

t

t
Qt
=

t
Qt
t + t + + t + t + t + + t = Qt
Simultaneous bias (SB)
-use of OLS when RHV include Ys

-OLS is biased and inconsistent
t + t = Qt
t 3=?@E< E^ =DGEF=DE4 3C<C_=4
# E^ =DGEF=DE4 3C<C_=4 D .$
t 3=?@E< E^ 9<=G=@=<D=G 3C<C_=4
# E^ 9<=G=@=<D=G 3C<C_=4
Simultaneous Bias -> OLS is inconsistent in the presence of endogenous variables at
RHV
t = t + t
OLS is BLUE for RFM
problem of identification
per equation:
= j ; t = j t
& =

C@ @ = 1,2, . . D; > = 1,2, . .
&
how can we recover the Bs and 4 from the 4? "

states of identifiability on an equation
o exactly identified (unique solution)
o over identified (multiple solution)
o unidentified (no solution)
Conditions of identifiability
- order condition (necessary)

- rank condition (necessary and sufficient)
Order condition
an equation is identifiable if the number of excluded variables from it is atleast one less than the
number of endogenous variable in the SEM
Nationally,
M- # of endogenous var in SEM
K- # of predetermined in SEM
m- # of endogenous variables in the equation
k- # or predetermined variables in the equation
(M+K) (m-k) 1
K-km-1
> over
= exact
Rank condition
an equation is identifiable if and only if there exsist atleast one subdeterminant (M-1) (m-1)
formed by the coefficients of the variable exclude from the equation evaluated.
L. Klein Close Economy Basic Model
equations: Structural model
1.
2.
3.
4.
5.
t = C + t + ? t + G *t + t (consumption function)
Tt = C + t + ? *t + t (investment function)
t = ?t + Tt + t (Identity)
t = C
+
t +
*t + G
)t +
t (Liquidity preference)
t = O + t + t (production function)
6. t = C + Ht + ? )t + t ( Labor demand function)

7. t = C + Ht + ? )t + t (labor supply function)
Standard form:
t + ) t = t t = t + 3t
Identification issue: how are we going to retrieve the s and s from the s?
Maddala System
Included = 1
Excluded = 0
Eqn#
1
2
3
4
5
6
7
C
1
0
1
0
0
0
0
I
0
1
1
0
0
0
0
N
0
0
0
0
1
1
1
P
0
0
0
1
0
1
1
R
1
1
0
1
0
0
0
Y
1
1
1
1
1
0
0
W
0
0
0
0
0
1
1
G
0
0
1
0
0
0
0
T
1
0
0
0
0
0
0
M
0
0
0
1
0
0
0
K-k
2
3
2
2
3
3
3
m-1
2
2
2
2
1
2
2
Status
Exact
Over
Exact
Exact
Over
Over
Over
K number of predetermined variables

k number of included predetermined variables
m number of included endogenous variables
Identity: no coefficient; no problem

Status should be exact
All identification equations are almost exact
An equation is identified if and only if there exist at least one non-singular m-1 by m-1
sub-matrix form out of coefficients of the variable excluded in other equations.
Rank Condition
This is not equal to zero = non-singular
It is singular if there is a row or column that is filled of all zeros.

Via Laplace expansion
1
1
0
0
0
Hausman test for:
0
0
0
1
1
1
0 0
0 0
1 0
0 0
1 1
1 1
0
1
0
0
0
0
0
0
0
0
0
Simultaneity
Exogeneity
Hausman test for Simultaneity
t = + t +
t + t
Step 1: Get the reduced form residuals of
t = O + O t + 3t
3t = *+ <=4GC_ E^
Step 2: Augment the original model by 3t
If - is significant
t = + t +
t + 3t + t
Reject /0 : = 0
OLS can still be used despite t (=DGEF=DE4) variables at RHS.

Hausman test for Exogeneity
Step 1: Get the fitted value (,) of the endogenous variable at right hand side
,t and augment the original model +

,

t + t
Kl tjtlt
If there are many restrictions Walds test
Simultaneous Equation Method

-at this point the equations in the system are identified
2 approached of solving structural SEM
1. Full information (aka System Approach)
-estimating all parameters in the SEM in one fell swoop

-all equations solved together simultaneously
risky proportions (solving the entire system)unpopular
- any specification errors in one equation of the system will be transmitted in the entire
system (contagious)
o OVB
o Wrong functional form
o Non normality of the error etc.
- Only do this if you have very high confidence in your model
Among techniques include:
1. full information maximum likelihood (FIML)
you cant formulate the likelihood formula because of some peculiarities (sometimes)
due to transmission errors
2. 3SLS (3 stage least squares)
3 dimentional model (L W H)
rarely used
3. seemingly unrelated regression (SURE)
errors of equations are contemporaneously correlated
multiple equations solving technique
4. joint generalized least squared (JGLS)
SEM version of GLS
2. Limited information (aka single equation approach)

a. Solve one eqation after the other
b. Not susceptible to another equation
c. Specification error confined with one equation only
Techniques used:
1. OLS
a. When systems are recursive
b. OLS is BLUE
c. Maligned to be avoided
Whenever b is triangular
1
0 0
B= 1 0

1
Matrix of endogenous variable coefficients
Eg. t = C _(hl| llt)_ + = t + ^ t + Qt
t = C + t + = t + ^ t + Qt

t = C
+
t +
t +
t + ^
t + Q
t
{t } {Qt }
{ h , t } {t } {Qt }
{
t } {Q
t }
2. ILS (indirect least squares)

a. Applicable on exactly identified equations
b. Mathematical solution exploit one to one correspondence on 4 CDG 4
3. Limited information maximum likelihood
a. Single equation counterpart of FIML
4. 2SLS
a. best thing that happened to SEM
b. henry theil Robert bassman
Stage1: 4 G=@=<D=G 4DF <=G?=G ^E<
=3C_=D@ C@ */.
stage2: structural equation is determined with in the forst stage proxying for the
y at RHS.
Dynamic Econometric models

Models Concerned with the consequences of economic actions over time

.@C@?: t = + % & &t + t

&'
Dynamic: lapse of time before impact is felt

e.g. Y target variable
X proxy variable
When consequences are rarely instantaneous
1. DL(p) distributed lag model

t = O + 0 t + tj + + tj + t
O endowment (autonomous Y)
0 impact multiplier
intermediate multiplier
= 1, 2, 3, , 9
t = O + % tj +
'0
= D@=< _@9_=< = % &

&'0
= = {EDF <D _@9_=< (@E@C_ =^^=?@)
2. AR(q) Autoregressive model
t = + % O& tj& + t
&'0
t = + O tj + O tj + + O tj + t
What you are today is a function of what you were before

3. ARDL (q, p)
t = + % tj + % O& tj& + t

'0
&'
What you are today is influenced by what you were before plus how the authorities molded you
to be how you are today
Focus: DL(p) model

Finite DL model (p is finite)

Infinite DL model (p)
t = O + % tj + t *(9)
'0
OLS: there are no endogenous at RHS (estimation method)
Alt-Tinbergen Method
o Sequential OLS
o Bottom-up
o Will start in simple regression up to complicated model
Hendry Top-down Method
o Will start in big model
o AIC (choose the model p* with smallest AIC
P finite
Almon Model
Koyck model
Adaptive Expectation
Rational Expectation
Infinite DL Models
t = O + % tj + t {(9)

Finite DL model (p is finite)

Infinite DL model (p)
'0
D@=<=GC@= _@9_=< = 0,1, 2, 3, , 9

Koyck Model
t = O + 0 t + tj + + tj + + t
= 0 = 0, 1,2, 3,
<C@= E^ G=?C (0 1)

1. t = O + 0 t + 0 tj + 0 tj + + 0 tj
+ + t
D^D@= G4@<@3= _CF EG=_
Lag (1) by 1 period multiply the result by , subtract the outcome from (1)
t tj = O (1
)
+ 0 t + t tj
lls
s&Ktlt
t = O(1 ) + 0 t + tj + t tj
ARDL (1, 0)
Three issues:
Autocorrelation
Simultaneity bias
Non-linear in parameter
SEM static/contemporaneous
t = O + 0 t + tj + 3t
tj is endogenous in dynamic model

O(1 ) non-linear in parameter
Durbin-watson cannot because of the presence of lag(1) on RHS, tj .
We cannot use OLS.
Estimation of O , 0 & W: {. 4 E@!
Use 2 tests for Autocorrelation
Durbin h-test presence/absence of autocorrelation

=
1 3C<V-X
; = 1
G
G = 4@C@4@?
2
Instrumental Variable/ Proxy Method

Liviathan Method
tj has to be proxied by tj
SRF: ,t = O + -0 t + -tj
FOC:
- = -0 -
*.. = %( t O
-0 t - tj )
t'
= 0 normal equation
= D@=< _@9_=< ?_C@3= =^^=?@

= % -&
&'0
= 4 E^ 0 + + (?_C@3=)
p
= = {* _@9_=< @E@C_ =^^=?@

k
= % -& ?_C@3= =^^=?@

&'0
-0
= % -& =
= lim % -& D^D@= 4=<=4 ^D?@ED
k
1
&'0
&'0
Median Lag amount of time lag (lapse of time) within 50% of total effect would be
manifested
Mean Lag amount of time on the average, the total effect would be perceptible
Koyck Model
With = 0 ; = 0, 1, 2, 3,
ARDL (0,1)
Problems:
t = O + % tj + t
'0
t = O + 0 t + tj + 3t
1. O = O(1 )
2. 3t = t tj *(1)
3. tj =DGEF=DE4
OLS is out because we cannot get BLUE due to these problems.

Estimation:
1. IV method of Liviatan
tj 9<EW=4 ^E< tj
2. 2SLS
For autocorrelation: use Durbin h
Use large T for problem (1)
ALMON MODEL
t = O + % tj + t
'0
9 = ^D@= (/=DG< @E9 GEHD)
= O0 + % C& t
&'
-mth degree polynomial

Jan Kmenta
-provided the ses to the SRF
- = C0 + % C& &

&'
3C<V- X = % 3C<(C& ) + ?E3(C , C ) 4

&'
Time series Econometrics

Basic Concepts
o Application of econometrics to TS data
o Data on variables captured and recorded in regular intervals of time (e.g. annual,
quarterly, monthly, etc.)
o Historical data-frequently/massively available
Challenges in using TS data in research

Autocorrelation
Spurious regression (non-sensical)
Random walk phenomenon
o Tomorrows stock market price, the best prediction is closing price today
random walk forecasting
ARCH effect (Autoregressive Conditional Heterescedasticity) conditional volatility in
stock market
Forecasting
Non-stationarity of most variables
Stochastic process (SP) collection of random variables ordered in time
e.g. {t } @ = 1, 2, 3, ,
{, , , }
DGP (Data Gathering Process)

Unknown mechanism that generates realization for a SP
Realization historical data (TS data)
{t } {t }
If {t } is weakly stationary process, its first 2 moments are said to be time-invariant.
i.e. $(t ) =
3C<(t ) = $(t ) = S
?E3(t , tj ) = $(t )(tj ) =
If strongly stationary, time-invariant in all moments
White noise most basic of all stationary SPs building block of all TS models, t
Random walk most basic of all non-stationary stochastic processes, t = tj + t
t

tj
t||ls |

t
|s

= t
= t
tl
l
= G^^=<=D?DF E9=<C@E<;
= 1 {;
{ = _CF E9=<C@E< H@ ED= 9<E9=<@ { t = tj

t = t = (1 {)t = t {t
t = t
= (t tj )
= t tj (tj tj )
= t 2tj + tj = (1 {) t
Unit roots number of times a SP say {t } is to be differenced to make it stationary
i.e. If t ~T(G) E< t 4 D@=F<C@=G E^ E<G=< G
d number of unit roots in t

Integrated SPs
t ~T(0) 4@C@EDC<
t ~T(G); d order of integration or # of unit roots in t

1st difference of CPI is inflation
t = )Tt
t = TD^t
t = )Tt )Ttj
TD^t = TD^
TD^
t
tj
(0)
Random walks and White noise process

RWM
G = 0,1,2
t = tj + t
t = t
3 types of Random Walks

1. RWM
t = tj + t ; = 1
2. RWMD (with drift)

t =

+ tj + t ;
s|t
|ltl|
Kt||lt
llt
=1
3. RWDT (with deterministic trend)

t = + tj + t + t ; = 1
When two or more NS variable are regressed, the result is spurious.
Granger-Newbold Rule (aka Classical)

symptom of spurious regression
If * > <D HC@4ED 4@C@4@? 49<E4

Unit root testing
provided by Dickey and Fuller

Dickey-Fuller Test
Auxillary regression:
RWM:
t = tj + t
t = (1 )tj + t
t = Otj + t
/0 : O = 0 =CD4 = 1 / : O < 0 (4@C@EDC<)
Derived from normal distribution
@=
Thats why they developed
O
DE@ 3C_G
4=(O)
@C distribution (aka Dickey-fuller distribution)
Shortcoming: t = Otj + t
t is highly correlated but Dickey and fuller assumed that it is white noise.
Augmented Dickey-Fuller Test (ADF)

RWM:
t = tj + % tj + t
'
RWMD:
t = + tj + % tj + t
'
RWMDT:
t = + tj + t + % tj + t
'
Level series original time series that were investigating
t ~*(9) current error has long memory; current is related to past

t = tj + tj + + tj + t
Alternative unit root tests: PP, KPSS, ADF-GLS, DP, NP
Cointegation Analysis
It is a property of 2 or more non-stationary variables to be linked together by a LR equilibrium
relationship
Robert engle and Clive Granger (1982)
Work for cointegration analysis to check if theres spurious regression
Augmented Engle Granger Test (AEG)
If t , t ~T(0) if their SRF
, = - + - t ; t = , ~T(0)
Then t CDG t are cointegrated
Stage 1: ADF t CDG t if both are I(1)
We perform unit root test on both variables and see to it that they are non-stationary.
Stage 2: Run OLS on t CDG t to get SRF and obtain t , ADF t and if t ~T(0) X & Y are
cointegrated
Short-run dynamics
ECM: Error Correction model
Granger Representation Theorem
- If the variable used in regression are cointegrated, an ECM representation
of the LR model is possible.
In k=2:
t = O0 + O t + O tj + t
O =<<E< ?E<<=?@ED ?E=^^?=D@

Cointegration and Error correction
VAR(p) Christopher Sims
Vector Autoregression
Let tA = bt t t
All variables are endogenous
As are matrices.
t = t + tj + tj + tj + t
OLS per equation is BLUE

AR(p) current var is related to past value
j
t = t + tj + % tj + t
'
Johansen
Eigen-value max @=4@ individual test
trace test cumulative testing
Identifiication of the cointegration vectors
Cointegrated vector economic theory; long-run equation

= O
=
O

s&Ktlt tl|t
| l|||
lt|
tl|
Johansen more versatile and can be used for more than two variables

ECOMET2 Lecture Notes

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

ECOMET2 Lecture Notes

Enviado por

Direitos autorais:

Formatos disponíveis

Notes in ECOMET2

1. Review of Classical Linear Regression Model

Real world variable; stochastic or random variable

 = intercept; value of Y when all Xs are zero

1. To find #" (problem of estimation)

$() =  + % & & )*+

, = - + % & & .*+

& marginal contribution of & to Y, ceteris paribus.

Degrees of freedom (df): number of observations left after estimation

u is multivariate normal (MVN)

If assumptions 1-6 are valid; by Gauss-Markov Theorem then OLS is BLUE

Binary Response Models

= 0 ^C_ <= ; = 1 4 ??=44 i=1, 2, 3, ..., n

The probability that success is attained

LPM: OLS despite Y is dummy

ANCOVA not all in the right hand side are dummy

Bernoulli only 1-trial; violates normality

$( ) = ) mean error is not zero

Score of the ith observation(utility score)

Logistic Model- logistic distribution cumulative distribution function

; |t|< =>will result in Logit Model

Standard Normal Link (t~(N(0,1)

=  + &' & W& +

{ = logit(natural logarithm of odds ratio[ratio of success probability/failure probability]

{ = - + &' & W& sample regression function

Binary Response Models

where < @ <

{ = - + % & W& = e-

Estimated logit is the linear function of the individuals characteristics

=marginal contribution of W& to ) , ceteris paribus[(k-i)n]

The the score, the the probability of success

Loans approved not depending on default

Use MLE to estimate the Betas

), = (e- ) ; i=1,2, .n

Censored Regression Model

Tobit Model(James Tobin)

= - + &' & W& ++ ; i=1,2,3 n

-& =f(sample) j=1,2,..k

Heckit Model: James Heckman

lt tl ts|s | K|l

Truncated Regression Models

No observation in Y= no observation in the Xs; truncated-effectively taken out from the

) (&) = omw l{h ()} where j=1,2, ..m-1

) () = omw l{h ()}; base category

Relative risk ratio=

Panel Data Econometric Models

t = + % & &t + t

 =o ; is the animal spirit(unobserved heterogeneity)

Model 2: intercept is varying and slopes are fixed

t = t + % & &t + t

t =  + @; @ is the animal spirit of time period shock

Model 3: intercept varies and space and slopes are fixed

t = t + % & &t + t

Where: t =  + @- for every time period there is one delta

t =  + % & &t + t

Simultaneous Equations Model (SEM)

-Economic models for econometric data determined by 2 or more economic relationships

Notations: (time based) (t=1,2,3,,T) time series

t +  t + +  t +  t +  t + +  t = Qt

Simultaneous bias (SB)

-use of OLS when RHV include Ys

how can we recover the Bs and 4 from the 4? "

- order condition (necessary)

6. t = C + Ht + ? )t + t ( Labor demand function)

= intercept; value of Y when all Xs are zero

$() = + % & & )*+

, = - + % & & .*+

= 0 ^C_<= ; = 1 4??=44 i=1, 2, 3, ..., n

= + &' & W& +

{ = - + &' & W& sample regression function

= - + &' & W& ++ ; i=1,2,3 n

lt tl ts|s | K|l

) (&) = omw l{h ()} where j=1,2, ..m-1

) () = omw l{h ()}; base category

=o ; is the animal spirit(unobserved heterogeneity)

t = t + % & &t + t

t = + @; @ is the animal spirit of time period shock

t = t + % & &t + t

Where: t = + @- for every time period there is one delta

t = + % & &t + t

t + t + + t + t + t + + t = Qt

Step 1: Get the reduced form residuals of

Step 2: Augment the original model by 3t

OLS can still be used despite t (=DGEF=DE4) variables at RHS.

,t and augment the original model +

.@C@?: t = + % & &t + t

= D@=< _@9_=< = % &

= = {EDF <D _@9_=< (@E@C_ =^^=?@)

t = + O tj + O tj + + O tj + t

D@=<=GC@= _@9_=< = 0,1, 2, 3, , 9

t = O(1 ) + 0 t + tj + t tj

tj is endogenous in dynamic model

Durbin-watson cannot because of the presence of lag(1) on RHS, tj .

SRF: ,t = O + -0 t + -tj

= D@=< _@9_=< ?_C@3= =^^=?@

= = {* _@9_=< @E@C_ =^^=?@

= % -& ?_C@3= =^^=?@

9 = ^D@= (/=DG< @E9 GEHD)

- = C0 + % C& &

3C<V- X = % 3C<(C& ) + ?E3(C , C ) 4

{ = _CF E9=<C@E< H@ ED= 9<E9=<@ { t = tj

= t tj (tj tj )

= t 2tj + tj = (1 {) t

i.e. If t ~T(G) E< t 4 D@=F<C@=G E^ E<G=< G

If * > <D HC@4ED 4@C@4@? 49<E4

/0 : O = 0 =CD4 = 1 / : O < 0 (4@C@EDC<)