Você está na página 1de 12

A Survey of Weak Instruments

and Weak Identification in Generalized


Method of Moments
James H. Stock
Harvard University and the National Bureau of Economic Research, Cambridge, MA 02138

Jonathan H. Wright
Federal Reserve Board, Washington, DC 20551

Motohiro Yogo
Harvard University, Cambridge, MA 02138

Weak instruments arise when the instruments in linear instrumental variables (IV) regression are weakly
correlated with the included endogenous variables. In generalized method of moments (GMM), more
generally, weak instruments correspond to weak identification of some or all of the unknown parameters.
Weak identification leads to GMM statistics with nonnormal distributions, even in large samples, so that
conventional IV or GMM inferences are misleading. Fortunately, various procedures are now available
for detecting and handling weak instruments in the linear IV model and, to a lesser degree, in nonlinear
GMM.

KEY WORDS: Instrument relevance; Instrumental variables; Similar tests.

1. INTRODUCTION 2. Empirical researchers often confront weak instruments.


Finding exogenous instruments is hard work, and the features
A subtle but important contribution of Hansen and Single-
that make an instrument plausibly exogenous, such as occur-
ton’s (1982) and Hansen’s (1982) original work on general-
ring sufficiently far in the past to satisfy a first-order condi-
ized method of moments (GMM) estimation was to recast the
tion or the as-if random coincidence that lies behind a quasi-
requirements for instrument exogeneity. In the linear simul- experiment, can also work to make the instrument weak.
taneous equations framework then prevalent, instruments are 3. It is not useful to think of weak instruments as a “small-
exogenous if they are excluded from the equation of inter- sample” problem: Bound, Jaeger, and Baker (1995) provided
est; in GMM, instruments are exogenous if they satisfy a an empirical example of weak instruments despite having
conditional mean restriction that, in Hansen and Singleton’s 329,000 observations.
(1982) application, was implied directly by a tightly speci- 4. There are methods more robust to weak instruments than
fied economic model. Although these two requirements are the conventional GMM.
same mathematically, they have conceptually different start- 5. What to do about weak identification is a more diffi-
ing points. The shift from debatable [“incredible,” according cult issue in nonlinear GMM than in linear IV regression, and
to Sims (1980)] exclusion restrictions to first-order conditions much theoretical work remains.
derived from economic theory has been productive, and care-
ful consideration of instrument exogeneity is now a standard This survey emphasizes the linear IV regression model with
part of solid empirical analysis using GMM. homoscedastic, serially uncorrelated errors, mainly because
But instrument exogeneity is only one of the two criteria much more is known about weak instruments in this case.
Section 2 provides a primer on weak instruments in linear
necessary for an instrument to be valid. Recently, the other
IV regression, and Section 3 discusses some empirical appli-
criterion—instrument relevance—has received increased atten-
cations that confront weak instruments. Sections 4–6 discuss
tion by theoretical and applied researchers. It now appears
recent econometric methods for handling weak instruments in
that some, perhaps many, applications of GMM and instru-
the linear model with homoscedastic errors: detection of weak
mental variables (IV) regression have what is known as “weak
instruments (Sec. 4); methods that are fully robust to weak
instruments” or “weak identification,” that is, instruments that
instruments, at least in large samples (Sec. 5); and partially
are only weakly correlated with the included endogenous
robust methods that are somewhat simpler to use (Sec. 6).
variables. Unfortunately, weak instruments pose considerable
Section 7 turns to weak identification in GMM for nonlinear
challenges to inference using GMM and IV methods.
models and/or heteroscedastic or serially correlated errors.
This survey of weak instruments and weak identification Section 8 concludes.
has five themes:

1. If instruments are weak, then the sampling distributions


© 2002 American Statistical Association
of GMM and IV statistics are in general nonnormal, and stan- Journal of Business & Economic Statistics
dard GMM and IV point estimates, hypothesis tests, and con- October 2002, Vol. 20, No. 4
fidence intervals are unreliable. DOI 10.1198/073500102288618658

518
Stock, Wright, and Yogo: Weak Instruments and Identification in GMM 519

Although many of the key ideas of weak instruments have 2.1.4. An Expression for the Two-Stage Least Squares
been understood for decades, most of the literature on solu- Estimator. The TSLS estimator is ˆ TSLS = Y  PZ y/
tions to the problem of weak instruments is quite recent, and Y  PZ Y , where PZ = ZZ  Z−1 Z  . Rothenberg (1984) pre-
this literature is expanding rapidly. We both fear and hope sented a useful expression for the TSLS estimator that
that much of the practical advice in this survey will soon be obtains by substituting Y  PZ u =  Z  u + v PZ u and Y  PZ Y =
outdated.  Z  Z + 2 Z  v + v PZ v into the expression for ˆ TSLS − 
and collecting terms,

2. A PRIMER ON WEAK INSTRUMENTS zu + Suv /


ˆ TSLS −  = u /v   (4)
IN THE LINEAR REGRESSION MODEL 1 + 2zv / + Svv /2
Many of the problems posed by weak instruments in the √
where zu =  Z  u/u  Z  Z, zv =  Z  v/v ×

linear IV regression model are best explained in the context
 Z  Z, Suv = v PZ u/v u , and Svv = v PZ v/v2 .
of the classical version of that model with fixed exogenous
Under the assumptions of fixed instruments and normal errors,
variables and iid normal errors. We therefore begin by using
zu and zv are standard normal random variables with correla-
this model to show how weak instruments lead to the two-
tion , and Suv and Svv are quadratic forms of normal random
stage least squares (TSLS) estimator with a nonnormal sam-
variables with respect to the idempotent matrix PZ .
pling distribution, regardless of sample size. In general, exact
Because the distributions of zu , zv , Suv , and Svv do not
distributions of IV statistics are not a practical basis for infer-
depend on the sample size T , the sample size enters the dis-
ence, and the section concludes with a synopsis of asymptotic
tribution of ˆ TSLS only through the concentration parameter.
methods designed to retain the insights gained from the finite-
If 2 is small, then the terms zv , Suv , and Svv in (4) lead to a
sample distribution theory.
nonnormal distribution. In contrast, the leading term zu dom-
inates if 2 is large, yielding the usual normal approximation
2.1 The Linear Gaussian Instrumental Variables to the distribution of ˆ TSLS . Formally, 2 plays the role in (4)
Regression Model With a Single Regressor usually played by the number of observations: As 2 becomes
large, the distribution of ˆ TSLS −  is increasingly well
The linear IV regression model with a single endogenous approximated by the N 0 u2 /v2  distribution. For the normal
regressor and no included exogenous variables is approximation to the distribution of the TSLS estimator to be
accurate, the concentration parameter must be large.
y = Y + u (1) 2.1.5. Bias of the Two-Stage Least Squares Estimator in
and the Unidentified Case. When 2 = 0 (equivalently, when
 = 0), the instruments are not just weak, but irrelevant. In this
Y = Z + v (2) case, the mean of the TSLS estimator is the probability limit of
the ordinary least squares (OLS) estimator, plim(ˆ OLS ). Specif-
where y and Y are T × 1 vectors of observations on endoge- ically, when K ≥ 3 so that its mean exists, Eˆ TSLS −  =
nous variables, Z is a T × K matrix of instruments, and u plimˆ OLS − = uY /Y2 . To derive this result, note that when
and v are T × 1 vectors of disturbance terms. The instru-  = 0, ˆ TSLS −  = v PZ u/v PZ v uv = uY , and v2 = Y2 .
ments are assumed to be nonrandom (fixed). The errors ut vt  Because u = Eu
v +  = uv /v2 v +  with  and v inde-
t = 1     T  are assumed to be iid N 0 , where the ele- pendent, Ev PZ 
v = 0 and the result follows.
ments of  are u2 , uv , and v2 , and let  = uv /u v . Equa- When the instruments are relevant but weak, the TSLS esti-
tion (1) is the structural equation, and  is the scalar parameter mator is biased toward plim(ˆ OLS ). Specifically, define the
of interest. The reduced-form equation (2) relates the endoge- “relative bias” of TSLS to be the bias of TSLS relative to the
nous regressor to the instruments. inconsistency of OLS, that is, Eˆ TSLS − /plimˆ OLS − .
2.1.3. The Concentration Parameter. The concentration When 2 is moderately large, the TSLS relative bias is approx-
parameter, 2 , is a unitless measure of the strength of the imately inversely proportional to 2 /K − 2, a result that
instruments and is defined as holds even if the errors are not normally distributed (Buse
1992).
2 =  Z  Z/v2  (3) 2.1.6. Numerical Examples. Figures 1(a) and 1(b) show
the pdf’s of the TSLS estimator and its t statistic for var-
A useful interpretation of 2 is in terms of F , the F statistic ious values of the concentration parameter when the true
for testing the hypothesis  = 0 in (2) (i.e., the “first-stage value of  is 0. The other parameter values mirror those
F statistic”). Let F be the infeasible counterpart of F , com- of Nelson and Startz (1990a, b): K = 1, u = v = 1, and
puted using the true value of v2 . Then K F is distributed as  = 99, so plimˆ OLS  = 99. For small values of 2 /K, such
a noncentral chi-squared distribution with degrees of freedom as Nelson and Startz’s value of .25, the distributions are strik-
K and noncentrality parameter 2 , and EF = 2 /K + 1. ingly nonnormal, even bimodal. As 2 /K increases, the dis-
If the sample size is large, then F and F are close, so tributions approach the usual normal limit.
EF   2 /K + 1. Thus larger values of 2 /K shift out the The dramatic Nelson–Startz results drew econometricians’
distribution of the first-stage F statistic, and F − 1 can be attention to the problem of weak instruments. Their results
thought of as an estimator of 2 /K. build on a large literature on the exact distribution of IV
520 Journal of Business & Economic Statistics, October 2002

quality of the approximations when the sample is large but


2 /K is not.
2.2.1. Edgeworth Expansions. An Edgeworth expansion
is a representation of√the distribution of the statistic of inter-
est in powers of 1/ T . As Rothenberg (1984) pointed out
in the fixed-instrument,
√ normal-error model, an Edgeworth
expansion in 1/ T with a fixed model is formally equivalent
to an Edgeworth expansion in 1/. In this sense, Edgeworth
expansions improve on the conventional normal approximation
when  is small enough for the term in 1/2 to matter, but not
so small that the terms in 1/3 and higher matter. Rothenberg
(1984) suggested that the Edgeworth approximation is “excel-
lent” for 2 > 50 and “adequate” for 2 as small as 10, as
long as the number of instruments is small (less than ).
2.2.2. Many-Instrument Asymptotics. Although the prob-
lems of many instruments and weak instruments might at first
seem different, they are in fact related. With many strong
instruments, the adjusted R2 of the first-stage regression would
be nearly 1, so a small first-stage adjusted R2 indicates that
the instruments, taken as a set, are weak. Bekker (1994) for-
malized this notion by developing asymptotic approximations
for a sequence of models with fixed instruments and normal
errors, in which the number of instruments, K, is proportional
to the sample size and 2 /K converges to a constant, finite
limit; similar approaches were taken by Anderson (1976),
Kunitomo (1980), and Morimune (1983). Many-instrument
asymptotic distributions are generally normal, and simulation
evidence suggests that these approximations are good for both
moderate and large values of K, although they cannot cap-
ture the nonnormality evident in the Nelson–Startz example of
Figure 1. Distributions derived using this approach generally
depend on the distribution of the errors (see Bekker and van
Figure 1. pdf of TSLS Estimator (a) and t Statistic (b) for 2 /K =
der Ploeg 1999), so some procedures that are justified using
021252102100; One Instrument (K = 1); and  = 199, Computed by Monte
Carlo Simulation. many-instrument asymptotics require adjustments for nonnor-
mal errors. However, rate and consistency results are more
estimators under the assumptions of fixed instruments and robust to nonnormality (see Chao and Swanson 2002).
iid normal errors (e.g., Sawa 1969; Richardson 1968). How- 2.2.3. Weak-Instrument Asymptotics. Like many-instru-
ever, the results in this literature, comprehensively reviewed ment asymptotics, weak-instrument asymptotics (Staiger and
by Phillips (1984), are offputting and pose substantial compu- Stock 1997) involves a sequence of models chosen to keep
tational challenges. Moreover, the assumptions of fixed instru- 2 /K constant as T → . However, unlike many-instrument
ments and normal errors are generally too restrictive to be asymptotics, K is held fixed. Technically, the sequence of
appropriate in empirical application. To overcome these lim- models considered is the same as used to derive the local
itations, researchers have used asymptotic approximations, to asymptotic power of the first-stage F test √ (a “Pitman drift”
which we now turn. parameterization in which  is in a 1/ T neighborhood of
0). Staiger and Stock (1997) showed that under general condi-
2.2 Asymptotic Approximations tions on the errors and with random instruments, many results
that hold exactly in the fixed-instrument, normal-error model
Conventional asymptotic approximations to finite-sample can be reinterpreted as holding asymptotically, with simplifi-
distributions are calculated for a fixed model in the limit that cations arising from the consistency of Z  Z/T and of the esti-
T → , but sometimes this approach does not provide the mator for v2 .
most useful approximating distribution. This is the case for
the weak instruments problem; as is evident in Figure 1, the
usual fixed-model asymptotic normal approximations can be 3. EMPIRICAL EXAMPLES
quite poor when the concentration parameter is small, even if
3.1 Estimating the Returns to Education
the number of observations is large. For this reason, alterna-
tive asymptotic methods are used to analyze IV statistics in the In an influential article, Angrist and Krueger (1991) pro-
presence of weak instruments. Three such methods are Edge- posed using the quarter of birth as an instrument to circumvent
worth expansions, many-instrument asymptotics, and weak- ability bias in estimating the returns to education. The date of
instrument asymptotics. These methods aim to improve the birth, they argued, should be uncorrelated with ability, so that
Stock, Wright, and Yogo: Weak Instruments and Identification in GMM 521

quarter of birth is exogenous; because of mandatory school- an instrument to be strong, it must be a good predictor of
ing laws, quarter of birth should also be relevant. With large either consumption growth or an asset return, depending on
samples from the U.S. census, they estimated the returns to the normalization, but both are notoriously difficult to predict.
education by TSLS, using as instruments quarter of birth and So finding weak instruments in this application should not be
its interactions with state and year of birth binary variables, a surprise.
for as many as 178 instruments.
Surprisingly, despite the large number of observations
(329,000 or more), the instruments, taken together, are weak 3.3 Macroeconometric Examples
in some of the Angrist–Krueger regressions. This point was Weak identification can also be a concern in GMM esti-
first made by Bound et al. (1995), who provided Monte Carlo mation of macroeconomic equations with expectational terms.
results showing that in some specifications, similar point esti- For example, Ma (2002) and Mavroeidis (2001) suggested
mates and standard errors obtain if each individual’s true quar- that weak instruments can be an issue in GMM estimation of
ter of birth is replaced by a randomly generated quarter of the hybrid New Keynesian Phillips curve (Fuhrer and Moore
birth. Because the results with the randomly generated quarter 1995; Gali and Gertler 1999). Other macroeconomic appli-
of birth must be spurious, this suggests that the results with cations that confront weak identification include estimates
the true quarter of birth are misleading. The source of these of New Keynesian output equations (Fuhrer and Rudebusch
misleading inferences is weak instruments; in some specifica- 2002) and some structural vector autoregressions (Pagan and
tions, the first-stage F statistic is less than 2, suggesting that
Robertson 1998).
2 /K might be 1 or less (recall that EF  − 1  2 /K. In
these specifications, there are a few strong instruments (the
quarter of birth binary variables) and many weak ones (their 4. DETECTION OF WEAK INSTRUMENTS
interactions with state and year), resulting in a combined set
of instruments that is weak. An important conclusion is that it This section discusses methods for detecting weak instru-
is not helpful to think of weak instruments as a “finite-sample” ments. In general, the linear IV regression model has n
problem that can be ignored if one has many observations. endogenous regressors, so that Y and v in (2) are T × n. The
methods for detecting weak instruments (and the definition of
the concentration parameter) depend on n. We first discuss
3.2 The Log-Linearized Euler Equation in the
inference based on the first-stage F statistic when there is a
Consumption-Based Capital Asset-Pricing Model
single endogenous regressor, then turn to the case of n > 1.
The first empirical application of GMM was Hansen and The section concludes with an alternative approach to infer-
Singleton’s (1982) investigation of the consumption-based ence about weak instruments proposed by Hahn and Hausman
capital asset pricing model (CCAPM). In its log-linearized (2002).
form, the first-order condition of the CCAPM with constant To keep things simple, the formulas in Sections 4–6 apply
relative risk aversion can be written as to the case in which there are no included exogenous regres-
sors. These formulas and methods, however, generally extend
E rt+1 + ! − "#ct+1 
Zt = 0 (5) to the case of included exogenous regressors by replacing y, Y ,
and Z by the residuals from their projection onto the included
where " is the coefficient of relative risk aversion (here also
exogenous regressors and by modifying the degrees of free-
the inverse of the elasticity of intertemporal substitution),
dom as needed. Unless noted otherwise, the methods discussed
#ct+1 is the growth rate of consumption, rt+1 is the log gross
in Sections 4–6 do not require fixed instruments and normally
return on some asset, ! is a constant, and Zt is a vector of
distributed errors for their asymptotic justification.
variables in the information set at time t (Hansen and Single-
ton 1983; Campbell 2001 for a survey).
The coefficients of (5) can be estimated by GMM using 4.1 The First-Stage F Statistic
Zt as an instrument. One way to proceed is to use TSLS
with rt+1 as the dependent variable; another is to apply TSLS Before discussing how to use the first-stage F statistic to
with #ct+1 as the dependent variable; and a third is to use detect weak instruments, we need to provide a precise defini-
a method, such as limited-information maximum likelihood tion of weak instruments.
(LIML), that is invariant to the normalization. Under stan- 4.1.1. A Definition of Weak Instruments. A practical
dard fixed-model asymptotics, these estimators are asymptoti- approach is to define a set of instruments to be weak if 2 /K
cally equivalent, so it should not matter which method is used. is small enough that inferences based on conventional normal
However, as discussed in detail by Neely, Roy, and Whiteman approximating distributions are misleading. In this approach,
(2001) and Yogo (2002), this does matter greatly in practice, the definition of weak instruments depends on the purpose to
with point estimates of " ranging from small (Hansen and which the instruments are put, combined with the researcher’s
Singleton 1982, 1983) to very large (Hall 1988; Campbell and tolerance for departures from the usual standards of inference
Mankiw 1989). (i.e., bias, size of tests). For example, suppose that one is using
The first-stage F statistics in these regressions are fre- TSLS and want its bias to be small. Accordingly, one measure
quently less than 5 (Yogo 2002), and it appears that weak of whether a set of instruments is strong is whether 2 /K is
instruments can explain many of these seemingly contradic- sufficiently large so that the TSLS relative bias (as defined in
tory results (Stock and Wright 2000; Neely et al. 2001). For Sec. 2) is at most (say) 10%; if not, then the instruments are
522 Journal of Business & Economic Statistics, October 2002

deemed weak. Alternatively, if interested in hypothesis test- 4.2 Extension of the First-Stage F Statistic to n > 1
ing, one could define instruments to be strong if 2 /K is large
When there are multiple endogenous regressors, the concen-
enough that a 5% hypothesis test rejects no more than (say) 1/2 1/2
tration parameter is a K × K matrix, V V  Z ZV V , where
15% of the time; otherwise, the instruments are weak. These V V is the covariance matrix of the vector of errors vt . To
two definitions—one based on relative bias and the other based avoid introducing new notation, we refer to the concentra-
on size—in general yield different threshold values of 2 /K; tion parameter as 2 in both the scalar and matrix cases. The
thus instruments might be weak if used for one application, quality of the usual normal approximation is governed by the
but not if used for another. matrix 2 /K. Because the predicted values of Y from the first-
Here we consider the two definitions of weak instruments in stage regression can be highly correlated, for the usual normal
the previous paragraph: The TSLS relative bias could exceed approximations to be good, it is not sufficient that some ele-
10%, or the actual size of the nominal 5% TSLS t test could ments of 2 /K are large. Rather, the matrix 2 /K must be
exceed 15%. As shown by Stock and Yogo (2001), under large in the sense that its smallest eigenvalue is large.
weak-instrument asymptotics, each of these definitions implies From a statistical perspective, when n > 1, the n first-stage
a threshold value of 2 /K. If the actual value of 2 /K exceeds F statistics are not sufficient for the concentration matrix even
this threshold, then the instruments are strong (e.g., TSLS rel- with fixed regressors and normal errors (see Shea 1997 for a
ative bias is <10%). Otherwise, the instruments are weak. discussion). Instead, inference about 2 can be based on the
4.1.2. Ascertaining Whether Instruments Are Weak Using n × n matrix analog of the first-stage F statistic,
the First-Stage F Statistic. In the fixed-instrument, normal-
GT = 
 V V Y  PZ Y 
−1/2 −1/2
V V /K (6)
error model, or, alternatively, under weak-instrument asymp-
totics, the distribution of the first-stage F statistic depends
where V V = Y  MZ Y /T − K, MZ = I − PZ , and I is a con-
only on 2 /K and K. Hence the F statistic is useful for mak- formable identity matrix. Under weak-instrument asymptotics,
ing inference about 2 /K. As Hall, Rudebusch, and Wilcox EGT  → 2 /K +I. Cragg and Donald (1993) proposed using
(1996) showed in Monte Carlo simulations, simply using F GT to test for partial identification (cf. Choi and Phillips
to test the hypothesis of nonidentification  = 0 is an 1992)—specifically, testing the hypothesis that the matrix 
inadequate screen for problems caused by weak instruments. has rank L against the alternative that it has rank greater than
Instead, we follow Stock and Yogo (2001) and use F to test L, where L < n. From the perspective of IV inference, mere
the null hypothesis that 2 /K is less than or equal to the weak- instrument relevance is insufficient; instead, the instruments
instrument threshold against the alternative that it exceeds the must be strong in the sense that 2 /K is large. Accordingly,
threshold. Stock and Yogo (2001) considered the problem of testing the
For selected values of K, Table 1 reports weak-instrument null hypothesis that a set of instruments is weak against the
threshold values of 2 /K and critical values of F for testing alternative that they are strong, where instruments are defined
the null hypothesis that instruments are weak. For example, to be strong if conventional TSLS inference is reliable for
under the TSLS relative bias definition of weak instruments, any linear combination of the coefficients. By focusing on the
if K = 5, then the threshold value of 2 /K is 5.82, and the worst-behaved linear combination, this approach is conserva-
test that 2 /K ≤ 582 rejects in favor of the alternative that tive but tractable, and Stock and Yogo provided tables of crit-
2 /K > 582 if F ≥ 1083. Evidently the first-stage F statistic ical values, analogous to those in Table 1, based on the mini-
mum eigenvalue of GT .
must be large, typically exceeding 10, for TSLS inference to
be reliable.
4.3 A Test of the Null of Strong Instruments
The methods discussed so far have been tests of the null of
Table 1. Selected Critical Values for Weak Instrument Tests for TSLS
weak instruments. Hahn and Hausman (2002) reversed the null
Based on the First-stage F statistic and alternative and proposed a test of the null that the instru-
ments are strong against the alternative that they are weak.
Relative bias > 10% Actual size of 5% test > 15% They noted that when there is a single endogenous regressor
Number of
instruments Threshold F statistic 5% Threshold F statistic 5% n = 1 and the instruments are strong, normalization of the
K 2 /K critical value 2 /K critical value regression (the choice of dependent variable) should not mat-
ter. Thus the TSLS estimator in the forward regression of y
1 1
82 8
96
2 4
62 11
59 on Y and the inverse of the TSLS estimator in the reverse
3 3
71 9
08 6
36 12
83 regression of Y on y are asymptotically equivalent [to order
5 5
82 10
83 9
20 15
09 op T −1/2 ] with strong instruments, but this is not the case if
10 7
41 11
49 15
55 20
88
15 7
94 11
51 21
69 26
80 the instruments are weak. Accordingly, Hahn and Hausman
(2002) developed a statistic comparing the forward and reverse
NOTE: The second column contains the smallest values of 2 /K that ensure that the bias regression estimators (and their extensions when n = 2). They
of TSLS is no more than 10% of the inconsistency of OLS. The third column contains the 5%
critical values applicable when the first-stage F statistic is used to test the null that 2 /K is less suggested that if this statistic rejects the null hypothesis, then
than or equal to the value in the second column against the alternative that 2 /K exceeds that a researcher should conclude that his or her instruments are
value. The final two columns present the analogous weak-instrument thresholds and critical
values when weak instruments are defined so that the usual nominal 5% TSLS t test of the weak. Otherwise, the researcher can treat the instruments as
hypothesis = 0 has size potentially exceeding 15%. (Source: Stock and Yogo 2001.) strong.
Stock, Wright, and Yogo: Weak Instruments and Identification in GMM 523

5. FULLY ROBUST INFERENCE 5.2.1. The Anderson–Rubin Statistic. More than 50 years
WITH WEAK INSTRUMENTS ago, Anderson and Rubin (1949) proposed testing the null
hypothesis  = 0 using the statistic
This section discusses hypothesis tests and confidence sets
for  that are fully robust to weak instruments, in the sense
y − Y0  PZ y − Y0 /K  
that these procedures have the correct size or coverage rates AR0  = =  (8)
regardless of the value of 2 (including 2 = 0) when the sam- y − Y0  MZ y − Y0 /T − K K
ple size is large (specifically, under weak-instrument asymp-
One definition of the LIML estimator is that it minimizes
totics). We focus on the case where n = 1, but these methods
AR.
generalize to joint inference about  when n > 1.
With fixed instruments and normal errors, the quadratic
Several fully robust tests have been proposed; consistent
forms in the numerator and denominator of (8) are indepen-
with earlier Monte Carlo studies, the results here suggest that
dent chi-squared random variables under the null hypothe-
none appears to dominate the others. Moreira (2001) provided
a theoretical explanation of this in the context of the fixed- sis, and AR0  has an exact FK T −K null distribution. Under
instrument, normal-error model. In that model, there is no uni- the more general conditions of weak-instrument asymptotics,
d
formly most powerful test of the hypothesis  = 0 , a result AR0 →5k2 /K under the null hypothesis, regardless of the
that also holds more generally under weak-instrument asymp- value of 2 /K. Thus the AR statistic provides a fully robust
totics. In this light, the various fully robust procedures repre- test of the hypothesis  = 0 .
sent trade-offs, with some working better than others, depend- The AR statistic is profligate in its use of overidentifying
ing on the true parameter values. restrictions in the sense that the numerator projects y − Y0 on
Z rather than on a subspace of Z, leading to a loss of power
5.1 A Family of Fully Robust Gaussian Tests relative to the infeasible power envelope when  is overi-
dentified. Moreover, the AR statistic can reject either because
Moreira (2001) considered the system (1) and (2) with fixed
 = 0 or because the instrument orthogonality conditions
instruments and normally distributed errors. Suppose that the
fail, so inference based on the AR statistic differs from infer-
reduced-form equation for y is y = Z + w. Let / denote
ence based on conventional GMM test statistics, for which the
the covariance matrix of the reduced-form errors, wt vt  , and
maintained hypothesis is that the instruments are valid. For
for now suppose that / is known. We are interested in testing
these reasons, other statistics have been proposed for testing
the hypothesis  = 0 .
Moreira (2001) showed that under these assumptions, the  = 0 with the aim of improving power relative to AR0 
statistics when  is overidentified.
5.2.2. Kleibergen’s Statistic. Kleibergen (2001) pro-
Z  Z−1/2 Z  Y b0 Z  Z−1/2 Z  Y /−1 a0 posed the statistic
=   and  =   (7)
b0 /b0 a0 /−1 a0
 2
are sufficient for  and , where Y = y Y  b0 = 1 − 0  , K0  =  (9)
and a0 = 0 1  . Thus for the purpose of testing  = 0 , it  
suffices to consider test statistics that are functions of only  which, following Moreira (2001), we have written in terms
and  , say g  . Moreover, under the null hypothesis  = of  and . If K = 1, then K0  = AR0 . Kleibergen
0 , the distribution of  depends on , but the distribution showed that under either conventional or weak-instrument
of  does not; thus, under the null hypothesis,  is sufficient
asymptotics, K0  has a 512 null limiting distribution.
for . It follows that a test of  = 0 based on g   is
5.2.3. Moreira’s Statistic. Moreira (2002) proposed test-
similar if its critical value is computed from the conditional
ing  = 0 using the conditional likelihood ratio test statistic
distribution of g   given  . Moreira (2001) also derived
an infeasible power envelope for similar tests under the further 
assumption that  is known. In practice,  is not known; 1    
M0  =  − 
when K > 1, feasible tests cannot achieve the power envelope, 2
and there is no uniformly most powerful test of  = 0 .  
 
In practice, / is unknown, so the statistics in (7) cannot +  +  2 − 4  
   −  2  (10)
be computed. However, under weak-instrument asymptotics,
/ can be estimated consistently under the null and, moreover, The (weak-instrument) asymptotic distribution of M0 
the results in the preceding paragraph generalize to stochastic under the null, conditional on  = 7, is nonstandard and
instruments and nonnormal errors. Accordingly, let  and  depends on 0 and 7. Moreira (2002) suggested computing
denote  and  evaluated with /  = Y  MZ Y /T −K replacing
the null distribution by Monte Carlo simulation.
/, where MZ = I − PZ . We refer to Moreira’s (2001) family
of tests, based on statistics of the form g  , as Gaussian
similar tests. 5.3 Conservative Tests
Staiger and Stock (1997) suggested testing  = 0 using a
5.2 Three Gaussian Similar Tests
Bonferroni test. Wang and Zivot (1998) and Zivot, Startz, and
We now turn to three Gaussian similar tests: the Anderson– Nelson (1998) proposed a modification of conventional GMM
Rubin (AR) statistic, Kleibergen statistic, and Moreira statistic. statistics in which u2 is estimated under the null hypothesis.
524 Journal of Business & Economic Statistics, October 2002

Under weak-instrument asymptotics, these tests are conserva- (a)


µ 2/K = 1 ρ = 0.5
tive (i.e., their size is less than their significance level for some 1.0

values of the parameters). Numerical analysis suggests that AR K-test


Power envelope
these tests tend to have lower power than the Gaussian similar 0.8

tests. M-test

0.6

Power
5.4 Power Comparisons 0.4

The asymptotic power functions of the AR, Kleibergen, and


Moreira tests depend on 2 /K,  [the correlation between u
0.2

and v in (1) and (2)], and K, as well as on the true value


0.0
of . We consider two values of 2 /K 8 2 /K = 1, which -10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
corresponds to very weak instruments (nearly unidentified), β– β0

and 2 /K = 5, which corresponds to moderately weak instru- (b)


µ 2/K = 1 ρ = 0.99
ments. The two values of  considered correspond to moderate 1.0

endogeneity ( = 5) and very strong endogeneity ( = 99, as


used in Fig. 1). 0.8

Figure 2 presents weak-instrument asymptotic power func-


tions for K = 5 instruments, so the degree of overidentification 0.6

Power
is 4. The power depends on  − 0 but not on 0 , so Figure 2
applies to general 0 . The shaded region is the area between 0.4

Moreira’s (2001) infeasible asymptotic Gaussian power enve-


lope and the power function of the AR test; the challenge for 0.2

newly proposed fully robust tests is to have power functions as


close to the top of this region as possible. When 2 /K = 1 and
0.0
-10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
 = 5, all tests have poor power for all values of the parame- β– β0
ter space—a reassuring result given how weak the instruments (c) µ 2/K = 5 ρ = 0.5
are; moreover, all tests have power functions that are far from 1.0
the infeasible power envelope. Notably, the power functions
do not increase monotonically in
 − 0
. When 2 /K = 5, 0.8
the M test (but not the K test) approaches the infeasible enve-
lope for both values of . 0.6
Power

Figure 3 presents the corresponding power functions for


many instruments (K = 50). In all cases, the M test is within 0.4

or toward the top of the shaded region; this is mainly (but not
always) the case for the K test, which has a power function 0.2

that, oddly, descends substantially for   0 . As Figure 3


makes clear, when K is large, the AR test has relatively 0.0

low power (arising from its inefficient use of overidentifying


-10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0

β– β0
restrictions), and substantial power improvements are possi-
ble, particularly by using the M test. (d) µ 2/K = 5 ρ = 0.99
1.0

5.5 Robust Confidence Sets 0.8

Due to the duality between hypothesis tests and confidence 0.6


sets, these tests can be used to construct fully robust confi-
Power

dence sets. For example, a fully robust 95% confidence set 0.4
can be constructed as the set of 0 for which the AR statistic,
AR0 , fails to reject at the 5% significance level. In gen- 0.2
eral, this approach requires evaluating the test statistic for all
points in the parameter space, although for some statistics the 0.0
confidence interval can be obtained by solving a polynomial -10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0

equation. β– β0
When the instruments are weak, these sets can have infinite
Figure 2. Weak-Instrument Asymptotic Power of Gaussian Similar
volume. For example, because the AR statistic is a ratio of
Tests for K = 5 Instruments. The upper boundary of the shaded area
quadratics, it can have a finite maximum, and when 2 = 0, is the Gaussian power envelope, the lower boundary is the power of
any point in the parameter space will be contained in the AR the AR test. The other two power functions are for Kleibergen’s and
confidence set with probability 95%. This does not imply that Moreira’s tests.
Stock, Wright, and Yogo: Weak Instruments and Identification in GMM 525

(a) µ 2/K = 1 ρ = 0.5 these methods waste information or are unnecessarily impre-
1.0
cise; rather, if instruments are weak, then there simply is lim-
Power envelope
ited information to use to make inferences about . This point
0.8
was made formally by Dufour (1997), who showed that under
weak-instrument asymptotics, a confidence set for  must have
0.6 infinite expected volume if it is to have nonzero coverage uni-
Power

formly in the parameter space, as long as 2 is fixed and finite.


0.4 This infinite expected volume condition is shared by confi-
K-test
dence sets constructed using any of the fully robust methods
0.2 AR of this section (see Zivot et al. 1998 for further discussion).
M-test

0.0
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
6. PARTIALLY ROBUST INFERENCE
β– β0 WITH WEAK INSTRUMENTS
(b) 2
µ /K = 1 ρ = 0.99 Although the fully robust tests discussed in the previous
1.0 section always control size, they can be difficult to compute.
Moreover, for n > 1, they do not readily provide point esti-
0.8 mates, and confidence intervals for individual elements of 
must be obtained by conservative projection methods. The
0.6 methods described in this section are relatively easy to com-
Power

pute, and inference proceeds using conventional normal fixed-


0.4 model asymptotic approximations. These methods are partially
robust to weak instruments in the sense that they are more
0.2
reliable than TSLS when instruments are weak.

0.0 6.1 k-Class Estimators


-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
β– β0 The k-class estimator of  is k ˆ = Y  I −
−1 
kMZ Y Y I − kMZ y . This class includes TSLS (for
(c) µ 2/K = 5 ρ = 0.5
1.0 which k = 1), LIML, and some alternatives that improve on
TSLS when instruments are weak.
0.8 6.1.1. Limited-Information Maximum Likelihood. LIML
is a k-class estimator where k = kLIML is the smallest root of
0.6
the determinantal equation
Y  Y − kY  MZ Y
= 0. Although the
mean of the LIML estimator does not exist because its distri-
Power

0.4
bution has fat tails, its median is typically much closer to 
than is the mean or median of TSLS. In the fixed-instrument,
normal-error model, the bias of TSLS increases with K, but
0.2
the bias of LIML does not (Rothenberg 1984). When the
instruments are fixed and the errors are symmetrically dis-
tributed, LIML is the best median-unbiased k-class estima-
0.0
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
β– β0 tor to second order (Rothenberg 1983). Moreover, unlike
TSLS, LIML is consistent under many-instrument asymptotics
(d) µ 2/K = 5 ρ = 0.99
1.0
(Bekker 1994).
6.1.2. Fuller-k Estimator. Fuller (1977) proposed an
0.8
alternative k-class estimator that sets k = kLIML − b/T − K,
where b is a positive constant. With fixed instruments and nor-
mal errors, the Fuller-k estimator with b = 1 is best unbiased
0.6
to second order (Rothenberg 1984). In Monte Carlo simula-
Power

tions, Hahn et al. (2001a) reported substantial reductions in


bias and mean squared error (MSE) using Fuller-k estimators,
0.4

relative to TSLS and LIML, when instruments are weak.


0.2
6.1.3. Bias-Adjusted Two-Stage Least Squares. Donald
and Newey (2001) considered a bias-adjusted TSLS estima-
0.0 tor (BTSLS), a k-class estimator with k = T /T − K + 2,
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
β– β0
modifying an estimator previously proposed by Nagar (1959).
Rothenberg (1984) showed that BTSLS is unbiased to second
Figure 3. Weak-Instrument Asymptotic Power of Gaussian Similar order in the fixed-instrument, normal-error model. Donald and
Tests for K = 50 Instruments. See the legend to Figure 2. Newey provided expressions for the second-order asymptotic
526 Journal of Business & Economic Statistics, October 2002

MSE of BTSLS, TSLS, and LIML as a function of the number (a)


of instruments K. In Monte Carlo simulations, these authors 18

found that selecting the number of instruments to minimize 16


the second-order MSE generally improves performance. Chao JIVE
and Swanson (2001) derived the bias and MSE of TSLS under 14

weak-instrument asymptotics, modified to allow the number 12

F -statistic
of instruments to increase with the sample size. They reported TSLS
improvements in Monte Carlo simulations by incorporating 10

bias adjustments. 8
6.1.4. Jackknife Instrumental Variables. Angrist, Imbens, BTSLS
and Krueger (1999) proposed the jackknife instrumental vari- 6

ables estimator (JIVE), ˆ JIVE = Y  y, where the ith


 Y −1 Y
4
row of Y −i and 
 is Zi  −i is the estimator of  computed Fuller-k
using all but the ith observation. They showed that JIVE and 2
3 5 7 9 11 13 15 17 19
TSLS are asymptotically equivalent under conventional fixed- K
model asymptotics. Calculations drawing on work of Chao and (b)
Swanson (2002) reveal that under weak-instrument asymp- 18

totics, JIVE is asymptotically equivalent to a k-class estima- 16


tor with k = 1 + K/T − K. Theoretical calculations (Chao TSLS
and Swanson 2002) and Monte Carlo simulations (Angrist, 14

Imbens, and Krueger 1999; Blomquist and Dahlberg 1999)


12
indicate that JIVE improves on TSLS when there are many F -statistic
instruments. 10

8
6.2 Comparisons JIVE BTSLS
6
One way to assess how robust an estimator or test is to weak
instruments is to characterize the size of its weak instrument 4
LIML
region. When n = 1, this can be done by computing the critical 2
value of the first-stage F statistic testing (at the 5% level) the 1 3 5 7 9 11 13 15 17 19
K
null hypothesis that 2 /K is too small to ensure a desired
degree of reliability under weak-instrument asymptotics (i.e., Figure 4. Critical Values for Weak-Instrument Tests Based on the
the instruments are weak) against the alternative that it exceeds First-Stage F Statistic for the TSLS, LIML, BTSLS, JIVE, and Fuller-k
the threshold value of 2 /K (i.e., the instruments are strong). Estimators As a Function of the Number of Instruments (K). The critical
This is the approach taken in Table 1 for TSLS, and Figure 4 value is for a 5% test of the null hypothesis that the instruments are
applies it to the other estimators discussed in this section. In weak, defined as (a) the weak-instrument asymptotic relative bias of the
estimator exceeds 10% and (b) the weak-instrument asymptotic size of
Figure 4(a), the weak-instrument set is defined to be the set the 5% Wald test can exceed 15%.
of 2 /K such that the relative bias of the estimator exceeds
10%; in Figure 4(b), the weak-instrument set is instead defined
so that a nominal 5% test of  = 0 , based on the relevant sampling distributions in problems of applied interest. For
t statistic, can have size exceeding 15%. In the context of example, Hansen et al. (1996) examined GMM estimators
Figure 4, the smaller the critical values, the more robust the of various intertemporal asset pricing models using a Monte
procedure. Carlo design calibrated to match U.S. data. They found that
As Figure 4 shows, LIML, BTSLS, JIVE, and the Fuller-k in many cases, inferences based on the usual normal approxi-
estimator (with b = 1) generally have smaller critical values mations are misleading (see also Tauchen 1986; Kocherlakota
than TSLS. In this sense, these four estimators are more robust 1990; Ferson and Foerester 1994; Smith 1999). As discussed
to weak instruments than TSLS. In contrast to TSLS, these in Section 3.2, weak instruments are a plausible source of
critical values decrease as a function of K. For K ≥ 10, the these problems.
critical values of the first-stage F statistic fall to 5 or less, The methods of Sections 4–6 apply to the linear IV model
well below those for TSLS. In this sense, these partially robust with homoscedastic, serially uncorrelated errors. This section
methods evidently provide relatively reliable alternatives in provides a nontechnical discussion of methods that apply
applications with weak instruments. when the errors are heteroscedastic or serially correlated
and/or when the model is nonlinear, that is, extensions of the
linear methods for iid data to general GMM. We begin by
7. GENERALIZED METHOD OF MOMENTS
briefly discussing the problems posed by weak instruments in
INFERENCE IN GENERAL NONLINEAR MODELS
nonlinear GMM and suggest that a better term in this context
It has been recognized for some time that the usual large- is weak identification. We then briefly survey the quite incom-
sample normal approximations to GMM statistics in general plete literature on detection of weak identification and on pro-
nonlinear models can provide poor approximations to exact cedures that are fully or partially robust to weak identification.
Stock, Wright, and Yogo: Weak Instruments and Identification in GMM 527

7.1 Weak Identification in Nonlinear GMM (Hansen et al. 1996) in which the weight matrix is evaluated
at the same parameter value as the numerator:
In GMM, the n × 1 parameter vector : is identified by the



G conditional mean restrictions E hYt  :0 
Zt = 0, where :0 1 T
1 T
is the true value of : and Zt is a K-vector of instruments; this CU
ST : = 
< : W : −1
< :  (11)
T t=1 t T t=1 t
in turn implies E <t :0  = 0, where <t : = hYt  : ⊗ Zt .
If the instruments are relevant, then E hYt  : ⊗ Zt = 0 for
where W  : = T −1 Tt=1 <t : − <: <
¯ ¯ 
t : − <: and
: = :0 , a necessary condition for : to be identified. In the ¯ −1
T
linear model, weak instruments arise when E hYt  : ⊗ Zt is <: = T t=1 <t :. [If <t : is serially correlated, then
W : is replaced by an estimator of the spectral density of
nearly 0 for : = :0 ; that is, when Zt is nearly uncorrelated with
the model error term even at false values of :. More generally, <t : at frequency 0.]
in nonlinear GMM, if E hYt  : ⊗ Zt is nearly 0 for : = :0 , Under the null hypothesis : = :0 , STCU :0  is asymptoti-
2
then : can be thought of as being weakly identified. cally 5GK distributed, whether identification is weak or strong
(Stock and Wright 2000). If the instruments are relevant, then
Because there is no exact sampling theory for GMM esti-
under the alternative that : = :0 , the “numerator moments” of
mators, formal treatments of weak identification in GMM rely
STCU :0  have nonzero expectation. A confidence set for : is
on asymptotics. One approach is to use stochastic expansions
computed by inverting the STCU : statistic numerically (see
in orders of T 1/2 ; however, as in the linear case, the result-
Stock and Wright 2000; Ma 2002 for examples).
ing approximations seem likely to be poor when identification
7.3.2. Kleibergen’s Statistic. Kleibergen (2002) pro-
is very weak. A second approach (Stock and Wright 2000)
posed testing the hypothesis : = :0 using a generaliza-
is to use an asymptotic nesting in which, loosely speaking,
tion of K0  and showed that the proposed statistic has a
the GMM version of the concentration parameter is fixed as
5n2 distribution under both conventional asymptotics and the
T → . This yields a stochastic process representation of the
weak-identification asymptotics of Stock and Wright (2000).
limiting objective function, which in the linear case simplifies Kleibergen found in Monte Carlo simulations that his pro-
to the weak-instrument asymptotics discussed in Section 2.2. posed statistic generally gives a more powerful test than
STCU :0 , consistent with the improvement of the K test over
the AR test reported in Section 5.4.
7.2 Detecting Weak Identification
An implication of weak identification is that GMM estima- 7.4 Procedures That Are Partially Robust
tors can exhibit a variety of pathologies. For example, two- to Weak Identification
step GMM estimators and iterated GMM point estimators can
be quite different and can produce quite different confidence Because there are estimators that improve on TSLS when
sets. If identification is weak, then GMM estimates can be instruments are weak in the linear case, it stands to reason that
sensitive to the addition of instruments or to changes in the there should be estimators that improve on two-step GMM
sample. If these features are present in an empirical applica- in the nonlinear case. The limited work in this area to date
tion, then they can be symptomatic of weak identification. has yielded some promising results. Two GMM estimators
The only formal test for weak identification in nonlinear that appear to be partially robust to weak instruments are
GMM of which we are aware is that proposed by Wright the continuous-updating estimator (CUE) (Hansen et al. 1996)
(2001). In the conventional asymptotic theory of GMM, the and generalized empirical likelihood (GEL) estimator (Smith
identification condition requires the gradient of <t :0  to have 1997). The CUE minimizes STCU : in (11). In the linear
full column rank. Wright (2001) proposed a test of the hypoth- model, the CUE is asymptotically equivalent to LIML under
esis of a complete failure of this rank condition. Thus Wright’s weak-instrument and conventional asymptotics if the errors are
test, like Cragg and Donald’s (1993) in the linear model, is homoscedastic. GEL estimators represent a family of estima-
strictly a test for nonidentification or underidentification, not tors that contain empirical likelihood (Owen 1988; DiCiccio,
for weak identification. Hall, and Romano 1991), the CUE, and other estimators. The
GEL estimators have good properties in stochastic expansions
(Rothenberg 1999; Newey and Smith 2001). For example, all
7.3 Procedures That Are Fully Robust GEL estimators are like LIML, BTSLS, JIVE, and the Fuller-
to Weak Identification k estimator in the linear model, in the sense that their second-
order bias is less than that of the two-step GMM estimator.
We are aware of only two fully robust methods for test- Work on GEL estimators in the context of weak instruments is
ing : = :0 in nonlinear GMM: a nonlinear AR statistic and promising but young; the reader is referred to Imbens (2002)
Kleibergen’s statistic. for further discussion.
7.3.1. Nonlinear Anderson–Rubin Statistic. Because the
numerator and denominator of the AR statistic (8) are eval-
8. CONCLUSIONS
uated at the true parameter value, it has a weak-instrument
asymptotic FK distribution even if the unknown parame- Many of the extensions of GMM since Hansen’s (1982)
ters are poorly identified. This observation suggests tests of and Hansen and Singleton’s (1982) seminal work can be seen
: = :0 based on the nonlinear analog of the AR statistic, which as attempts to improve the performance of GMM in circum-
is the so-called continuous-updating GMM objective function stances of practical interest to empirical economists. One such
528 Journal of Business & Economic Statistics, October 2002

circumstance is the presence of weak instruments or weak Bekker, P. A. (1994), “Alternative Approximations to the Distribution of
identification. Instrumental Variables Estimators,” Econometrica, 62, 657–681.
Bekker, P. A., and van der Ploeg, J. (1999), “Instrumental Variable Estimation
Despite the evolving nature of the literature, this survey sug- Based on Grouped Data,” manuscript, University of Groningen, Dept. of
gests that there are some useful methods that practitioners can Economics.
adopt to address concerns about weak instruments. In the lin- Blomquist, S., and Dahlberg, M. (1999), “Small Sample Properties of LIML
and Jackknife IV Estimators: Experiments With Weak Instruments,” Jour-
ear IV model with homoscedastic errors and one endogenous nal of Applied Econometrics, 14, 69–88.
regressor, applied researchers should at least use the tools of Bound, J., Jaeger, D. A., and Baker, R. (1995), “Problems With Instrumental
Section 4 to assess whether weak instruments potentially are Variables Estimation When the Correlation Between the Instruments and
the Endogenous Explanatory Variables Is Weak,” Journal of the American
a problem in a given application, for example, by checking Statistical Association, 90, 443–450.
the first-stage F statistic. If the first-stage F statistic is small, Buse, A. (1992), “The Bias of Instrumental Variable Estimators,” Economet-
say <10, and if the errors appear to be homoscedastic and rica, 60, 173–180.
Campbell, J. Y. (2001), “Consumption-Based Asset Pricing,” in Handbook of
serially uncorrelated, then either a fully robust method (our the Economics of Finance, forthcoming.
preference) from Section 5 or a partially robust method from Campbell, J. Y., and Mankiw, N. G. (1989), “Consumption, Income and
Section 6 can be used. Even if F > 10, it is prudent to check Interest Rates: Reinterpreting the Time Series Evidence,” NBER Macroe-
conomics Annual, 4, 185–216.
the results using LIML, BTSLS, JIVE, or the Fuller-k estima- Chao, J., and Swanson, N. R. (2001), “Bias and MSE Analysis of the IV
tor, especially when the number of instruments is large. In the Estimator Under Weak Identification With Application to Bias Correction,”
GMM case (i.e., the moments are nonlinear in the parameters unpublished manuscript, Purdue University.
(2002), “Consistent Estimation With a Large Number of Weak Instru-
and/or the errors are heteroscedastic or serially correlated), ments,” unpublished manuscript, Purdue University.
then one or more of the methods of Sections 7.3 and 7.4 can Choi, I., and Phillips, P. C. B. (1992), “Asymptotic and Finite Sample Distri-
bution Theory for IV Estimators and Tests in Partially Identified Structural
be used. Equations,” Journal of Econometrics, 51, 113–150.
There are a number of related topics that, because of space Cragg, J. G., and Donald, S. G. (1993), “Testing Identifiability and Specifica-
limitations, have not been discussed in this survey. Because tion in Instrumental Variable Models,” Econometric Theory, 9, 222–240.
DiCiccio, T., Hall, P., and Romano, J. (1991), “Empirical Likelihood is
we have focused on weak instruments, we did not discuss the Bartlett-Correctable,” The Annals of Statistics, 19, 1053–1061.
problem of estimation when some instruments are strong and Donald, S. G., and Newey, W. K. (2001), “Choosing the Number of Instru-
others are weak. In that circumstance, one way to proceed is ments,” Econometrica, 69, 1161–1191.
Dufour, J. M. (1997), “Some Impossibility Theorems in Econometrics With
to try to cull the weak instruments from the strong and to Applications to Structural and Dynamic Models,” Econometrica, 65, 1365–
use only the strong (see Hall and Inoue 2001; Hall and Peixe 1387.
2001; Donald and Newey 2001). A second omitted topic is Ferson, W. E., and Foerester, S. R. (1994), “Finite Sample Properties of the
Generalized Method of Moments in Tests of Conditional Asset Pricing
estimation of linear panel data models with a lagged depen- Models,” Journal of Financial Economics, 36, 29–55.
dent variable, in which instruments (lags) are weak if the lag Fuhrer, J. C., and Moore, G. R. (1995), “Inflation Persistence,” Quarterly
coefficient is almost 1; recent work in this area includes that of Journal of Economics, 110, 127–159.
Fuhrer, J. C., and Rudebusch, G. D. (2002), “Estimating the Euler Equation
Kiviet (1995), Alonso-Borrego and Arellano (1996), and Hahn for Output,” unpublished manuscript, Federal Reserve Bank of Boston.
et al. (2001b). A third omitted issue is combining weak instru- Fuller, W. (1977), “Some Properties of a Modification of the Limited Infor-
ments with a failure of exogeneity restrictions (as emphasized mation Estimator,” Econometrica, 45, 939–953.
Gali, J., and Gertler, M. (1999), “Inflation Dynamics: A Structural Economet-
by Bound et al. 1995). On these and related topics, much work ric Analysis,” Journal of Monetary Economics, 44, 195–222.
remains. Hahn, J., and Hausman, J. (2002), “A New Specification Test for the Validity
of Instrumental Variables,” Econometrica, 70, 163–189.
Hahn, J., Hausman, J., and Kuersteiner, G. (2001a), “Higher Order MSE of
ACKNOWLEDGMENTS Jackknife 2SLS,” unpublished manuscript, Massachusetts Institute of Tech-
nology, Dept. of Economics.
The authors thank Joshua Angrist, Whitney Newey, Adrian (2001b), “Bias Corrected Instrumental Variables Estimation for
Dynamic Panel Models With Fixed Effects,” unpublished manuscript,
Pagan, Marcelo Moreira, and Eric Swanson for comments Massachusetts Institute of Technology, Dept. of Economics.
on an earlier draft. This research was supported in part by Hall, A. R., and Inoue, A. (2001), “A Canonical Correlations Interpretation of
National Science Foundation grant SBR-9730489. Generalized Method of Moments Estimation With Applications to Moment
Selection,” unpublished manuscript, North Carolina State University.
Hall, A. R., and Peixe, F. P. M. (2001), “A Consistent Method for the Selection
[Received June 2002. Revised June 2002.] of Relevant Instruments,” unpublished manuscript, North Carolina State
University.
Hall, A. R., Rudebusch, G. D., and Wilcox, D. W. (1996), “Judging Instrument
REFERENCES Relevance in Instrumental Variables Estimation,” International Economic
Review, 37, 283–289.
Alonso-Borrego, C., and Arellano, M. (1996), “Symmetrically Normalized Hall, R. E. (1988), “Intertemporal Substitution in Consumption,” Journal of
Instrumental-Variable Estimation Using Panel Data,” Journal of Business Political Economy, 96, 339–357.
& Economic Statistics, 17, 36–49. Hansen, L. P. (1982), “Large Sample Properties of Generalized Method of
Anderson, T. W. (1976), “Estimation of Linear Functional Relationships: Moments Estimators,” Econometrica, 50, 1029–1054.
Approximate Distribution and Connections With Simultaneous Equations Hansen, L. P., Heaton, J., and Yaron, A. (1996), “Finite Sample Properties
in Econometrics,” Journal of the Royal Statistical Society, Ser. B, 38, 1–36. of Some Alternative GMM Estimators,” Journal of Business & Economic
Anderson, T. W., and Rubin, H. (1949), “Estimation of the Parameters of a Statistics, 14, 262–280.
Single Equation in a Complete System of Stochastic Equations,” Annals of Hansen, L. P., and Singleton, K. J. (1982), “Generalized Instrumental Vari-
Mathematical Statistics, 20, 46–63. ables Estimation of Nonlinear Rational Expectations Models,” Economet-
Angrist, J. D., Imbens, G. W., and Krueger, A. B. (1999), “Jackknife rica, 50, 1269–1286.
Instrumental Variables Estimation,” Journal of Applied Econometrics, 14, (1983), “Stochastic Consumption, Risk Aversion and the Tem-
57–67. poral Behavior of Asset Returns,” Journal of Political Economy, 91,
Angrist, J. D., and Krueger, A. B. (1991), “Does Compulsory School Atten- 249–265.
dance Affect Schooling and Earnings,” Quarterly Journal of Economics, Imbens, G. W. (2002), “Generalized Method of Moments and Empirical Like-
106, 979–1014. lihood,” Journal of Business & Economic Statistics, forthcoming.
Stock, Wright, and Yogo: Weak Instruments and Identification in GMM 529

Kiviet, J. F. (1995), “On Bias, Inconsistency, and Efficiency of Various Esti- Phillips, P. C. B. (1984), “Exact Small Sample Theory in the Simultaneous
mators in Dynamic Panel Data Models,” Journal of Econometrics, 68, Equations Model,” in Handbook of Econometrics, Vol. 1, eds. Z. Griliches
1–268. and M. D. Intriligator, Amsterdam: North-Holland.
Kleibergen, F. (2001), “Pivotal Statistics for Testing Structural Parameters in Richardson, D. H. (1968), “The Exact Distribution of a Structural Coef-
Instrumental Variables Regression,” Econometrica, forthcoming. ficient Estimator,” Journal of the American Statistical Association, 63,
(2002), “Testing Parameters in GMM Without Assuming That They 1214–1226.
Are Identified,” unpublished manuscript, University of Amsterdam, Dept. Rothenberg, T. J. (1983), “Asymptotic Properties of Some Estimators in Struc-
of Economics. tural Models,” in Studies in Econometrics, Time Series, and Multivariate
Kocherlakota, N. (1990), “On Tests of Representative Consumer Asset Pricing Statistics, eds. S. Karlin, T. Amemiya, and L. A. Goodman, New York:
Models,” Journal of Monetary Economics, 26, 285–304. Academic Press.
Kunitomo, N. (1980), “Asymptotic Expansions of the Distributions of Esti- (1984), “Approximating the Distribution of Econometric Estimators
mators in a Linear Functional Relationship and Simultaneous Equations,” and Test Statistics,” in Handbook of Econometrics, Vol. 2, eds. Z. Griliches
Journal of the American Statistical Association, 75, 693–700. and M. D. Intriligator, Amsterdam: North-Holland.
Ma, A. (2002), “GMM Estimation of the New Phillips Curve,” Economics (1999), “Higher Order Properties of Empirical Likelihood for Simulta-
Letters, 76, 411–417. neous Equations,” unpublished manuscript, University of California Berke-
Mavroeidis, S. (2001), “Identification and Misspecification Issues in Forward ley, Dept. of Economics.
Looking Monetary Models,” unpublished manuscript, Oxford University, Sawa, T. (1969), “The Exact Sampling Distribution of Ordinary Least Squares
Dept. of Economics. and Two-Stage Least Squares Estimators,” Journal of the American Statis-
Moreira, M. J. (2001), “Tests with Correct Size When Instruments Can tical Association, 64, 923–936.
Be Arbitrarily Weak,” unpublished manuscript, University of California Shea, J. (1997), “Instrument Relevance in Multivariate Linear Models: A Sim-
Berkeley, Dept. of Economics. ple Measure,” The Review of Economics and Statistics, 79, 348–352.
(2002), “A Conditional Likelihood Ratio Test for Structural Models,” Sims, C. A. (1980), “Macroeconomics and Reality,” Econometrica, 48, 1–48.
unpublished manuscript, University of California Berkeley, Dept. of Eco- Smith, D. C. (1999), “Finite Sample Properties of Tests of the Epstein-Zin
nomics. Asset Pricing Model,” Journal of Econometrics, 93, 113–148.
Morimune, K. (1983), “Approximate Distributions of k-class Estimators When Smith, R. (1997), “Alternative Semiparametric Likelihood Approaches to
the Degree of Overidentifiability Is Large Compared With the Sample Generalized Method of Moments Estimation,” Economic Journal, 107,
Size,” Econometrica, 51, 821–841. 503–519.
Nagar, A. L. (1959), “The Bias and Moment Matrix of the General k-Class Staiger, D., and Stock, J. H. (1997), “Instrumental Variables Regression With
Estimators of the Parameters in Simultaneous Equations,” Econometrica, Weak Instruments,” Econometrica, 65, 557–586.
27, 575–595. Stock, J. H., and Wright, J. H. (2000), “GMM With Weak Identification,”
Neely, C. J., Roy, A., and Whiteman, C. H. (2001), “Risk Aversion Versus Econometrica, 68, 1055–1096.
Intertemporal Substitution: A Case Study of Identification Failure in the Stock, J. H., and Yogo, M. (2001), “Testing for Weak Instruments in Linear
Intertemporal Consumption Capital Asset Pricing Model,” Journal of Busi- IV Regression,” unpublished manuscript, Harvard University.
ness & Economic Statistics, 19, 395–403. Tauchen, G. (1986), “Statistical Properties of Generalized Method of
Nelson, C. R., and Startz, R. (1990a), “Some Further Results on the Exact Moments Estimators of Structural Parameters Obtained From Financial
Small Sample Properties of the Instrumental Variables Estimator,” Econo- Market Data,” Journal of Business & Economic Statistics, 4, 397–425.
metrica, 58, 967–976. Wang, J., and Zivot, E. (1998), “Inference on Structural Parameters in Instru-
(1990b), “The Distribution of the Instrumental Variable Estimator and mental Variables Regression With Weak Instruments,” Econometrica, 66,
Its t Ratio When the Instrument Is a Poor One,” Journal of Business, 63, 1389–1404.
S125–S140. Wright, J. H. (2001), “Detecting Lack of Identification in GMM,” Economet-
Newey, W. K., and Smith, R. J. (2001), “Higher Order Properties of GMM and ric Theory, forthcoming.
Generalized Empirical Likelihood Estimators,” unpublished manuscript, Yogo, M. (2002), “Estimating the Elasticity of Intertemporal Substitution
Massachusetts Institute of Technology, Dept. of Economics. When Instruments Are Weak,” unpublished manuscript, Harvard Univer-
Owen, A. (1988), “Empirical Likelihood Ratios Confidence Intervals for a sity, Dept. of Economics.
Single Functional,” Biometrika, 75, 237–249. Zivot, E., Startz, R., and Nelson, C. R. (1998), “Valid Confidence Intervals and
Pagan, A. R., and Robertson, J. C. (1998), “Structural Models of the Liquidity Inference in the Presence of Weak Instruments,” International Economic
Effect,” The Review of Economics and Statistics, 80, 202–217. Review, 39, 1119–1246.

Você também pode gostar