Escolar Documentos
Profissional Documentos
Cultura Documentos
Editor
H. Joseph Newton
Department of Statistics
Texas A&M University
College Station, Texas 77843
979-845-8817; fax 979-845-6077
jnewton@stata-journal.com
Editor
Nicholas J. Cox
Department of Geography
Durham University
South Road
Durham City DH1 3LE UK
n.j.cox@stata-journal.com
Associate Editors
Christopher F. Baum
Boston College
Jens Lauritsen
Odense University Hospital
Nathaniel Beck
New York University
Stanley Lemeshow
Ohio State University
Rino Bellocco
Karolinska Institutet, Sweden, and
Univ. degli Studi di Milano-Bicocca, Italy
J. Scott Long
Indiana University
Maarten L. Buis
Vrije Universiteit, Amsterdam
Thomas Lumley
University of WashingtonSeattle
A. Colin Cameron
University of CaliforniaDavis
Roger Newson
Imperial College, London
Mario A. Cleves
Univ. of Arkansas for Medical Sciences
Austin Nichols
Urban Institute, Washington DC
William D. Dupont
Vanderbilt University
Marcello Pagano
Harvard School of Public Health
David Epstein
Columbia University
Sophia Rabe-Hesketh
University of CaliforniaBerkeley
Allan Gregory
Queens University
J. Patrick Royston
MRC Clinical Trials Unit, London
James Hardin
University of South Carolina
Philip Ryan
University of Adelaide
Ben Jann
ETH Z
urich, Switzerland
Mark E. Schaffer
Heriot-Watt University, Edinburgh
Stephen Jenkins
University of Essex
Jeroen Weesie
Utrecht University
Ulrich Kohler
WZB, Berlin
Nicholas J. G. Winter
University of Virginia
Frauke Kreuter
University of MarylandCollege Park
Jeffrey Wooldridge
Michigan State University
Lisa Gilmore
Jennifer Neve and Deirdre Patterson
The Stata Journal publishes reviewed papers together with shorter notes or comments,
regular columns, book reviews, and other material of interest to Stata users. Examples
of the types of papers include 1) expository papers that link the use of Stata commands
or programs to associated principles, such as those that will serve as tutorials for users
first encountering a new field of statistics or a major new technique; 2) papers that go
beyond the Stata manual in explaining key features or uses of Stata that are of interest
to intermediate or advanced users of Stata; 3) papers that discuss new commands or
Stata programs of interest either to a wide spectrum of users (e.g., in data management
or graphics) or to some large segment of Stata users (e.g., in survey statistics, survival
analysis, panel analysis, or limited dependent variable modeling); 4) papers analyzing
the statistical properties of new or existing estimators and tests in Stata; 5) papers
that could be of interest or usefulness to researchers, especially in fields that are of
practical importance but are not often included in texts or other journals, such as the
use of Stata in managing datasets, especially large datasets, with advice from hard-won
experience; and 6) papers of interest to those who teach, including Stata with topics
such as extended examples of techniques and interpretation of results, simulations of
statistical concepts, and overviews of subject areas.
For more information on the Stata Journal, including information for authors, see the
web page
http://www.stata-journal.com
The Stata Journal is indexed and abstracted in the following:
R
Science Citation Index Expanded (also known as SciSearch
)
R
CompuMath Citation Index
Copyright Statement: The Stata Journal and the contents of the supporting files (programs, datasets, and
c by StataCorp LP. The contents of the supporting files (programs, datasets, and
help files) are copyright
help files) may be copied or reproduced by any means whatsoever, in whole or in part, as long as any copy
or reproduction includes attribution to both (1) the author and (2) the Stata Journal.
The articles appearing in the Stata Journal may be copied or reproduced as printed copies, in whole or in part,
as long as any copy or reproduction includes attribution to both (1) the author and (2) the Stata Journal.
Written permission must be obtained from StataCorp if you wish to make electronic copies of the insertions.
This precludes placing electronic copies of the Stata Journal, in whole or in part, on publicly accessible web
sites, fileservers, or other locations where the copy may be accessed by anyone other than the subscriber.
Users of any of the software, ideas, data, or other materials published in the Stata Journal or the supporting
files understand that such use is made without warranty of any kind, by either the Stata Journal, the author,
or StataCorp. In particular, there is no warranty of fitness of purpose or merchantability, nor for special,
incidental, or consequential damages such as loss of profits. The purpose of the Stata Journal is to promote
free communication among Stata users.
The Stata Journal, electronic version (ISSN 1536-8734) is a publication of Stata Press. Stata and Mata are
registered trademarks of StataCorp LP.
without and with sample selection, respectively. The proposed estimators are n
consistent and asymptotically normal for the model parameters of interest under
weak assumptions on the distribution of the underlying error terms. Our Monte
Carlo simulations suggest that the efficiency losses of the semi-nonparametric and
the semiparametric maximum likelihood estimators relative to a maximum likelihood correctly specified estimator of a parametric probit are rather small. On
the other hand, a comparison of these estimators in non-Gaussian designs suggests
that semi-nonparametric and semiparametric maximum likelihood estimators substantially dominate the parametric probit maximum likelihood estimator.
Keywords: st0144, snp, snp2, snp2s, sml, sml2s, binary-choice models, seminonparametric approach, SNP estimation, semiparametric maximum likelihood,
SML estimation, Monte Carlo simulation
Introduction
The parameters of discrete-choice models are typically estimated by maximum likelihood (ML) after imposing assumptions on the distribution of the underlying error
terms. If the distributional assumptions are correctly specified, then parametric ML
estimators are known to be consistent and asymptotically efficient. However, as discussed at length in the semiparametric literature, departures from the distributional
assumptions may lead to inconsistent estimation. This problem has motivated the development of several semiparametric estimation procedures which consistently estimate
the model parameters under less restrictive distributional assumptions. Semiparametric estimation of binary-choice models has been considered by Manski (1975); Cosslett
(1983); Gallant and Nychka (1987); Powell, Stock, and Stoker (1989); Horowitz (1992);
Ichimura (1993); and Klein and Spady (1993), among others.
In this article, I discuss the semi-nonparametric (SNP) approach of Gallant and
Nychka (1987), the semiparametric maximum likelihood (SML) approach of Klein and
Spady (1993), and a set of new Stata commands for semiparametric estimation of univariate and bivariate binary-choice models. The SNP approach of Gallant and Nychka
c 2008 StataCorp LP
st0144
G. De Luca
191
(1987), originally proposed for estimation of density functions, was adapted to estimation of univariate and bivariate binary-choice models by Gabler, Laisney, and Lechner
(1993) and De Luca and Peracchi (2007), respectively.1 A generalization of the SML
estimator of Klein and Spady (1993) for semiparametric estimation of bivariate binarychoice models with sample selection was provided by Lee (1995). The SNP and SML
approaches differ from the parametric approach because they can handle a broader
class of error distributions. The SNP and SML approaches differ from each other in
how they approximate the unknown distributions. The SNP approach uses a flexible
functional form to approximate the unknown distribution while the SML approach uses
kernel functions.
The remainder of the article is organized as follows. In section 2, I briefly review
parametric specification and ML estimation of three binary-choice models of interest.
SNP and SML estimation procedures, the underlying identifiability restrictions, and the
asymptotic properties of the corresponding estimators are discussed in sections 3 and 4,
respectively. Section 5 describes the syntax and the options of the Stata commands,
while section 6 provides some examples. Monte Carlo evidence on the small-sample
performances of the SNP and SML estimators relative to the parametric probit estimator
is presented in section 7.
Parametric ML estimation
(1)
(2)
n
X
i=1
Yi ln i () + (1 Yi ) ln {1 i ()}
(3)
192
1(Yj
0)
j = 1, 2
(4)
j = 1, 2
(5)
where the Yj are latent variables for which only the binary indicators Yj can be observed,
the X j are kj vectors of (not necessary distinct) exogenous variables, the j = (j , j )
are (kj + 1) vectors of unknown parameters, and the Uj are latent regression errors.
When U1 and U2 have a bivariate Gaussian distribution with zero means, unit variances, and correlation coefficient , model (4)(5) is known as a bivariate probit model.
Because Y1 and Y2 are fully observable, the vectors of parameters 1 and 2 can always
be estimated consistently by separate estimation of two univariate probit models, one
for Y1 and one for Y2 . However, when the correlation coefficient is different from zero,
it is more efficient to estimate the two equations jointly by maximizing the log-likelihood
function
L() =
n
X
i=1
(6)
00 () = Pr(Y1 = 0, Y2 = 0) = 1 (1 ) (2 ) + 2 (1 , 2 ; )
where 2 (, ; ) is the bivariate Gaussian distribution function with zero means, unit
b
variances, and correlation coefficient , and j = j +
j X j . An ML estimator
k1 +k2 +2
maximizes the log-likelihood function (6) over the parameter space =
(1, 1).
Consider a bivariate binary-choice model with sample selection where the indicator
Y1 is always observed, while the indicator Y2 is assumed to be observed only for the
subsample of n1 observations (with n1 < n) for which Y1 = 1. The model can be written
as
3. In the following, the suffix i and the explicit conditioning on the vector of covariates X1 and X2
are suppressed to simplify notation.
G. De Luca
193
Yj = j +
j X j + Uj
Y1 =
Y2 =
1(Y1
1(Y2
0)
0)
j = 1, 2
(7)
if Y1 = 1
(8)
(9)
When the latent regression errors U1 and U2 have a bivariate Gaussian distribution
with zero means, unit variances, and correlation coefficient , model (7)(9) is known
as a bivariate probit model with sample selection. Unlike the case of full observability,
the presence of sample selection has two important implications. First, ignoring the
potential correlation between the two latent regression errors may lead to inconsistent
estimates of 2 = (2 , 2 ) and inefficient estimates of 1 = (1 , 1 ). Second, identifiability of the model parameters requires imposing at least one exclusion restriction on
the two sets of exogenous covariates X 1 and X 2 (Meng and Schmidt 1985). Construction of the log-likelihood function for joint estimation of the overall vector of model
parameters = ( 1 , 2 , ) is straightforward after noticing that the data identify only
three possible events: (Y1 = 1, Y2 = 1), (Y1 = 1, Y2 = 0), and (Y1 = 0). Thus the
log-likelihood function for a random sample of n observations is
L() =
n
X
i=1
(10)
SNP estimation
The basic idea of SNP estimation is to approximate the unknown densities of the latent
regression errors by Hermite polynomial expansions and use the approximations to
derive a pseudo-ML estimator for the model parameters. Once we relax the Gaussian
distributional assumption, a semiparametric specification of the likelihood function is
needed. For the three binary-choice models considered in this article, semiparametric
specifications of the log-likelihood functions have the same form as (3), (6), and (10),
respectively, with the probability functions replaced by4
11 ( 1 , 2 ) = 1 F1 (1 ) F2 (2 ) + F (1 , 2 )
10 ( 1 , 2 ) = F2 (2 ) F (1 , 2 )
01 ( 1 , 2 ) = F2 (1 ) F (1 , 2 )
00 ( 1 , 2 ) = F (1 , 2 )
where Fj is the unknown marginal distribution function of the latent regression error
Uj , j = 1, 2, and F is the unknown joint distribution function of (U1 , U2 ).5
4. The marginal probability function is defined by 1 (1 ) = 1 F1 (1 ).
5. The probability functions underlying the probit specifications can be easily obtained from these
general expressions by exploiting the symmetry of the Gaussian distribution.
194
Following Gallant and Nychka (1987), we approximate the unknown joint density,
f , of the latent regression errors by a Hermite polynomial expansion of the form
1
(11)
R (u1 , u2 )2 (u1 ) (u2 )
R
PR1 PR2
h k
where () is the standardized Gaussian density, R (u1 , u2 ) = h=0
k=0 hk u1 u2 is a
polynomial in u1 and u2 of order R = (R1 , R2 ), and
Z Z
R =
R (u1 , u2 )2 (u1 )(u2 ) du1 du2
f (u1 , u2 ) =
h k
f (u1 , u2 ) =
hk u1 u2 (u1 ) (u2 )
R
h=0 k=0
Pbh
Pbk
with hk
= r=ah s=ak rs hr,ks , where ah = max(0, h R1 ), ak = max(0, k R2 ),
bh = min(h, R1 ), and bk = min(k, R2 ). Integrating f (u1 , u2 ) alternatively with respect
to u2 and u1 gives the following approximations to the marginal densities f1 and f2
Z
f1 (u1 ) =
f (u1 , u2 ) du2
1
=
R
f2 (u2 ) =
1
=
R
2R2
2R1 X
X
hk
mk uh1
h=0 k=0
1
(u1 ) =
R
1
(u2 ) =
R
2R1
X
1h uh1
2k uk2
h=0
(12)
(u1 )
f (u1 , u2 ) du1
2R1 X
2R2
X
h=0 k=0
hk
mh uk2
2R2
X
k=0
(13)
(u2 )
6. Further details on the smoothness conditions defining this class of densities can be found in
Gallant and Nychka (1987, 369).
G. De Luca
195
distribution, 1h =
m
,
m
,
and
=
k
2k
h
R
k=0 hk
h=0 hk
h=0 1h mh =
P2R2
k=0 2k mk . As for the bivariate density function, 10 and 20 are normalized to one
by imposing that h0 = 0k = 0 for all h = 1, . . . , R1 and k = 1, . . . , R2 . Thus, if 1h = 0
for all h 1, then R = 1 and so the approximation f1 coincides with the standard
normal density. Similarly, the approximation f2 coincides with the standard normal
density when 2k = 0 for all k 1.
Adopting the SNP approximation to the density of the latent regression errors does
not guarantee that they have zero mean and unit variance. The zero-mean condition
implies that some location restriction needs to be imposed on either the distributions
of the error terms, or the systematic part of the model. For the univariate model,
Gabler, Laisney, and Lechner (1993) impose restrictions on the SNP parameters to guarantee that the error term has zero mean. For the bivariate model, this approach is quite
complex. Therefore, we follow the alternative approach of Melenberg and van Soest
(1996) and set the two intercept coefficients 1 and 2 to their parametric estimates.
The parametric probit and the SNP estimates are not directly comparable because the
SNP approximation does not have unit variance. However, as shown in section 6, one
can compare the ratio of the estimated coefficients.
After accounting for the above restrictions, the total number of estimated parameters
is (k1 + R1 ) in the univariate SNP model and (k1 + k2 + R1 R2 ) in the bivariate SNP
model. Clearly, such models are not identified if the number of independent probabilities
is lower than the number of free parameters to be estimated.7
Subject to these identifiability restrictions, integrating the joint density (11) gives
the following approximation to the joint distribution function F
1
A (u1 , u2 )(u1 )(u2 )
R 1
1
1
Similarly, integrating the marginal densities (12) and (13) gives the following approximations to the marginal distribution functions F1 and F2 ,
1
A (u1 )(u1 )
R 3
1
F2 (u2 ) = (u2 )
A (u2 )(u2 )
R 2
F1 (u1 ) = (u1 )
7. For instance, a univariate SNP model with a single categorical variable X is identified only if X
can take at least (1 + R1 ) different values. A bivariate SNP model with X is identified only if X
can take at least (2 + R1 R2 )/3 different values. A bivariate SNP model with sample selection, in
which X1 and X2 are two distinct categorical variables with p1 and p2 different values, is identified
only if (2 + R1 R2 ) 2 p1 p2 .
196
where
A1 (u1 , u2 ) =
2R1 X
2R2
X
hk
Ah (u1 )Ak (u2 )
h=0 k=0
A2 (u2 ) =
2R1 X
2R2
X
hk
mh Ak (u2 )
h=0 k=0
A3 (u1 ) =
2R1 X
2R2
X
hk
mk Ah (u1 )
h=0 k=0
SML estimation
The basic idea of the SML estimation procedure is that of maximizing a pseudologlikelihood function in which the unknown probability functions are locally approximated
by nonparametric kernel estimators.
Consider first SML estimation of a univariate binary-choice model. Before describing
the estimation procedure in detail, we discuss nonparametric identification of the vector
of parameters = (, ). As for the SNP estimation procedure, the intercept coefficient
G. De Luca
197
can be absorbed into the unknown distribution function of the error term and is not
separately identified. Furthermore, the slope coefficients can only be identified up to
a scale parameter. In this case, however, the scale normalization must be based on a
continuous variable with a nonzero coefficient and it must be directly imposed on the
estimation process.8 Per Pagan and Ullah (1999), these location-scale normalizations
can be obtained by imposing the linear index restriction
() = Pr(Y = 1 | X; ) = Pr{Y = 1 | (X; )} = ()
P g{(X; ) | Y = 1}
P g{(X; ) | Y = 1} + (1 P ) g{(X; ) | Y = 0}
(14)
b() =
gb1 (; hn )
gb1 (; hn ) + gb0 (; hn )
(15)
198
P1|1 g( | Y1 = 1, Y2 = 1)
P1|1 g( | Y1 = 1, Y2 = 1) + (1 P1|1 ) g( | Y1 = 1, Y2 = 0)
(16)
i j
hn1
b1|1 () =
gb1|1 (; hn1 )
gb1|1 (; hn1 ) + gb0|1 (; hn1 )
G. De Luca
199
b0 ( 1 ) = 1
b1 ( 1 )
b11 () =
b1 ( 1 )
b1|1 ()
b10 () =
b1 ( 1 ) {1
b1|1 ()}
Lee (1995) shows that, under mild regularity conditions, the resulting SML estimator is
Stata commands
5.1
The new Stata commands snp, snp2, and snp2s estimate the parameters of the SNP
binary-choice models considered in this article. In particular, snp fits a univariate
binary-choice model, snp2 fits a bivariate binary-choice model, while snp2s fits a bivariate binary-choice model with sample selection. The general syntax of these commands
is as follows:
snp depvar varlist
if
in
weight
, noconstant offset(varname)
if
in
weight
if
in
weight ,
select(depvar s = varlist s , offset(varname) noconstant )
order1(#) order2(#) robust from(matname) dplot(filename) level(#)
maximize options
eqname:
depvar
= varlist , noconstant offset(varname) )
snp, snp2, and snp2s are implemented for Stata 9 by using ml model lf. These commands share the same features of all Stata estimation commands, including access to
the estimation results and the options for the maximization process (see [R] maximize).
fweights, pweights, and iweights are allowed (see [U] 14.1.6 weight). Most of the
options are similar to those of other Stata estimation commands. A description of the
options that are specific to our SNP commands is provided below.
200
5.2
The new Stata commands sml and sml2s estimate the parameters of the SML models
discussed in this article. sml fits a univariate binary-choice model, and sml2s fits
a bivariate binary-choice model with sample selection. The general syntax of these
commands is as follows:
sml depvar varlist
if
in
weight
, noconstant offset(varname)
bwidth(#) from(matname) level(#) maximize options
if
in
weight ,
select(depvar s = varlist s , offset(varname) noconstant )
bwidth1(#) bwidth2(#) from(matname) level(#) maximize options
sml and sml2s are implemented for Stata 9 by using ml model d2 and ml model d0,
respectively. In this case, ml model lf cannot be used because SML estimators violate
11. As pointed out by an anonymous referee, for a finite R, the SNP model can be misspecified and the
robust option accounts for this misspecification in estimating the covariance matrix of the SNP
estimator.
G. De Luca
201
the linear-form restriction.12 Unlike the SNP commands, pweight and robust are not
allowed with sml and sml2s commands. Although this may be a drawback of our SML
routines, it is important to mention that SML estimators impose weaker distributional
assumptions than the SNP estimators and they are also robust to the presence of heteroskedasticity of a general but known form and heteroskedasticity of an unknown form
if it depends on the underlying indexes (see Klein and Spady [1993]). A description of
the options that are specific to our SML commands is provided below.
Options
bwidth(#) specifies the value of the bandwidth parameter hn . The default is hn =
n1/6.5 , where n is the overall sample size.
bwidth1(#) specifies the value of the bandwidth parameter hn used for nonparametric
estimation of the selection probability
b1 (1 ). The default is hn = n1/6.5 , where n
is the overall sample size.
bwidth2(#) specifies the value of the bandwidth parameter hn1 used for nonparametric
1/6.5
estimation of the conditional probability
b1|1 (1 , 2 ). The default is hn1 = n1
,
where n1 is the number of selected observations.
5.3
1.
Further remarks
and SML estimators typically require large samples. Furthermore, since the
log-likelihood functions of these estimators are not globally concave, it is good
practice to check for convergence to the global maximum rather than a local one
by using the from option.
SNP
2. Asymptotic properties of the SNP estimators require that the degree R of the
Hermite polynomial expansion increases with the sample size. In particular, snp
generalizes the probit model only if R 3 (see Gabler, Laisney, and Lechner
[1993]). For snp2 and snp2s, the error terms may have skewness and kurtosis
different from those of a Gaussian distribution only if R1 2 or R2 2. In
practice, the values of R, R1 , and R2 may be selected either through a sequence of
likelihood-ratio tests or by model-selection criteria such as the Akaike information
criterion or the Bayesian information criterion (see the lrtest command).
3.
estimation uses Gaussian kernels with a fixed bandwidth. Asymptotic properties of the SML estimators require the bandwidth parameters to satisfy the re1/6
1/8
strictions n1/6 < hn < n1/8 and n1
< hn1 < n1 . In practice, one may
SML
12. An extensive discussion on the alternative Stata ML models can be found in Gould, Pitblado, and
Sribney (2006).
202
4. The proposed estimators are more computationally demanding than the corresponding parametric estimators because of both the greater complexity of the
likelihood functions and the fact that they are written as ado-files. The number
of iterations required by SNP estimators typically increases with the order of the
Hermite polynomial expansion. Convergence of SML estimators usually requires a
lower number of iterations, but they are more computationally demanding since
kernel regression is conducted at each step of the maximization process. For both
types of estimators, estimation time further depends on the number of observations and the number of covariates.
Examples
This section provides illustrations of the SNP and SML commands using simulated data,
which allows us to have a benchmark for the estimation results. The Stata code for our
data-generating process is
. * Data generating process
. clear all
. set seed 1234
. matrix define sigma=(1,.5\.5,1)
. quietly drawnorm u1 u2, n(2000) corr(sigma) double
. generate double x1=(uniform()*2-1)*sqrt(3)
. generate double x2=(uniform()*2-1)*sqrt(3)
. generate double x3=invchi2(1,uniform())
. generate x4=(uniform()>.5)
. generate y1=(x1-x3+2*x4+u1>0)
. generate y2=(x2+.5*x3-1.5*x4+u2>0)
Error terms are generated from a bivariate Gaussian distribution with zero means, unit
variances, and a correlation coefficient equal to 0.5. The set of covariates includes four
variables: X1 and X2 are independently drawn from a standardized uniform distribution
on (1, 1), X3 is drawn from a chi-squared distribution with 1 degree of freedom, and
X4 is drawn from a Bernoulli distribution with a probability of success equal to 0.5. To
guarantee identifiability of the model parameters, our data-generating process imposes
one exclusion restriction in each equation, namely, X1 only enters the equation of Y1 ,
while X2 only enters the equation of Y2 .
G. De Luca
203
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2
Coef.
x1
x3
x4
_cons
1.093143
-1.153823
2.037333
.1540149
Std. Err.
.0539591
.0605857
.0988448
.0596325
z
20.26
-19.04
20.61
2.58
P>|z|
0.000
0.000
0.000
0.010
=
=
=
=
2000
1492.07
0.0000
0.5407
1.198901
-1.035077
2.231065
.2708924
_b[x3] / _b[x1]
_b[x4] / _b[x1]
y1
Coef.
b3_b1
b4_b1
-1.05551
1.863739
Std. Err.
.0527265
.0887034
z
-20.02
21.01
P>|z|
0.000
0.000
-.9521678
2.037594
Because of the different scale normalization, estimated coefficients of the probit model
are not directly comparable with those of the SNP and SML models. Here we compare
the ratio of the estimated coefficients by using the nlcom command.
The SNP estimates of the same model, with degree of the univariate Hermite polynomial expansion R = 4, are given by
204
Number of obs
Wald chi2(3)
Prob > chi2
-632.0571
y1
Coef.
Std. Err.
x1
x3
x4
1.600913
-1.688176
2.990631
.2670701
.2840946
.4982394
_cons
.1540149
Fixed
SNP coefs: 1
2
3
4
-.1277219
.1213818
.0391799
.0170504
.0937248
.0883377
.0232224
.0201136
P>|z|
=
=
=
2000
36.86
0.0000
y1
5.99
-5.94
6.00
0.000
0.000
0.000
1.077465
-2.244991
2.0141
2.124361
-1.13136
3.967163
-1.36
1.37
1.69
0.85
0.173
0.169
0.092
0.397
-.3114192
-.0517569
-.0063352
-.0223715
.0559754
.2945206
.0846951
.0564722
1.470122
.226157
2.720993
_b[x3] / _b[x1]
_b[x4] / _b[x1]
y1
Coef.
b3_b1
b4_b1
-1.054508
1.868078
Std. Err.
.0532962
.0853722
z
-19.79
21.88
P>|z|
0.000
0.000
-.9500493
2.035405
. matrix b0=e(b)
Estimated coefficients and standard errors are very close to the corresponding probit
estimates. SNP coefficients are not significantly different from zero, and a likelihoodratio test of the probit model against the SNP model does not reject the Gaussianity
assumption. Estimates of skewness and kurtosis are also close to the Gaussian values of
0 and 3, respectively. In general, however, a very large-sample size, of say 10,000 observations, is necessary to obtain accurate estimates of these higher order moments.13 The
post option in nlcom causes this command to behave like a Stata estimation command.
Below we use these normalized estimates of the snp command as starting values for the
sml command:
13. Simulation also indicates that while the skewness and kurtosis converge to those of the true error
distribution, the reported variance differs by a scale factor from the variance of the true error
distribution.
G. De Luca
205
Number of obs
Wald chi2(2)
Prob > chi2
Coef.
x3
x4
x1
-1.065468
1.882163
(offset)
Std. Err.
.0530123
.0899912
z
-20.10
20.91
P>|z|
0.000
0.000
=
=
=
2000
625.91
0.0000
-.9615662
2.058543
Number of obs
Wald chi2(6)
Prob > chi2
Std. Err.
P>|z|
=
=
=
2000
1242.93
0.0000
y1
x1
x3
x4
_cons
1.101044
-1.151508
2.056867
.1432707
.0526561
.0589227
.0982408
.0589297
20.91
-19.54
20.94
2.43
0.000
0.000
0.000
0.015
.99784
-1.266994
1.864318
.0277707
1.204248
-1.036021
2.249415
.2587708
x2
x3
x4
_cons
1.045511
.4806406
-1.551646
.0184006
.0457917
.0356907
.0813526
.0544944
22.83
13.47
-19.07
0.34
0.000
0.000
0.000
0.736
.9557608
.4106882
-1.711094
-.0884065
1.135261
.550593
-1.392198
.1252077
/athrho
.6076755
.0732266
8.30
0.000
.4641541
.751197
rho
.5424888
.0516764
.4334638
.6358625
y2
chi2(1) =
79.0783
206
The
SNP
-1.045833
1.868106
.4597184
-1.484103
Std. Err.
.0506567
.0861391
.0332726
.0749728
P>|z|
-20.65
21.69
13.82
-19.80
0.000
0.000
0.000
0.000
-.9465474
2.036935
.5249315
-1.337159
Std. Err.
P>|z|
y1
x1
x3
x4
1.554566
-1.617294
2.971796
.1293488
.1431821
.2476522
12.02
-11.30
12.00
0.000
0.000
0.000
1.301047
-1.897926
2.486407
1.808085
-1.336662
3.457186
x2
x3
x4
1.743243
.7931422
-2.603726
.1550252
.0845287
.22436
11.24
9.38
-11.61
0.000
0.000
0.000
1.439399
.627469
-3.043464
2.047087
.9588155
-2.163989
Intercepts:
_cons1
_cons2
0
0
Fixed
Fixed
SNP coefs:
g_1_1
g_1_2
g_1_3
g_2_1
g_2_2
g_2_3
g_3_1
g_3_2
g_3_3
-.467446
-.0437985
.2417127
.0275117
.1097933
-.0127886
.1368238
.0312873
-.0309619
.337281
.0702888
.0936064
.0667043
.0351317
.0201542
.082591
.0186252
.0215084
-1.39
-0.62
2.58
0.41
3.13
-0.63
1.66
1.68
-1.44
0.166
0.533
0.010
0.680
0.002
0.526
0.098
0.093
0.150
-1.128505
-.181562
.0582476
-.1032263
.0409364
-.0522901
-.0250516
-.0052175
-.0731175
.1936125
.0939651
.4251778
.1582497
.1786502
.026713
.2986991
.067792
.0111938
y2
Selection equation
Standard Deviation = 1.426339
Variance =
2.034443
Skewness =
.1274959
Kurtosis =
2.554007
G. De Luca
207
-1.040351
1.911656
.4549809
-1.493611
Std. Err.
.0503783
.0838935
.0305409
.0723319
z
-20.65
22.79
14.90
-20.65
P>|z|
0.000
0.000
0.000
0.000
-.9416109
2.076084
.51484
-1.351843
.4
.3
Density
.2
.1
0
.1
Density
.2
.3
.4
By specifying the noconstant options, the intercept coefficients are normalized to zero
and starting values are set to the estimates of the bivariate probit model with no intercept. Once differences in the scale of the error terms are taken into account, the
estimated coefficients of biprobit and snp2 seem to be very close. As explained in
section 3, the bivariate probit model is nested in the bivariate SNP model only if the
correlation coefficient is equal to zero. Accordingly, a likelihood-ratio test for the
Gaussianity of the error terms cannot be used. Furthermore, it is important to notice that snp2 and snp2s do not provide standard errors and confidence intervals for
the estimated correlation coefficient. If this is a parameter of interest, inference can be
carried out via the bootstrap, although this alternative can be computationally demanding. The estimated correlation coefficient is indeed provided as an estimation output in
e(rho). Figure 1 shows the plots of the two estimated marginal densities obtained by
specifying the dplot option.
5 4 3 2 1
0 1
Eq. 1
5 4 3 2 1
0 1
Eq. 2
208
In the next example, we introduce selectivity in the equation for Y2 and present
parametric ML estimates of the resulting bivariate binary-choice model with sample
selection.
. quietly replace y2=. if y1<1
. heckprob y2 x2 x3 x4, select(y1=x1 x3 x4) nolog
Probit model with sample selection
Std. Err.
Number of obs
Censored obs
Uncensored obs
=
=
=
2000
919
1081
Wald chi2(3)
Prob > chi2
=
=
260.84
0.0000
P>|z|
y2
x2
x3
x4
_cons
1.093629
.4040404
-1.609261
.046692
.0681056
.0848008
.1514939
.1134511
16.06
4.76
-10.62
0.41
0.000
0.000
0.000
0.681
.9601448
.2378339
-1.906183
-.175668
1.227114
.5702469
-1.312338
.269052
x1
x3
x4
_cons
1.089935
-1.149565
2.044188
.1475302
.0536583
.0601002
.0983742
.0592512
20.31
-19.13
20.78
2.49
0.000
0.000
0.000
0.013
.9847668
-1.267359
1.851378
.0313999
1.195103
-1.031771
2.236998
.2636604
/athrho
.6923304
.1696845
4.08
0.000
.3597548
1.024906
rho
.599477
.1087045
.344998
.7718572
y1
chi2(1) =
-1.054709
1.875514
.3694491
-1.471486
Std. Err.
.0524603
.0885771
.0734796
.1124007
19.62
z
-20.10
21.17
5.03
-13.09
P>|z|
0.000
0.000
0.000
0.000
-.9518892
2.049122
.5134664
-1.251185
The snp2s estimates of the same model with (R1 , R2 ) = (4, 3) are given by
G. De Luca
209
Number of obs
Wald chi2(3)
Prob > chi2
Std. Err.
P>|z|
=
=
=
2000
110.18
0.0000
y2
x2
x3
x4
1.837312
.685451
-2.681849
.1807611
.1467516
.2982197
10.16
4.67
-8.99
0.000
0.000
0.000
1.483027
.3978231
-3.266349
2.191597
.9730789
-2.097349
x1
x3
x4
1.582025
-1.695678
3.011402
.1916435
.1991834
.3429733
8.26
-8.51
8.78
0.000
0.000
0.000
1.20641
-2.08607
2.339186
1.957639
-1.305286
3.683617
Intercepts:
_cons1
_cons2
.1475302
.046692
Fixed
Fixed
SNP coefs:
g_1_1
g_1_2
g_1_3
g_2_1
g_2_2
g_2_3
g_3_1
g_3_2
g_3_3
g_4_1
g_4_2
g_4_3
-.4411874
-.0775538
.2447965
.2157904
.1268945
-.0950307
.113475
.0453493
-.0294287
-.0449806
-.005844
.0187026
.4443872
.1175507
.099868
.2817236
.0864948
.0726311
.0886566
.0369828
.0219723
.0489556
.0185705
.0113096
-0.99
-0.66
2.45
0.77
1.47
-1.31
1.28
1.23
-1.34
-0.92
-0.31
1.65
0.321
0.509
0.014
0.444
0.142
0.191
0.201
0.220
0.180
0.358
0.753
0.098
-1.31217
-.307949
.0490589
-.3363777
-.0426323
-.2373851
-.0602888
-.0271357
-.0724937
-.1409318
-.0422415
-.0034638
.4297954
.1528413
.4405341
.7679585
.2964212
.0473237
.2872388
.1178343
.0136362
.0509705
.0305535
.040869
y1
Selection equation
Standard Deviation = 1.473068
Variance =
2.16993
Skewness =
.1437971
Kurtosis =
2.862437
-1.07184
1.903511
.3730727
-1.459659
Std. Err.
.0522701
.0850225
.0669923
.104647
z
-20.51
22.39
5.57
-13.95
P>|z|
0.000
0.000
0.000
0.000
-.9693929
2.070152
.5043752
-1.254555
210
As a final example, we provide estimates obtained from the sml2s command by setting
1/6.02
hn = n1/6.5 and hn1 = n1
.
. quietly summarize y2
. local bw2=1/(r(N)^(1/6.02))
. sml2s y2 x3 x4, select(y1=x3 x4, offset(x1)) offset(x2) bwidth2(`bw2) nolog
Two-stage SML estimator - Lee (1995)
Log likelihood =
Number of obs
Wald chi2(2)
Prob > chi2
-1044.401
Coef.
Std. Err.
P>|z|
=
=
=
2000
154.17
0.0000
y2
x3
x4
x2
.3691167
-1.463612
(offset)
.0789709
.1178916
4.67
-12.41
0.000
0.000
.2143366
-1.694675
.5238969
-1.232549
x3
x4
x1
-1.063704
1.889661
(offset)
.0488601
.0845938
-21.77
22.34
0.000
0.000
-1.159468
1.72386
-.9679396
2.055462
y1
The first two lines provide a simple way to specify alternative values for the bandwidth
parameters. Here we are implicitly assuming that the number of nonmissing observations on Y1 is equal to the size of the estimation sample. If this is not the case, because
of missing data on the covariates, the summarize command on the first line should be
appropriately restricted to the relevant estimation sample.
To investigate the finite sample properties of the SNP and SML estimators, we conducted
a set of Monte Carlo simulations. The aim of this experiment is to investigate both the
efficiency losses of the these estimators relative to the parametric probit ML estimator
in the Gaussian case, and the effectiveness of the proposed estimators under different
non-Gaussian distributional assumptions.
Overall, our Monte Carlo experiment consists of four simulation designs and three
sample sizes (500, 1000, and 2000). In each design, simulated data were generated from
the following bivariate latent regression model
Y1 = 11 X11 + 12 X12 + U1
Y2 = 21 X21 + 22 X22 + U2
where the true parameters are 11 = 21 = 22 = 1 and 12 = 1. The regressors X11
and X21 were independently drawn from a uniform distribution with support (1, 1),
while X12 and X22 were independently drawn from a chi-squared distribution with 1
and 3 degrees of freedom, respectively. All the regressors were standardized to have
zero means and unit variances. Simulation designs differ because of the distributional
assumptions made on the latent regression errors U1 and U2 (see table 1).
G. De Luca
211
Table 1. Theoretical moments by simulation design
Skewness of U1
Skewness of U2
Kurtosis of U1
Kurtosis of U2
Correlation coefficient
Design 1
Design 2
Design 3
Design 4
0
0
3
3
0.5
0.66
0.80
3
3
0.5
0
0
2.60
2.00
0.5
0.68
1.19
4.01
5.13
0.5
In Design 1, the error terms were generated from a bivariate Gaussian distribution
with zero means, unit variances, and correlation coefficient = 0.5. In Designs 24,
the error terms were generated from a mixture of two bivariate Gaussian distributions
with equal covariance matrices,
f (U1 , U2 ; , ) = f1 (U1 , U2 ; m1 , ) + (1 )f2 (U1 , U2 ; m2 , )
where is the mixing probability, and the fj (, ; mj , ), j = 1, 2, are bivariate Gaussian
densities with mean mj = (mj1 , mj2 ) and covariance matrix
2
11 12
=
2
22
The theoretical moments of this bivariate mixture are
E(Uj ) = m1j + (1 )m2j
2
E(Uj2 ) = jj
+ m21j + (1 ) m2j2
2
2
E(Uj3 ) = (3 jj
m1j + m31j ) + (1 ) (3 jj
m2j + m32j )
4
2
2
E(Uj4 ) = 3 jj
+ (6 jj
m21j + m41j ) + (1 ) (6 jj
m22j + m42j )
By varying the mixing probability and the parameters of the two Gaussian components f1 (U1 , U2 ; m1 , ) and f2 (U1 , U2 ; m2 , ), one can then define a family of bivariate
mixtures with given skewness, kurtosis, and correlation coefficient.14 Table 1 gives the
skewness and kurtosis used in each design. The latent regression errors were generated from an asymmetric and mesokurtic distribution in Design 2, a symmetric and
platykurtic distribution in Design 3, and an asymmetric and leptokurtic distribution in
Design 4.15 Error terms were then standardized to have zero means, unit variances,
14. Although bivariate mixture distributions allow us to control the level of skewness, kurtosis, and
correlation coefficient in each design, it is difficult to assess whether or not these error structures
are nested into the SNP model for a finite value of R. For this reason, our simulation design may
be biased against the SNP estimator.
15. To investigate the small-sample behavior of the three estimators under different levels of skewness
and kurtosis, error terms were always generated with stronger departures from Gaussianity in the
distribution of U2 .
212
and correlation coefficient = 0.5 in each design. Stata code for the data-generating
process of the non-Gaussian designs is
. * Data generating process - non-Gaussian designs
. clear all
. set seed 1234
. matrix define var=(`v1,`cov\`cov,`v2)
. matrix define mu1=(`mu11,`mu21)
. matrix define mu2=(`mu12,`mu22)
. quietly drawnorm u11 u21, n(`sample) m(mu1) cov(var) double
. quietly drawnorm u12 u22, n(`sample) m(mu2) cov(var) double
. quietly generate d1=(uniform()<`pi)
. quietly generate double u1=u11 if d1==1
. quietly generate double u2=u21 if d1==1
. quietly replace u1=u12 if d1==0
. quietly replace u2=u22 if d1==0
. local m1=(`pi*`mu11) + (1-`pi)*(`mu12)
. local m2=(`pi*`mu21) + (1-`pi)*(`mu22)
. local sd1=sqrt(`v1+`pi*(1-`pi)*(`mu12-`mu11)^2)
. local sd2=sqrt(`v2+`pi*(1-`pi)*(`mu22-`mu21)^2)
. quietly replace u1=(u1-`m1)/`sd1
. quietly replace u2=(u2-`m2)/`sd2
. quietly generate double x11=(uniform()*2-1)*sqrt(3)
. quietly generate double x21=(uniform()*2-1)*sqrt(3)
. quietly generate double x12=(invchi2(1,uniform())-1)/sqrt(2)
. quietly generate double x22=(invchi2(1,uniform())-3)/sqrt(6)
. quietly generate double y1s=`b11*x11+`b12*x12+u1
. quietly generate double y2s=`b21*x21+`b22*x22+u2
where the mixing probability pi and the set of parameters (mu11, mu12, mu21, mu22,
v1, v2, cov) are chosen to obtain the selected levels of skewness, kurtosis, and correlation coefficient (see Preston [1953]).
Throughout the study, comparability of the probit, SNP, and SML estimators is
obtained by imposing the scale normalization 11 = 21 = 1. For the parametric probit
and the SNP estimators the normalization is imposed on the estimation results by taking
the ratio of the estimated coefficients 12 /11 and 23 /21 , while for the SML estimator
the normalization is directly imposed on the estimation process by constraining the
coefficients of X11 and X21 to one. We always used the default starting values for the
SNP and SML estimators. Furthermore, SNP and SML estimation were performed with
prespecified values of R and hn , respectively. To save computational time, no check was
undertaken to investigate convergence to the global maximum rather than a local one,
and we used rule-of-thumb values for R and hn .
G. De Luca
213
Tables 24 focus on the univariate binary-choice model for Y2 and present summary
statistics for the simulation results from 1000 replications with sample sizes 500, 1000,
and 2000, respectively.16 The normalization restrictions imply that there is only one
free parameter in the model whose true value is 1. SNP estimation was performed under
three alternative choices of R (with R = 3, 4, 5) as degree of the univariate Hermite
polynomial expansion, while SML estimation was performed under three alternative
values of the bandwidth parameter hn = n1/ (with = 6.02, 6.25, 6.5). According
to our simulation results, efficiency losses of the SNP and the SML estimators in the
Gaussian design (Design 1) are rather small. In particular, the relative efficiency of the
SNP estimator relative to the probit estimator ranges between 74% and 89%, while the
relative efficiency of the SML estimator relative to the probit estimator ranges between
78% and 83%.17
A comparison of the three estimators in the non-Gaussian designs further suggests
that SNP and SML estimators substantially dominate the probit estimator, specially in
Designs 2 and 4 where error terms are generated from asymmetric distributions. First,
the bias of the probit estimator is about 10% in Design 2 and about 6.5% in Design 4,
while the bias of SNP and SML estimators never exceed 1.5%. Second, the ratios between
the mean squared estimates (MSE) of the probit estimator and the MSEs of the two
semiparametric estimators range between 1.7 and 5.3 in Design 2, and between 1.2 and
3.3 in Design 4. As expected, efficiency gains of the SNP and SML estimators relative to
the probit estimator always increase as the sample size becomes larger. Third, the actual
rejection rate of the Wald test for the probit estimate being equal to the true value of
the parameter is quite far from the nominal value of 5%, while the actual rejection rates
of the Wald tests for the SNP and SML estimates converge to their nominal values as
the sample size becomes larger.
16. For each simulation design and selected sample size, we provide average and standard deviation of
the estimates, mean square error of each comparable estimator, and rejection rate of the Wald test
for each estimated coefficient being equal to its true value.
17. Results on the SNP estimator are consistent with the simulation results of Klein and Spady (1993)
who find a relative efficiency of 78% on different simulation designs.
214
Est
1.0039
1.0041
1.0104
1.0074
1.0157
1.0204
1.0248
Est
1.0091
1.0206
1.0125
1.0162
1.0432
1.0487
1.0544
SD
.1452
.1581
.1624
.1684
.1633
.1634
.1619
SD
.1495
.1690
.1718
.1773
.1675
.1684
.1698
MSE
.0211
.0250
.0265
.0284
.0269
.0271
.0268
MSE
.0224
.0290
.0297
.0317
.0299
.0307
.0318
RR
.0566
.0627
.0688
.1021
.0890
.0829
.0728
RR
.0443
.0622
.0802
.0970
.0665
.0675
.0643
Des. 2
probit
snp3
snp4
snp5
sml1
sml2
sml3
Des. 4
probit
snp3
snp4
snp5
sml1
sml2
sml3
Est
.9017
1.0039
.9930
.9931
.9855
.9876
.9900
Est
.9354
.9991
1.0027
1.0003
.9749
.9784
.9821
SD
.1598
.1446
.1443
.1438
.1405
.1387
.1377
SD
.1461
.1397
.1389
.1404
.1457
.1439
.1423
MSE
.0352
.0209
.0209
.0207
.0199
.0194
.0191
MSE
.0255
.0195
.0193
.0197
.0219
.0212
.0206
RR
.1776
.0737
.0898
.0979
.0838
.0757
.0676
RR
.1227
.0563
.0724
.0785
.1066
.0986
.0915
Est
1.0048
1.0010
1.0065
1.0040
1.0119
1.0158
1.0197
Est
1.0022
1.0092
1.0038
1.0031
1.0294
1.0343
1.0396
SD
.1059
.1120
.1149
.1189
.1171
.1168
.1164
SD
.1029
.1124
.1149
.1175
.1130
.1138
.1151
MSE
.0112
.0125
.0133
.0142
.0138
.0139
.0139
MSE
.0106
.0127
.0132
.0138
.0136
.0141
.0148
RR
.0592
.0602
.0582
.0772
.0762
.0702
.0612
RR
.0484
.0494
.0575
.0676
.0494
.0434
.0464
Des. 2
probit
snp3
snp4
snp5
sml1
sml2
sml3
Des. 4
probit
snp3
snp4
snp5
sml1
sml2
sml3
Est
.9064
1.0104
1.0030
1.0011
.9977
.9994
1.0011
Est
.9353
1.0000
1.0074
1.0052
.9876
.9904
.9931
SD
.1303
.1054
.1038
.1032
.1044
.1037
.1030
SD
.1109
.0982
.0975
.0972
.1024
.1012
.1002
MSE
.0257
.0112
.0108
.0107
.0109
.0107
.0106
MSE
.0165
.0096
.0096
.0095
.0106
.0103
.0101
RR
.2452
.0701
.0801
.0761
.0801
.0771
.0771
RR
.1595
.0562
.0632
.0702
.1013
.0943
.0832
G. De Luca
215
Est
1.0005
.9957
1.0009
.9991
1.0035
1.0067
1.0102
Est
1.0012
1.0047
1.0029
1.0012
1.0226
1.0266
1.0310
SD
.0717
.0753
.0762
.0776
.0783
.0779
.0778
SD
.0724
.0783
.0792
.0806
.0806
.0813
.0825
MSE
.0051
.0057
.0058
.0060
.0061
.0061
.0062
MSE
.0052
.0061
.0063
.0065
.0070
.0073
.0078
RR
.0502
.0592
.0532
.0633
.0602
.0582
.0582
RR
.0511
.0521
.0541
.0581
.0591
.0611
.0571
Des. 2
probit
snp3
snp4
snp5
sml1
sml2
sml3
Des. 4
probit
snp3
snp4
snp5
sml1
sml2
sml3
Est
.8969
1.0030
.9981
.9974
.9932
.9944
.9956
Est
.9304
.9936
1.0017
.9995
.9889
.9908
.9928
SD
.1199
.0687
.0676
.0677
.0694
.0687
.0681
SD
.0940
.0666
.0645
.0644
.0694
.0684
.0676
MSE
.0250
.0047
.0046
.0046
.0049
.0048
.0047
MSE
.0137
.0045
.0042
.0042
.0049
.0048
.0046
RR
.4320
.0530
.0540
.0560
.0600
.0530
.0520
RR
.2593
.0460
.0440
.0450
.0761
.0711
.0641
18. As explained in section 5.2, the SML commands do not support the robust option for the Huber/White/sandwich estimator of the covariance matrix.
216
Est
1.0095
1.0086
1.0025
.9933
Est
1.0163
.9988
1.0096
1.0053
SD
.1207
.1315
.1413
.1507
SD
.1209
.1307
.1441
.1575
MSE
.0147
.0174
.0200
.0228
MSE
.0149
.0171
.0208
.0248
RR
.0543
.0834
.0492
.1055
RR
.0437
.0863
.0416
.1102
Des. 2
bipr1
snp21
bipr2
snp22
Des. 4
bipr1
snp21
bipr2
snp22
Est
.9684
.9756
.9056
.9500
Est
.9681
.9875
.9376
.9583
SD
.1193
.1279
.1550
.1669
SD
.1241
.1338
.1416
.1362
MSE
.0152
.0170
.0329
.0304
MSE
.0164
.0180
.0239
.0203
RR
.0775
.1016
.1640
.2072
RR
.1000
.1010
.1230
.1360
Est
1.0067
1.0081
1.0056
.9962
Est
1.0135
.9975
1.0021
.9975
SD
.0841
.0898
.1014
.1082
SD
.0843
.0897
.0985
.1029
MSE
.0071
.0081
.0103
.0117
MSE
.0073
.0081
.0097
.0106
RR
.0530
.0670
.0590
.0910
RR
.0453
.0866
.0433
.0816
Des. 2
bipr1
snp21
bipr2
snp22
Des. 4
bipr1
snp21
bipr2
snp22
Est
.9599
.9673
.9105
.9537
Est
.9615
.9908
.9382
.9610
SD
.0902
.0919
.1261
.1201
SD
.0906
.0894
.1076
.0985
MSE
.0098
.0095
.0239
.0166
MSE
.0097
.0081
.0154
.0112
RR
.1020
.1070
.2360
.1900
RR
.1160
.0880
.1540
.1230
Est
1.0011
1.0025
.9991
.9926
Est
1.0125
.9984
1.0015
.9990
SD
.0566
.0600
.0701
.0706
SD
.0578
.0609
.0706
.0691
MSE
.0032
.0036
.0049
.0050
MSE
.0035
.0037
.0050
.0048
RR
.0521
.0571
.0470
.0691
RR
.0481
.0601
.0531
.0621
Des. 2
bipr1
snp21
bipr2
snp22
Des. 4
bipr1
snp21
bipr2
snp22
Est
.9574
.9661
.9015
.9513
Est
.9540
.9897
.9337
.9630
SD
.0701
.0702
.1152
.0903
SD
.0737
.0621
.0908
.0739
MSE
.0067
.0061
.0230
.0105
MSE
.0075
.0040
.0126
.0068
RR
.1510
.1250
.4050
.1970
RR
.1660
.0770
.2460
.1320
G. De Luca
217
Finally, tables 810 provide simulation results of the bivariate binary-choice model
with sample selection for Y1 and Y2 . In this case, selectivity was introduced by setting
Y2 to missing whenever Y1 = 0. As for the bivariate model without sample selection,
the normalization restrictions imply that there are two free parameters in the model,
one in the selection equation whose true value is 1 and one in the main equation whose
true value is 1. In this case, we compare performances of the bivariate probit estimator
with sample selection, the SNP estimator with R1 = 4 and R2 = 3, and the SML
1/6.5
estimator with hn = n1/6.5 and hn1 = n1
. Our simulation results suggest again
that efficiency losses of SNP and SML estimators with respect to a correctly specified
probit estimator are rather small in both equations (namely, 87% and 80% in the first
equation, and 86% and 70% in the second equation). In the non-Gaussian cases, the
probit estimator is instead markedly biased and less efficient than the SNP and SML
estimators specially in the presence of asymmetric distributions and relatively largesample sizes. As before, the actual rejection rates of the Wald tests for the SNP and
SML estimates are better than those for the parametric probit estimates, but they are
still far from their nominal values of 5%. These coverage problems are likely to be
due to the incorrect choice of the degree of the Hermite polynomial expansion and the
bandwidth parameters, respectively. Given the computational burden of our Monte
Carlo simulations, investigating the optimal choice of these parameters is behind the
scope of this article. We leave this topic for future research.
Table 8. Simulation results for the bivariate binary-choice model with sample selection
(n = 500)
Des. 1
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Des. 3
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Est
1.0114
1.0096
1.0106
1.0048
.9895
1.0112
Est
1.0184
1.0044
1.0185
1.0622
1.0019
1.0713
SD
.1228
.1357
.1399
.2047
.2281
.2506
SD
.1254
.1361
.1371
.2282
.2221
.2897
MSE
.0152
.0185
.0197
.0419
.0521
.0629
MSE
.0161
.0185
.0192
.0559
.0493
.0890
RR
.0557
.0902
.0952
.0588
.1631
.1581
RR
.0483
.0924
.0903
.0378
.1082
.1366
Des. 2
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Des. 4
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Est
.9682
.9795
1.0066
.8944
.9490
.9667
Est
.9688
.9951
.9990
.9133
.9617
.9477
SD
.1184
.1291
.1338
.2178
.1980
.2435
SD
.1233
.1338
.1462
.2127
.2132
.2477
MSE
.0150
.0171
.0179
.0586
.0418
.0604
MSE
.0162
.0179
.0214
.0527
.0469
.0641
RR
.0774
.0905
.0905
.1347
.1236
.1759
RR
.0993
.1003
.1274
.1474
.1685
.2247
218
Table 9. Simulation results for the bivariate binary-choice model with sample selection
(n = 1000)
Des. 1
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Des. 3
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Est
1.0070
1.0082
1.0081
1.0073
.9830
1.0063
Est
1.0136
1.0009
1.0129
1.0583
.9974
1.0555
SD
.0857
.0906
.0944
.1382
.1534
.1611
SD
.0853
.0920
.0940
.1665
.1538
.1826
MSE
.0074
.0083
.0090
.0192
.0238
.0260
MSE
.0075
.0085
.0090
.0311
.0237
.0364
RR
.0512
.0592
.0813
.0572
.1155
.1185
RR
.0446
.0791
.0751
.0538
.0974
.1156
Des. 2
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Des. 4
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Est
.9593
.9766
1.0038
.8959
.9554
.9766
Est
.9613
.9947
.9973
.9056
.9662
.9471
SD
.0897
.0895
.0924
.1694
.1507
.1711
SD
.0910
.0888
.1013
.1563
.1413
.1668
MSE
.0097
.0086
.0086
.0395
.0247
.0298
MSE
.0098
.0079
.0103
.0333
.0211
.0306
RR
.1030
.0890
.0750
.1610
.1080
.1510
RR
.1234
.0782
.1224
.1635
.1254
.1956
Table 10. Simulation results for the bivariate binary-choice model with sample selection
(n = 2000)
Des. 1
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Des. 3
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Est
1.0020
1.0045
1.0016
1.0032
.9836
.9985
Est
1.0117
1.0021
1.0080
1.0615
1.0015
1.0487
SD
.0576
.0618
.0641
.0961
.1028
.1158
SD
.0585
.0615
.0629
.1213
.1037
.1213
MSE
.0033
.0038
.0041
.0093
.0108
.0134
MSE
.0036
.0038
.0040
.0185
.0108
.0171
RR
.0502
.0592
.0662
.0431
.0953
.1003
RR
.0518
.0640
.0701
.0691
.0813
.0904
Des. 2
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Des. 4
heckpr1
snp2s1
sml2s1
heckpr2
snp2s2
sml2s2
Est
.9580
.9781
1.0026
.8880
.9561
.9784
Est
.9549
.9939
.9950
.9033
.9721
.9572
SD
.0694
.0663
.0658
.1471
.1082
.1189
SD
.0729
.0619
.0693
.1316
.0983
.1224
MSE
.0066
.0049
.0043
.0342
.0136
.0146
MSE
.0074
.0039
.0048
.0267
.0104
.0168
RR
.1420
.0920
.0840
.2500
.1230
.1250
RR
.1600
.0650
.1130
.2470
.0990
.1900
Acknowledgments
I thank Franco Peracchi, David Drukker, Vince Wiggins, and two anonymous referees
for their valuable comments and suggestions. I also thank Claudio Rossetti for help in
programming the univariate SML estimator.
G. De Luca
219
References
220
Meng, C. L., and P. Schmidt. 1985. On the cost of partial observability in the bivariate
probit model. International Economic Review 26: 7185.
Pagan, A., and A. Ullah. 1999. Nonparametric Econometrics. Cambridge: Cambridge
University Press.
Powell, J. L., J. H. Stock, and T. M. Stoker. 1989. Semiparametric estimation of index
coefficients. Econometrica 57: 14031430.
Preston, E. J. 1953. A graphical method for the analysis of statistical distributions into
two normal components. Biometrika 40: 460464.
Stewart, M. B. 2004. Semi-nonparametric estimation of extended ordered probit models.
Stata Journal 4: 2739.
About the author
Giuseppe De Luca received his PhD in Econometrics and Empirical Economics from the University of Rome Tor Vergata in the summer of 2007. Currently, he works at Istituto per lo
Sviluppo della Formazione Professionale dei Lavoratori (ISFOL) as research assistant in econometrics. His research interests are microeconometric theory and applications, nonparametric
estimation methods, problems of nonresponse in sample surveys, and models of labor supply.