2.2000 Generalized Linear Models McCulloch

Generalized Linear Models
Author(s): Charles E. McCulloch

Source: Journal of the American Statistical Association, Vol. 95, No. 452 (Dec., 2000), pp. 1320-
1324
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/2669780
Accessed: 12/11/2010 07:30
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=astata.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.
http://www.jstor.org
1320 Journalof the AmericanStatisticalAssociation,December 2000
Society,Ser. B, 56, 221-234. Perlman,M. D. (1980), "Unbiasednessof MultivariateTests: RecentRe-

Hotelling,H. (1931), "The Generalizationof Student'sRatio,"Annals of sults,"in MultivariateAnalysisV, Amsterdam:North-Holland/Elsevier,
MathematicalStatistics,2, 360-378. pp. 413-432.
Jeffreys, H. (1939), TheoryofProbability, Oxford,U.K.: ClarendonPress. Perlman,M. D., and Olkin,I. (1980), "Unbiasednessof InvariantTestsfor
Kass, R. E., and Raftery,A. E. (1995), "Bayes Factors,"Journalof the MANOVA and OtherMultivariateProblems,"The Annals of Statistics,
AmericanStatisticalAssociation,90, 773-795. 8, 1326-1341.
Kass, R. E., and Wasserman,L. (1996), "The Selectionof PriorDistribu- Pratt,J.W. (1965), "Bayesian Interpretation of StandardInferenceState-
tionsby FormalRules,"JournaloftheAmericanStatisticalAssociation, ments,"Journalof theRoyal StatisticalSociety,Ser. B, 27, 169-203.
91, 1343-1370. Puri,M. L., and Sen, P. K. (1971), Nonparametric MethodsinMultivariate
Kiefer,J.,and Schwartz,R. (1965), "AdmissibleBayes Characterof T2-, Analysis,New York:Wiley.
R2-, and OtherFully InvariantTests forClassical MultivariateNormal (1985), Nonparametric Methodsin General Linear Models, New
Problems,"Annals of MathematicalStatistics,36, 747-770. Corr. 43, York:Wiley.
1742. Raftery,A. E. (1995), "Bayesian Model Selection in Social Research,"
Lauritzen,S. L. (1996), GraphicalModels,Oxford,U.K.: ClarendonPress. in Sociological Methodology1995, ed. P. V. Marsden,Oxford,U.K.:
Lehmann,E. L. (1959), TestingStatisticalHypotheses.New York:Wiley. Blackwells,pp. 111-196.
Lindley,D. V. (1957), "A StatisticalParadox,"Biometrika, 44, 187-192. Randles,R. H. (1989), "A Distribution-Free MultivariateSign Test Based
Liu, R. Y., and Singh,K. (1993), "A QualityIndex Based on Data Depth on Interdirections," Journalof theAmericanStatisticalAssociation,84,
and Multivariate Rank Tests,"Journalof theAmericanStatisticalAsso- 1045-1050.
ciation,88, 252-260. Rao, C. R. (1947), "Large-SampleTestsof StatisticalHypothesesConcern-
(1997), "Notionsof Limitingp Values Based on Data Depth and ing Several ParametersWithApplicationsto Problemsof Estimation,"
Bootstrap,"Journalof the AmericanStatisticalAssociation,92, 266- Proceedingsof the CambridgePhilosophicalSociety,44, 50-57.
277. (1973), LinearStatisticalInferenceand itsApplications,New York:
Mahalanobis,P. C. (1930), "On Testsand Measuresof GroupDivergence," Wiley.
Journaland Proceedingsof theAsiaticSocietyofBengal, 26, 541-588.
Robertson,T., Wright,F. T., and Dykstra,R. L. (1988), Orcder-Restricted
Marden,J. I. (1982), "MinimalCompleteClasses of Tests of Hypotheses
StatisticalInference,New York:Wiley.
WithMultivariateOne-SidedAlternatives," TheAnnalsofStatistics,10,
Roy, S. N. (1953), "On a HeuristicMethodof Test Constructionand Its
962-970.
Use in MultivariateAnalysis,"Annals of MathematicalStatistics,24,
RankTests,"inMultivariate
(1999), "Multivariate Analysis,Design
220-238.
of Experiments, and SurveySampling,ed. S. Ghosh,New York:Marcel
Serfling,R. (1980), Approximation Thieoremsof MathematicalStatistics,
Dekker,pp. 401-432.
New York:Wiley.
Matthes,T. K., and Truax, D. R. (1967), "Test of CompositeHypothe-
ses for the MultivariateExponentialFamily,"Annals of Mathematical Stein,C. (1956), "The Admissibilityof Hotelling'sT2 Test,"Annals of
Statistics,38, 681-697. Corr.38, 1928. MathematicalStatistics,27, 616-623.
Mosteller,F., and Wallace, D. L. (1984), AppliedBayesian and Classical Student(1908), "The ProbableErrorof a Mean," Biometrika, 6, 1-25.
Inference:The Case of theFederalistPcapers,New York: Springer. Sun, J. (1991), "SignificanceLevels in ExploratoryProjectionPursuit,"
Neyman,J., and Pearson,E. (1928), "On the Use and Interpretation of Biometrika, 78, 759-769.
Certain Test Criteriafor Purposes of StatisticalInference.Part I," Wald,A. (1943), "Tests of StatisticalHypothesesConcerningSeveral Pa-
Biometrika, 20A, 175-240. rametersWhen theNumberof Observationsis Large,"Transactionsof
(1933), "On theProblemof theMost Efficient Tests of Statistical theAmericanMathematicalSociety,54, 426-482.
Hypotheses,"PhilosophicalTransactionsof the Royal Society,Ser. A, (1950), StatisticalDecision.Functions,New York:Wiley.
231, 289-337. Wermuth, N., and Cox, D. R. (1992), "GraphicalModels forDependencies
Patel, K. M., and Hoel, D. G. (1973), "A Nonparametric Test for Inter- and Associations,"Proceedingsof the TenthSymposiumon Computa-
action in FactorialExperiments," Journalof the AmericanStatistical tionalStatistics,1, 235-249.
Association,68, 615-620. Whitaker,J. (1990), GraphicalMoclelsin AppliedMultivariateSt.atistics,
Pearson,K. (1900), "On theCriterionThat a GivenSystemof Deviations New York: Wiley.
From the Probablein the Case of a CorrelatedSystemof Variablesis Wijsman,R. A. (1967), "Cross-Sectionsof Orbitsand Their Application
Such That it can be ReasonablySupposedto Have Risen FromRandom to Densitiesof Maximal Invariants," Proceedingsof theFifthBerkeley
Sampling,"PhilosophicalMagazine, 1, 157-175. Symposiumon MathematicalStatisticsand Probability,1, 389-400.
Generalized Linear Models

Charles E. MCCULLOCH
1. INTRODUCTION AND SOME HISTORY binaryresponsevariableas a functionof one or morepre-

Whatis thedifference, in absolutevalue,betweenlogistic dictors.These techniqueshave a longhistory, withtheterm
"probit"tracedby David (1995) back to Bliss (1934), and
regressionand discriminant analysis?I will not make you
Finney(1952) attributing the originof the techniqueitself
read thisentirearticleto findthe answer,whichis 2. But
to psychologistsin the late 1800s. In its earliestincarna-
you will have to read a bit further
to findout why.
tions,probitanalysiswas littlemorethana transformation
As moststatisticians know,logisticregressionand probit
technique;scientistsrealized thatthe sigmoidalshape of-
regressionare commonlyused techniquesfor modelinga ten observedin plots of observedproportions of successes
versus a predictorx could be rendereda straightline by
CharlesE. McCulloch is Professor,Division of Biostatistics,
University
of California,San Francisco,CA 94143 (E-mail:chuck@biostat.ucsfedu).
This is BU-1449-M in the BiometricsUnit Technical Report Series at ? 2000 American Statistical Association
Cornell Universityand was supportedby National Science Foundation Journal of the American Statistical Association
grantDMS-9625476. December 2000, Vol. 95, No. 452, Vignettes
Theoryand Methodsof Statistics 1321
applyinga transformation corresponding to the inverseof layingout the recommendedcomputationalmethod.This

thenormalcdf. includesstepssuch as "34. Check steps30-33" and thead-
For example,Bliss (1934) describedan experimentin monishment to the computer(a person!)that"A machine
whichnicotineis appliedto aphidsand theproportion killed is nota completesafeguardagainstarithmetical errors,and
is recorded.(How is that for an early antismokingmes- carelessnesswill lead to wronganswersjust as certainlyas
sage?) Letting -'(.) representtheinverseof the standard in nonmechanizedcalculations."This is clearlysage advice
normalcdf,andPiitheobservedproportion killedat dose di against overconfidence in outputeven fromtoday's soft-
of thenicotine,Bliss showeda plotof 1-1 (Pi) versuslogdi. ware.
The plotseemsto indicatethata two-segment linearregres- Finney was practicallyapologetic about the effortre-
sion modelin logdi is theappropriate model.In an articlea quired:"The chiefhindrancesto themorewidespreadadop-
yearlater,Bliss (1935) explainedthemethodologyin more tionof theprobitmethod. . . (is) ... theapparentlaborious-
detail as a weightedlinear regressionof il 1(pi) on the ness of thecomputations." He recognizedthathis methods
predictorxi using weightsequal to [niq(pi)21/{f>(p,)[1 - mustbe iterateduntilconvergence to arriveat themaximum
4 (pi)]}, where0/(.)represents thestandardnormalpdfand likelihoodestirnatesbutindicatedthat"withexperiencethe
ni is the samplesize forcalculatingPi. These weightscan firstprovisionalline may oftenbe drawnso accuratelythat
be easilyderivedas theinverseof theapproximatevariance onlyone cycle of thecalculationsis neededto give a satis-
foundby applyingthe deltamethodto -1 (Pi). factoryfit..."
This approachobviouslyhas problemsifan observedpro- Withcomputationsso lengthy, whatiterativemethodof
portionis either0 or 1. As a briefappendixto Bliss's work, was used? Finneyrecommended
fitting using"workingpro-
Fisher (1935) outlinedthe use of maximumlikelihoodto bits,"whichhe definedas (ignoringthe shiftof fiveunits
obtain estimatesusing data in whichPji is either0 or 1. historicallyused to keep all the calculationspositive):
Hereinlies a subtlechange:Fisherwas no longerdescribing
- (x') (4)
a modelforthetransformed proportions, butinsteadwas di- ,X +Yi
rectlymodelingthe mean of thebinaryresponse.Users of
generalizedlinearmodels (GLMs) will recognizethe dis- The workingprobitsfora current value of /3wereregressed
tinctionbetweena transformation and a link. on the predictorsusing weightsthe same as suggestedby
This techniqueof maximumlikelihoodis suggestedonly Bliss, namely [O(pi)2]/{4?(pi)[1- 4?(pi)]} to get the new
as a methodof last resortwhen the observedproportions value of /3.
thatare equal to 0 or 1 mustbe incorporated in theanalysis. When I firstlearnedabouttheEM algorithm(Dempster,
The computationalburdenswere simplytoo highforit to Laird, and Rubin 1977), I was struckby its similarityto
be used on a regularbasis in thatera. Finney'salgorithm.A commonrepresentation of (1) is via
So, by the 1930s,modelsof thefollowingformwerebe- a thresholdmodel.Thatis, hypothesizea latentvariableWi
ing posited,and fitting withthemethodof maximumlike- such that
lihood was at least beingentertained. Withpi denotingthe
probabilityof a success forthe ith observation,the model Wi - indepN(x'/, 1). (5)
is givenby
Then,usingYi = {2>W},0yields(1). To implementtheEM
Yi - indep.Bernoulli(pi) algorithm,it is naturalto regardtheWi as missingdata and
fillthemin. Once the Wi are known,ordinaryleast squares
and can be used to get the new estimateof /3.The E step fills
in the Wi usingtheformula
Pi (1)
Yi- ~(x~/
wherexi denotesthe ith row of the matrixof predictors. E[WilYi] = X/3+ q(Xt/3) i(x)[1 b(x),)]- (6)
Witha slightabuse of notation,and to make thislook sim-
ilar to a linearmodel,we can rewrite(1) as and theM step estimates/3as (X'X) 1X'W.
Thus the termadded to x'/ in the EM algorithmis the
Y indep.Bernoulli(p) same as thetermaddedusingworkingprobits,once theyare
multipliedby theweight.Practicalusage of EM and work-
and ingprobits,however,showsthatworkingprobitsinvariably
convergesmuchmorequicklythanEM!
p = (X/3) (2) So as earlyas 1952 manyof thekeyingredients of GLMs
or,equivalently, are seen: using"workingvariates"and linkfunctions, fitting
usinga methodof iterativelyweightedfits,and usinglikeli-
4)-1(p) = X,8. (3) hood methods.But lack of computational resourcessimply
did not allow widespreaduse of such techniques.
By 1952, thishad changedlittle.In thatyearFinneymore Logistic regressionwas similarlyhampered. Over a
clearlydescribedtheuse of maximumlikelihoodforfitting decade later,Cox (1966) statedthat"since the maximiza-
thesemodels in an appendixtitled"MathematicalBasis of tionof a functionof manyvariablesmaynotbe straightfor-
theProbitMethod"and spentsix pages in anotherappendix ward,even withan electroniccomputer,it is worthhaving
1322 Journalof the AmericanStatisticalAssociation,December 2000
'simple' methodsfor solving maximumlikelihoodequa- these authorswere the firstto put fortha unifiedframe-
tions,especiallyforuse whenthereare singleobservations workshowingthesimilarities betweenseeminglydisparate
at each x value,so thatthelinearizingtransformation
is not methods,such as probitregression, linearmodels,and con-
applicable."Note the need forsimplemethodsdespitethe tingencytables.Theyrecognizedthatfitting a probitregres-
factthat"computers"in 1966 are now machines. sion by iterativefitsusing the "workingprobits,"namely
For thelogisticregressionmodel akin to (2), namely (4), couldbe generalizedin a straightforwardwayto unifya
wholecollectionof maximumlikelihoodproblems.Replac-
Y indep.Bernoulli(p) ing q (.) witha general"link"function, g(.), and defining
and a "workingvariate"via
p = 1/(1 + exp[-X,8]) (7) z -g(u) + (y -,u)g,u) (12)

or gave, via iterativeweightedleast squares,a computational
method for findingthe maximumlikelihood estimates.
log[pA(l - p)] = Xp,3
More formally, we can writethemodel as
it is straightforward
to show thatthe maximumlikelihood
equationsare givenby Yi - indep.fy,(yi),
X'Y = X'p. (8)
fy,(yi) = exp{(yiOi - b-(i))/a c(yj,
y )},
Because /3entersp in a nonlinearfashionin (8), it is not
possible to analyticallysolve this equation for /3. How-
E[Y1] = pi,
ever,using the crude approximation(Cox 1966), 1/(1 +
exp[-t]) 1/2+ t/6,whichclearlyis applicableonly for and
the midrangeof the curve,(8) can be rewrittenapproxi-
matelyas g(lUi) - xi'/, (13)
X'Y =XI ( 1 i + 6X 1) where Oi is a knownfunctionof 3 and g(.) is a known
X'1 + 1X'X13. (9) functionthattransforms (or links) the mean of yi (not yi
itself!)to thelinearpredictor.The iterativealgorithmis used
This leads to to give maximumlikelihoodestimatesof 3.
More important, it made possiblea styleof thinking that
xX Y- -1 =X/x1, (10) freedthedata analystfromtheneed to look fora transfor-
mation,thatsimultaneously achievedlinearity in thepredic-
whichwe can solve as torsand normalityof the distribution (as in Box and Cox
/ = (X'X)-1X'6 Y - p)(X/X)-1X/Y*, (11)

1964).
I thinkof buildingGLMs by makingthreedecisions:
whereY1*is equal to 3 fora success and -3 fora failure. 1. What is the distribution of the data (forfixedvalues
That is, we can approximatethe logisticregressioncoeffi- of the predictorsand possibly aftera transformation)?
cientsin a crudewaybyan ordinary leastsquaresregression 2. What function of the mean will be modeledas linear
on a coded Y. in the predictors?
3. What will thepredictorsbe?
Logistic regressionis oftenused as an alternatemethod
fortwo-groupdiscriminant analysis(Cox and Snell 1989), What advantagesdoes thisapproachhave? First,it unifies
by usingthe(binary)groupidentifier as the"response"and whatappearto be verydifferent methodologies, whichhelps
themultivariate vectorsas the"predictors." This is a useful to understand, use, and teach the techniques.Second, be-
alternativewhenthe usual multivariate normalityassump- cause therightside of the equationis a linearmodel after
tionforthemultivariate vectorsis questionable;forexam- applyingthe link,manyof the conceptsof linear models
ple, whenone or morevariablesare binaryor categorical. carryover to GLMs. For example,the issues of full-rank
When it is reasonableto assume multivariate normality, versusoverparameterized models are similar.
theusual Fisherdiscriminant functionis givenby S- 1(X1 - The applicationof GLMs became a realityin the mid
X2), whereXi is themeanof thevectorsfortheithgroup. 1970s, when GLMs were incorporatedinto the statistics
If we code the successes and failuresas 1 and -1, then package GENSTAT and made availableinteractively in the
Xi - X2 = X'Y. Thus we see thatthedifference between GLIM software.Users of thesepackages could thenhan-
logisticregressionand discriminant functionanalysisis 2, dle linearregression, logisticand probitregression, Poisson
in absolutevalue. regression,log-linearmodels,and regressionwithskewed
continuousdistributions, all in a consistentmanner.Both
2. ORIGINS packages are stillwidelyused and are currently distributed
GLMs appeared on the statisticalscene in the path- by the NumericalAlgorithmsGroup (www.nag.com).Of
breakingarticleof Nelder and Wedderburn(1972). Even course,by now,mostmajor statisticalpackages have facil-
thoughvirtuallyall of the pieces had previouslyexisted, itiesforGLMs; forexample,SAS Proc GENMOD.
Theoryand Methodsof Statistics 1323
GLMs receiveda tremendous boostwiththedevelopment gree). Not surprisingly,the efficiency

of inferencescan be
of quasi-likelihoodby Wedderburnin 1974. Using only affectedif the "working"structure is far fromtruth(e.g.,
themean-to-variance relationship,
Wedderburn showedhow Fitzmaurice1995).
statisticalinferencecould stillbe conducted.Perhaps sur- Distribution theoryformodificationsof exponentialfam-
prisingly,given the paucity of assumptions,these tech- ilies foruse in GLMs has been developedfurther by,forex-
niquesoftenretainfullor nearlyfullefficiency (Firth1987). ample,Jorgensen (1997), and thetheoryof quasi-likelihood
Further,theimportant modification of overdispersionis al- is detailedin thebook-lengthtreatment of Heyde (1997).
lowed; thatis, models withvarianceproportionally larger
thanpredictedby the nominaldistribution, say, a Poisson
distribution.Such situationsarise commonlyin practice. 4. LOOKING FORWARD
Quasi-likelihoodwas put on a firmertheoreticalbasis by Anyone makingpredictionsruns the risk of someone
McCullagh (1983). actuallycheckinglaterto see whetherthe predictionsare
That yearalso saw thepublicationof thefirsteditionof correct.So I am countingon the "JeanDixon effect,"de-
the now-classicbook GeneralizedLinear Models (McCul- finedby the Skeptic'sDictionary(http://skeptic.com) as
laghandNelder1983).Witha niceblendof theory, practice, "the tendencyof the mass media to hype or exaggeratea
and applicationsthis textmade GLMs more widely used fewcorrectpredictions by a psychic,guaranteeing thatthey
and appreciated.A colleagueonce asked me whatI thought will be remembered, whileforgetting or ignoringthemuch
of thebook. I repliedthatit was absolutelywonderfuland more numerousincorrectpredictions."Because likelihood
thatthe modelingand data analyticphilosophythatit es- and quasi-likelihoodmethodsarebased on large-sampleap-
poused was visionary.Aftergoingon forseveralminutes,I proximations, an important area of developmentis thecon-
noticedthathe looked perplexed.When I inquiredwhy,he struction of testsand confidenceintervalsthatare accurate
replied"I thinkit's terrible-ithas no theorems."Perhaps in small-and moderate-sized samples.This maybe through
thatwas thepoint. "small-sampleasymptotics"(e.g., Jorgensen,1997; Skov-
gaard,1996) or via computationally intensivemethodslike
thebootstrap(Davison and Hinkley1997; Efronand Tibshi-
3. MAJORDEVELOPMENTS rani 1993). The extensionof GLMs to morecomplexcorre-
GLMs are now a maturedata-analytic methodology(e.g., lationstructures has been an area of activeresearchand will
Lindsey1996) and have been developedin numerousdirec- see more development.Models fortime series (e.g., Chan
tions.Thereare techniquesforchoosinglinkfunctionsand and Ledolter 1995), random-effects models (e.g., Breslow
diagnosinglink failures(e.g., Mallick and Gelfand 1994; and Clayton 1993; Stiratelli,Laird, and Ware 1984) and
Pregibon1980) as well as researchon theconsequencesof spatialmodels(e.g., Heagertyand Lele 1999) have all been
linkmisspecification (e.g.,Li and Duan 1989; Weisbergand proposed.Unfortunately, likelihoodanalysisof manyof the
Welsh 1994). Thereare techniquesforoutlierdetectionand models has led to intractable,high-dimensional integrals.
assessmentof case influenceformodelchecking(e.g.,Cook So, likewise,computingmethodsforthesemodelswill con-
and Croos-Dabrera1998; Pregibon1981). Thereare meth- tinue to be an ongoing area of development.Booth and
ods of modelingthedispersionparametersas a functionof Hobert(1999) and McCulloch (1997) used a Monte Carlo
covariates(e.g., Efron1986) and foraccommodatingmea- EM approach;Quintana,Liu and del Pino (1999) used a
surementerrorin thecovariates(e.g., Buzas and Stefanski stochasticapproximation algorithm; and Heagertyand Lele
1993; StefanskiandCarroll1990),as well as waysto handle (1999) took a compositelikelihoodtack.
generalizedadditivemodels (Hastie and Tibshirani1990). Attempts to avoid likelihoodanalysisvia techniquessuch
An extremelyimportantextensionof GLMs is the ap- as penalized quasi-likelihood(for a descriptionsee Bres-
proach pioneeredby Liang and Zeger (Liang and Zeger low and Clayton 1993) have not been entirelysuccessful.
1986; Zeger and Liang 1986) knownas generalizedesti- Approachesbased on workingvariates(e.g., Schall 1991)
matingequations(GEEs). In my opinion,GEEs made two and Laplace approximations(e.g., Wolfinger1994) gener-
valuable contributions: the accommodationof a wide ar- ate inconsistent estimates(Breslowand Lin 1994) and can
ray of correlateddata structures and thepopularizationof be badly biased fordistributions farfromnormal(i.e., the
the"robustsandwichestimator"of thevariance-covariance importantcase of Bernoulli-distributed data). Clearly,re-
structure. Currentsoftwareimplementations of GEEs are liable, well-tested,general-purpose fitting algorithmsneed
designedmostlyto accommodatelongitudinaldata struc- to be developedbeforethesemodelswill see regularuse in
tures;thatis, ones in which the data can be arrangedas practice.
repeatmeasurements on a seriesof independent"subjects" The inclusionof randomeffectsin GLMs raises several
(broadly interpreted, of course). The use of the "robust additionalquestions:What is theeffectof misspecification
sandwichestimator," whichbasically goes back to Huber of the random-effects distribution (e.g., Neuhaus, Hauck,
(1967) and Royall (1986), allows the specificationof a and Kalbfleisch1992) and how can it be diagnosed?What
"working"covariance structure. That is, the data analyst is thebest way to predicttherandomeffectsand how can
must specifya guess as to the correctcovariancestruc- predictionlimitsbe set,especiallyin small-and moderate-
ture,butinferencesremainasymptotically valideven if this sized samples(e.g.,Booth and Hobert1998)? How can out-
structure is incorrectly
specified(as it alwaysis to some de- lyingrandomeffectsbe identified or downweighted? All of
1324 Journalofthe AmericanStatisticalAssociation,December 2000
theseare important practicalquestionsthatmustbe thor- Biometrika,74, 233-245.

oughlyinvestigated forregulardata analysis. Fisher,R. A. (1935), Appendixto "The Calculationof theDose-Mortality
Curve" by C. Bliss, AnnalsofAppliedBiology,22, 164-165.
The whole idea behindGLMs is the developmentof a
Fitzmaurice,G. M. (1995), "A Caveat ConcerningIndependenceEstimat-
strategyand philosophyfor approachingstatisticalprob- ing EquationsWithMultivariate BinaryData," Biometrics,51, 309-317.
lems, especially those involvingnonnormallydistributed Hastie, T. J.,and Tibshirani,R. J. (1990), GenieralizedAdditiveModels,
data, in a way thatretainsmuch of the simplicityof lin- London: Chapmanand Hall.
ear models.Areas in whichlinearmodelshave been heav- Heagerty,P., and Lele, S. (1998), "A CompositeLikelihoodApproachto
BinarySpatial Data," Journalof theAmericanStatisticalAssociation,
ily used (e.g., simultaneousequation modelingin econo- 93, 1099-1111.
metrics)have and will see adaptationsforGLMs. As such, Heyde, C. C. (1997), Quasi-Likelihoodand Its Application,New York:
GLMs willcontinuein broaduse and development forsome Springer.
timeto come. Huber, P. J. (1967), "The Behaviour of Maximum Likelihood Estima-
torsUnderNonstandardConditions,"in Proceedingsof theFifthBerke-
ley Symposiumon MathematicalStatisticsand Probability,eds. L. M.
REFERENCES LeCam and J.Neyman,pp. 221-233.
Jorgensen, B. (1997), The TheoryofDispersionModels,London:Chapman
Bliss, C. (1934), "The Methodof Probits,"Science,79, 38-39.
and Hall.
(1935), "The Calculationof theDose-MortalityCurve,"Annalsof
Li, K-C., and Duan, N. (1989), "RegressionAnalysisUnderLink Viola-
AppliedBiology,22, 134-167.
tion,"TheAnnals of Statistics,17, 1009-1052.
Booth,J. G., and Hobert,J. P. (1998), "StandardErrorsof Predictionin
Liang, K-Y., and Zeger,S. L. (1986), "LongitudinalData AnalysisUsing
GeneralizedLinearMixed Models,"Journalof theAmericanStatistical
GeneralizedLinear Models,"Biometrika,73, 13-22.
Association,93, 262-272.
(1999), "Maximizing Generalized Linear Mixed Model Likeli- Lindsey,J. K. (1996), ApplyingGeneralizedLinear Models, New York:
hoods Withan AutomatedMonteCarlo EM Algorithm," Journalof the Springer.
Royal StatisticalSociety,Ser. B, 61, 265-285. Mallick, B. K., and Gelfand,A. E. (1994), "GeneralizedLinear Models
Box, G. E. P., and Cox, D. R. (1964), "An Analysisof Transformations" WithUnknownLink Functions,"Biometrika,81, 237-245.
(withdiscussion),Journalof the Royal StatisticalSociety,Ser. B, 26, McCullagh,P. (1983), "Quasi-LikelihoodFunctions,"TheAnnalsofStatis-
211-252. tics, 11, 59-67.
Breslow,N. E., and Clayton,D. G. (1993), "ApproximateInferencein McCullagh, P., and Nelder,J.(1983), GeneralizedLinearModels,London:
GeneralizedLinearMixed Models,"Journalof theAmericanStatistical Chapmanand Hall.
Association,88, 9-25. McCulloch,C. (1997), "MaximumLikelihoodAlgorithms forGeneralized
Breslow,N. E., and Lin, X. (1994), "Bias Correctionin GeneralizedLinear LinearMixed Models,"Journalof theAmericanStatisticalAssociation,
Mixed Models Witha SingleComponentof Dispersion,"Biometrika, 82, 92, 162-170.
81-91. Nelder, J. A., and Wedderburn, R. W. M. (1972), "GeneralizedLinear
Buzas, J. S., and Stefanski,L. A. (1996), "InstrumentalVariableEstima- Models,"Journalof theRoyal StatisticalSociety,Ser. A, 135, 370-384.
tionin GeneralizedLinear MeasurementErrorModels,"Journalof the Pregibon,D. (1980), "GoodnessofLinkTestsforGeneralizedLinearMod-
AmericanStatisticalAssociation,91, 999-1006. els,"AppliedStatistics,29, 15-24.
Chan,K. S., andLedolter,J.(1995), "MonteCarlo EM EstimationforTime (1981), "LogisticRegressionDiagnostics,"TheAnnalsofStatistics,
Series Models InvolvingCounts,"Journalof theAmericanStatistical 9, 705-724.
Association,90, 242-252. Quintana,R., Liu, J.,and del Pino, G. (1999), "MonteCarlo EM WithIm-
Cook, R. D., and Croos-Dabrera,R. (1998), "PartialResidualPlotsin Gen- portanceReweightingand Its Applicationin RandomEffectsModels,"
eralizedLinearModels,"JournaloftheAmerican] StatisticalAssociation, ComputationalStatisticsand Data Analysis,29, 429-444.
93, 730-739. Royall,R. (1986), "Model RobustInferenceUsing MaximumLikelihood
Cox, D. R. (1966), "Some ProceduresConnectedWiththeLogisticQuali- Estimators," Intemnational StatisticalReview,54, 221-226.
tativeResponseCurve,"in ResearchPapers in Statistics, ed. F. N. David, Schall,R. (1991), "Estimationin GeneralizedLinearModels WithRandom
New York:Wiley,pp. 55-72. Effects," Biometrika, 78, 719-727.
Cox, D. R., and Snell, E. J. (1989), Analysisof BinaryData, London: Skovgaard,I. M. (1996), "An ExplicitLarge-DeviationApproximation to
Chapmanand Hall. One-Parameter Tests,"Bernoulli,2, 145-165.
David, H. A. (1995), "First(?) Occurrenceof CommonTermsin Proba- Stefanski,L. A., and Carroll,R. J. (1990), "Score Tests in Generalized
bilityand Statistics,"TheAmericanStatistician,49, 121-133. Linear MeasurementErrorModels," Journalof the Royal Statistical
Davison, A. C., and Hinkley,D. V. (1997), BootstrapMethodsand Their Society,Ser. B, 52, 345-359.
Application,Cambridge,U.K.: CambridgeUniversity Press. Stiratelli,R., Laird,N., and Ware,J.H. (1984), "Random-Effects Models
Dempster,A. P., Laird,N. M., and Rubin,D. B. (1977), "MaximumLike- for Serial ObservationsWithBinaryResponse,"Biometrics,40, 961-
lihoodFromIncompleteData via theEM Algorithm"(withdiscussion), 971.
Journalof theRoyal StatisticalSociety,Ser. B, 39, 1-38. Wedderburn, R. W. M. (1974), "Quasi-LikelihoodFunctions,Generalized
Efron,B. (1986), "Double ExponentialFamiliesandTheirUse in General- Linear Models, and the Gauss-NewtonMethod,"Biometrika, 61, 439-
ized LinearRegression," JournaloftheAmericanStatisticalAssociation, 447.
81, 709-721. Weisberg,S., and Welsh,A. H. (1994), "Adaptingforthe MissingLink,"
Efron,B., and Tibshirani,R. J. (1993), An Introduction to theBootstrap, TheAnnals of Statistics,22, 1674-1700.
New York:Chapmanand Hall. Wolfinger, R. W. (1994), "Laplace's Approximation forNonlinearMixed
Finney,D. J. (1952), ProbitAnalysis,Cambridge,U.K.: CambridgeUni- Models,"Biometrika,80, 791-795.
versityPress. Zeger, S., and Liang, K-Y. (1986), "LongitudinalData AnalysisforDis-
Firth,D. (1987), "On the Efficiencyof Quasi-LikelihoodEstimation," creteand ContinuousOutcomes,"Biometrics,42, 121-130.

2.2000 Generalized Linear Models McCulloch

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

2.2000 Generalized Linear Models McCulloch

Enviado por

Direitos autorais:

Formatos disponíveis

Generalized Linear Models

Author(s): Charles E. McCulloch

Society,Ser. B, 56, 221-234. Perlman,M. D. (1980), "Unbiasednessof MultivariateTests: RecentRe-

Generalized Linear Models

1. INTRODUCTION AND SOME HISTORY binaryresponsevariableas a functionof one or morepre-

applyinga transformation corresponding to the inverseof layingout the recommendedcomputationalmethod.This

p = 1/(1 + exp[-X,8]) (7) z -g(u) + (y -,u)g,u) (12)

/ = (X'X)-1X'6 Y - p)(X/X)-1X/Y*, (11)

GLMs receiveda tremendous boostwiththedevelopment gree). Not surprisingly,the efficiency

theseare important practicalquestionsthatmustbe thor- Biometrika,74, 233-245.

Você também pode gostar