Escolar Documentos
Profissional Documentos
Cultura Documentos
Department of Quantitative Health Sciences, Cleveland Clinic, 9500 Euclid Ave, Wb4, Cleveland,
Ohio 44195, U.S.A
2
Division of Epidemiology, Department of Internal Medicine, University of Utah, 30 North 1900 East,
Salt Lake City, Utah 84148, U.S.A
email: lil2@ccf.org
Summary. In many longitudinal clinical studies, the level and progression rate of repeatedly measured biomarkers on each
subject quantify the severity of the disease and that subjects susceptibility to progression of the disease. It is of scientic and
clinical interest to relate such quantities to a later time-to-event clinical endpoint such as patient survival. This is usually done
with a shared parameter model. In such models, the longitudinal biomarker data and the survival outcome of each subject
are assumed to be conditionally independent given subject-level severity or susceptibility (also called frailty in statistical
terms). In this article, we study the case where the conditional distribution of longitudinal data is modeled by a linear mixedeect model, and the conditional distribution of the survival data is given by a Cox proportional hazard model. We allow
unknown regression coecients and time-dependent covariates in both models. The proposed estimators are maximizers of
an exact correction to the joint log likelihood with the frailties eliminated as nuisance parameters, an idea that originated
from correction of covariate measurement error in measurement error models. The corrected joint log likelihood is shown to
be asymptotically concave and leads to consistent and asymptotically normal estimators. Unlike most published methods for
joint modeling, the proposed estimation procedure does not rely on distributional assumptions of the frailties. The proposed
method was studied in simulations and applied to a data set from the Hemodialysis Study.
Key words: Corrected score; Cox model; Joint modeling; Measurement error; Regression spline; Shared parameter
model.
1. Introduction
In many longitudinal clinical studies, subject-specic estimates of the level and rate of change of longitudinal measurements of a biomarker are used to quantify the initial severity
and subsequent course of disease progression. It is often of
considerable interest to understand the relationship between
these characteristics of the biomarker and long-term timeto-event clinical endpoints characterizing the ultimate disease outcome. For example, Schluchter, Konstan, and Davis
(2002) studied the relationship between pulmonary function
and survival among cystic brosis patients. The pulmonary
function was measured by forced expiratory volume in one
second (FEV 1 ). Because FEV 1 was measured with error, a
single measurement is not representative of the severity of
pulmonary disease. Repeated measurements of FEV 1 were
taken and for each patient the trajectory of these measurements can be thought of as being approximately linear, with
the intercept and slope representing the level and progression
of the pulmonary function damage. Schluchter et al. (2002)
used a random intercept and slope model for the repeated
FEV 1 measurements. These subject-specic random eects
(frailties) were then used as predictors of survival. In that paper, the frailties and the survival time were assumed to follow
C
737
738
likelihood score equation of the survival submodel. Because exact correction does not exist for partial likelihood
(Augustin, 2004), the correction was done on a rst- or
second-order approximation to the partial likelihood score
equation instead. In this article, for the purpose of estimating both submodels, we work with their joint full likelihood,
instead of the likelihood from the survival submodel alone.
We show that the exact correction to this joint log likelihood
exists. To cope with the unspecied baseline hazard function
of the Cox model, we approximate it with a regression spline.
Sections 2 and 3 describe the model and method of estimation. The proposed estimators are shown to be the maximizers
of the corrected joint log likelihood, which is asymptotically
concave, thus assuring relatively straightforward computations. Simulations and the application to the HEMO data set
are presented in Sections 4 and 5. Discussions are in Section
6. Detailed derivations and theoretical properties of the estimator, including consistency and asymptotic normality, are
in the Web Appendix.
2. Model and Notation
Suppose the study has n subjects indexed by i = 1, 2 , . . . , n.
The ith subject has ni repeated biomarker measurements Wij
at times ti 1 < ti j < < ti n i . The longitudinal submodel
is specied in two steps as follows. In the rst step,
Wi j = ViTj + DTij i + i j .
(1)
This is a linear mixed model. V i j is a p-vector of timedependent covariates. i is a subject-specic random eect
vector. D i j is a q-vector. Often q = 2 and D i j = (1, tij )T ,
in which case i represents the random intercept and slope.
When D i j = 1, the model only has a random intercept, similar to that in Ratclie et al. (2004). Random eects of more
than two elements may be considered, for example, if we include a spline term of time in D i j to model the nonlinear
trajectory over time. In this article, we mainly focus on the
case when q = 2, though the methodology applies to other
cases with little modications. We assume that the residual
is normal with mean 0 and variance 2 .
In the second step, we specify a model for i that incorporates time-independent covariates:
i = Xi + i ,
(2)
(3)
i Ui (Yi )T
(7)
2
Wi Xi Di i
ni
,
LLi () = log 22
2
22
LSi () =
Mi
g =0
(i Xi )T 1
(i Xi ).
2
Yi
739
(4)
(8)
, (9)
(10)
Yi
mi (t)dt
0
Mi
min(g +1 , Yi ) g 1
g =0
Mi
g =0
M
i
Yig Ui (min(g , Yi ))T . Let U 1i g =
We have i Ui (Yi )T = g =0
( 1 (min( g , Yi )) , . . . , J (min( g , Yi )), ZTi )T and U 2i g = i .
Then equation (5) becomes
Mi
g =0
Yi g UT1i g 1 + iT 2 + log(ci g )
(i)
(ii)
(iii)
mi (min(g , Yi )).
(iv)
n
n
ni
LLi () =
i =1
n
i.
In i Di (Di Di ) Di i , then i = Di i + R
i are independent given . E(R
i ) = 0.
i and R
i
i |i N i , 2 (DTi Di )1 . Let i Ti = 2 (DTi
D i )1 . We can write i = i + i ei , where e i is a
random vector of standard multivariate normal distribution that is independent of i and the survival
submodel.
i 2 /(ni q).
An unbiased estimator of 2 is R
Based on these results, we can derive the following expressions, conditional on i , i = 1, 2, . . . , n (details in Web Appendix A).
i =
= (DTi Di )1 DTi i and R
Let i = Wi Vi , i
T
1 T
(Wi Vi )
log(22 )
In i
Di (DTi
i =1
(6)
i =1
(1 + op (1)),
22
np
2
Di )
DTi
(Wi Vi )
(11)
740
n
LSi () =
Mi
n
i =1
1T U1i g + 2T DTi Di
i =1 g =0
DTi
Yig
(Wi Vi )
n
Xi + log(ci g )
2T 2 DTi Di
DTi
Di
1
1
i =1
+ 2T
DTi
components. The variance components are nuisance parameters in most applications. We start with the case in which
2 is known. In this case 1 can be estimated by maximizing
equation (14), or equivalently, solving its score equation. The
following theorem summarizes the properties of the estimator.
2T
exp
1
Mi
exp
1T U1i g
g =0
(Wi Vi )
2T
Xi + log(ci g )
(1 + op (1)),
n
(12)
LMi () =
i =1
n
nq
log(2) log | |
2
2
n
1
(
i Xi )T 1
i Xi )
(
2
i =1
n
T 1
1
2
+
tr 1
Di Di
2
(1 + op (1)).
(13)
i =1
In these expressions, we eliminated the unknown nuisance parameters i , i = 1, 2 , . . . , n by making use of the conditional
independence of W i and (Yi , i ) given i . This technique
is closely related to the corrected score method in the measurement error literature (Nakamura, 1990). It views i as an
error-prone measurement of i in the additive error model
i = i + i ei with heterogeneous error. It is the same idea
used in Wang (2006). In that paper, i is a statistic and involves no unknown parameters. Therefore, standard corrected
score method for Cox partial likelihood (Nakamura, 1992) can
be applied after this additive error relationship is derived. In
our problem, because we have unknown parameters in the
longitudinal submodel, i is no longer a statistic. Hence, only
correcting the partial likelihood of the survival submodel is
not enough. We have to use the joint likelihood from both
submodels. For this purpose, it is a more natural choice to
work with the full likelihood of the Cox model instead of
its partial likelihood. Another important distinction of our
work from Wang (2006) is that because exact correction to
the partial likelihood score does not exist (Augustin, 2004),
their correction was done on the rst- or second-order approximation to the partial likelihood score, similar to Nakamura
(1992). In this article, we have exact correction to the joint
log likelihood, as shown in equations (11)(13).
Summing up the three pieces derived above, we have the
joint corrected log likelihood:
L()
=
n
i =1
i () +
LL
n
i =1
i () +
LS
n
i ().
LM
Theorem 1. Assuming 2 is known and the trapezoids approximation and regression spline approximation to the log base 1 ) as a function
line hazard are exact, for given value of , L(
of 1 is asymptotically concave. Its score equation with respect
1 )/1 = 0 is an unbiased estimating equato 1 : S(1 ) = L(
tion for 1 . The estimator 1 is consistent and asymptotically
normal as n and all ni s are bounded above by a positive integer. These conclusions hold without any distributional
assumptions on i .
The proof is deferred to Web Appendix B. Theorem 1 shows
that although the joint corrected log likelihood was derived
under the assumption of normal frailties, the estimator remains consistent without that assumption. The consistency
result holds even when the covariance matrix of the random
eect, , is misspecied (see Web Appendix B). Theorem 1
also suggests that when n is large enough, 1 is uniquely dened and the computation involves maximization of a concave
function, which is straightforward in many software packages.
In our numerical simulations, we experienced no convergence
problem.
The following is the practical estimation procedure that
incorporates the estimation of 2 :
(1) If there is no time-dependent covariate V, we can estimate 2 consistently by
n
2 =
i =1
n
1
DTi Wi
.
(ni q)
If there is a time-dependent covariate V, we rst estimate consistently by the following estimating equation, which is the derivative of the quadratic form in
equation (11):
0=
n
ViT In i Di DTi Di
1
DTi (Wi Vi ).
i =1
i (), LS
i (), and LM
i () denote the approximations on
LL
the right-hand side of equations (11)(13).
i =1
T In Di DTi Di
(Wi Vi )
i
(14)
i =1
WiT In i Di DTi Di
2 =
i =1
n
1
DTi (Wi Vi )
.
(ni q)
i =1
(2) From
2 and any given , we can estimate 1 by solving the unbiased estimating equation S( 1 ) = 0, which
is equivalent to maximizing equation (14).
10
time
0.2
0.4
0.6
0.8
1.0
741
0.0
10
time
Figure 1. Median estimated baseline hazard and survival functions with three internal knots (dotted), six internal knots
(dashed). The true hazard/survival functions are in solid lines. The result is based on 200 simulation runs. The sample size
is 250.
n
n
i =1
n
1
i =1
Xi
T XTi .
i =1
742
Table 1
Assessing the performance of the proposed method. n: sample size; emp SD: sample standard deviations of the 200 point
estimators from the simulations; ave SE: average estimated standard errors of the point estimators; coverage: coverage
probability of the 95% condence interval.
Parameter
(=true value)
=1
1 = 1
2 = 2
3 = 1
4 = 0.5
a 1 = 0.5 (Z)
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
=1
1 = 1
2 = 2
3 = 1
4 = 0.5
a 1 = 0.5 (Z)
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
250
1000
Bias
Emp SD
Ave SE
Coverage(%)
0.00751
0.00869
0.00830
0.00256
0.00639
0.0239
0.0469
0.103
0.000473
0.00268
0.00119
0.000801
0.00229
0.0157
0.0122
0.0417
0.0389
0.105
0.146
0.0601
0.0791
0.0948
0.153
0.360
0.0209
0.0480
0.0693
0.0302
0.0431
0.0402
0.0631
0.148
0.0406
0.103
0.140
0.0575
0.0779
0.0923
0.150
0.354
0.0206
0.0520
0.0705
0.0290
0.0391
0.0413
0.0619
0.143
96.5
96.0
94.5
94.5
95.5
95.0
97.0
96.5
95.5
97.0
96.0
93.0
93.0
95.0
96.5
95.5
743
Table 2
Compare the naive method with the proposed joint modeling method. n: sample size; emp SD: sample standard deviations of the
200 point estimators from the simulations; MSE: mean squared error; coverage: coverage probability of the 95% condence
interval.
Sample
size
(a)
n = 250
(b)
n = 1000
(c)
n = 250
(d)
n = 1000
Bias
Emp SD
MSE
Parameter
(=true value)
Naive
Joint
Naive
Joint
Naive
Joint
Coverage(%)
Joint
1 = 1
2 = 2
3 = 1
4 = 0.5
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
1 = 1
2 = 2
3 = 1
4 = 0.5
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
1 = 1
2 = 2
3 = 1
4 = 0.5
a 1 = 0.5 (Z)
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
1 = 1
2 = 2
3 = 1
4 = 0.5
a 1 = 0.5 (Z)
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
0.00197
0.00370
0.00591
0.0104
0.347
0.471
0.00455
0.00244
0.00277
0.000365
0.345
0.499
0.00769
0.00421
0.000279
0.00147
0.0304
0.114
0.292
0.00510
0.000560
0.00260
0.00364
0.0299
0.122
0.290
0.00299
0.00571
0.00659
0.0118
0.0196
0.0552
0.00656
0.00635
0.00184
0.00146
0.00752
0.00493
0.00769
0.00421
0.000277
0.00147
0.0210
0.0377
0.0526
0.00510
0.000561
0.00260
0.00364
0.0116
0.00513
0.00196
0.105
0.144
0.0561
0.0800
0.0550
0.140
0.0510
0.0740
0.0284
0.0400
0.0256
0.0675
0.0985
0.147
0.0529
0.0843
0.160
0.0684
0.136
0.0520
0.0757
0.0268
0.0411
0.0795
0.0315
0.0654
0.103
0.137
0.0546
0.0709
0.111
0.258
0.0482
0.0667
0.0267
0.0360
0.0517
0.114
0.0985
0.147
0.0529
0.0843
0.191
0.130
0.287
0.0520
0.0757
0.0268
0.0411
0.0893
0.0509
0.113
0.105
0.144
0.0564
0.0807
0.351
0.491
0.0512
0.0740
0.0285
0.0400
0.346
0.504
0.0988
0.147
0.0529
0.0843
0.163
0.133
0.322
0.0522
0.0757
0.0269
0.0413
0.0849
0.126
0.297
0.103
0.137
0.0550
0.0719
0.113
0.264
0.0486
0.0670
0.0268
0.0360
0.0522
0.114
0.0988
0.147
0.0529
0.0843
0.192
0.135
0.292
0.0522
0.0757
0.0269
0.0413
0.0901
0.0512
0.113
94.5
94.0
94.0
97.0
96.0
95.5
95.5
97.0
95.0
96.5
93.5
93.5
97.0
96.0
97.5
93.5
96.0
95.5
96.0
92.5
94.0
95.5
94.5
95.0
95.0
95.0
Table 3
Fitting the joint model using regression spline with three or six internal knots. Emp SD: sample standard deviations of the 200
point estimators from the simulations.
Parameter
(=true value)
1 = 1
2 = 2
3 = 1
4 = 0.5
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
1 = 1
2 = 2
3 = 1
4 = 0.5
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
Sample
size
250
1000
Bias
3
0.00616
0.000893
0.00612
0.00543
0.0292
0.0435
0.00681
0.0107
0.000871
0.00117
0.000986
0.00791
5. Application
We now analyze the HEMO data introduced in Section 1. We
included serum albumin measurements from biannual visits
scheduled approximately 6 months apart. The data consist
of 1628 patients with between 3 and 15 repeated albumin
measurements each (including baseline). For the purpose of
illustrating the proposed method, we excluded patients with
Emp SD
6
0.00609
0.000774
0.00616
0.00548
0.0295
0.0444
0.00679
0.0106
0.000879
0.00117
0.000609
0.00875
0.0956
0.125
0.0607
0.0796
0.107
0.212
0.0524
0.0686
0.0296
0.0368
0.0564
0.101
0.0956
0.125
0.0607
0.0796
0.107
0.212
0.0525
0.0686
0.0296
0.0368
0.0565
0.101
744
Table 4
Applying the proposed method to the HEMO data set
Lower CI
3.656
0.00121
0.00700
0.0581
0.0143
0.0103
0.0257
0.0151
0.0163
0.0163
0.0119
0.0142
0.0142
0.0107
3.626
0.0308
0.0390
0.0814
0.0421
0.0381
0.0467
3.685
0.0332
0.0250
0.0349
0.0134
0.0175
0.00466
<0.001
0.941
0.668
<0.001
0.311
0.468
0.0167
0.0606
0.0689
1.464
3.708
0.0900
0.0898
0.179
0.378
0.237
0.245
1.815
4.449
0.116
0.107
1.113
2.968
0.500
0.443
<0.001
<0.001
6. Discussion
We have proposed a semiparametric joint modeling approach
for both longitudinal and survival data. A distinguishing feature of this method is that unlike most published joint modeling methods, it does not require any distributional assumption for the frailties. Imposing such assumptions may improve
the eciency of estimation but could also lead to increased
bias. When the sample size is large, as in the application we
studied, bias considerations often outweigh concerns with efciency. Our method is more general than the corrected score
method proposed by Wang (2006) in that we allow unknown
regression parameters in both the longitudinal and survival
submodels. This generalization was necessary to handle the
calibration of albumin in the HEMO study application. As
discussed in Section 3.3, incorporating unknown parameters
from the longitudinal submodel raises some new challenges
not addressed by the framework of Wang (2006). Therefore,
we initiated a new line of attack by using an exact correction
to the joint log full likelihood of both submodels, instead of
a correction to the second-order approximation of the Cox
partial likelihood of the survival submodels alone. Many joint
p-value
0.8
0.6
0.4
0.2
calibration. There are two frailty terms in the survival submodel: the random intercept and slope. They characterize the
adjusted subject-specic level of albumin and susceptibility
to albumin decline. The treatment groups are included in the
survival submodel as well. Table 4 presents the results from
the joint modeling. The log baseline hazard was approximated
by a piecewise linear regression spline with six internal knots.
The corresponding analysis using three internal knots gave
similar results and was omitted. The result from the longitudinal submodel shows that the albumin is in an overall declining
trend (the coecient of time in years), and the result from the
survival submodel shows that both frailties signicantly predict survival, and that decreased level of albumin or steeper
declining slope are associated to increased risk of death. Note
that the albumin measurements taken on Mondays and Tuesdays are generally lower than measurements taken on other
weekdays (p = 0.0167). Therefore, our conjecture is conrmed
that calibration of albumin is necessary for analysis involving
longitudinal albumin measurements. The estimated baseline
survival function was plotted in Figure 2.
Upper CI
1.0
S.E.
0.0
Longitudinal submodel
Intercept
High dose
High ux
Time in years
High dose by time
High ux by time
Mon/Tues
Survival submodel
High dose
High ux
Frailty intercept
Frailty slope
Estimator
Survival
Parameter
Time (Years)
Figure 2. Estimated baseline survival function and its pointwise 95% condence interval (gray area).
modeling methods demand intensive computations, and convergence could be a problem. We have proven that our target function for maximization is asymptotically concave, and
we did not experience nonconvergence in the simulations presented in Tables 13.
A limitation of the proposed method is that each subject
must have at least three nonmissing repeated longitudinal
measurements for it to work with the random intercept and
slope model, as in the application. This is also a limitation of
the method of Wang (2006). In studies that are designed to
address a joint modeling problem, the design should stipulate
adequate measurements early in the follow-up period so that
sucient repeated measurements are available prior to censoring for the large majority of subjects. Although methods
that use parametric distributions for the random eects allow
patients to have less than three repeated measurements, these
subjects provide little information regarding their random
Acknowledgements
This work was partially supported by grant 5U01DK
053869-06 and grant 5U01DK045430-14 from the National Institute of Diabetes and Digestive and Kidney Diseases.
References
Augustin, T. (2004). An exact corrected log-likelihood function for
Coxs proportional hazards model under measurement error and
some extensions. Scandinavian Journal of Statistics 31, 4350.
Cai, T., Hyndman, R. J., and Wand, M. P. (2002). Mixed modelbased hazard estimation. Journal of Computational and Graphical
Statistics 11, 784798.
Carroll, R., Ruppert, D., and Stefanski, L. (1995). Nonlinear Measurement Error Models. New York: Chapman & Hall/CRC.
745
Hsieh, F., Tseng, Y.-K., and Wang, J.-L. (2006). Joint modeling of survival and longitudinal data: Likelihood approach revisited. Biometrics 62, 10371043.
Kooperberg, C., Stone, C. J., and Truong, Y. K. (1995). Hazard regression. Journal of the American Statistical Association 90, 7894.
Li, E., Zhang, D., and Davidian, M. (2004). Conditional estimation
for generalized linear models when covariates are subject-specic
parameters in a mixed model for longitudinal measurements. Biometrics 60, 17.
Li, L. and Greene, T. (2008). Varying coecients model with measurement error. Biometrics 64, 519526.
Liu, M. and Ying, Z. (2007). Joint analysis of longitudinal data with
informative right censoring. Biometrics 63, 363371.
Nakamura, T. (1990). Corrected score function for errors-in-variables
models: Methodology and application to generalized linear models. Biometrika 77, 127137.
Nakamura, T. (1992). Proportional hazards model with covariates subject to measurement error. Biometrics 48, 829838.
Nan, B., Lin, X., Lisabeth, L. D., and Harlow, S. (2005). A varyingcoecient Cox model for the eect of age at a marker event on
age at menopause. Biometrics 61, 576583.
Ratclie, S. J., Guo, W., and Ten Have, T. R. (2004). Joint modeling of
longitudinal and survival data via a common frailty. Biometrics
60, 892899.
Rocco, M., Dwyer, J., Larive, B., Greene, T., Cockram, D. B., Chumlea, W. C., Kusek, J. W., Leung, J., Burrowes, J. D., McLeroy, S.
L., Poole, D., and Uhlin, L. for the HEMO Study Group. (2004).
The eect of dialysis dose and membrane ux on nutritional parameters in hemodialysis patients: Results of the HEMO study.
Kidney International 65, 23212334.
Schluchter, M. D., Greene, T., and Beck, G. J. (2001). Analysis of
change in the presence of informative censoring: Application to
a longitudinal clinical trial of progressive renal disease. Statistics
in Medicine 20, 9891007.
Schluchter, M. D., Konstan, M. W., and Davis, P. B. (2002). Jointly
modelling the relationship between survival and pulmonary function in cystic brosis patients. Statistics in Medicine 21, 1271
1287.
Ten Have, T. R., Miller, M. E., Reboussin, B. A., and James, M. K.
(2000). Mixed eects logistic regression models for longitudinal
ordinal functional response data with multiple-cause drop-out
from the longitudinal study of aging. Biometrics 56, 279287.
Tseng, Y.-K., Hsieh, F., and Wang, J.-L. (2005). Joint modelling of
accelerated failure time and longitudinal data. Biometrika 92,
587603.
Tsiatis, A. A. and Davidian, M. (2004). Joint modeling of longitudinal
and time-to-event data: An overview. Statistica Sinica 14, 809
834.
Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. New York: Springer-Verlag.
Wang, C.-Y. (2006). Corrected score estimator for joint modeling of
longitudinal and failure time data. Statistica Sinica 16, 235253.
Wang, Y. and Taylor, J. M. G. (2001). Jointly modeling longitudinal
and event time data with application to acquired immunodeciency syndrome. Journal of the American Statistical Association
96, 895905.
Wu, M. C. and Carroll, R. J. (1988). Estimation and comparison of
changes in the presence of informative right censoring by modeling
the censoring process. Biometrics 44, 175188.