A Semiparametric Joint Model - Application To Hemodialysis Study

DOI: 10.1111/j.1541-0420.2008.01168.
Biometrics 65, 737745

September 2009
A Semiparametric Joint Model for Longitudinal and Survival Data

with Application to Hemodialysis Study
Liang Li,1, Bo Hu,1 and Tom Greene2
1
Department of Quantitative Health Sciences, Cleveland Clinic, 9500 Euclid Ave, Wb4, Cleveland,
Ohio 44195, U.S.A
2
Division of Epidemiology, Department of Internal Medicine, University of Utah, 30 North 1900 East,
Salt Lake City, Utah 84148, U.S.A
email: lil2@ccf.org
Summary. In many longitudinal clinical studies, the level and progression rate of repeatedly measured biomarkers on each
subject quantify the severity of the disease and that subjects susceptibility to progression of the disease. It is of scientic and
clinical interest to relate such quantities to a later time-to-event clinical endpoint such as patient survival. This is usually done
with a shared parameter model. In such models, the longitudinal biomarker data and the survival outcome of each subject
are assumed to be conditionally independent given subject-level severity or susceptibility (also called frailty in statistical
terms). In this article, we study the case where the conditional distribution of longitudinal data is modeled by a linear mixedeect model, and the conditional distribution of the survival data is given by a Cox proportional hazard model. We allow
unknown regression coecients and time-dependent covariates in both models. The proposed estimators are maximizers of
an exact correction to the joint log likelihood with the frailties eliminated as nuisance parameters, an idea that originated
from correction of covariate measurement error in measurement error models. The corrected joint log likelihood is shown to
be asymptotically concave and leads to consistent and asymptotically normal estimators. Unlike most published methods for
joint modeling, the proposed estimation procedure does not rely on distributional assumptions of the frailties. The proposed
method was studied in simulations and applied to a data set from the Hemodialysis Study.
Key words: Corrected score; Cox model; Joint modeling; Measurement error; Regression spline; Shared parameter
model.
1. Introduction
In many longitudinal clinical studies, subject-specic estimates of the level and rate of change of longitudinal measurements of a biomarker are used to quantify the initial severity
and subsequent course of disease progression. It is often of
considerable interest to understand the relationship between
these characteristics of the biomarker and long-term timeto-event clinical endpoints characterizing the ultimate disease outcome. For example, Schluchter, Konstan, and Davis
(2002) studied the relationship between pulmonary function
and survival among cystic brosis patients. The pulmonary
function was measured by forced expiratory volume in one
second (FEV 1 ). Because FEV 1 was measured with error, a
single measurement is not representative of the severity of
pulmonary disease. Repeated measurements of FEV 1 were
taken and for each patient the trajectory of these measurements can be thought of as being approximately linear, with
the intercept and slope representing the level and progression
of the pulmonary function damage. Schluchter et al. (2002)
used a random intercept and slope model for the repeated
FEV 1 measurements. These subject-specic random eects
(frailties) were then used as predictors of survival. In that paper, the frailties and the survival time were assumed to follow

C
2009, The International Biometric Society
a trivariate Gaussian distribution and the parameters were

estimated by maximizing the likelihood. The model used by
Schluchter et al. (2002) is a typical joint model of the longitudinal and survival data, called the shared parameter model
in the literature (Ten Have et al., 2000; Tsiatis and Davidian,
2004). It is a two-stage hierarchical model. The rst stage
is the distribution of the frailties, which are usually subjectspecic characteristics (severity of disease or susceptibility to
the progression of the disease) that may inuence both the
longitudinal biomarker measurements and the survival outcome. In the example above, the frailities characterize the
baseline level and subsequent progression rate of the biomarkers. In other applications, the frailty could represent clustering eects due to dierent study sites (Ratclie, Guo, and
Ten Have, 2004). The second stage species two conditionally independent distributions, given the frailties: one for the
longitudinal outcomes, and one for the survival outcome. The
shared parameter model is widely used in joint modeling literature. For example, Liu and Ying (2007) used it for a model
relating the rate of change of glomerular ltration rate to the
time to end stage renal disease or death. Wang (2006) considered a problem in which the prole of serum hormone levels
737
738
Biometrics, September 2009
may be associated with long-term cardiovascular disease risk.

See Tsiatis and Davidian (2004) for a review of joint modeling
with shared parameter models.
In this article, we consider a general form of the shared
parameter model that uses a linear mixed model as the longitudinal submodel, and a Cox proportional hazard model
as the survival submodel. Our general specication includes
as special cases the models considered by Wang (2006), Ratclie et al. (2004), Hsieh, Tseng, and Wang (2006), Tsiatis
and Davidian (2004), among others. The application presented in this article came from the Hemodialysis (HEMO)
Study. The HEMO study was a clinical trial with two treatment interventions: standard or high dose of dialysis, low- or
high-ux dialyzer. The primary endpoint was all-cause mortality. The primary analysis result has been published (Rocco
et al., 2004) and neither treatment was found to have a signicant eect on mortality. In this article, we use the data to
illustrate the proposed method as well as address a secondary
question: whether the variation in albumin concentration over
time, an indicator of disease severity, is a strong predictor of
mortality. The overall trend of albumin among this patient
population is that it declines over time. Hence, we can view
the level and rate of decline of albumin as measures of the
severity of the disease. We used a linear mixed model with
subject-specic random intercepts and slopes to approximate
the joint distribution of biannually spaced albumin concentrations over time. The survival part is a Cox model that takes
the random eects as covariates. A challenge to our problem is
that variations in total body water associated with the intermittent treatment schedule of hemodialysis leads to articial
dierences in albumin concentrations associated with the specic weekday of the albumin measurements. The patients in
HEMO study received dialysis three times a week, following
either a MondayWednesdayFriday or a TuesdayThursday
Saturday schedule. The albumin measurements were made on
selected days of dialysis. It is anticipated that albumin measurements made on Monday or Tuesday might be lower than
those made on the other weekdays because the volume of total body water would be expected to be higher following a
2-day gap without dialysis. Therefore, it is advantageous to
include an indicator of Monday or Tuesday in the longitudinal
submodel as a time-dependent covariate.
Our proposed method of estimation diers from the published literature in several aspects. First, the proposed estimator is consistent regardless of the distribution of the frailties.
Most published methods make a parametric assumption for
the frailty distribution (e.g., Wang and Taylor, 2001; Tsiatis
and Davidian, 2004; Tseng, Hsieh, and Wang, 2005; Hsieh
et al., 2006; Liu and Ying, 2007, among others). Our idea
originates from the research of measurement error problems
(Li, Zhang, and Davidian, 2004; Li and Greene, 2008). Second, we allow the proposed model to have unknown regression coecients in both the longitudinal and survival submodels. This diers from Wang (2006), whose procedure also
avoids requiring a specic form for the frailty distribution,
but does not accommodate unknown regression parameters in
the longitudinal submodel. Wang (2006) treated the subjectspecic least-squares estimators of random intercept and slope
as a measurement of the true, unknown frailties, and applied a covariate measurement error correction to the partial
likelihood score equation of the survival submodel. Because exact correction does not exist for partial likelihood
(Augustin, 2004), the correction was done on a rst- or
second-order approximation to the partial likelihood score
equation instead. In this article, for the purpose of estimating both submodels, we work with their joint full likelihood,
instead of the likelihood from the survival submodel alone.
We show that the exact correction to this joint log likelihood
exists. To cope with the unspecied baseline hazard function
of the Cox model, we approximate it with a regression spline.
Sections 2 and 3 describe the model and method of estimation. The proposed estimators are shown to be the maximizers
of the corrected joint log likelihood, which is asymptotically
concave, thus assuring relatively straightforward computations. Simulations and the application to the HEMO data set
are presented in Sections 4 and 5. Discussions are in Section
6. Detailed derivations and theoretical properties of the estimator, including consistency and asymptotic normality, are
in the Web Appendix.
2. Model and Notation
Suppose the study has n subjects indexed by i = 1, 2 , . . . , n.
The ith subject has ni repeated biomarker measurements Wij
at times ti 1 < ti j < < ti n i . The longitudinal submodel
is specied in two steps as follows. In the rst step,
Wi j = ViTj + DTij i + i j .
(1)
This is a linear mixed model. V i j is a p-vector of timedependent covariates. i is a subject-specic random eect
vector. D i j is a q-vector. Often q = 2 and D i j = (1, tij )T ,
in which case i represents the random intercept and slope.
When D i j = 1, the model only has a random intercept, similar to that in Ratclie et al. (2004). Random eects of more
than two elements may be considered, for example, if we include a spline term of time in D i j to model the nonlinear
trajectory over time. In this article, we mainly focus on the
case when q = 2, though the methodology applies to other
cases with little modications. We assume that the residual
is normal with mean 0 and variance 2 .
In the second step, we specify a model for i that incorporates time-independent covariates:
i = Xi + i ,
(2)
X i is a q 2(r + 1) design matrix of the time-independent

covariates, including the intercept, where r is the number of
time-independent covariates. i is assumed to have mean 0
and variancecovariance matrix . This two-step specication of the linear mixed model was described in Section 3.2
of Verbeke and Molenberghs (2000). In the next section, we
initially assume that i is multivariate normal and derive
unbiased estimating equations for the parameters. Then we
will show that the estimating equations remain unbiased even
when the normality assumption is dropped.
Next, we describe the survival submodel, conditional on
the frailty i . Let Ti and Ci denote the time to event and
time to censoring for the ith subject. Yi = min(Ti , Ci ) is
the observed event or censoring time, and i = 1{T i C i } is the
event indicator. We model the log hazard function by
h(t; Zi , i ) = h0 (t) + ZTi a1 + iT a2 .
(3)
Semiparametric Joint Model with Application to Hemodialysis Study

This is a Cox proportional hazard model, where h 0 (t) is the
log baseline hazard function. Z i is a vector of covariates. To
simplify notation, we assume that Z i is time independent;
however, the proposed method still applies when Z i is time
dependent.
The two-stage hierarchical structure can be seen clearly
in this shared parameter model. The rst stage characterizes
the distribution of the frailty i . The second stage includes
the conditionally independent distributions of {Wij , j = 1,
2 , . . . , ni } and {Yi , i } given i .
The following notation is useful for later presentations: Wi = (Wi 1 , . . . , Wi n i )T , Vi = (Vi 1 , . . . , Vi n i )T , Di =
(Di 1 , . . . , Di n i )T .
3. Estimation Procedure
3.1 Poissonization of the Cox Model
We rst transform the full likelihood of the Cox model into
a Poisson likelihood form, using an idea from Cai, Hyndman,
and Wand (2002). It paves the road for developing the corK
rected likelihood later. Let h0 (t) k =1 a0k k (t) be a regression spline approximation to the log baseline hazard function,
where k (t), k = 1, 2 , . . . , K are known basis functions. Under
this approximation, the hazard function of the Cox model can
be written as exp{Ui (t)T }, where U i (t) = ( 1 (t) , . . . , K (t),
ZTi , Ti )T , = (T1 , T2 )T , 1 = (a 01 , . . . , a 0K , aT1 )T ,
2 = a 2 . The conditional log likelihood of {Yi , i }, given
i , is
i Ui (Yi )T
3.2 The Joint Log Likelihood

The joint likelihood of {W i , Yi , i , i } follows a two-stage
hierarchical structure. The rst stage denes the distribution
of i (and equivalently i ); the second stage denes the two
conditionally independent distributions W i given i and {Yi ,
i } given i . Therefore, the joint log likelihood of {W i , Yi ,
i , i } is
Li () = LLi () + LSi () + LMi (),
(7)
where LLi , LSi are the conditional log likelihoods of W i and

{Yi , i } given i (or equivalently, i ) and LMi is the marginal
log likelihood of i . = (T , T , T , 2 , vec( )T )T is the
vector of all the unknown parameters. The expression vec( )
denotes the vector dened by the lower triangle of the symmetric matrix .
2

Wi Xi Di i
ni
,
LLi () = log 22
2
22
LSi () =
Mi

Yig UT1i g 1 + iT 2 2T Xi + log(ci g )
g =0
exp UT1i g 1 + iT 2 2T Xi + log(ci g )

1
q
LMi () = log(2) log | |
2
2
1
(i Xi )T 1
(i Xi ).
2
Yi
exp{Ui (t)T }dt.
739
(4)
(8)

, (9)
(10)
We now use the trapezoidal rule to approximate the integral

in equation (4). Let 0 = 1 = 0 < 1 < < G = max{Yi }
be grid points that span the whole study period. They can be
chosen arbitrarily, but they must be dense enough to provide
a good approximation to the integral (see below). For each
i, let index Mi be dened through M i 1 < Yi M i . Write
mi (t) = exp{U i (t)T },
Yi
mi (t)dt
0
Mi

min(g +1 , Yi ) g 1
g =0
Mi
exp{Ui (min(g , Yi ))T + log(ci g )}. (5)
g =0
For i = 1, 2 , . . . , n, and g = 0, 1 , . . . , Mi , dene Y i g =

0 for g = 0, 1 , . . . , Mi 1, and Y i g = i for g = Mi .
M
i
Yig Ui (min(g , Yi ))T . Let U 1i g =
We have i Ui (Yi )T = g =0
( 1 (min( g , Yi )) , . . . , J (min( g , Yi )), ZTi )T and U 2i g = i .
Then equation (5) becomes
Mi

g =0
Yi g UT1i g 1 + iT 2 + log(ci g )
(i)
(ii)
(iii)
mi (min(g , Yi )).
Denote (min( g +1 , Yi ) g 1 )/2 by cig , g = 0, 1 , . . . , Mi , cig

> 0. The log likelihood (equation 4) becomes
i Ui (Yi )T
3.3 The Joint Corrected Log Likelihood

Conditional on i , and assuming is known, we have the
following results from linear model theory:
exp UT1i g 1 + iT 2 + log(ci g )
(iv)
n
n
ni
LLi () =
i =1
n
i.
In i Di (Di Di ) Di i , then i = Di i + R
i are independent given . E(R
i ) = 0.
i and R
i

i |i N i , 2 (DTi Di )1 . Let i Ti = 2 (DTi
D i )1 . We can write i = i + i ei , where e i is a
random vector of standard multivariate normal distribution that is independent of i and the survival
submodel.
i 2 /(ni q).
An unbiased estimator of 2 is R
Based on these results, we can derive the following expressions, conditional on i , i = 1, 2, . . . , n (details in Web Appendix A).

i =
= (DTi Di )1 DTi i and R
Let i = Wi Vi , i

T
1 T
(Wi Vi )
log(22 )
In i
Di (DTi
i =1
(6)
This is the log likelihood of a Poisson regression of Y i g on

U 1i (min( g , Yi )) and i , with oset cig .
i =1
(1 + op (1)),
22
np
2
Di )
DTi
(Wi Vi )
(11)
740
n

LSi () =
Mi
n
i =1
1T U1i g + 2T DTi Di
i =1 g =0
DTi

Yig
(Wi Vi )
n
Xi + log(ci g )
2T 2 DTi Di
DTi
Di
1
1
i =1
+ 2T
DTi
components. The variance components are nuisance parameters in most applications. We start with the case in which
2 is known. In this case 1 can be estimated by maximizing
equation (14), or equivalently, solving its score equation. The
following theorem summarizes the properties of the estimator.
2T
exp
1
Mi

exp
1T U1i g
g =0
(Wi Vi )
2T

Xi + log(ci g )
(1 + op (1)),
n
(12)

LMi () =
i =1
n
nq
log(2) log | |
2
2
n
1
(
i Xi )T 1
i Xi )
(
2
i =1
n
T 1
1
2
+
tr 1
Di Di
2

(1 + op (1)).
(13)
i =1
In these expressions, we eliminated the unknown nuisance parameters i , i = 1, 2 , . . . , n by making use of the conditional
independence of W i and (Yi , i ) given i . This technique
is closely related to the corrected score method in the measurement error literature (Nakamura, 1990). It views i as an
error-prone measurement of i in the additive error model
i = i + i ei with heterogeneous error. It is the same idea
used in Wang (2006). In that paper, i is a statistic and involves no unknown parameters. Therefore, standard corrected
score method for Cox partial likelihood (Nakamura, 1992) can
be applied after this additive error relationship is derived. In
our problem, because we have unknown parameters in the
longitudinal submodel, i is no longer a statistic. Hence, only
correcting the partial likelihood of the survival submodel is
not enough. We have to use the joint likelihood from both
submodels. For this purpose, it is a more natural choice to
work with the full likelihood of the Cox model instead of
its partial likelihood. Another important distinction of our
work from Wang (2006) is that because exact correction to
the partial likelihood score does not exist (Augustin, 2004),
their correction was done on the rst- or second-order approximation to the partial likelihood score, similar to Nakamura
(1992). In this article, we have exact correction to the joint
log likelihood, as shown in equations (11)(13).
Summing up the three pieces derived above, we have the
joint corrected log likelihood:
L()
=
n

i =1
i () +
LL
n

i =1
i () +
LS
n
i ().
LM
Theorem 1. Assuming 2 is known and the trapezoids approximation and regression spline approximation to the log base 1 ) as a function
line hazard are exact, for given value of , L(
of 1 is asymptotically concave. Its score equation with respect
1 )/1 = 0 is an unbiased estimating equato 1 : S(1 ) = L(
tion for 1 . The estimator 1 is consistent and asymptotically
normal as n and all ni s are bounded above by a positive integer. These conclusions hold without any distributional
assumptions on i .
The proof is deferred to Web Appendix B. Theorem 1 shows
that although the joint corrected log likelihood was derived
under the assumption of normal frailties, the estimator remains consistent without that assumption. The consistency
result holds even when the covariance matrix of the random
eect, , is misspecied (see Web Appendix B). Theorem 1
also suggests that when n is large enough, 1 is uniquely dened and the computation involves maximization of a concave
function, which is straightforward in many software packages.
In our numerical simulations, we experienced no convergence
problem.
The following is the practical estimation procedure that
incorporates the estimation of 2 :
(1) If there is no time-dependent covariate V, we can estimate 2 consistently by
n
2 =
3.4 The Estimation Procedure

Let = ( T1 , T2 )T , where 1 = (T , T , T )T contains the
mean parameters, and 2 = ( 2 , vec( )T )T are variance
i =1
n
1
DTi Wi
.
(ni q)
If there is a time-dependent covariate V, we rst estimate consistently by the following estimating equation, which is the derivative of the quadratic form in
equation (11):
0=
n
ViT In i Di DTi Di
1
DTi (Wi Vi ).
i =1
The solution has a closed form expression similar to

a weighted least-squares estimator. Then, a consistent
estimator of 2 is:
n
i (), LS
i (), and LM
i () denote the approximations on
LL
the right-hand side of equations (11)(13).
i =1
T In Di DTi Di
(Wi Vi )
i
(14)
i =1
WiT In i Di DTi Di
2 =
i =1
n
1
DTi (Wi Vi )
.
(ni q)
i =1
(2) From
2 and any given , we can estimate 1 by solving the unbiased estimating equation S( 1 ) = 0, which
is equivalent to maximizing equation (14).
10
time
0.2
0.4
0.6
0.8
1.0
741
0.0
median baseline survival
median baseline hazard
0.5 1.0 1.5 2.0 2.5
10
time
Figure 1. Median estimated baseline hazard and survival functions with three internal knots (dotted), six internal knots
(dashed). The true hazard/survival functions are in solid lines. The result is based on 200 simulation runs. The sample size
is 250.
(3) A better choice of may improve the eciency of

the estimator. We recommend that, after obtaining a
consistent estimator of 1 as in step (2), one updates
using the following method of moments estimator
n
n

1 T 1
= 1
i iT
2
Di Di
n
n
i =1
n
1
i =1
Xi
T XTi .
i =1
Note that here we made use of the relationship

i } = Xi . One
Var{
i } = + 2 (DTi Di )1 and E{
then re-estimates 1 using this updated value for .
(4) The covariance matrix of 1 can be estimated using
S(1 ) and the sandwich method (Section A.3 of Carroll,
Ruppert, and Stefanski, 1995). Wald-type condence interval of the parameter can be constructed based on
the asymptotic normality of 1 . In calculating the sandwich variance estimator for 1 , we replaced the unknown
variance components and 2 by their consistent es and
timators
2 . This practice can be justied by
observing that the consistency and asymptotic variance
of 1 do not rely on correct specication of . In addition, in many applications the sampling error in
2
will be suciently small as to have only a negligible efeective degrees
fect on the variance of 1 because its
n
n than to
of freedom is closer in magnitude to
i =1 i
n. The simulation results in the next section show that
the variance estimator is accurate and the condence
interval has satisfactory coverage probability. If one is
interested in adjusting for the sampling variation in the
estimated 2 , one can rst derive the estimating equation for 2 using the results in remark (1) above, and
then apply the stacked estimating equation method on
pages 268269 of Carroll et al. (1995).
For S( 1 ) = 0 to be unbiased, it is necessary that the
regression spline approximate the log baseline hazard well.
Regression spline approximation has been used to estimate

the hazard in Cox models with time-varying hazard ratios
(Kooperberg, Stone, and Truong, 1995; Nan et al., 2005). Because the baseline hazard is usually not of direct interest, we
do not require the estimated hazard to be smooth. We used
the truncated linear basis with internal knots being the sample quantiles of Yi , where i = 1. The estimated curve is not
smooth but can be very close to the true curve (Figure 1) when
many knots are used. In fact, because biomedical data usually
have a lot of variation, the baseline hazard is unlikely a very
wiggly curve. In such cases a small number of internal knots
would suce. Simulations in the next section suggest that
1 is not sensitive to the estimation of the baseline hazard.
One may notice that the third piece of equation (14) can
uniquely identify itself, without the help of the other pieces.
The third piece is proportional to the log likelihood of the unweighted least-squares method (UWLS) introduced in Wu and
Carroll (1988). For the case with random intercept and slope
model, the UWLS runs a simple intercept-slope regression for
each subject to get an estimator of i . Then an ordinary leastsquare regression of these estimated i on X i is conducted to
estimate . With the help of the likelihood from the survival
submodel (second piece), some parameters in may be estimated with better eciency under certain conditions. This
will be discussed further in the next section.
4. Simulations
We conducted simulations to assess the nite sample performance of the proposed method. The data were generated
from the model in Section 2. In the rst stage of the model,
i = ( i 1 , i 2 )T includes the random intercept and slope,
which are independent and uniformly distributed with mean
0. Var( i 1 ) = 1.0, var( i 2 ) = 0.25. X i is a block diagonal
matrix with the two blocks being the 1 2 matrix (1, Trt i ).
Trt i = 0 or 1 is a treatment group indicator generated from a
Binomial distribution with probability 0.5. We dene = (1,
2, 1, 0.5)T . Under this specication, the four parameters in
represent the overall intercept, the main eect of treatment,
the main eect of time (mean slope of treatment group 0), and
742
Table 1
Assessing the performance of the proposed method. n: sample size; emp SD: sample standard deviations of the 200 point
estimators from the simulations; ave SE: average estimated standard errors of the point estimators; coverage: coverage
probability of the 95% condence interval.
Parameter
(=true value)
=1
1 = 1
2 = 2
3 = 1
4 = 0.5
a 1 = 0.5 (Z)
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
=1
1 = 1
2 = 2
3 = 1
4 = 0.5
a 1 = 0.5 (Z)
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
250
1000
Bias
Emp SD
Ave SE
Coverage(%)
0.00751
0.00869
0.00830
0.00256
0.00639
0.0239
0.0469
0.103
0.000473
0.00268
0.00119
0.000801
0.00229
0.0157
0.0122
0.0417
0.0389
0.105
0.146
0.0601
0.0791
0.0948
0.153
0.360
0.0209
0.0480
0.0693
0.0302
0.0431
0.0402
0.0631
0.148
0.0406
0.103
0.140
0.0575
0.0779
0.0923
0.150
0.354
0.0206
0.0520
0.0705
0.0290
0.0391
0.0413
0.0619
0.143
96.5
96.0
94.5
94.5
95.5
95.0
97.0
96.5
95.5
97.0
96.0
93.0
93.0
95.0
96.5
95.5
the interaction between treatment group and time (dierence

in mean slopes between treatment groups). The longitudinal
submodel also includes a time-dependent covariate V that
has a standard normal distribution. We also took = 1 and
= 1.2. The survival submodel has a time-independent
covariate Z that either has a uniform distribution between
2 and 2 or equals to Trt i . a 1 = 0.5, a 2 = (0.5, 1.0)T . The
log baseline hazard is a quadratic function 2 + 0.03t2 . The
censoring time Ci has a uniform distribution between 3 and
10. By design, the repeated measurements should be taken
at time points 2, 1, 0, . . . , 6, but not beyond Yi . This
design ensures that each subject has at least three repeated
measurements. The censoring rate of the data set is roughly
30%.
The simulations were conducted under two sample sizes,
n = 250 and n = 1000. We used 50 equally spaced grid points
for the trapezoidal approximation. The regression spline has
three internal knots at equally spaced sample quantiles of Yi
with i = 1. Table 1 shows the result from 200 simulation
runs. The proposed estimators have little bias relative to the
magnitude of the true values, the average of estimated variance is close to the empirical variance of the point estimators,
and the condence intervals have reasonable coverage probabilities. The overall performance of the proposed method improves with larger sample sizes. Partly due to the concavity
of the joint corrected log likelihood function, the algorithm
converged in all the simulations.
We now compare our method with the naive method that
does not perform covariate measurement error correction.
The naive method uses UWLS to estimate coecients of the
longitudinal submodel and the subject-specic frailties, and
ts the survival submodel with the unknown frailties replaced
by their UWLS estimates. Because UWLS cannot handle
time-dependent covariates in the longitudinal submodel, in
this simulation we do not include V in the data-generating
model. Table 2 presents the comparison. As a result of the

error-correction, the parameters of the survival submodel
were estimated with much less bias than the naive method;
the standard errors are a little larger due to the adjustment
for the uncertainty in the frailties, a common phenomenon in
measurement error analysis (Wang, 2006); and overall we have
substantial reduction in MSE, especially with larger sample
sizes.
Although not our main focus, the eciency of parameter
estimates of the longitudinal submodel was also investigated
in Table 2. In cases (a) and (b), the treatment group aects
the mean slope and intercept of the longitudinal submodel
but not the survival, i.e., there is no Z in the survival submodel; (c) and (d) are cases where Zi = Trt i in the survival
submodel. The proposed estimators of 1 4 generally have
better eciency than the naive estimators in (a) and (b). In
these two cases, the second piece of the joint corrected log
likelihood (equation 14) provides constraint to in the likelihood maximization (mainly through the T2 X i term) in
addition to the third piece that corresponds to the likelihood
of UWLS. This may lead to eciency gain. The proposed
joint modeling method provides greater gains in eciency for
the estimators of 2 and 4 than the other two regression
coecients, because their corresponding terms in X i involve
Trt i . In a model with no treatment groups (e.g., estimating
the mean rate of change of a single homogeneous cohort), X i
becomes a diagonal matrix and the joint modeling estimator
of is the same as the UWLS estimator. In cases (c) and
(d), because Trt i also appears in T1 U 1i g , T2 X i no longer
constrains and there is no eciency gain.
Table 3 compares the results of tting the same simulated
data sets with three or six internal knots. The mean parameters are not sensitive to these choices. The baseline hazard
and survival functions were estimated with slightly more bias
with three knots (Figure 1).
743
Table 2
Compare the naive method with the proposed joint modeling method. n: sample size; emp SD: sample standard deviations of the
200 point estimators from the simulations; MSE: mean squared error; coverage: coverage probability of the 95% condence
interval.
Sample
size
(a)
n = 250
(b)
n = 1000
(c)
n = 250
(d)
n = 1000
Bias
Emp SD
MSE
Parameter
(=true value)
Naive
Joint
Naive
Joint
Naive
Joint
Coverage(%)
Joint
1 = 1
2 = 2
3 = 1
4 = 0.5
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
1 = 1
2 = 2
3 = 1
4 = 0.5
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
1 = 1
2 = 2
3 = 1
4 = 0.5
a 1 = 0.5 (Z)
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
1 = 1
2 = 2
3 = 1
4 = 0.5
a 1 = 0.5 (Z)
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
0.00197
0.00370
0.00591
0.0104
0.347
0.471
0.00455
0.00244
0.00277
0.000365
0.345
0.499
0.00769
0.00421
0.000279
0.00147
0.0304
0.114
0.292
0.00510
0.000560
0.00260
0.00364
0.0299
0.122
0.290
0.00299
0.00571
0.00659
0.0118
0.0196
0.0552
0.00656
0.00635
0.00184
0.00146
0.00752
0.00493
0.00769
0.00421
0.000277
0.00147
0.0210
0.0377
0.0526
0.00510
0.000561
0.00260
0.00364
0.0116
0.00513
0.00196
0.105
0.144
0.0561
0.0800
0.0550
0.140
0.0510
0.0740
0.0284
0.0400
0.0256
0.0675
0.0985
0.147
0.0529
0.0843
0.160
0.0684
0.136
0.0520
0.0757
0.0268
0.0411
0.0795
0.0315
0.0654
0.103
0.137
0.0546
0.0709
0.111
0.258
0.0482
0.0667
0.0267
0.0360
0.0517
0.114
0.0985
0.147
0.0529
0.0843
0.191
0.130
0.287
0.0520
0.0757
0.0268
0.0411
0.0893
0.0509
0.113
0.105
0.144
0.0564
0.0807
0.351
0.491
0.0512
0.0740
0.0285
0.0400
0.346
0.504
0.0988
0.147
0.0529
0.0843
0.163
0.133
0.322
0.0522
0.0757
0.0269
0.0413
0.0849
0.126
0.297
0.103
0.137
0.0550
0.0719
0.113
0.264
0.0486
0.0670
0.0268
0.0360
0.0522
0.114
0.0988
0.147
0.0529
0.0843
0.192
0.135
0.292
0.0522
0.0757
0.0269
0.0413
0.0901
0.0512
0.113
94.5
94.0
94.0
97.0
96.0
95.5
95.5
97.0
95.0
96.5
93.5
93.5
97.0
96.0
97.5
93.5
96.0
95.5
96.0
92.5
94.0
95.5
94.5
95.0
95.0
95.0
Table 3
Fitting the joint model using regression spline with three or six internal knots. Emp SD: sample standard deviations of the 200
point estimators from the simulations.
Parameter
(=true value)
1 = 1
2 = 2
3 = 1
4 = 0.5
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
1 = 1
2 = 2
3 = 1
4 = 0.5
a 21 = 0.5 ( i 1 )
a 22 = 1 ( i 2 )
Sample
size
250
1000
Bias
3
0.00616
0.000893
0.00612
0.00543
0.0292
0.0435
0.00681
0.0107
0.000871
0.00117
0.000986
0.00791
5. Application
We now analyze the HEMO data introduced in Section 1. We
included serum albumin measurements from biannual visits
scheduled approximately 6 months apart. The data consist
of 1628 patients with between 3 and 15 repeated albumin
measurements each (including baseline). For the purpose of
illustrating the proposed method, we excluded patients with
Emp SD
6
0.00609
0.000774
0.00616
0.00548
0.0295
0.0444
0.00679
0.0106
0.000879
0.00117
0.000609
0.00875
0.0956
0.125
0.0607
0.0796
0.107
0.212
0.0524
0.0686
0.0296
0.0368
0.0564
0.101
0.0956
0.125
0.0607
0.0796
0.107
0.212
0.0525
0.0686
0.0296
0.0368
0.0565
0.101
only one or two repeated albumin measurements because at

least three albumin measurements per patient are required
for this method to work. Fixed eects in the longitudinal
submodel include the overall intercept, indicator variables
for randomized assignment to the high dose and high ux
treatment groups, time, the treatment groups by time interactions, and the time-dependent covariate for Monday/Tuesday
744
Table 4
Applying the proposed method to the HEMO data set
Lower CI
3.656
0.00121
0.00700
0.0581
0.0143
0.0103
0.0257
0.0151
0.0163
0.0163
0.0119
0.0142
0.0142
0.0107
3.626
0.0308
0.0390
0.0814
0.0421
0.0381
0.0467
3.685
0.0332
0.0250
0.0349
0.0134
0.0175
0.00466
<0.001
0.941
0.668
<0.001
0.311
0.468
0.0167
0.0606
0.0689
1.464
3.708
0.0900
0.0898
0.179
0.378
0.237
0.245
1.815
4.449
0.116
0.107
1.113
2.968
0.500
0.443
<0.001
<0.001
6. Discussion
We have proposed a semiparametric joint modeling approach
for both longitudinal and survival data. A distinguishing feature of this method is that unlike most published joint modeling methods, it does not require any distributional assumption for the frailties. Imposing such assumptions may improve
the eciency of estimation but could also lead to increased
bias. When the sample size is large, as in the application we
studied, bias considerations often outweigh concerns with efciency. Our method is more general than the corrected score
method proposed by Wang (2006) in that we allow unknown
regression parameters in both the longitudinal and survival
submodels. This generalization was necessary to handle the
calibration of albumin in the HEMO study application. As
discussed in Section 3.3, incorporating unknown parameters
from the longitudinal submodel raises some new challenges
not addressed by the framework of Wang (2006). Therefore,
we initiated a new line of attack by using an exact correction
to the joint log full likelihood of both submodels, instead of
a correction to the second-order approximation of the Cox
partial likelihood of the survival submodels alone. Many joint
p-value
0.8
0.6
0.4
0.2
calibration. There are two frailty terms in the survival submodel: the random intercept and slope. They characterize the
adjusted subject-specic level of albumin and susceptibility
to albumin decline. The treatment groups are included in the
survival submodel as well. Table 4 presents the results from
the joint modeling. The log baseline hazard was approximated
by a piecewise linear regression spline with six internal knots.
The corresponding analysis using three internal knots gave
similar results and was omitted. The result from the longitudinal submodel shows that the albumin is in an overall declining
trend (the coecient of time in years), and the result from the
survival submodel shows that both frailties signicantly predict survival, and that decreased level of albumin or steeper
declining slope are associated to increased risk of death. Note
that the albumin measurements taken on Mondays and Tuesdays are generally lower than measurements taken on other
weekdays (p = 0.0167). Therefore, our conjecture is conrmed
that calibration of albumin is necessary for analysis involving
longitudinal albumin measurements. The estimated baseline
survival function was plotted in Figure 2.
Upper CI
1.0
S.E.
0.0
Longitudinal submodel
Intercept
High dose
High ux
Time in years
High dose by time
High ux by time
Mon/Tues
Survival submodel
High dose
High ux
Frailty intercept
Frailty slope
Estimator
Survival
Parameter
Time (Years)
Figure 2. Estimated baseline survival function and its pointwise 95% condence interval (gray area).
modeling methods demand intensive computations, and convergence could be a problem. We have proven that our target function for maximization is asymptotically concave, and
we did not experience nonconvergence in the simulations presented in Tables 13.
A limitation of the proposed method is that each subject
must have at least three nonmissing repeated longitudinal
measurements for it to work with the random intercept and
slope model, as in the application. This is also a limitation of
the method of Wang (2006). In studies that are designed to
address a joint modeling problem, the design should stipulate
adequate measurements early in the follow-up period so that
sucient repeated measurements are available prior to censoring for the large majority of subjects. Although methods
that use parametric distributions for the random eects allow
patients to have less than three repeated measurements, these
subjects provide little information regarding their random

eects. Therefore, the parametric assumption is very dicult
to verify empirically for those methods.
Most joint modeling methods can be applied to two kinds of
problems: (1) using the frailty from the longitudinal submodel
to predict the survival, where the focus is on the survival
part; (2) using the survival submodel to adjust for informative
censoring due to terminal events that are correlated with the
frailty, where the survival submodel is nuisance. Schluchter
et al. (2002) exemplies the rst type of application, while
Schluchter, Greene, and Beck (2001) is an example of the
second type of application. In this article, we have focused on
the rst problem. The proposed method can be applied to the
second problem as well, subject to the following observations.
First, if informative censoring causes a substantial number of
subjects to have few (e.g., less than three) repeated measures,
then the proposed method may not apply. Second, if there
is no time-dependent covariate in the longitudinal submodel,
our simulation has shown that the estimates of the coecients
of the longitudinal submodel are similar to those from the
UWLS method, which may not be ecient when the numbers
of repeated measures are small. Third, when there is a timedependent covariate in the longitudinal submodel and one is
not willing to impose a parametric distributional assumption
on the frailty, the UWLS method does not apply, and our
method is currently the only viable choice.
Poissonization is not inevitable for the proposed approach
to work. In the log-likelihood of the Cox model (equation 4),
the unknown random eect appears either in the linear term,
or in the exponential term. In both cases, it is correctable
(Nakamura, 1990, Section 4.1). Exact correction exists regardless of the way baseline hazard is approximated or modeled.
Therefore, one can modify our approach by using numerical
integration methods other than the trapezoidal rule for the
integral in equation (4), and can still derive a corrected joint
log likelihood. This is shown theoretically in the proof of Theorem 1 in Web Appendix B.
7. Supplementary Materials
Web Appendices A and B referenced in Sections 2, 3, and 6 are
available under the Paper Information link at the Biometrics
website http://www.biometrics.tibs.org.
Acknowledgements
This work was partially supported by grant 5U01DK
053869-06 and grant 5U01DK045430-14 from the National Institute of Diabetes and Digestive and Kidney Diseases.
References
Augustin, T. (2004). An exact corrected log-likelihood function for
Coxs proportional hazards model under measurement error and
some extensions. Scandinavian Journal of Statistics 31, 4350.
Cai, T., Hyndman, R. J., and Wand, M. P. (2002). Mixed modelbased hazard estimation. Journal of Computational and Graphical
Statistics 11, 784798.
Carroll, R., Ruppert, D., and Stefanski, L. (1995). Nonlinear Measurement Error Models. New York: Chapman & Hall/CRC.
745
Hsieh, F., Tseng, Y.-K., and Wang, J.-L. (2006). Joint modeling of survival and longitudinal data: Likelihood approach revisited. Biometrics 62, 10371043.
Kooperberg, C., Stone, C. J., and Truong, Y. K. (1995). Hazard regression. Journal of the American Statistical Association 90, 7894.
Li, E., Zhang, D., and Davidian, M. (2004). Conditional estimation
for generalized linear models when covariates are subject-specic
parameters in a mixed model for longitudinal measurements. Biometrics 60, 17.
Li, L. and Greene, T. (2008). Varying coecients model with measurement error. Biometrics 64, 519526.
Liu, M. and Ying, Z. (2007). Joint analysis of longitudinal data with
informative right censoring. Biometrics 63, 363371.
Nakamura, T. (1990). Corrected score function for errors-in-variables
models: Methodology and application to generalized linear models. Biometrika 77, 127137.
Nakamura, T. (1992). Proportional hazards model with covariates subject to measurement error. Biometrics 48, 829838.
Nan, B., Lin, X., Lisabeth, L. D., and Harlow, S. (2005). A varyingcoecient Cox model for the eect of age at a marker event on
age at menopause. Biometrics 61, 576583.
Ratclie, S. J., Guo, W., and Ten Have, T. R. (2004). Joint modeling of
longitudinal and survival data via a common frailty. Biometrics
60, 892899.
Rocco, M., Dwyer, J., Larive, B., Greene, T., Cockram, D. B., Chumlea, W. C., Kusek, J. W., Leung, J., Burrowes, J. D., McLeroy, S.
L., Poole, D., and Uhlin, L. for the HEMO Study Group. (2004).
The eect of dialysis dose and membrane ux on nutritional parameters in hemodialysis patients: Results of the HEMO study.
Kidney International 65, 23212334.
Schluchter, M. D., Greene, T., and Beck, G. J. (2001). Analysis of
change in the presence of informative censoring: Application to
a longitudinal clinical trial of progressive renal disease. Statistics
in Medicine 20, 9891007.
Schluchter, M. D., Konstan, M. W., and Davis, P. B. (2002). Jointly
modelling the relationship between survival and pulmonary function in cystic brosis patients. Statistics in Medicine 21, 1271
1287.
Ten Have, T. R., Miller, M. E., Reboussin, B. A., and James, M. K.
(2000). Mixed eects logistic regression models for longitudinal
ordinal functional response data with multiple-cause drop-out
from the longitudinal study of aging. Biometrics 56, 279287.
Tseng, Y.-K., Hsieh, F., and Wang, J.-L. (2005). Joint modelling of
accelerated failure time and longitudinal data. Biometrika 92,
587603.
Tsiatis, A. A. and Davidian, M. (2004). Joint modeling of longitudinal
and time-to-event data: An overview. Statistica Sinica 14, 809
834.
Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. New York: Springer-Verlag.
Wang, C.-Y. (2006). Corrected score estimator for joint modeling of
longitudinal and failure time data. Statistica Sinica 16, 235253.
Wang, Y. and Taylor, J. M. G. (2001). Jointly modeling longitudinal
and event time data with application to acquired immunodeciency syndrome. Journal of the American Statistical Association
96, 895905.
Wu, M. C. and Carroll, R. J. (1988). Estimation and comparison of
changes in the presence of informative right censoring by modeling
the censoring process. Biometrics 44, 175188.
Received January 2008. Revised July 2008.

Accepted July 2008.

A Semiparametric Joint Model - Application To Hemodialysis Study

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

A Semiparametric Joint Model - Application To Hemodialysis Study

Enviado por

Direitos autorais:

Formatos disponíveis

DOI: 10.1111/j.1541-0420.2008.01168.

Biometrics 65, 737745

A Semiparametric Joint Model for Longitudinal and Survival Data

2009, The International Biometric Society

a trivariate Gaussian distribution and the parameters were

Biometrics, September 2009

may be associated with long-term cardiovascular disease risk.

X i is a q 2(r + 1) design matrix of the time-independent

Semiparametric Joint Model with Application to Hemodialysis Study

3.2 The Joint Log Likelihood

where LLi , LSi are the conditional log likelihoods of W i and

Yig UT1i g 1 + iT 2 2T Xi + log(ci g )

exp UT1i g 1 + iT 2 2T Xi + log(ci g )

exp{Ui (t)T }dt.

We now use the trapezoidal rule to approximate the integral

exp{Ui (min(g , Yi ))T + log(ci g )}. (5)

For i = 1, 2 , . . . , n, and g = 0, 1 , . . . , Mi , dene Y i g =

Denote (min( g +1 , Yi ) g 1 )/2 by cig , g = 0, 1 , . . . , Mi , cig

3.3 The Joint Corrected Log Likelihood

exp UT1i g 1 + iT 2 + log(ci g )

This is the log likelihood of a Poisson regression of Y i g on

Biometrics, September 2009

3.4 The Estimation Procedure

The solution has a closed form expression similar to

median baseline survival

median baseline hazard

0.5 1.0 1.5 2.0 2.5

Semiparametric Joint Model with Application to Hemodialysis Study

(3) A better choice of may improve the eciency of

Note that here we made use of the relationship

Regression spline approximation has been used to estimate

Biometrics, September 2009

the interaction between treatment group and time (dierence

model. Table 2 presents the comparison. As a result of the

Semiparametric Joint Model with Application to Hemodialysis Study

only one or two repeated albumin measurements because at

Biometrics, September 2009

Semiparametric Joint Model with Application to Hemodialysis Study

Received January 2008. Revised July 2008.

Você também pode gostar