Escolar Documentos
Profissional Documentos
Cultura Documentos
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/7617538
CITATIONS READS
39 39
5 authors, including:
Rosemary Martino
University of Toronto
78 PUBLICATIONS 1,376 CITATIONS
SEE PROFILE
All content following this page was uploaded by Rosemary Martino on 18 February 2016.
The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document
and are linked to publications on ResearchGate, letting you access and read them immediately.
American Journal of Gastroenterology ISSN 0002-9270
C 2005 by Am. Coll. of Gastroenterology doi: 10.1111/j.1572-0241.2005.50141.x
Published by Blackwell Publishing
ORIGINAL CONTRIBUTIONS
OBJECTIVES: To develop a measure of disease-specific health-related quality of life for achalasia for use as an
outcome measure in clinical trials.
METHODS: We generated a list of potential items for a measure of disease-specific health-related quality of life
for achalasia by semistructured interviews with seven persons with achalasia, and by expert opinion.
We then used factor analysis and item response theory methods for item reduction, using responses
on the long-form questionnaire from 70 persons with achalasia. The severity measure underlying the
item responses was constructed using a Rasch model.
RESULTS: We developed a 10-item measure of disease-specific health-related quality of life that sampled the
concepts of food tolerance, dysphagia-related behavior modifications, pain, heartburn, distress,
lifestyle limitation, and satisfaction. The measure was reliable (person separation reliability 0.79,
Cronbachs 0.83), showed evidence of construct validity and good data-to-model fit (mean infit and
outfit statistics for items, 1.00 and 0.98, respectively), and had a wide effective measurement range
(able to discriminate between 87% of subjects with achalasia). The measure was recalibrated onto a
0100 interval-level scale.
CONCLUSIONS: We describe a reliable measure of achalasia disease-specific health-related quality of life that has a
broad effective measurement range, interval-level properties, and evidence of construct validity. This
measure is appropriate for use as an outcome measure in clinical trials and other evaluative studies
on the effectiveness of treatment for achalasia.
(Am J Gastroenterol 2005;100:16681676)
Achalasia is a rare chronic disease of esophageal motor func- dence comparing the effectiveness of alternative treatments.
tion characterized by failure of the lower esophageal sphincter The low incidence of achalasia (7) and scarcity of random-
(LES) to relax upon swallowing, and absence of peristalsis in ized trials comparing surgical myotomy and balloon dilata-
the esophageal body. Persons with achalasia experience dys- tion (8) have made it difficult to compare the effectiveness of
phagia, regurgitation, and chest pain. Treatment is directed alternative therapeutic approaches. However, the biggest im-
toward relieving obstruction at the level of the LES, by sur- pediment to date has been the absence of a valid and reliable
gical myotomy, balloon dilatation, or injection of botulinum measure of achalasia disease-specific health-related quality
toxin (1). These procedures vary with respect to effectiveness, of life. Since achalasia is not curable, the goal of treatment is
safety and durability (2). With the introduction of minimally to improve symptoms and quality of life of affected persons.
invasive surgical techniques, laparoscopic surgery for acha- Although there are clinical measures of achalasia symp-
lasia has become increasingly popular (3, 4), primarily due toms, such as dysphagia, these measures were not developed
to a belief that laparoscopic surgery is more effective than for use in patients with achalasia (9, 10). We sought to de-
endoscopic therapy (5) and is associated with less operative velop a disease-specific measure of health-related quality of
risk and disability than conventional open surgery. However, life for use in patients with achalasia that was valid and re-
there is considerable difference of opinion among clinicians liable. Further, because we wanted to develop an instrument
whether laparoscopic surgery or balloon dilatation is the best that would be useful as an outcome measure in clinical trials,
initial approach to treating achalasia (1, 2, 6). we desired a measure with interval-level properties (11). In
Controversy regarding the optimal management of acha- this article, we describe the development and initial validation
lasia has been exacerbated by a paucity of high-quality evi- of the measure.
1668
A Measure of Achalasia Quality of Life 1669
Figure 1. Item probability curve for the question: How often in the past month have you needed to drink water while eating to deal with
food caught in your esophagus? The probability of responding to each of the item categories is plotted according to the recalibrated overall
score (0100) on the measure.
4 items on dysphagia symptoms scored yes or no; many distance between 2 logits and 1 logits is the same as the
respondents answered only 1 of 4 items), or items for which distance between 0 logits and +1 logits, and the same as the
there was uniformity of response across respondents. Using distance between +2 logits and +3 logits. We used the logit
this method, we were able to reduce the instrument to a 17- scale to recalibrate the raw score on the final measure into
item measure. We also collapsed response categories within a scale ranging from 0 to 100, with 0 indicating very low
items, to ensure that there was the minimum of 10 responses severity and 100 indicting the highest possible severity.
per response category required for our modeling strategy.
Further item reduction was accomplished by use of the Rasch Effective Measurement Range
model, by retaining one item of a group whose mean severity For each item the probability of responding a certain way can
score across response categories overlapped. This method be modeled as a function of the severity of the underlying trait.
enabled us to retain those items that measured severity across For example, the probability of answering never/rarely to
different locations of the underlying scale. the question How often in the past month have you needed
to drink water while eating to deal with food caught in your
Analysis of Fit esophagus? should be higher among persons with milder
Both person and item trait level are mapped onto the same achalasia than the probability of answering sometimes or
logit scale in the Rasch model. The probability of responding frequently/every time I eat. This relationship can be plot-
a certain way to each item can be estimated for each person. ted as an item probability curve (Fig. 1). There is a score on
Fit of the model to the data can then be calculated, and is usu- the measure that corresponds to the point where the proba-
ally quantified using statistics called infit (a 2 fit statistic bility of responding never/rarely is the same as responding
sensitive to lack of fit for persons responding to items close sometimesthis is the score where the probability curves
to their severity level) and outfit (a 2 fit statistic sensitive for these two responses intersect. The score at which this oc-
to lack of fit for persons responding to items far from their curs (i.e., the probability of responding to both categories is
severity level). The expected value of mean-square infit and the same) is called the Thurstone Threshold. When all such
outfit statistics is 1.0, and a range of 0.61.4 is considered thresholds for all items are plotted on the scale, the effective
reasonable (16, 20). These statistics may also be standardized measurement range extends from the threshold closest to the
to a mean of 0 and standard deviation of 1.0. lowest trait severity to the threshold closest to the highest
trait severity level. We estimated the proportion of persons
Reliability whose trait severity was more extreme than the levels at which
We estimated the scale reliability using the person separa- the lowest and highest thresholds were located. The measure
tion reliability statistic (the proportion of variance not due to cannot reliably discriminate between persons with trait levels
measurement error) and Cronbachs (internal consistency that are either all below the lowest threshold or all above the
reliability). highest threshold.
categories for items not included in the final instrument. We Figure 2 shows the location of the item responses for all 10
tested for an association between the instrument score and items on the recalibrated measure (scored 0100, see below),
severity across response categories using the test for linear as well as the recalibrated score for each person (depicted as
trend and t-tests where appropriate. tick marks along the horizontal). The items are spaced along
the vertical axis in order of the mean location of the entire
item on the measure, in decreasing order of disease severity.
RESULTS
Study Subjects Goodness of Fit
After subsequent mailings to initial nonresponders, we re- Goodness-of-fit statistics are presented in Table 2. The infit
ceived 70 usable questionnaires from the 121 mailed to and outfit statistics for all 10 items were within a range in-
subjects with a known address (58% response). The seven dicating good fit of the model to the data. The infit statistic
subjects who were interviewed for the item generation pro- ranged from 0.78 to 1.34 and the outfit statistic ranged from
cess also completed questionnaires. Characteristics of the re- 0.72 to 1.53 for all 10 items. The mean (mean-square) infit
spondents are presented in Table 1. and outfit statistics were 1.00 and 0.98, respectively
Figure 2. Location of item responses and persons along the spectrum of achalasia severity. Items are listed along the vertical axis in
descending order of severity as measured by the item location parameter of the Rasch model. The horizontal axis has been recalibrated onto
a 0100 interval-level scale.
range, and has interval-level measurement properties. We Lack of an outcome measure with good measurement
developed this instrument for use as an outcome measure properties has been an important limitation of clinical tri-
in evaluative research on the effectiveness of treatment for als of achalasia therapy. The only clinical trial ever under-
achalasia. Use of a common metric with good measurement taken comparing surgery with balloon dilatation used an ad
properties will facilitate the design and interpretation of clin- hoc instrument with undefined measurement properties (8).
ical trials of achalasia therapy, and will also provide a method In a recent trial comparing surgery with botulinum toxin
to compare case series reported by different institutions. injection, the principal outcome measure was a symptom
The methods used in the development of this instrument score for dysphagia and regurgitation calculated by adding
allowed us to ensure that it had good coverage of the quality a severity score (06) to a frequency score (05) for each
of life concepts that were important to persons with achala- of these two symptoms (21). Treatment failure in this study
sia. Our methodology also allowed us to create an efficient was defined as a posttreatment score that was greater than
instrument, by including items that sampled unique points the 10th percentile of scores obtained from an untreated
on the continuum of the underlying measure of quality of population (22). Conversion of a continuous measure to a
life. Although the final measure included items that covered dichotomous outcome, as was done in this study, poten-
many aspects of achalasia, such as swallowing difficulty, pain, tially results in a loss of important information on treat-
heartburn, food tolerance and diet-related behavior modifi- ment outcome. Use of a continuous measure as an outcome
cations, we found that the combination of these items repre- measure may benefit the efficiency of a clinical trial, by
sented a one-dimensional construct. reducing the sample size required to detect a treatment effect,
and by allowing explicit estimation of a clinically important
Table 2. Goodness-of-Fit Statistics for the 10 Measured Items difference.
A potential limitation of our analysis was the fact that
Statistic
we developed and validated the measure on the same pop-
Infit Outfit ulation. However, identifying a large population of subjects
Mean Mean with achalasia for validation studies is difficult because of
Measure Square Standardized Square Standardized the rarity of the disease. Further validation of our measure in
Mean 1.00 0.2 0.98 0.2 external populations is essential. Studies using this measure
Standard deviation 0.17 1.2 0.22 1.2 in other centers will contribute information to further assess
Maximum 1.34 1.9 1.53 2.6 the validity and reliability of the instrument in other settings.
Minimum 0.78 1.8 0.72 1.7 Despite aggressive efforts to maximize response to our
Standardized to a mean of 0 and standard deviation of 1.0. long-form questionnaire, we had a response rate of 58%.
A Measure of Achalasia Quality of Life 1673
Figure 3. Relation of raw scores to the measure. The raw scores did not have interval-level properties. While there was a linear relationship
between raw scores and interval-level recalibrated scores along the middle of the scale, a one-unit increment in the raw score substantially
underestimated the amount of real change among high and low levels of severity.
Several factors may explain this relatively low response, such outcome, it would not necessarily affect the validity of our
as the fact that many subjects had not received treatment for instrument development process. For example, if our inten-
several years, and the length of the questionnaire. Respon- tion was to sample subjects in relation to treatment outcome,
ders and nonresponders were not significantly different in a large proportion of nonresponders would bias our estimate
terms of measurable characteristics, such as their age, gen- of treatment outcome substantially, since the probability of
der, and the year of last visit. However, even if responders responding to a survey is likely to be influenced by treat-
were significantly different from nonresponders in terms of ment outcome. However, our methods relied on the empirical
relationships between responses within individuals. There-
fore, nonresponse would bias our final instrument only to the
extent that responders and nonresponders differ in the rela-
Table 3. Conversion of Raw Scores to Interval-Level Scores Recal- tionships between their responses to different items on the
ibrated onto a 0 to 100 Scale
instrument. It is not likely that the probability of response
Raw Score Converted Score is strongly related to the interrelationships of items on the
10 0 instrument.
11 14 In this study, we did not specifically intend to compare
12 23 quality of life scores between patients treated with different
13 29
modalities. In fact, most study subjects had more than one
14 33
15 37 type of treatment. When subjects who had surgery (most of
16 40 whom also had one or more dilatation) were compared with
17 43 subjects who had not had surgery, the scores were slightly
18 45 higher in the surgery group (54.9, SD 13.6) than the non-
19 47
surgery group (46.9, SD 18.0).
20 50
21 52 In conclusion, we developed a valid and reliable mea-
22 54 sure of achalasia disease-specific health-related quality of
23 57 life with interval-level properties. It is scored on a 0100
24 59 scale, with higher values indicating greater disease sever-
25 62
ity. Further work is required to evaluate the reliability and
26 65
27 69 validity of the scale in different populations, and to deter-
28 73 mine the magnitude of a clinically meaningful change in the
29 78 scale. This scale is suitable for use as an outcome measure
30 86 in a clinical trial comparing the effectiveness of therapies for
31 100
achalasia.
1674 Urbach et al.
Figure 4. Effective measurement range of the scale. Each bubble represents the recalibrated score of one person. The effective measurement
range extends from the lowest Thurstone Threshold to the highest Thurstone Threshold. Three persons (4.3%) scored better than the
lowest threshold, and six persons (8.6%) scored worse than the highest threshold. The measure cannot reliably discriminate between subjects
who score outside the effective measurement range.
Table 4. Mean Scores Among Subjects Who Endorsed Responses for Items Not Included in the Final Instrument
Question and Responses
Mean Score (Number of Respondents) p Value
No problem Some difficulty Much difficulty
How difficult is it for you to eat the following foods?
Bagel 26.3 (10) 47.0 (31) 60.6 (27) p < 0.001
Steak 29.3 (12) 48.6 (31) 61.1 (26) p < 0.001
Soft vegetables 41.1 (39) 60.9 (29) 47.5 (1) p < 0.001
Strongly agree Agree Neither agree nor disagree Disagree Strongly disagree
I feel comfortable eating in a restaurant
40.7 (16) 32.3 (9) 50.0 (12) 55.6 (18) 64.6 (14) p < 0.001
No problem Mild problem Moderate problem Severe problem Very severe problem
How much of a problem for you is reflux and/or regurgitation?
39.2 (21) 49.6 (24) 52.7 (15) 65.5 (5) 71.0 (4) p < 0.001
Never Rarely Sometimes Frequently Every time I eat
How often in the past month have you made yourself vomit to deal with food caught in your esophagus?
41.8 (31) 50.8 (19) 58.9 (14) 69.2 (4) 89.3 (1) p < 0.001
Scores are recalibrated onto the 0100 interval-level scale.
p values are for the test for linear trend across groups.
This p value is for a t-test of the comparison of no problem, with some difficulty and much difficulty combined.
A Measure of Achalasia Quality of Life 1675
ACKNOWLEDGMENTS 6. How often have you experienced pain when eating dur-
ing the past month? (Please circle one.)
Dr. Urbach is a Career Scientist of the Ontario Ministry of
Health and Long-Term Care, Health Research Personnel De-
velopment Program. This study was supported by a grant Never Rarely Sometimes Frequently/Every Time I Eat
from the Canadian Association of General Surgeons and by [1] [2] [3] [4]
the physicians of Ontario through the Physicians Services
Incorporated Foundation.
7. During the past month, how much of a problem for you
was heartburn (a burning pain behind the lower part of the
APPENDIX
chest)? (Please circle one.)
A Measure of Achalasia Disease Severity
The raw score assigned to each item response is indicated
in bold in square brackets (these scoring weights should not No Mild Moderate Severe Very Severe
be included in a questionnaire given to patients). Total raw Problem Problem Problem Problem Problem
scores are calculated by summing the score for each item to [1] [2] [3] [4] [5]
yield a score between 10 and 31, and are only valid if there
are no missing responses.
1. How much has achalasia limited the types of food you 8. When you sit down to eat a meal, are you bothered by
have been able to eat in the last month? (Please check one.) how long it takes you to finish eating? (Please check one.)
r Not limited at all (I can eat and drink all the foods that I r No, I eat as quickly as I like. [1]
would like to.) [1] r
r Somewhat limited (I can eat and drink most of the foods
Yes, I am bothered by how long it takes me to eat. [2]
that I would like to.) [2] 9. Has having achalasia limited your lifestyle? (Please
r Moderately or severely limited (I can eat and drink very check one.)
few of the foods that I would like to.)[3] r No, it is not at all limiting (My daily activities have not
The following is a list of food types that may or may not changed.) [1]
cause you difficulty in swallowing. Please indicate which of r Yes, it has limited my lifestyle (It has affected some areas,
the following types of foods you are able to swallow without and I can no longer participate in all the activities I want
experiencing any problems such as pain or food sticking as to do.) [2]
it goes down. If you are not sure if you can swallow a type of
10. How much do you agree with the following statement
food, please make your best guess at whether you would be
about how satisfied you are with your health in regard to
able to swallow it without problems. Please check one box
achalasia? (Please circle the number that best describes your
for each type of food.
feelings.)
I am satisfied with my health in regard to achalasia.
botulinum toxin injection. J Gastrointest Surg 2001;5:192 13. Mellenbergh GJ. Generalized linear item response theory.
205. Psychol Bull 1994;115:3007.
3. Swanstrom LL, Pennings J. Laparoscopic esophagomy- 14. Hays RD, Morales LS, Reise SP. Item response theory and
otomy for achalasia. Surg Endosc 1995;9:28692. health outcomes measurement in the 21st century. Med Care
4. Ben-Meir A, Urbach DR, Khajanchee YS, et al. Quality 2000;38:II-28II-42.
of life before and after laparoscopic Heller myotomy for 15. McHorney CA, Cohen AS. Equating health status measures
achalasia. Am J Surg 2001;181:4714. with item response theory: Illustrations with functional sta-
5. Patti MG, Fisichella PM, Perretta S, et al. Impact of mini- tus items. Med Care 2000;38:II-43II-59.
mally invasive surgery on the treatment of esophageal acha- 16. Cook KF, Rabeneck L, Campbell CJM, et al. Evaluation
lasia: A decade of change. J Am Coll Surg 2003;196:698 of a multidimensional measure of dyspepsia-related health
705. for use in a randomized clinical trial. J Clin Epidemiol
6. Katz P. Achalasia: Two effective treatment optionsLet the 1999;52:38192.
patient decide. Am J Gastroenterol 1994;89:96970. 17. Stuki G, Daltroy L, Katz JN, et al. Interpretation of change
7. Podas T, Eaden J, Mayberry M, et al. Achalasia: A crit- scores in ordinal clinical scales and health status measures:
ical review of epidemiologic studies. Am J Gastroenterol The whole may not equal the sum of the parts. J Clin Epi-
1998;93:23457. demiol 1996;49:7117.
8. Csendes A, Braghetto I, Henriquez A, et al. Late results of 18. Masters GN. A Rasch model for partial credit scoring. Psy-
a prospective randomised study comparing forceful dilata- chometrika 1982;47:1497.
tion and oesophagomyotomy in patients with achalasia. Gut 19. Reckase M. Unifactor latent trait models applied to multifac-
1989;30:299304. tor tests: Results and implications. J Educ Stat 1979;4:207
9. Dakkak M, Bennett JR. A new dysphagia score with objec- 30.
tive validation. J Clin Gastroenterol 1992;14:99100. 20. Wright BD, Lineacre JM. Reasonable mean-square fit val-
10. Wallace KL, Middleton S, Cook IJ. Development and val- ues. Rasch Measure Trans 1994;8:370.
idation of a self-report symptom inventory to assess the 21. Zaninotto G, Annese V, Costantini M, et al. Random-
severity of oral-pharyngeal dysphagia. Gastroenterology ized controlled trial of botulinum toxin versus laparoscopic
2000;118:67887. Heller myotomy for esophageal achalasia. J Gastrointest
11. Stucki G, Daltroy L, Katz JN, et al. Interpretation of change Surg 2004;239:36470.
scores in ordinal clinical scales and health status measures: 22. Zaninotto G, Costantini M, Molena D, et al. Treatment of
The whole may not equal the sum of its parts. J Clin Epi- esophageal achalasia with laparoscopic Heller myotomy and
demiol 1996;49:7117. Dor partial anterior fundoplication: Prospective evaluation
12. Bowling A. Measuring disease, 2th ed. Buckingham: Open of 100 consecutive patients. J Gastrointest Surg 2000;4:282
University Press; 2001. 9.