Você está na página 1de 13

This article was downloaded by: [University Library Utrecht]

On: 23 September 2013, At: 06:10


Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer
House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Clinical and Experimental


Neuropsychology
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/ncen20

Parsimonious prediction of Wechsler Memory


Scale, Fourth Edition scores: Immediate and
delayed memory indexes
a

Justin B. Miller , Bradley N. Axelrod , Lisa J. Rapport , Scott R. Millis , Sarah


b

VanDyke , Christian Schutte & Robin A. Hanks


a

Department of Psychology, Wayne State University, Detroit, MI, USA

Psychology Section, John D. Dingell Department of Veterans Affairs Medical


Center, Detroit, MI, USA
c

Department of Physical Medicine and Rehabilitation, Wayne State University,


School of Medicine Detroit, MI, USA
Published online: 02 Mar 2012.

To cite this article: Justin B. Miller , Bradley N. Axelrod , Lisa J. Rapport , Scott R. Millis , Sarah VanDyke ,
Christian Schutte & Robin A. Hanks (2012) Parsimonious prediction of Wechsler Memory Scale, Fourth Edition scores:
Immediate and delayed memory indexes, Journal of Clinical and Experimental Neuropsychology, 34:5, 531-542, DOI:
10.1080/13803395.2012.665437
To link to this article: http://dx.doi.org/10.1080/13803395.2012.665437

PLEASE SCROLL DOWN FOR ARTICLE


Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content)
contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors
make no representations or warranties whatsoever as to the accuracy, completeness, or suitability
for any purpose of the Content. Any opinions and views expressed in this publication are the opinions
and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of
the Content should not be relied upon and should be independently verified with primary sources of
information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands,
costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or
indirectly in connection with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or
systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution
in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at
http://www.tandfonline.com/page/terms-and-conditions

JOURNAL OF CLINICAL AND EXPERIMENTAL NEUROPSYCHOLOGY


2012, 34 (5), 531542

Parsimonious prediction of Wechsler Memory Scale,


Fourth Edition scores: Immediate and delayed memory
indexes
Justin B. Miller1 , Bradley N. Axelrod2 , Lisa J. Rapport1 , Scott R. Millis3 ,
Sarah VanDyke2 , Christian Schutte2 , and Robin A. Hanks3
1

Department of Psychology, Wayne State University, Detroit, MI, USA


Psychology Section, John D. Dingell Department of Veterans Affairs Medical Center, Detroit,
MI, USA
3
Department of Physical Medicine and Rehabilitation, Wayne State University, School of Medicine
Detroit, MI, USA

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Research on previous versions of the Wechsler Memory Scale (WMS) found that index scores could be predicted
using a parsimonious selection of subtests (e.g., Axelrod & Woodard, 2000). The release of the Fourth Edition
(WMSIV) requires a reassessment of these predictive formulas as well as the use of indices from the California
Verbal Learning TestII (CVLTII). Complete WMSIV and CVLTII data were obtained from 295 individuals.
Six regression models were fit using WMSIV subtest scaled scoresLogical Memory (LM), Visual Reproduction
(VR), and Verbal Paired Associates (VPA)and CVLTII substituted scores to predict Immediate Memory
Index (IMI) and Delayed Memory Index (DMI) scores. All three predictions of IMI significantly correlated with
the complete IMI (r = .92 to .97). Likewise, predicted DMI scores significantly correlated with complete DMI
(r = .92 to .97). Statistical preference was indicated for the models using LM, VR, and VPA, in which 97% and
96% of the cases fell within two standard errors of measurement (SEMs) of full index scores, respectively. The
present findings demonstrate that the IMI and DMI can be reliably estimated using two or three subtests from
the WMSIV, with preference for using three. In addition, evidence suggests little to no improvement in predictive
accuracy with the inclusion of CVLTII indices.
Keywords: Memory; Wechsler Memory Scale; Short form; Parsimonious prediction; Concordance correlation
coefficient; Immediate memory; Delayed memory.

The release of the Fourth Edition of the Wechsler


Memory Scale (WMSIV; Wechsler, 2008b)
introduces new measures (e.g., Designs, Spatial
Addition, Symbol Span) along with the return
of several familiar measures that have roots that
reach back to the inception of the scale from
1945 (e.g., Visual Reproduction, Logical Memory,
Verbal Paired Associates). The WMSIV has also
introduced a new degree of flexibility by allowing

for substitution of scores from the Second Edition


of the California Verbal Learning Test (CVLTII;
Delis, Kramer, Kaplan, & Ober, 2000). Specifically,
clinicians who chose to utilize the CVLTII as a
measure of verbal list learning instead of Verbal
Paired Associates can generate scaled scores by
substituting the Trials 15 t score and long-delay
free-recall z score from the CVLTII for the immediate and delayed memory recall portions of Verbal

This research was supported by a grant from the National Institute on Disability and Rehabilitation Research, Department of
Education (H133A080044), the Del Harder Foundation, and the Wayne State University Graduate School. The contents of this study
do not necessarily represent the policies of the funding agencies. Justin B. Miller is now at the Semel Institute for Neuroscience and
Human Behavior, University of California, Los Angeles, CA, USA.
Address correspondence to Bradley N. Axelrod, John D. Dingell Department of Veterans Affairs, Psychology Section, Detroit, MI,
USA (E-mail: bradley.axelrod@va.gov).

2012 Psychology Press, an imprint of the Taylor & Francis Group, an Informa business
http://www.psypress.com/jcen
http://dx.doi.org/10.1080/13803395.2012.665437

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

532

MILLER ET AL.

Paired Associates, respectively. These scores were


derived from a subset of 380 participants in the
normative sample who were administered both the
CVLTII and the full WMSIV; for greater detail
regarding the derivation of these substituted scaled
scores, the reader is referred to the technical and
interpretive manual for the WMSIV (Wechsler,
2008a).
The structure of the WMSIV index scores has
changed in the new edition. Whereas complete
administration of the Third Edition (WMSIII)
generated eight separate index scores, some of
which have been retained in the Advanced Clinical
Solutions supplement, the standard WMSIV has
simplified this profile and calculates just five
indexes. Specifically, separate index scores are generated for auditory memory (AMI), visual memory
(VMI), immediate memory (IMI), delayed memory (DMI), and visual working memory (VWMI).
The most notable change is that there is no longer
calculation of a General Memory Quotient. The
number of available subtests has also been reduced
from 17 to 11 for the WMSIV. According to the
manual, complete administration of all 11 subtests
in the WMSIV requires approximately 83 minutes
at the 50th percentile of general test takers and
126 minutes at the 90th percentile in the special
group sample. However, our experiences suggest
that these estimates may be low and that more realistic estimates are that the complete battery takes an
average of approximately 120 minutes when all subtests are administered to individuals with cognitive
impairment.
Research on the Revised (WMSR) and Third
(WMSIII) editions of the Wechsler Memory
Scale found that index scores could be predicted using a parsimonious selection of subtests (Axelrod & Woodard, 2000; Woodard &
Axelrod, 1995). Specifically, three-subtest short
forms of the WMSR resulted in 94% and 97%
of the sample obtaining scores within 6 points of
immediate and delayed summary scores, respectively. Similarly, three-subtest short forms for the
WMSIII resulted in 92% and 96% of the sample
falling within 6 points of immediate and delayed
recall, respectively.
The release of the Fourth Edition (WMSIV)
requires a reassessment of these predictive formulas
as well as the substitution of CVLTII indices. The
aim of the present study was to determine the extent
to which a parsimonious selection of the 11 available subtests reliably predicted the composite index
scores generated by full administration. Emphasis
was placed on predicting the IMI and DMI from
three of the four primary subtests used in calculation of the IMI and DMI and those measures that

were carried over from the previous WMS version


(i.e., Logical Memory, Visual Reproduction, and
Verbal Paired Associates). Although it has been
suggested that short forms may be inappropriate
when specific measures for brief assessment exist
(Kaufman & Kaufman, 2001), the reality remains
that complete administration of an entire battery
of memory measures is not always feasible for any
number of reasons (e.g., limitations on resources,
patient fatigue, third-party payers). This study
sought to provide an option for those situations in
which a brief assessment is warranted.
Of additional interest was the difference in
predicted index scores when CVLTII indices were
substituted for Verbal Paired Associates. Given
that the Logical Memory and Visual Reproduction
subtests are nearly identical, and Verbal Paired
Associates is highly similar in the third and fourth
editions, it was predicted that the present study
would yield findings similar to those by Axelrod
and Woodard (2000) and Woodard and Axelrod
(1995), in that the IMI and DMI could be reliably
predicted using less than the full battery of subtests included in the WMSIV. Moreover, it was
expected that predicted values from these equations
would demonstrate adequate reliability to be
considered useful in clinical settings. Although it
is assumed that the intercorrelations among the
individual subtests and the respective index scores
are quite high, such a finding adds additional
support to the assertion that the measure can
effectively be shortened without compromising the
integrity of the IMI or DMI.

METHOD
Participants
Archival data for the present study were obtained
from the John D. Dingell Department of Veterans
Affairs (VA) Medical Center in Detroit, Michigan
and the Rehabilitation Institute of Michigan (RIM)
in Detroit. The VA sample was drawn from the
clinical archives of persons evaluated on an outpatient basis during the course of routine clinical
care. The primary diagnostic concerns included
dementia, mood disorders, traumatic brain injury,
learning disability, attention deficit hyperactivity
disorder, and other medical concerns, such as cognitive declines related to Parkinsons disease and
vascular events.
The sample from RIM was collected as part
of a research study investigating the ability of
the WMSIV to differentiate cognitive impairment resulting from actual moderate to severe

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

WMSIV INDEX SCORE PREDICTION

traumatic brain injury (TBI) from feigned cognitive


impairment. This sample included persons with
bona fide TBI (n = 60) recruited from the
pool of participants currently enrolled in the
Southeastern Michigan Traumatic Brain Injury
System (SEMTBIS) and healthy adults coached
to feign cognitive impairment (n = 64). For a
detailed description of this sample of participants,
and the coaching procedures, the reader is referred
to Miller et al. (2011). Although cases simulating
cognitive impairment do not represent actual cognitive functioning, the present study focused on the
psychometric properties of the index scores rather
than assessment of actual cognitive functioning.
Furthermore, including individuals feigning cognitive impairment helps to ensure that an adequate
range of scores is sampled and that index score calculation is independent from clinical presentation.
The sample included 295 cases; 124 from RIM
and 171 from the VA. Mean age for the total sample was 43.4 years (SD = 13.3; range = 18 to 65),
and mean years of education were 12.8 (SD = 2.0;
range = 8 to 21). The sample was predominantly men (90.2%) with nearly equal representations of African Americans (48.5%) and Caucasians
(49.8%). Fewer than 2% of the sample reported
their ethnicity as Arabic, Hispanic, or Latino.

Materials
For all cases, the entire WMSIV and CVLTII
were administered and scored according to
the standardized procedures. The sample from
RIM also completed several stand-alone symptom validity measures (e.g., Test of Memory
Malingering; Tombaugh, 1996) and embedded
measures of response bias (e.g., Reliable Digit
Span; Greiffenstein, Gola, & Baker, 1995). The VA
sample also completed the full WMSIV as part of
a comprehensive assessment of cognitive functioning in the context of a standard outpatient clinical
evaluation.

Procedure
This study was reviewed and approved by the
Human Investigation Committee at Wayne State
University and the Veterans Affairs Clinical
Investigation Committee. Persons recruited for the
study conducted at RIM were contacted via telephone (Southeastern Michigan Traumatic Brain
Injury System participants) or responded to printed
advertisements and posted fliers (simulator participants); all persons in this sample who agreed

533

to participate were evaluated at RIM or Wayne


State University. Testing was completed in a single session, and participants were compensated $30.
Because the archival data from the VA sample
were collected as part of routine clinical care, these
individuals were neither recruited nor compensated
for participating. All tests were administered by
a licensed psychologist, trained neuropsychology
technician, or advanced graduate students in clinical psychology.
The WMSIV and CVLTII were administered
independently; the order of test administration varied among participants in both samples to preclude the introduction of order effect bias. All
measures were scored according to standardized
instructions provided in the test manuals. Primary
WMSIV index scores were calculated using both
the standard calculation method using all four primary subtests and the substitution of the Total
15 t score (CVLTT) from the CVLTII for
Verbal Paired Associates (VPA) Immediate and
Long-Delay Free-Recall z score (CVLTZ) for VPA
delayed.

Analyses
The primary analytic strategy entailed use of linear
regressions with one of two index scores as the outcome and a combination of several subtest scores
as predictors. The primary index scores of interest
included the Immediate Memory Index (IMI) and
the Delayed Memory Index (DMI). The individual predictors included scaled scores from the
Visual Reproduction (VR), Logical Memory (LM),
and Verbal Paired Associates (VPA) as well as
scaled scores generated from substituting CVLTII
indices for VPA. The Design Memory subtest
was not included as part of the analysis to focus
the present study on those subtests that are most
frequently administered. Six separate models were
evaluated, three predicting IMI(a) VR and LM
(IMI2); (b) VR, LM, and VPA (IMI3W); and (c)
VR, LM, and CVLTT (IMI3C)and three comparable models predicting DMI(a) VR and LM
(DMI2); (b) VR, LM, and VPA (DMI3W); and
(c) VR, LM, and CVLTZ (DMI3C). Predicted
values were calculated using the unstandardized
regression coefficients from each resulting equation.
Model performance was evaluated by comparing R-squared values, and the contribution of
individual predictors was assessed based on standardized regression coefficients. Predicted values
from each regression equation were correlated
with observed index score values using Pearson
correlation coefficients to determine the association

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

534

MILLER ET AL.

between actual and predicted values. The percentage of predicted index scores that fell within 1 standard error of measurement (SEM) of observed
index scores was computed. As reported in the technical and interpretive manual for the WMSIV, the
average SEM for the adult battery (i.e., 1669 years
of age) is 3.53 points for the IMI and 3.71 points for
the DMI (Wechsler, 2008a). Thus, for the present
study, the SEM was rounded to 4 points for both
indexes. To provide a clinically relevant metric, the
percentage of scores falling within a full standard
deviation (SD; 15 index points) was also computed. The Pearson correlations between predicted
WMSIV scores and obtained WMSIV scores
were compared across the different models, and
the differences in correlations between models were
evaluated using t tests following r-to-z transformations of the correlations.
To evaluate the rate of agreement between
the original value and predicted values, Lins
Concordance Correlation Coefficient (CCC; King
& Chinchilli, 2001; Lin, 1989) was also calculated.
The CCC is a method for measuring agreement
on a continuous measure obtained by two persons
or methods; it measures both precision and accuracy to determine how far the observed data deviate
from the line of perfect concordance (i.e., the line
at 45 on a square scatter plot). Lins coefficient is
expressed as the product of the Pearson correlation
coefficient, which is a quantification of precision
(i.e., the distance of each data point from the fitted model), with the added incorporation of a bias
correction factor (Cb) that measures accuracy (i.e.,
the distance of the resulting model from the optimal 45 diagonal originating at the origin; King
& Chinchilli, 2001; Lin, 1989). Unlike using standard Pearson statistics or paired-sample t tests, the
CCC is capable of detecting systematic bias that
may be present among comparisons (e.g., over or
under prediction) and is thus a preferred statistic
for evaluating agreement between continuous variables. Bias correction factor values that differ from
1.00 indicate the presence of bias, with greater deviation indicating stronger bias (Lin, 1989). Values of
the CCC can range from 1.0, indicating poor concordance, to 1.0, which would be observed in the
presence of perfect agreement between values.

RESULTS
Descriptive statistics for predicted index score values for all six models are presented in Table 1,
and model summary statistics for the three IMI
and three DMI prediction equations are presented
in Table 2. All regression coefficients in each of

the models were significant at the p < .001 level.


The frequencies of score discrepancies between predicted and actual values based on the SEM and
standard deviation (SD) are presented in Table 3;
these scores have been provided in order to evaluate whether any of the predictive models systematically over- or underpredict observed scores.
The IMI3W model was statistically significant,
F(3, 288) = 1,832.18, p < .001, with an R2 of
.95. Visual Reproduction Immediate (VR1) generated the largest standardized regression coefficient,
and LM1 generated the smallest. This model also
generated predicted scores that had the tightest distribution of the immediate memory models, with
nearly two thirds of scores falling within 1 SEM of
actual values and 96% of scores within 2 SEMs (i.e.,
8 index score points). The IMI3C model was
also significant, F(3, 279) = 634.19, p < .001, with
an R2 of .87. Visual Reproduction Immediate again
had the largest coefficient; however, the regression
coefficient for CVLTT was the smallest in this
model. In relation to actual values, 44.9% of predicted values fell within 1 SEM and 79.9% of
scores within 2 SEMs of actual values. The most
parsimonious model, IMI2, produced the smallest
R2 value of the three immediate memory models but
was still statistically significant, F(2, 290) = 817.14,
p < .001. This model produced the largest discrepancies between observed and predicted values with
only 40% of scores within 1 SEM and 23.3% of
scores differing by more than 2 SEMs. Among individuals with an observed IMI that was within 1 SD
of the mean in either direction (n = 125), the average prediction discrepancy for the three models was
0.70 points (SD = 5.4, range = 12.1 to 21.7). For
individuals with an observed IMI between 1 and
2 SDs below the mean (n = 86), predicted values
averaged 0.20 points lower than observed values
(SD = 5.8, range = 16.49 to 17.11). Among individuals with observed IMI values more than 2 SDs
below the mean (n = 73), the average discrepancy was 1.80 points lower than observed values
(SD = 5.8, range = 17.54 to 10.64). Nine participants had observed IMI scores that were more than
1 SD above the mean.
Predicting the DMI yielded similar results to the
three models for IMI, with the DMI3W demonstrating the largest R2 . This model was significant
overall, F(3, 287) = 1,727.41, p < .001. Predicted
values from this model most closely approximated
the distribution of observed scores, with 71.1% of
scores falling within 1 SEM and fewer than 4%
of scores falling more than 2 SEMs from actual
values. As with the immediate memory models,
replacing Verbal Paired Associates Delayed Recall
(VPA2) with CVLTZ for the DMI3C produced a

WMSIV INDEX SCORE PREDICTION

535

TABLE 1
Descriptive statistics for full and short-form predicted WMSIV Immediate and Delayed Memory Indexes

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Model

SD

Range of difference

Actual IMI
IMI2 (LM1, VR1)
IMI3C (LM1, VR1, CVLTII1)a
IMI3W (LM1, VR1, VPA1)

83.0
82.9
83.2
82.9

18.0
16.7
16.9
17.6

NA
17.5 to 21.7
16.4 to 20.8
10.7 to 14.4

Actual DMI
DMI2 (LM2, VR2)
DMI3C (LM2, VR2, CVLTII2)b
DMI3W (LM2, VR2, VPA2)

82.6
82.4
82.7
82.6

16.3
15.0
15.0
15.8

NA
16.1 to 17.5
15.4 to 16.3
10.7 to 11.8

Note. WMSIV = Wechsler Memory ScaleFourth Edition; IMI = Immediate Memory Index; DMI = Delayed Memory Index;
LM1 = Logical MemoryImmediate; LM2 = Logical MemoryDelayed; VR1 = Visual ReproductionImmediate; VR2 = Visual
ReproductionDelayed; CVLTII1 = California Verbal Learning TestIIImmediate; CVLTII2 = California Verbal Learning Test
IIDelayed; VPA1 = Verbal Paired AssociatesImmediate; VPA2 = Verbal Paired AssociatesDelayed.
a IMI3C employed a substitution of CVLTII Trials 15 t score for Verbal Paired Associates (Immediate).
b DMI3C employed a substitution of CVLTII Long-Delay Free-Recall z score for Verbal Paired Associates (Delayed).

TABLE 2
Regression summary statistics for short-form prediction models of Immediate and Delayed Memory Indexes
Predictors

Model summary

SEB

R2

SEE

IMI2
Logical Memory 1
Visual Reproduction 1
Constant

2.78
2.61
43.76

0.14
0.13
1.05

0.54
0.52

.92

.85

7.00

IMI3C
Logical Memory 1
Visual Reproduction 1
Substitution 1
Constant

2.32
2.48
1.13
39.48

0.15
0.13
0.17
1.18

0.44
0.49
0.18

.93

.87

6.49

IMI3W
Logical Memory 1
Visual Reproduction 1
Verbal Paired Associates 1
Constant

1.83
2.25
2.16
36.77

0.09
0.08
0.09
0.67

0.35
0.45
0.39

.98

.95

4.04

DMI2
Logical Memory 2
Visual Reproduction 2
Constant

2.76
2.55
46.55

1.22
0.12
0.99

0.57
0.55

.92

.84

6.49

DMI3C
Logical Memory 2
Visual Reproduction 2
Substitution 2
Constant

2.39
2.35
0.85
44.43

0.13
0.13
0.13
1.02

0.49
0.50
0.18

.93

.86

6.10

DMI3W
Logical Memory 2
Visual Reproduction 2
Verbal Paired Associates 2
Constant

1.98
2.02
1.94
40.43

0.08
0.07
0.08
0.63

0.41
0.43
0.39

.97

.95

3.75

Model

Note. SEB = Standard error of beta weights; SEE = Standard error of the estimate; IMI = Immediate Memory Index; IMI3C =
predicted IMI using LM, VR, and California Verbal Learning Test, 2nd edition (CVLTII); IMI3W = predicted IMI using LM,
VR, and Verbal Paired Associates (VPA); DMI = Delayed Memory Index; DMI2 = predicted DMI using LM and VR; DMI
3C = predicted DMI using LM, VR, and CVLTII; DMI3W = predicted DMI using LM, VR, and VPA. Substitution refers to use
of California Verbal Learning TestII indices in place of verbal paired associates.

536

MILLER ET AL.

TABLE 3
Accuracy of predicted WMSIV index scores in relation to the standard error of measurement and standard deviation of the
indexes
Predicted IMI

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Range of discrepancy

Predicted DMI

IMI2

IMI3C

IMI3W

DMI2

DMI3C

DMI3W

Standard error of measurementa


(Predicted > actual) 8 points
(Predicted > actual) < 4 points
Within 1 SEM
(Predicted < actual) < 4 points
(Predicted < actual) 8 points

12.6
16.6
40.0
19.3
10.8

9.9
18.7
44.9
16.3
10.2

1.0
15.8
65.4
15.4
2.4

9.6
19.5
43.8
15.1
12.0

10.3
14.9
49.8
13.9
11.0

1.4
12.7
71.1
12.4
2.4

Standard deviation
(Predicted > actual) 15 points
(Predicted > actual) within 15 points
(Predicted < actual) within 15 points
(Predicted < actual) 15 points

1.4
49.7
47.6
1.4

0.7
47.1
51.4
0.7

0.0
52.7
47.3
0.0

0.7
49.8
49.1
0.3

0.0
50.7
48.9
0.4

0.0
48.8
51.2
0.0

Note. WMSIV = Wechsler Memory ScaleFourth Edition; IMI = Immediate Memory Index; IMI2 = predicted IMI using Logical
Memory (LM) and Visual Reproduction (VR); IMI3C = predicted IMI using LM, VR, and California Verbal Learning Test, 2nd
edition (CVLTII); IMI3W = predicted IMI using LM, VR, and Verbal Paired Associates (VPA); DMI = Delayed Memory Index;
DMI2 = predicted DMI using LM and VR; DMI3C = predicted DMI using LM, VR, and CVLTII; DMI3W = predicted DMI
using LM, VR, and VPA.
a Standard errors of measurement (SEMs) for IMI = 3.5 and 3.7 points for IMI and DMI, respectively. Cutpoints rounded to nearest
whole points of SEM = 4. Standard deviations (SDs) for IMI and DMI = 15 points.

significant model, F(3, 277) = 563.26, p < .001. The


R2 value for this model, however, was considerably
smaller than the model using VPA2 and slightly
smaller than the R2 value for the IMI model
using CVLTZ. This finding indicates that substitution of the CVLTII Long-Delay Free Recall for
VPA2 produces greater variability in subtest scores
than does the standard calculation of the DMI
using VPA or the substitution method for the IMI.
The DMI2 generated the smallest R2 value of all
six models; however, the model was still statistically
significant, F(2, 289) = 769.42, p < .001. A total
of 43% of scores from this model were within 1
SEM and 78% within 2 SEMs. Among individuals with an observed DMI that was within 1 SD
of the mean in either direction (n = 108), the average prediction discrepancy for the three models was
1.3 points (SD = 5.1, range = 16.03 to 15.01). For
individuals with an observed DMI between 1 and
2 SDs below the mean (n = 109), there was no difference on average between observed and predicted
values (M = 0.0, SD = 4.9, range = 15.41 to
17.51). Among individuals with observed DMI values more than 2 SDs below the mean (n = 65),
the average discrepancy was 2.4 points lower than
observed DMI values (SD = 5.0, range = 16.08 to
10.53). Nine participants had observed DMI scores
that were more than 1 SD above the mean.
The rate of agreement between observed values
and predicted values generated from each model
was also high. Each Pearson correlation coefficient
calculated between observed index score values and

predicted values from each model was positive and


significant (p < .001), indicating a high degree of
precision in predicted values. The bias correction
factors used in the calculation of the CCC each
approximated 1.00, indicating near-perfect accuracy with very little evidence of systematic bias
in any of the six models. Inspection of the generated scatter plots between observed and predicted
values showed a slight trend of each model to overpredict at the higher end of the score distribution
and underpredict at the lower end; this trend is
most pronounced in the two-variable models and
the models that substitute CVLTII variables for
VPA. The predictive models using LM, VR, and
VPA showed near-perfect agreement. All measures
of agreement are presented in Table 4; Figures 16
plot the data along with line of perfect concordance
(i.e., the line at 45 on a square scatter plot) and the
datas reduced major axis (i.e., summarized center
of observed data plotted through the intersection
of the means with a slope equal to the ratio of standard deviations) to show the extent of bias present
in each prediction model.
Given that the fit of each of the six models
accounted for a significant proportion of the variance in the respective index scores, and each predictive model generated values that were in high
agreement with observed values, correlations of
predicted to actual index scores were compared to
each other using Fischers r-to-z transformation,
which translates Pearson correlation coefficients to
a common metric to facilitate significance testing

WMSIV INDEX SCORE PREDICTION

537

TABLE 4
Measures of agreement between observed and predicted index scores

Precision (Pearson
correlation coefficient)

Accuracy
(bias factor)

Concordance correlation
coefficient (CCC)

95%
Confidence
interval

IMI2
IMI3C
IMI3W

.92
.93
.98

1.00
1.00
1.00

.92
.93
.97

[.90, .94]
[.92, .95]
[.97, .98]

DMI2
DMI3C
DMI3W

.92
.93
.97

1.00
1.00
1.00

.91
.92
.97

[.89, .93]
[.91, .94]
[.97, .98]

Model

140
120
100
IMI
80
60
40

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Note. IMI = Immediate Memory Index; IMI2 = predicted IMI using Logical Memory (LM) and Visual Reproduction (VR);
IMI3C = predicted IMI using LM, VR, and California Verbal Learning Test, 2nd edition (CVLTII); IMI3W = predicted IMI
using LM, VR, and Verbal Paired Associates (VPA); DMI = Delayed Memory Index; DMI2 = predicted DMI using LM and VR;
DMI3C = predicted DMI using LM, VR, and CVLTII; DMI3W = predicted DMI using LM, VR, and VPA.

40

60
Reduced major axis

80
IMI2

100

120

Line of perfect concordance

Figure 1. Scatter plot of observed Immediate Memory Index (IMI) values and predicted IMI values using Logical Memory and Visual
Reproduction subtests (IMI2).

using a t distribution. This transformation allows


for determination of the relative benefit of one
prediction equation over the others. For immediate memory scores, the correlation of IMI3W to
actual IMI was significantly stronger, t(294) = 5.99,
p < .001, than were the correlations of IMI to either
IMI2 or IMI3C, which did not significantly
differ from each other, t(294) = 0.83, p = .40.
Similarly, actual DMI scores were most highly correlated with DMI3W, significantly more strongly,
t(294) = 6.15, p < .001, than to comparing DMI
to either DMI2 or DMI3C, which again did not
differ from each other, t(294) = 0.83, p = .40. Both
the IMI3W and DMI3W generated significantly

stronger correlations than the other models and


showed the highest rate of agreement with observed
scores. All Pearson correlation coefficients between
observed and predicted scores are presented in
Table 4.
To evaluate the influence of test-taking strategy
on the predictive abilities of the generated models, predicted values were compared to original
values for the subsample of participants simulating cognitive impairment. As the purpose of the
present paper was to determine the ability of a subset of test scores to estimate the full index scores,
it was expected that test-taking strategy would
have no influence on predictive accuracy. This

MILLER ET AL.

40
40

60

80
IMI3C

Reduced major axis

100

120

Line of perfect concordance

60

80

IMI

100

120

140

Figure 2. Scatter plot of observed Immediate Memory Index (IMI) values and predicted IMI values using Logical Memory, Visual
Reproduction, and California Verbal Learning Tests, 2nd Edition subtest scaled scores (IMI3C).

40

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

60

80

IMI

100

120

140

538

40

60
Reduced major axis

80
IMI3W

100

120

Line of perfect concordance

Figure 3. Scatter plot of observed Immediate Memory Index (IMI) values and predicted IMI values using Logical Memory, Visual
Reproduction, and Verbal Paired Associates subtest scaled scores (IMI3W).

expectation was confirmed by paired-sample t tests,


which were nonsignificant for IMI comparisons (all
p values > .38). However, DMI predicted values for
this subsample were all significantly higher than

the actual DMI values (all p values .02) by


an average of 1.5 points (DMI3C, M = 79.5,
SD = 14.4) and 1.7 points (DMI3W, M = 79.7,
SD = 14.9; DMI2, M = 79.7, SD = 14.3).

539

40

60

80

100
DMI2

Reduced major axis

120

140

Line of perfect concordance

60

80

DMI

100

120

140

Figure 4. Scatter plot of observed Delayed Memory Index (DMI) values and predicted DMI values using Logical Memory and Visual
Reproduction subtests (DMI2).

40

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

60

80

DMI

100

120

140

WMSIV INDEX SCORE PREDICTION

40

60

80

Reduced major axis

DMI3C

100

120

140

Line of perfect concordance

Figure 5. Scatter plot of observed Delayed Memory Index (DMI) values and predicted DMI values using Logical Memory, Visual
Reproduction, and California Verbal Learning Tests, 2nd Edition subtest scaled scores (DMI3C).

Although statistically significant, the differences


of less than 2 points are not clinically meaningful, as demonstrated by negligible effect sizes for
each of the three comparisons (Cohens ds < .12).
Looking at the distribution of predicted scores

among simulators based on the SEM reaffirms this


observation as evidenced by the percentage of predicted DMI values that were greater than observed
scores by 2 or more SEMs (14.1%, 10.9%, and
1.6% for the DMI2, DMI3C, and DMI3W,

MILLER ET AL.

40

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

60

80

DMI

100

120

140

540

40

60

80

100

120

140

DMI3W
Reduced major axis

Line of perfect concordance

Figure 6. Scatter plot of observed Delayed Memory Index (DMI) values and predicted DMI values using Logical Memory, Visual
Reproduction, and Verbal Paired Associates subtest scaled scores (DMI3W).

respectively). The proportion of predicted DMI


values that were underestimated by the predictive
formulas by more than 2 SEMs was considerably
smaller (DMI2 = 4.7%; DMI3C = 6.3%; DMI
3W = 1.6%). Among nonsimulators, a similar proportion of overprediction by 2 or more SEMs
was observed (DMI2 = 8.3%; DMI3C = 10.1%;
DMI3W = 1.3%).

DISCUSSION
The findings provide strong support for estimating WMSIV index scores using parsimonious
subsets of the most frequently administered subtests. The three primary subtests of the WMSIV
that are most familiar and most frequently used
(Logical Memory, Visual Reproduction, and
Verbal Paired Associates) serve remarkably well
to estimate both the Immediate Memory Index
(IMI) and Delayed Memory Index (DMI) of the
WMSIV. Furthermore, the present findings
demonstrate that the IMI and DMI can be reliably
estimated with a level of predictive accuracy that
is considered acceptable for use in clinical practice.
Estimations that used the CVLTII substitution
generally performed less well than did estimations
using the WMSIV subtests. Although the most
accurate model uses three of the four original

WMS subtests used to calculate the IMI and DMI,


the present findings also show that using just two of
the subtests will still yield index scores that closely
approximate the actual values that would have been
obtained if all four subtests were administered.
Such findings likely have particular relevance for
clinicians facing increasing pressures and restrictions on testing practices, or for those working with
patient populations who may fatigue easily and are
unable to endure an extensive cognitive battery.
Comparing the different short forms finds that
the two-subtest versions and the versions that
added CVLTII substituted scores are essentially interchangeable. Specifically, using Logical
Memory and Visual Reproduction alone to predict the Immediate Memory and Delayed Memory
index scores yields the same predictive accuracy as the models that added indices from the
CVLTII. Therefore, the addition of CVLTII
indices contributes no additional predictive accuracy over using just Logical Memory and Visual
Reproduction.
In contrast, the addition of Verbal Paired
Associates in addition to Logical Memory and
Visual Reproduction significantly improved
estimations of IMI and DMI. This finding is
anticipated because using Logical Memory, Visual
Reproduction, and Verbal Paired associates
together approximates most closely to the original

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

WMSIV INDEX SCORE PREDICTION

sample from which the Immediate and Delayed


Memory indexes were constructed. The predictive
accuracy of the three-subtest version also raises
questions regarding the utility of the Designs
subtest included in the full WMSIV. The present
findings suggest that Designs plays a minimal
role in the actual immediate or delayed memory
index scores, given that short forms using Logical
Memory, Visual Reproduction, and Verbal Paired
Associates together account for approximately
95% of variability in these indices. What remains
unclear, however, is the extent to which the Designs
subtest contributes to the auditory and visual
memory indexes. Although the Designs subtest
may contribute little to the immediate and delayed
memory indexes, it may be an integral component
of these other indexes. For clinicians who are only
interested in the IMI and DMI, however, omission
of the Designs subtest is particularly appealing in
that it removes the subtest with the greatest time
demand and the most materials, thus simplifying
administration.
Although the predicted DMI values were statistically higher than the actual values in the sample of simulators, it is important to note that the
magnitude of the actual differences (i.e., less than
2 points) is not clinically meaningful. Additionally,
it is important to note that these values still showed
a strong positive relationship with actual values,
which all fall at the extreme end of the distribution. Consistent with the existing literature, the
predictive accuracy of these models deteriorates
the further out scores get from the center of the
distribution (Axelrod & Woodard, 2000; Spinks
et al., 2009). For individuals who obtained an
actual immediate memory index within two standard deviations of the mean in either direction,
predicted values were within one index point on
average. Predicted values for individuals with an
actual immediate memory index that fell more than
two standard deviations below the mean, however,
generated predicted values that differed by nearly
two index score points on average. A similar pattern was observed among delayed memory index
scores, in which individuals who obtained an actual
delayed memory index within two standard deviations of the mean generated predicted values that
were generally within one index point. Individuals
with an actual delayed memory index that was
more than two standard deviations below the mean
generated predicted values that differed by more
than two index points on average, and although
this difference is very small and not clinically
meaningful, it is still of theoretical interest. Thus,
clinicians faced with a test-taker demonstrating
severely impaired levels of memory functioning as

541

evidenced by scores that are more than two full


standard deviations below the mean on individual subtests should use these predictive equations
with this consideration in mind. Similar cautionary warnings likely apply to the opposite end of
the distribution as well; however, the proportion of
individuals scoring more than one standard deviation above the mean in the present sample was too
small to investigate this possibility adequately.
From a practical standpoint, clinicians who
wish to adopt the present approach and obtain
Immediate and Delayed index scores without having to administer the entire WMSIV battery can
do so by utilizing the coefficients presented in
Table 2. Specifically, once the specific model has
been selected, the respective unstandardized regression coefficients (B) can be applied by multiplying the subtest age-adjusted scaled scores by their
respective coefficients in Table 2, summing the
totals, and adding the constant. The resulting value
equals the predicted index score, which approximates what would have been obtained had all four
subtests been administered. The predicted values
that were generated to approximate index scores
were obtained from administration of the entire
WMSIV battery; also, measures not included in
the original standardization process were used to
satisfy delay intervals. These limitations may have
influenced performance on DMI subtests; however,
they are unlikely to have had significant influence,
because during the standardization process itself,
the tests go through myriad changes (i.e., adding
items, dropping items) before the final product is
published. Another important point to reiterate
is that all predictive models utilized age-adjusted
scaled scores obtained from the test manual. Use
of raw subtest scores in these equations will result
in drastically different values that will not approximate an index score.
In comparison to previous short forms of the
Wechsler Memory Scale (e.g., Axelrod & Woodard,
2000; Woodard & Axelrod, 1995), the results from
the present study are very consistent. As with the
WMSIII, there is continued evidence that the
immediate memory index score can be reliably predicted from two or three of the available subtests.
Although there is no longer a General Memory
index such as that included in the WMSIII and
as studied by Axelrod and Woodard (2000), a similar composite included in the present study is the
delayed memory index, which also was effectively
predicted by less than the full battery of subtests.
One striking similarity between the present paper
and earlier short forms is that the squared multiple correlations for each model are nearly identical.
Although the specific predictors are different, the

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

542

MILLER ET AL.

three-variable models fit by Axelrod and Woodard


accounted for approximately 95% of the variance
in index scores, whereas the two-variable models
accounted for only 87%. This pattern between the
two- and three-variable models was virtually identical in the present study for those models that
only included Wechsler Memory Scale subtests
(i.e., those not including CVLTII substitution).
Moreover, the same shortcomings were also identified in that the predictive formulas lose stability
at the extreme ends of the normal distribution,
which is likely a factor of the predictive method as
opposed to a function of the test itself.
Comparing the contributions of the individual
subtests between the WMSIII and the WMSIV
reveals some specific differences that are important to note. For example, in each model studied
by Axelrod and Woodard (2000) for the WMSIII,
Logical Memory had the largest contribution of
the included predictors. In the present study, however, this was only the case when Logical Memory
was paired with one other subtest (e.g., Visual
Reproduction). In each of the three-variable models for the WMSIV, visual reproduction made
the largest contribution to predicted index scores,
and it is unclear how this subtest contributed to
WMSIII short forms as Visual Reproduction was
not included in the earlier studies.
A unique contribution of the present study was
the inclusion of individuals coached to feign cognitive impairment. Theoretically, the generation of
index scorescomplete or estimated from short
formsshould be independent of test-taking strategy as they are a linear composite of observed
scores and therefore should not be affected by
skill or psychological characteristics such as motivation; the calculations should apply regardless of
test-taking strategy. This notion was supported in
part by the fact that the pattern of findings among
the coached simulators paralleled that observed
among the bona fide patients. At the same time,
although the absolute differences were relatively
small and not clinically meaningful (i.e., less than
2 points), it is of some interest that the short
forms tended to slightly overpredict simulators.
It is also important to note that the subsample of participants simulating cognitive impairment performed quite poorly, which reinforces the
notion that scores from these predictive equations
become less reliable at the extreme ends of the
score distributions. Furthermore, a nearly identical proportion of scores were overpredicted in the
remainder of the sample for each of the predictive
methods. In that regard, the short forms seem to
work in the direction of being robust to feigned
cognitive impairment rather that vulnerable to it

and can thus be applied in the context of suspected


negative response bias.
The mixed nature of the clinical sample, which
included a variety of neuropsychological and
psychological conditions across a broad range of
age and education, represents both a strength and
a weakness of the present design. Also, although
there is no reason to expect that gender would
affect prediction accuracy, the present sample
overrepresented men. Future research should verify
that the patterns observed in this study hold true
across specific subgroups and conditions. The
exclusion of the Designs subtest also represents
a weakness of the present study, as its inclusion
may have allowed for even greater precision in the
resulting predicted values.
Original manuscript received 7 October 2011
Revised manuscript accepted 6 February 2012
First published online 2 March 2012

REFERENCES
Axelrod, B. N., & Woodard, J. L. (2000). Parsimonious
prediction of Wechsler Memory ScaleIII memory
indices. Psychological Assessment, 12, 431435.
Delis, D., Kramer, J., Kaplan, E., & Ober, B. (2000).
The California Verbal Learning Test (2nd ed.). San
Antonio, TX: Pearson.
Greiffenstein, M. F., Gola, T., & Baker, W. J. (1995).
MMPI2 validity scales versus domain-specific measures in detection of factitious traumatic brain injury.
Clinical Neuropsychologist, 9, 230240.
Kaufman, J. C., & Kaufman, A. S. (2001). Time for
the changing of the guard: A farewell to short forms
of intelligence tests. Journal of Psychoeducational
Assessment, 19, 245267.
King, T. S., & Chinchilli, V. M. (2001). A generalized concordance correlation coefficient for continuous and
categorical data. Statistics in Medicine, 20, 21312147.
Lin, L. I. (1989). A concordance correlation-coefficient
to evaluate reproducibility. Biometrics, 45, 255268.
Miller, J. B., Millis, S. R., Rapport, L. J., Bashem, J. R.,
Hanks, R. A., & Axelrod, B. N. (2011). Detection of
insufficient effort using the advanced clinical solutions
for the Wechsler Memory Scale, Fourth Edition. The
Clinical Neuropsychologist, 25, 160172.
Spinks, R., McKirgan, L. W., Arndt, S., Caspers,
K., Yucuis, R., & Pfalzgraf, C. J. (2009). IQ
estimate smackdown: Comparing IQ proxy measures to the WAISIII. Journal of the International
Neuropsychological Society, 15, 590596.
Tombaugh, T. (1996). Test of Memory Malingering.
North Tonawanda, NY: Multi-Health Systems.
Wechsler, D. (2008a). Technical and interpretive manual
for Wechsler Memory Scale (4th ed.). San Antonio,
TX: Pearson.
Wechsler, D. (2008b). Wechsler Memory Scale (4th ed.).
San Antonio, TX: Pearson.
Woodard, J. L., & Axelrod, B. N. (1995). Parsimonious
prediction of Wechsler Memory Scale Revised memory indices. Psychological Assessment, 7, 445449.

Você também pode gostar