Parsimonious Prediction of Wechsler Memory Scale

This article was downloaded by: [University Library Utrecht]
On: 23 September 2013, At: 06:10

Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer
House, 37-41 Mortimer Street, London W1T 3JH, UK
Journal of Clinical and Experimental

Neuropsychology
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/ncen20
Parsimonious prediction of Wechsler Memory

Scale, Fourth Edition scores: Immediate and
delayed memory indexes
a
Justin B. Miller , Bradley N. Axelrod , Lisa J. Rapport , Scott R. Millis , Sarah

b
VanDyke , Christian Schutte & Robin A. Hanks

a
Department of Psychology, Wayne State University, Detroit, MI, USA
Psychology Section, John D. Dingell Department of Veterans Affairs Medical

Center, Detroit, MI, USA
c
Department of Physical Medicine and Rehabilitation, Wayne State University,

School of Medicine Detroit, MI, USA
Published online: 02 Mar 2012.
To cite this article: Justin B. Miller , Bradley N. Axelrod , Lisa J. Rapport , Scott R. Millis , Sarah VanDyke ,
Christian Schutte & Robin A. Hanks (2012) Parsimonious prediction of Wechsler Memory Scale, Fourth Edition scores:
Immediate and delayed memory indexes, Journal of Clinical and Experimental Neuropsychology, 34:5, 531-542, DOI:
10.1080/13803395.2012.665437
To link to this article: http://dx.doi.org/10.1080/13803395.2012.665437
PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content)
contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors
make no representations or warranties whatsoever as to the accuracy, completeness, or suitability
for any purpose of the Content. Any opinions and views expressed in this publication are the opinions
and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of
the Content should not be relied upon and should be independently verified with primary sources of
information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands,
costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or
indirectly in connection with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or
systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution
in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at
http://www.tandfonline.com/page/terms-and-conditions
JOURNAL OF CLINICAL AND EXPERIMENTAL NEUROPSYCHOLOGY

2012, 34 (5), 531542
Parsimonious prediction of Wechsler Memory Scale,

Fourth Edition scores: Immediate and delayed memory
indexes
Justin B. Miller1 , Bradley N. Axelrod2 , Lisa J. Rapport1 , Scott R. Millis3 ,
Sarah VanDyke2 , Christian Schutte2 , and Robin A. Hanks3
1
Department of Psychology, Wayne State University, Detroit, MI, USA

Psychology Section, John D. Dingell Department of Veterans Affairs Medical Center, Detroit,
MI, USA
3
Department of Physical Medicine and Rehabilitation, Wayne State University, School of Medicine
Detroit, MI, USA
Downloaded by [University Library Utrecht] at 06:10 23 September 2013
Research on previous versions of the Wechsler Memory Scale (WMS) found that index scores could be predicted
using a parsimonious selection of subtests (e.g., Axelrod & Woodard, 2000). The release of the Fourth Edition
(WMSIV) requires a reassessment of these predictive formulas as well as the use of indices from the California
Verbal Learning TestII (CVLTII). Complete WMSIV and CVLTII data were obtained from 295 individuals.
Six regression models were fit using WMSIV subtest scaled scoresLogical Memory (LM), Visual Reproduction
(VR), and Verbal Paired Associates (VPA)and CVLTII substituted scores to predict Immediate Memory
Index (IMI) and Delayed Memory Index (DMI) scores. All three predictions of IMI significantly correlated with
the complete IMI (r = .92 to .97). Likewise, predicted DMI scores significantly correlated with complete DMI
(r = .92 to .97). Statistical preference was indicated for the models using LM, VR, and VPA, in which 97% and
96% of the cases fell within two standard errors of measurement (SEMs) of full index scores, respectively. The
present findings demonstrate that the IMI and DMI can be reliably estimated using two or three subtests from
the WMSIV, with preference for using three. In addition, evidence suggests little to no improvement in predictive
accuracy with the inclusion of CVLTII indices.
Keywords: Memory; Wechsler Memory Scale; Short form; Parsimonious prediction; Concordance correlation
coefficient; Immediate memory; Delayed memory.
The release of the Fourth Edition of the Wechsler

Memory Scale (WMSIV; Wechsler, 2008b)
introduces new measures (e.g., Designs, Spatial
Addition, Symbol Span) along with the return
of several familiar measures that have roots that
reach back to the inception of the scale from
1945 (e.g., Visual Reproduction, Logical Memory,
Verbal Paired Associates). The WMSIV has also
introduced a new degree of flexibility by allowing
for substitution of scores from the Second Edition

of the California Verbal Learning Test (CVLTII;
Delis, Kramer, Kaplan, & Ober, 2000). Specifically,
clinicians who chose to utilize the CVLTII as a
measure of verbal list learning instead of Verbal
Paired Associates can generate scaled scores by
substituting the Trials 15 t score and long-delay
free-recall z score from the CVLTII for the immediate and delayed memory recall portions of Verbal
This research was supported by a grant from the National Institute on Disability and Rehabilitation Research, Department of
Education (H133A080044), the Del Harder Foundation, and the Wayne State University Graduate School. The contents of this study
do not necessarily represent the policies of the funding agencies. Justin B. Miller is now at the Semel Institute for Neuroscience and
Human Behavior, University of California, Los Angeles, CA, USA.
Address correspondence to Bradley N. Axelrod, John D. Dingell Department of Veterans Affairs, Psychology Section, Detroit, MI,
USA (E-mail: bradley.axelrod@va.gov).
2012 Psychology Press, an imprint of the Taylor & Francis Group, an Informa business
http://www.psypress.com/jcen
http://dx.doi.org/10.1080/13803395.2012.665437
532
MILLER ET AL.
Paired Associates, respectively. These scores were

derived from a subset of 380 participants in the
normative sample who were administered both the
CVLTII and the full WMSIV; for greater detail
regarding the derivation of these substituted scaled
scores, the reader is referred to the technical and
interpretive manual for the WMSIV (Wechsler,
2008a).
The structure of the WMSIV index scores has
changed in the new edition. Whereas complete
administration of the Third Edition (WMSIII)
generated eight separate index scores, some of
which have been retained in the Advanced Clinical
Solutions supplement, the standard WMSIV has
simplified this profile and calculates just five
indexes. Specifically, separate index scores are generated for auditory memory (AMI), visual memory
(VMI), immediate memory (IMI), delayed memory (DMI), and visual working memory (VWMI).
The most notable change is that there is no longer
calculation of a General Memory Quotient. The
number of available subtests has also been reduced
from 17 to 11 for the WMSIV. According to the
manual, complete administration of all 11 subtests
in the WMSIV requires approximately 83 minutes
at the 50th percentile of general test takers and
126 minutes at the 90th percentile in the special
group sample. However, our experiences suggest
that these estimates may be low and that more realistic estimates are that the complete battery takes an
average of approximately 120 minutes when all subtests are administered to individuals with cognitive
impairment.
Research on the Revised (WMSR) and Third
(WMSIII) editions of the Wechsler Memory
Scale found that index scores could be predicted using a parsimonious selection of subtests (Axelrod & Woodard, 2000; Woodard &
Axelrod, 1995). Specifically, three-subtest short
forms of the WMSR resulted in 94% and 97%
of the sample obtaining scores within 6 points of
immediate and delayed summary scores, respectively. Similarly, three-subtest short forms for the
WMSIII resulted in 92% and 96% of the sample
falling within 6 points of immediate and delayed
recall, respectively.
The release of the Fourth Edition (WMSIV)
requires a reassessment of these predictive formulas
as well as the substitution of CVLTII indices. The
aim of the present study was to determine the extent
to which a parsimonious selection of the 11 available subtests reliably predicted the composite index
scores generated by full administration. Emphasis
was placed on predicting the IMI and DMI from
three of the four primary subtests used in calculation of the IMI and DMI and those measures that
were carried over from the previous WMS version

(i.e., Logical Memory, Visual Reproduction, and
Verbal Paired Associates). Although it has been
suggested that short forms may be inappropriate
when specific measures for brief assessment exist
(Kaufman & Kaufman, 2001), the reality remains
that complete administration of an entire battery
of memory measures is not always feasible for any
number of reasons (e.g., limitations on resources,
patient fatigue, third-party payers). This study
sought to provide an option for those situations in
which a brief assessment is warranted.
Of additional interest was the difference in
predicted index scores when CVLTII indices were
substituted for Verbal Paired Associates. Given
that the Logical Memory and Visual Reproduction
subtests are nearly identical, and Verbal Paired
Associates is highly similar in the third and fourth
editions, it was predicted that the present study
would yield findings similar to those by Axelrod
and Woodard (2000) and Woodard and Axelrod
(1995), in that the IMI and DMI could be reliably
predicted using less than the full battery of subtests included in the WMSIV. Moreover, it was
expected that predicted values from these equations
would demonstrate adequate reliability to be
considered useful in clinical settings. Although it
is assumed that the intercorrelations among the
individual subtests and the respective index scores
are quite high, such a finding adds additional
support to the assertion that the measure can
effectively be shortened without compromising the
integrity of the IMI or DMI.
METHOD
Participants
Archival data for the present study were obtained
from the John D. Dingell Department of Veterans
Affairs (VA) Medical Center in Detroit, Michigan
and the Rehabilitation Institute of Michigan (RIM)
in Detroit. The VA sample was drawn from the
clinical archives of persons evaluated on an outpatient basis during the course of routine clinical
care. The primary diagnostic concerns included
dementia, mood disorders, traumatic brain injury,
learning disability, attention deficit hyperactivity
disorder, and other medical concerns, such as cognitive declines related to Parkinsons disease and
vascular events.
The sample from RIM was collected as part
of a research study investigating the ability of
the WMSIV to differentiate cognitive impairment resulting from actual moderate to severe
WMSIV INDEX SCORE PREDICTION
traumatic brain injury (TBI) from feigned cognitive

impairment. This sample included persons with
bona fide TBI (n = 60) recruited from the
pool of participants currently enrolled in the
Southeastern Michigan Traumatic Brain Injury
System (SEMTBIS) and healthy adults coached
to feign cognitive impairment (n = 64). For a
detailed description of this sample of participants,
and the coaching procedures, the reader is referred
to Miller et al. (2011). Although cases simulating
cognitive impairment do not represent actual cognitive functioning, the present study focused on the
psychometric properties of the index scores rather
than assessment of actual cognitive functioning.
Furthermore, including individuals feigning cognitive impairment helps to ensure that an adequate
range of scores is sampled and that index score calculation is independent from clinical presentation.
The sample included 295 cases; 124 from RIM
and 171 from the VA. Mean age for the total sample was 43.4 years (SD = 13.3; range = 18 to 65),
and mean years of education were 12.8 (SD = 2.0;
range = 8 to 21). The sample was predominantly men (90.2%) with nearly equal representations of African Americans (48.5%) and Caucasians
(49.8%). Fewer than 2% of the sample reported
their ethnicity as Arabic, Hispanic, or Latino.
Materials
For all cases, the entire WMSIV and CVLTII
were administered and scored according to
the standardized procedures. The sample from
RIM also completed several stand-alone symptom validity measures (e.g., Test of Memory
Malingering; Tombaugh, 1996) and embedded
measures of response bias (e.g., Reliable Digit
Span; Greiffenstein, Gola, & Baker, 1995). The VA
sample also completed the full WMSIV as part of
a comprehensive assessment of cognitive functioning in the context of a standard outpatient clinical
evaluation.
Procedure
This study was reviewed and approved by the
Human Investigation Committee at Wayne State
University and the Veterans Affairs Clinical
Investigation Committee. Persons recruited for the
study conducted at RIM were contacted via telephone (Southeastern Michigan Traumatic Brain
Injury System participants) or responded to printed
advertisements and posted fliers (simulator participants); all persons in this sample who agreed
533
to participate were evaluated at RIM or Wayne

State University. Testing was completed in a single session, and participants were compensated $30.
Because the archival data from the VA sample
were collected as part of routine clinical care, these
individuals were neither recruited nor compensated
for participating. All tests were administered by
a licensed psychologist, trained neuropsychology
technician, or advanced graduate students in clinical psychology.
The WMSIV and CVLTII were administered
independently; the order of test administration varied among participants in both samples to preclude the introduction of order effect bias. All
measures were scored according to standardized
instructions provided in the test manuals. Primary
WMSIV index scores were calculated using both
the standard calculation method using all four primary subtests and the substitution of the Total
15 t score (CVLTT) from the CVLTII for
Verbal Paired Associates (VPA) Immediate and
Long-Delay Free-Recall z score (CVLTZ) for VPA
delayed.
Analyses
The primary analytic strategy entailed use of linear
regressions with one of two index scores as the outcome and a combination of several subtest scores
as predictors. The primary index scores of interest
included the Immediate Memory Index (IMI) and
the Delayed Memory Index (DMI). The individual predictors included scaled scores from the
Visual Reproduction (VR), Logical Memory (LM),
and Verbal Paired Associates (VPA) as well as
scaled scores generated from substituting CVLTII
indices for VPA. The Design Memory subtest
was not included as part of the analysis to focus
the present study on those subtests that are most
frequently administered. Six separate models were
evaluated, three predicting IMI(a) VR and LM
(IMI2); (b) VR, LM, and VPA (IMI3W); and (c)
VR, LM, and CVLTT (IMI3C)and three comparable models predicting DMI(a) VR and LM
(DMI2); (b) VR, LM, and VPA (DMI3W); and
(c) VR, LM, and CVLTZ (DMI3C). Predicted
values were calculated using the unstandardized
regression coefficients from each resulting equation.
Model performance was evaluated by comparing R-squared values, and the contribution of
individual predictors was assessed based on standardized regression coefficients. Predicted values
from each regression equation were correlated
with observed index score values using Pearson
correlation coefficients to determine the association
534
MILLER ET AL.
between actual and predicted values. The percentage of predicted index scores that fell within 1 standard error of measurement (SEM) of observed
index scores was computed. As reported in the technical and interpretive manual for the WMSIV, the
average SEM for the adult battery (i.e., 1669 years
of age) is 3.53 points for the IMI and 3.71 points for
the DMI (Wechsler, 2008a). Thus, for the present
study, the SEM was rounded to 4 points for both
indexes. To provide a clinically relevant metric, the
percentage of scores falling within a full standard
deviation (SD; 15 index points) was also computed. The Pearson correlations between predicted
WMSIV scores and obtained WMSIV scores
were compared across the different models, and
the differences in correlations between models were
evaluated using t tests following r-to-z transformations of the correlations.
To evaluate the rate of agreement between
the original value and predicted values, Lins
Concordance Correlation Coefficient (CCC; King
& Chinchilli, 2001; Lin, 1989) was also calculated.
The CCC is a method for measuring agreement
on a continuous measure obtained by two persons
or methods; it measures both precision and accuracy to determine how far the observed data deviate
from the line of perfect concordance (i.e., the line
at 45 on a square scatter plot). Lins coefficient is
expressed as the product of the Pearson correlation
coefficient, which is a quantification of precision
(i.e., the distance of each data point from the fitted model), with the added incorporation of a bias
correction factor (Cb) that measures accuracy (i.e.,
the distance of the resulting model from the optimal 45 diagonal originating at the origin; King
& Chinchilli, 2001; Lin, 1989). Unlike using standard Pearson statistics or paired-sample t tests, the
CCC is capable of detecting systematic bias that
may be present among comparisons (e.g., over or
under prediction) and is thus a preferred statistic
for evaluating agreement between continuous variables. Bias correction factor values that differ from
1.00 indicate the presence of bias, with greater deviation indicating stronger bias (Lin, 1989). Values of
the CCC can range from 1.0, indicating poor concordance, to 1.0, which would be observed in the
presence of perfect agreement between values.
RESULTS
Descriptive statistics for predicted index score values for all six models are presented in Table 1,
and model summary statistics for the three IMI
and three DMI prediction equations are presented
in Table 2. All regression coefficients in each of
the models were significant at the p < .001 level.

The frequencies of score discrepancies between predicted and actual values based on the SEM and
standard deviation (SD) are presented in Table 3;
these scores have been provided in order to evaluate whether any of the predictive models systematically over- or underpredict observed scores.
The IMI3W model was statistically significant,
F(3, 288) = 1,832.18, p < .001, with an R2 of
.95. Visual Reproduction Immediate (VR1) generated the largest standardized regression coefficient,
and LM1 generated the smallest. This model also
generated predicted scores that had the tightest distribution of the immediate memory models, with
nearly two thirds of scores falling within 1 SEM of
actual values and 96% of scores within 2 SEMs (i.e.,
8 index score points). The IMI3C model was
also significant, F(3, 279) = 634.19, p < .001, with
an R2 of .87. Visual Reproduction Immediate again
had the largest coefficient; however, the regression
coefficient for CVLTT was the smallest in this
model. In relation to actual values, 44.9% of predicted values fell within 1 SEM and 79.9% of
scores within 2 SEMs of actual values. The most
parsimonious model, IMI2, produced the smallest
R2 value of the three immediate memory models but
was still statistically significant, F(2, 290) = 817.14,
p < .001. This model produced the largest discrepancies between observed and predicted values with
only 40% of scores within 1 SEM and 23.3% of
scores differing by more than 2 SEMs. Among individuals with an observed IMI that was within 1 SD
of the mean in either direction (n = 125), the average prediction discrepancy for the three models was
0.70 points (SD = 5.4, range = 12.1 to 21.7). For
individuals with an observed IMI between 1 and
2 SDs below the mean (n = 86), predicted values
averaged 0.20 points lower than observed values
(SD = 5.8, range = 16.49 to 17.11). Among individuals with observed IMI values more than 2 SDs
below the mean (n = 73), the average discrepancy was 1.80 points lower than observed values
(SD = 5.8, range = 17.54 to 10.64). Nine participants had observed IMI scores that were more than
1 SD above the mean.
Predicting the DMI yielded similar results to the
three models for IMI, with the DMI3W demonstrating the largest R2 . This model was significant
overall, F(3, 287) = 1,727.41, p < .001. Predicted
values from this model most closely approximated
the distribution of observed scores, with 71.1% of
scores falling within 1 SEM and fewer than 4%
of scores falling more than 2 SEMs from actual
values. As with the immediate memory models,
replacing Verbal Paired Associates Delayed Recall
(VPA2) with CVLTZ for the DMI3C produced a
535
TABLE 1
Descriptive statistics for full and short-form predicted WMSIV Immediate and Delayed Memory Indexes
Model
SD
Range of difference
Actual IMI
IMI2 (LM1, VR1)
IMI3C (LM1, VR1, CVLTII1)a
IMI3W (LM1, VR1, VPA1)
83.0
82.9
83.2
82.9
18.0
16.7
16.9
17.6
NA
17.5 to 21.7
16.4 to 20.8
10.7 to 14.4
Actual DMI
DMI2 (LM2, VR2)
DMI3C (LM2, VR2, CVLTII2)b
DMI3W (LM2, VR2, VPA2)
82.6
82.4
82.7
82.6
16.3
15.0
15.0
15.8
NA
16.1 to 17.5
15.4 to 16.3
10.7 to 11.8
Note. WMSIV = Wechsler Memory ScaleFourth Edition; IMI = Immediate Memory Index; DMI = Delayed Memory Index;
LM1 = Logical MemoryImmediate; LM2 = Logical MemoryDelayed; VR1 = Visual ReproductionImmediate; VR2 = Visual
ReproductionDelayed; CVLTII1 = California Verbal Learning TestIIImmediate; CVLTII2 = California Verbal Learning Test
IIDelayed; VPA1 = Verbal Paired AssociatesImmediate; VPA2 = Verbal Paired AssociatesDelayed.
a IMI3C employed a substitution of CVLTII Trials 15 t score for Verbal Paired Associates (Immediate).
b DMI3C employed a substitution of CVLTII Long-Delay Free-Recall z score for Verbal Paired Associates (Delayed).
TABLE 2
Regression summary statistics for short-form prediction models of Immediate and Delayed Memory Indexes
Predictors
Model summary
SEB
R2
SEE
IMI2
Logical Memory 1
Visual Reproduction 1
Constant
2.78
2.61
43.76
0.14
0.13
1.05
0.54
0.52
.92
.85
7.00
IMI3C
Logical Memory 1
Substitution 1
Constant
2.32
2.48
1.13
39.48
0.15
0.13
0.17
1.18
0.44
0.49
0.18
.93
.87
6.49
IMI3W
Logical Memory 1
Verbal Paired Associates 1
Constant
1.83
2.25
2.16
36.77
0.09
0.08
0.09
0.67
0.35
0.45
0.39
.98
.95
4.04
DMI2
Logical Memory 2
Constant
2.76
2.55
46.55
1.22
0.12
0.99
0.57
0.55
.92
.84
6.49
DMI3C
Logical Memory 2
Substitution 2
Constant
2.39
2.35
0.85
44.43
0.13
0.13
0.13
1.02
0.49
0.50
0.18
.93
.86
6.10
DMI3W
Logical Memory 2
Verbal Paired Associates 2
Constant
1.98
2.02
1.94
40.43
0.08
0.07
0.08
0.63
0.41
0.43
0.39
.97
.95
3.75
Model
Note. SEB = Standard error of beta weights; SEE = Standard error of the estimate; IMI = Immediate Memory Index; IMI3C =
predicted IMI using LM, VR, and California Verbal Learning Test, 2nd edition (CVLTII); IMI3W = predicted IMI using LM,
VR, and Verbal Paired Associates (VPA); DMI = Delayed Memory Index; DMI2 = predicted DMI using LM and VR; DMI
3C = predicted DMI using LM, VR, and CVLTII; DMI3W = predicted DMI using LM, VR, and VPA. Substitution refers to use
of California Verbal Learning TestII indices in place of verbal paired associates.
536
MILLER ET AL.
TABLE 3
Accuracy of predicted WMSIV index scores in relation to the standard error of measurement and standard deviation of the
indexes
Predicted IMI
Range of discrepancy
Predicted DMI
IMI2
IMI3C
IMI3W
DMI2
DMI3C
DMI3W
Standard error of measurementa

(Predicted > actual) 8 points
(Predicted > actual) < 4 points
Within 1 SEM
(Predicted < actual) < 4 points
(Predicted < actual) 8 points
12.6
16.6
40.0
19.3
10.8
9.9
18.7
44.9
16.3
10.2
1.0
15.8
65.4
15.4
2.4
9.6
19.5
43.8
15.1
12.0
10.3
14.9
49.8
13.9
11.0
1.4
12.7
71.1
12.4
2.4
Standard deviation
(Predicted > actual) 15 points
(Predicted > actual) within 15 points
(Predicted < actual) within 15 points
(Predicted < actual) 15 points
1.4
49.7
47.6
1.4
0.7
47.1
51.4
0.7
0.0
52.7
47.3
0.0
0.7
49.8
49.1
0.3
0.0
50.7
48.9
0.4
0.0
48.8
51.2
0.0
Note. WMSIV = Wechsler Memory ScaleFourth Edition; IMI = Immediate Memory Index; IMI2 = predicted IMI using Logical
Memory (LM) and Visual Reproduction (VR); IMI3C = predicted IMI using LM, VR, and California Verbal Learning Test, 2nd
edition (CVLTII); IMI3W = predicted IMI using LM, VR, and Verbal Paired Associates (VPA); DMI = Delayed Memory Index;
DMI2 = predicted DMI using LM and VR; DMI3C = predicted DMI using LM, VR, and CVLTII; DMI3W = predicted DMI
using LM, VR, and VPA.
a Standard errors of measurement (SEMs) for IMI = 3.5 and 3.7 points for IMI and DMI, respectively. Cutpoints rounded to nearest
whole points of SEM = 4. Standard deviations (SDs) for IMI and DMI = 15 points.
significant model, F(3, 277) = 563.26, p < .001. The

R2 value for this model, however, was considerably
smaller than the model using VPA2 and slightly
smaller than the R2 value for the IMI model
using CVLTZ. This finding indicates that substitution of the CVLTII Long-Delay Free Recall for
VPA2 produces greater variability in subtest scores
than does the standard calculation of the DMI
using VPA or the substitution method for the IMI.
The DMI2 generated the smallest R2 value of all
six models; however, the model was still statistically
significant, F(2, 289) = 769.42, p < .001. A total
of 43% of scores from this model were within 1
SEM and 78% within 2 SEMs. Among individuals with an observed DMI that was within 1 SD
of the mean in either direction (n = 108), the average prediction discrepancy for the three models was
1.3 points (SD = 5.1, range = 16.03 to 15.01). For
individuals with an observed DMI between 1 and
2 SDs below the mean (n = 109), there was no difference on average between observed and predicted
values (M = 0.0, SD = 4.9, range = 15.41 to
17.51). Among individuals with observed DMI values more than 2 SDs below the mean (n = 65),
the average discrepancy was 2.4 points lower than
observed DMI values (SD = 5.0, range = 16.08 to
10.53). Nine participants had observed DMI scores
that were more than 1 SD above the mean.
The rate of agreement between observed values
and predicted values generated from each model
was also high. Each Pearson correlation coefficient
calculated between observed index score values and
predicted values from each model was positive and

significant (p < .001), indicating a high degree of
precision in predicted values. The bias correction
factors used in the calculation of the CCC each
approximated 1.00, indicating near-perfect accuracy with very little evidence of systematic bias
in any of the six models. Inspection of the generated scatter plots between observed and predicted
values showed a slight trend of each model to overpredict at the higher end of the score distribution
and underpredict at the lower end; this trend is
most pronounced in the two-variable models and
the models that substitute CVLTII variables for
VPA. The predictive models using LM, VR, and
VPA showed near-perfect agreement. All measures
of agreement are presented in Table 4; Figures 16
plot the data along with line of perfect concordance
(i.e., the line at 45 on a square scatter plot) and the
datas reduced major axis (i.e., summarized center
of observed data plotted through the intersection
of the means with a slope equal to the ratio of standard deviations) to show the extent of bias present
in each prediction model.
Given that the fit of each of the six models
accounted for a significant proportion of the variance in the respective index scores, and each predictive model generated values that were in high
agreement with observed values, correlations of
predicted to actual index scores were compared to
each other using Fischers r-to-z transformation,
which translates Pearson correlation coefficients to
a common metric to facilitate significance testing
537
TABLE 4
Measures of agreement between observed and predicted index scores
Precision (Pearson
correlation coefficient)
Accuracy
(bias factor)
Concordance correlation
coefficient (CCC)
95%
Confidence
interval
IMI2
IMI3C
IMI3W
.92
.93
.98
1.00
1.00
1.00
.92
.93
.97
[.90, .94]
[.92, .95]
[.97, .98]
DMI2
DMI3C
DMI3W
.92
.93
.97
1.00
1.00
1.00
.91
.92
.97
[.89, .93]
[.91, .94]
[.97, .98]
Model
140
120
100
IMI
80
60
40
Note. IMI = Immediate Memory Index; IMI2 = predicted IMI using Logical Memory (LM) and Visual Reproduction (VR);
IMI3C = predicted IMI using LM, VR, and California Verbal Learning Test, 2nd edition (CVLTII); IMI3W = predicted IMI
using LM, VR, and Verbal Paired Associates (VPA); DMI = Delayed Memory Index; DMI2 = predicted DMI using LM and VR;
DMI3C = predicted DMI using LM, VR, and CVLTII; DMI3W = predicted DMI using LM, VR, and VPA.
40
60
Reduced major axis
80
IMI2
100
120
Line of perfect concordance
Figure 1. Scatter plot of observed Immediate Memory Index (IMI) values and predicted IMI values using Logical Memory and Visual
Reproduction subtests (IMI2).
using a t distribution. This transformation allows

for determination of the relative benefit of one
prediction equation over the others. For immediate memory scores, the correlation of IMI3W to
actual IMI was significantly stronger, t(294) = 5.99,
p < .001, than were the correlations of IMI to either
IMI2 or IMI3C, which did not significantly
differ from each other, t(294) = 0.83, p = .40.
Similarly, actual DMI scores were most highly correlated with DMI3W, significantly more strongly,
t(294) = 6.15, p < .001, than to comparing DMI
to either DMI2 or DMI3C, which again did not
differ from each other, t(294) = 0.83, p = .40. Both
the IMI3W and DMI3W generated significantly
stronger correlations than the other models and

showed the highest rate of agreement with observed
scores. All Pearson correlation coefficients between
observed and predicted scores are presented in
Table 4.
To evaluate the influence of test-taking strategy
on the predictive abilities of the generated models, predicted values were compared to original
values for the subsample of participants simulating cognitive impairment. As the purpose of the
present paper was to determine the ability of a subset of test scores to estimate the full index scores,
it was expected that test-taking strategy would
have no influence on predictive accuracy. This
MILLER ET AL.
40
40
60
80
IMI3C
Reduced major axis
100
120
60
80
IMI
100
120
140
Figure 2. Scatter plot of observed Immediate Memory Index (IMI) values and predicted IMI values using Logical Memory, Visual
Reproduction, and California Verbal Learning Tests, 2nd Edition subtest scaled scores (IMI3C).
40
60
80
IMI
100
120
140
538
40
60
Reduced major axis
80
IMI3W
100
120
Figure 3. Scatter plot of observed Immediate Memory Index (IMI) values and predicted IMI values using Logical Memory, Visual
Reproduction, and Verbal Paired Associates subtest scaled scores (IMI3W).
expectation was confirmed by paired-sample t tests,

which were nonsignificant for IMI comparisons (all
p values > .38). However, DMI predicted values for
this subsample were all significantly higher than
the actual DMI values (all p values .02) by

an average of 1.5 points (DMI3C, M = 79.5,
SD = 14.4) and 1.7 points (DMI3W, M = 79.7,
SD = 14.9; DMI2, M = 79.7, SD = 14.3).
539
40
60
80
100
DMI2
Reduced major axis
120
140
60
80
DMI
100
120
140
Figure 4. Scatter plot of observed Delayed Memory Index (DMI) values and predicted DMI values using Logical Memory and Visual
Reproduction subtests (DMI2).
40
60
80
DMI
100
120
140
40
60
80
Reduced major axis
DMI3C
100
120
140
Figure 5. Scatter plot of observed Delayed Memory Index (DMI) values and predicted DMI values using Logical Memory, Visual
Reproduction, and California Verbal Learning Tests, 2nd Edition subtest scaled scores (DMI3C).
Although statistically significant, the differences

of less than 2 points are not clinically meaningful, as demonstrated by negligible effect sizes for
each of the three comparisons (Cohens ds < .12).
Looking at the distribution of predicted scores
among simulators based on the SEM reaffirms this

observation as evidenced by the percentage of predicted DMI values that were greater than observed
scores by 2 or more SEMs (14.1%, 10.9%, and
1.6% for the DMI2, DMI3C, and DMI3W,
MILLER ET AL.
40
60
80
DMI
100
120
140
540
40
60
80
100
120
140
DMI3W
Reduced major axis
Figure 6. Scatter plot of observed Delayed Memory Index (DMI) values and predicted DMI values using Logical Memory, Visual
Reproduction, and Verbal Paired Associates subtest scaled scores (DMI3W).
respectively). The proportion of predicted DMI

values that were underestimated by the predictive
formulas by more than 2 SEMs was considerably
smaller (DMI2 = 4.7%; DMI3C = 6.3%; DMI
3W = 1.6%). Among nonsimulators, a similar proportion of overprediction by 2 or more SEMs
was observed (DMI2 = 8.3%; DMI3C = 10.1%;
DMI3W = 1.3%).
DISCUSSION
The findings provide strong support for estimating WMSIV index scores using parsimonious
subsets of the most frequently administered subtests. The three primary subtests of the WMSIV
that are most familiar and most frequently used
(Logical Memory, Visual Reproduction, and
Verbal Paired Associates) serve remarkably well
to estimate both the Immediate Memory Index
(IMI) and Delayed Memory Index (DMI) of the
WMSIV. Furthermore, the present findings
demonstrate that the IMI and DMI can be reliably
estimated with a level of predictive accuracy that
is considered acceptable for use in clinical practice.
Estimations that used the CVLTII substitution
generally performed less well than did estimations
using the WMSIV subtests. Although the most
accurate model uses three of the four original
WMS subtests used to calculate the IMI and DMI,

the present findings also show that using just two of
the subtests will still yield index scores that closely
approximate the actual values that would have been
obtained if all four subtests were administered.
Such findings likely have particular relevance for
clinicians facing increasing pressures and restrictions on testing practices, or for those working with
patient populations who may fatigue easily and are
unable to endure an extensive cognitive battery.
Comparing the different short forms finds that
the two-subtest versions and the versions that
added CVLTII substituted scores are essentially interchangeable. Specifically, using Logical
Memory and Visual Reproduction alone to predict the Immediate Memory and Delayed Memory
index scores yields the same predictive accuracy as the models that added indices from the
CVLTII. Therefore, the addition of CVLTII
indices contributes no additional predictive accuracy over using just Logical Memory and Visual
Reproduction.
In contrast, the addition of Verbal Paired
Associates in addition to Logical Memory and
Visual Reproduction significantly improved
estimations of IMI and DMI. This finding is
anticipated because using Logical Memory, Visual
Reproduction, and Verbal Paired associates
together approximates most closely to the original
sample from which the Immediate and Delayed

Memory indexes were constructed. The predictive
accuracy of the three-subtest version also raises
questions regarding the utility of the Designs
subtest included in the full WMSIV. The present
findings suggest that Designs plays a minimal
role in the actual immediate or delayed memory
index scores, given that short forms using Logical
Memory, Visual Reproduction, and Verbal Paired
Associates together account for approximately
95% of variability in these indices. What remains
unclear, however, is the extent to which the Designs
subtest contributes to the auditory and visual
memory indexes. Although the Designs subtest
may contribute little to the immediate and delayed
memory indexes, it may be an integral component
of these other indexes. For clinicians who are only
interested in the IMI and DMI, however, omission
of the Designs subtest is particularly appealing in
that it removes the subtest with the greatest time
demand and the most materials, thus simplifying
administration.
Although the predicted DMI values were statistically higher than the actual values in the sample of simulators, it is important to note that the
magnitude of the actual differences (i.e., less than
2 points) is not clinically meaningful. Additionally,
it is important to note that these values still showed
a strong positive relationship with actual values,
which all fall at the extreme end of the distribution. Consistent with the existing literature, the
predictive accuracy of these models deteriorates
the further out scores get from the center of the
distribution (Axelrod & Woodard, 2000; Spinks
et al., 2009). For individuals who obtained an
actual immediate memory index within two standard deviations of the mean in either direction,
predicted values were within one index point on
average. Predicted values for individuals with an
actual immediate memory index that fell more than
two standard deviations below the mean, however,
generated predicted values that differed by nearly
two index score points on average. A similar pattern was observed among delayed memory index
scores, in which individuals who obtained an actual
delayed memory index within two standard deviations of the mean generated predicted values that
were generally within one index point. Individuals
with an actual delayed memory index that was
more than two standard deviations below the mean
generated predicted values that differed by more
than two index points on average, and although
this difference is very small and not clinically
meaningful, it is still of theoretical interest. Thus,
clinicians faced with a test-taker demonstrating
severely impaired levels of memory functioning as
541
evidenced by scores that are more than two full

standard deviations below the mean on individual subtests should use these predictive equations
with this consideration in mind. Similar cautionary warnings likely apply to the opposite end of
the distribution as well; however, the proportion of
individuals scoring more than one standard deviation above the mean in the present sample was too
small to investigate this possibility adequately.
From a practical standpoint, clinicians who
wish to adopt the present approach and obtain
Immediate and Delayed index scores without having to administer the entire WMSIV battery can
do so by utilizing the coefficients presented in
Table 2. Specifically, once the specific model has
been selected, the respective unstandardized regression coefficients (B) can be applied by multiplying the subtest age-adjusted scaled scores by their
respective coefficients in Table 2, summing the
totals, and adding the constant. The resulting value
equals the predicted index score, which approximates what would have been obtained had all four
subtests been administered. The predicted values
that were generated to approximate index scores
were obtained from administration of the entire
WMSIV battery; also, measures not included in
the original standardization process were used to
satisfy delay intervals. These limitations may have
influenced performance on DMI subtests; however,
they are unlikely to have had significant influence,
because during the standardization process itself,
the tests go through myriad changes (i.e., adding
items, dropping items) before the final product is
published. Another important point to reiterate
is that all predictive models utilized age-adjusted
scaled scores obtained from the test manual. Use
of raw subtest scores in these equations will result
in drastically different values that will not approximate an index score.
In comparison to previous short forms of the
Wechsler Memory Scale (e.g., Axelrod & Woodard,
2000; Woodard & Axelrod, 1995), the results from
the present study are very consistent. As with the
WMSIII, there is continued evidence that the
immediate memory index score can be reliably predicted from two or three of the available subtests.
Although there is no longer a General Memory
index such as that included in the WMSIII and
as studied by Axelrod and Woodard (2000), a similar composite included in the present study is the
delayed memory index, which also was effectively
predicted by less than the full battery of subtests.
One striking similarity between the present paper
and earlier short forms is that the squared multiple correlations for each model are nearly identical.
Although the specific predictors are different, the
542
MILLER ET AL.
three-variable models fit by Axelrod and Woodard

accounted for approximately 95% of the variance
in index scores, whereas the two-variable models
accounted for only 87%. This pattern between the
two- and three-variable models was virtually identical in the present study for those models that
only included Wechsler Memory Scale subtests
(i.e., those not including CVLTII substitution).
Moreover, the same shortcomings were also identified in that the predictive formulas lose stability
at the extreme ends of the normal distribution,
which is likely a factor of the predictive method as
opposed to a function of the test itself.
Comparing the contributions of the individual
subtests between the WMSIII and the WMSIV
reveals some specific differences that are important to note. For example, in each model studied
by Axelrod and Woodard (2000) for the WMSIII,
Logical Memory had the largest contribution of
the included predictors. In the present study, however, this was only the case when Logical Memory
was paired with one other subtest (e.g., Visual
Reproduction). In each of the three-variable models for the WMSIV, visual reproduction made
the largest contribution to predicted index scores,
and it is unclear how this subtest contributed to
WMSIII short forms as Visual Reproduction was
not included in the earlier studies.
A unique contribution of the present study was
the inclusion of individuals coached to feign cognitive impairment. Theoretically, the generation of
index scorescomplete or estimated from short
formsshould be independent of test-taking strategy as they are a linear composite of observed
scores and therefore should not be affected by
skill or psychological characteristics such as motivation; the calculations should apply regardless of
test-taking strategy. This notion was supported in
part by the fact that the pattern of findings among
the coached simulators paralleled that observed
among the bona fide patients. At the same time,
although the absolute differences were relatively
small and not clinically meaningful (i.e., less than
2 points), it is of some interest that the short
forms tended to slightly overpredict simulators.
It is also important to note that the subsample of participants simulating cognitive impairment performed quite poorly, which reinforces the
notion that scores from these predictive equations
become less reliable at the extreme ends of the
score distributions. Furthermore, a nearly identical proportion of scores were overpredicted in the
remainder of the sample for each of the predictive
methods. In that regard, the short forms seem to
work in the direction of being robust to feigned
cognitive impairment rather that vulnerable to it
and can thus be applied in the context of suspected

negative response bias.
The mixed nature of the clinical sample, which
included a variety of neuropsychological and
psychological conditions across a broad range of
age and education, represents both a strength and
a weakness of the present design. Also, although
there is no reason to expect that gender would
affect prediction accuracy, the present sample
overrepresented men. Future research should verify
that the patterns observed in this study hold true
across specific subgroups and conditions. The
exclusion of the Designs subtest also represents
a weakness of the present study, as its inclusion
may have allowed for even greater precision in the
resulting predicted values.
Original manuscript received 7 October 2011
Revised manuscript accepted 6 February 2012
First published online 2 March 2012
REFERENCES
Axelrod, B. N., & Woodard, J. L. (2000). Parsimonious
prediction of Wechsler Memory ScaleIII memory
indices. Psychological Assessment, 12, 431435.
Delis, D., Kramer, J., Kaplan, E., & Ober, B. (2000).
The California Verbal Learning Test (2nd ed.). San
Antonio, TX: Pearson.
Greiffenstein, M. F., Gola, T., & Baker, W. J. (1995).
MMPI2 validity scales versus domain-specific measures in detection of factitious traumatic brain injury.
Clinical Neuropsychologist, 9, 230240.
Kaufman, J. C., & Kaufman, A. S. (2001). Time for
the changing of the guard: A farewell to short forms
of intelligence tests. Journal of Psychoeducational
Assessment, 19, 245267.
King, T. S., & Chinchilli, V. M. (2001). A generalized concordance correlation coefficient for continuous and
categorical data. Statistics in Medicine, 20, 21312147.
Lin, L. I. (1989). A concordance correlation-coefficient
to evaluate reproducibility. Biometrics, 45, 255268.
Miller, J. B., Millis, S. R., Rapport, L. J., Bashem, J. R.,
Hanks, R. A., & Axelrod, B. N. (2011). Detection of
insufficient effort using the advanced clinical solutions
for the Wechsler Memory Scale, Fourth Edition. The
Clinical Neuropsychologist, 25, 160172.
Spinks, R., McKirgan, L. W., Arndt, S., Caspers,
K., Yucuis, R., & Pfalzgraf, C. J. (2009). IQ
estimate smackdown: Comparing IQ proxy measures to the WAISIII. Journal of the International
Neuropsychological Society, 15, 590596.
Tombaugh, T. (1996). Test of Memory Malingering.
North Tonawanda, NY: Multi-Health Systems.
Wechsler, D. (2008a). Technical and interpretive manual
for Wechsler Memory Scale (4th ed.). San Antonio,
TX: Pearson.
Wechsler, D. (2008b). Wechsler Memory Scale (4th ed.).
San Antonio, TX: Pearson.
Woodard, J. L., & Axelrod, B. N. (1995). Parsimonious
prediction of Wechsler Memory Scale Revised memory indices. Psychological Assessment, 7, 445449.

Parsimonious Prediction of Wechsler Memory Scale

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Parsimonious Prediction of Wechsler Memory Scale

Enviado por

Direitos autorais:

Formatos disponíveis

This article was downloaded by: [University Library Utrecht]

On: 23 September 2013, At: 06:10

Journal of Clinical and Experimental

Parsimonious prediction of Wechsler Memory

Justin B. Miller , Bradley N. Axelrod , Lisa J. Rapport , Scott R. Millis , Sarah

VanDyke , Christian Schutte & Robin A. Hanks

Department of Psychology, Wayne State University, Detroit, MI, USA

Psychology Section, John D. Dingell Department of Veterans Affairs Medical

Department of Physical Medicine and Rehabilitation, Wayne State University,

PLEASE SCROLL DOWN FOR ARTICLE

JOURNAL OF CLINICAL AND EXPERIMENTAL NEUROPSYCHOLOGY

Parsimonious prediction of Wechsler Memory Scale,

Department of Psychology, Wayne State University, Detroit, MI, USA

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

The release of the Fourth Edition of the Wechsler

for substitution of scores from the Second Edition

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Paired Associates, respectively. These scores were

were carried over from the previous WMS version

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

WMSIV INDEX SCORE PREDICTION

traumatic brain injury (TBI) from feigned cognitive

to participate were evaluated at RIM or Wayne

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

the models were significant at the p < .001 level.

WMSIV INDEX SCORE PREDICTION

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Standard error of measurementa

significant model, F(3, 277) = 563.26, p < .001. The

predicted values from each model was positive and

WMSIV INDEX SCORE PREDICTION

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Line of perfect concordance

using a t distribution. This transformation allows

stronger correlations than the other models and

Reduced major axis

Line of perfect concordance

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Line of perfect concordance

expectation was confirmed by paired-sample t tests,

the actual DMI values (all p values .02) by

Reduced major axis

Line of perfect concordance

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

WMSIV INDEX SCORE PREDICTION

Reduced major axis

Line of perfect concordance

Although statistically significant, the differences

among simulators based on the SEM reaffirms this

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

Line of perfect concordance

respectively). The proportion of predicted DMI

WMS subtests used to calculate the IMI and DMI,

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

WMSIV INDEX SCORE PREDICTION

sample from which the Immediate and Delayed

evidenced by scores that are more than two full

Downloaded by [University Library Utrecht] at 06:10 23 September 2013

three-variable models fit by Axelrod and Woodard

and can thus be applied in the context of suspected

Você também pode gostar