Você está na página 1de 19

Practical Missing Data

Analysis in SPSS
(v17 onwards)
Peter T. Donnan
Professor of Epidemiology and Biostatistics

Objectives
How to impute missing values
in SPSS, specifically MI
How to implement analyses
with multiple imputed values
Interpretation of the output
Practical tips

Example data
From trial of pedometers+advice vs
advice vs controls in sedentary elderly
women
Follow-up at 3 and 6 mnths
Main outcome measure of activity
from accelerometer counts
210 randomised / 170 at 3 months

Example data Pedometer


trial
Read in data SPSS Study databse.sav
Main outcome is:
3 mnth activity AccelVM2
Baseline activity AccelVM1a

Trial arm represented by two dummy


variables:
Grp1 = Pedom. Vs. control
Grp2 = Advice vs. control

Main analysis Pedometer


trial
Regression on 3
months activity
adjusting for
baseline activity
and two dummy
variables
representing trial
arm contrasts

Main analysis Pedometer


trial

Note that n =170


with 40 missing in
complete case analysis
and so potential for
bias

Missing at Random (MAR)


Prob (Missing) is independent of:

1) unobserved data but


2) dependent on observed data
Essentially observed data is a random
sample of full data in each stratum
MAR is weaker version of MCAR
assumption
If MAR is assumed, many methods possible
to impute data using observed data.

Comparison of completers at
3 months and drop-outs
Completers (n
=172)

Dropped out at
3 months (n =
32)

Chi-squared
or t-test pvalue

77.1 (5.0)

78.5 (5.6)

0.137

130695 (47991)

113381 (50444)

8.69 (2.25)

7.41 (2.86)

199.59
(306.74)

404.29 (1289.54)

Pedometer Group N (%)

58 (85.3%)

10 (14.7%)

BCI Group N (%)

52 (77.6%)

15 (22.4%)

Control Group N (%)

62 (92.5%)

5 (7.5%)

Stairs difficult Yes

48 (76.2%)

15 (23.8%)

No

124 (87.9%

17 (12.1%)

Age Mean (SD)


Accelerometer VM
Mean (SD)
Limb Function Mean
(SD)
NHS Costs previous 3
months Mean (SD)

0.065

0.028

0.402

0.052

0.033

Execution of MI in SPSS
So assuming MAR we can use the
available data to predict missing values
in SPSS:
Analyze
Multiple Imputation
Impute Missing Data Values

Execution of MI in SPSS
Enter ALL variables
you think associated
with missingness
Note default
imputation number =
5
Create new dataset
to store results
Note icon indicating
procedures that
allow MI analysis

Execution of MI in SPSS
Automatic method
lets SPSS chose
Custom gives more
flexibility
Can include all 2-way
interactions
Linear Regression
model prediction

Execution of MI in SPSS
List of variables
chosen
Define Each variable
for imputation or
predictor or BOTH
N.b. Recommend
including the
OUTCOME as both
predictor and
outcome

Output of MI in SPSS

Note main interest


in outcome VM2 but
other factors with
missing values also
imputed

Step 2 - Using Imputed


datasets in analysis
Note new dataset has IMPUTATION number
as first column and contains in order the
original dataset (n = 210), IMPUTATION = 0
and concatenated below it a further 5 new
datasets (each n = 210) but now with imputed
values, IMPUTATION = 1 to 5
Most analyses can now be implemented if the
fossil shell spiral symbol is present

Repeat Main analysis


Need Pooled Results
Procedure exactly
same as before
SPSS will do the
pooled analysis if
the icon (above)
is present in the
drop-down menu

Pooled Analysis in SPSS

Results
presented for
the original
data and for
each imputed
dataset
separately

Results of pooled analysis


from 5 imputed datasets
Model

SE

Sig.

Fractio
n
missing

Constant

15607

7808

1.999

0.047

0.173

AccelVM1
a

0.852

0.051

16.630

0.000

0.124

Pedomete 11310
r Group

6131

1.845

0.066

0.138

Advice
only

6526

2.687

0.009

0.266

Pooled

Larger
effect
sizes in
both
groups

17536

Greater power gives


more significance

Interpretation
Compare pooled results with the original as a
form of sensitivity analysis
If results similar suggests the original results
fairly robust
Consider whether MAR is reasonable assumption
Consider whether you have included all factors
(including the outcome) related to the
missingness in the imputation model as a crucial
assumption

Summary

SPSS now includes Multiple imputation in its


armoury

Consider assumptions of MI

Compare results under different assumption


to assess robustness of results

If MAR assumption o.k. then MI provides


results that are less biased than complete
case analysis

Você também pode gostar