Você está na página 1de 23

ANOVA

Jeremy Sumner
Maths and Physics, University of Tasmania

KMA711, June 2016

Useful resources

http://rtutorialseries.blogspot.com.au/
http://www.r-tutor.com/elementary-statistics/analysis-variance
http://www.statmethods.net/stats/anova.html
http://www.r-bloggers.com/one-way-analysis-of-variance-anova/
http://www.stat.columbia.edu/martin/W2024/R3.pdf

Time to coagulation vs. diet

When is ANOVA a useful tool?

DATA:

Categorical explanatory variable(s)

Continuous response variable

e.g. Time to blood coagulation (response variable) under different diets


(explanatory variable). . .
Analysis goals: Is the response variable significantly affected by
the different levels (A,B,C,D) of the explanatory variable?
i.e. Does diet affect time to coagulation?

ANOVA basics and assumptions

ANOVA is pretty much the same idea as regression except


explanatory variables are categorical/factors.
Coagulate Time Diet example
DATA is Time = yij , for Diet i = A, B, C , D and observation j
NULL: different groups/levels for the categorical/factor
variables make no difference to the response.
ASSUMPTION:
ANOVA assumes the response variable is normally distributed with identical variance about the group means.
The idea is to compare the within groups to the between
group variation.
Simple ANOVA model: yij = i + ij , where errors  are
independent and N(0, 2 ).

ANOVA in a nutshell

Question: How can we statistically quantify what we see in


the box plots?
Answer:
1
2

Use data to compute summary statistic F


Assuming NULL (no effect), use F to compute a p-value

What is a p-value again?

A p-value is the probability of observing the data (or


more extreme) if the NULL hypothesis is true
i.e. What is the probability of seeing box plots like this if diet
makes no difference?

ANOVA basics
Time to coagulation by Diet

NULL: A = B = C = D
Groups normal about their mean
ALT: i 6= j
Boxes: within group variation
Averages: between group variation
Model: yij = i + random

ANOVA in a nutshell PART 2

within variation less than between variation means the


diet choice probably matters
this implies small p-value, so NULL hypothesis is probably
false

Summary statistics
DATA assumptions
yij = i + ij = + i + ij
yij is j th sample from i th group
is the grand mean
i = +i are group means
ij N(0, 2 )

Group
1
2
3
..
.

Data
y11 , y12 , . . . , y1N1
y21 , y22 , . . . , y2N2
y31 , y32 , . . . , y3N3
..
.

Dist
N(1 , 2 )
N(2 , 2 )
N(3 , 2 )
..
.

yk1 , yk2 , . . . , ykNk

N(k , 2 )

ANOVA summary statistics

Sample means: yi =

1
Ni

Sample variances: si2 =

PNi

j=1 yij

1
Ni 1

PNi

j=1

(yij yi )

Size
N1
N2
N3
..
.

Mean
y1
y2
y3
..
.

Variance
s12
s22
s32
..
.

Nk

yk

sk2

ANOVA in a nutshell PART 2


within variation MSE should roughly equal 2
between variation MS Groups should roughly equal 2
F statistic: F =

MSGroups
MSE

= 1?

p-value is probability of observing F > 1

Some equations and the F-test


Use si2 to obtain a pooled estimate of 2 :
MSE =

SSE
k1

(N1 1)s12 + (N2 1)s22 + . . . + (Nk 1)sk2


P
i (Ni 1)
2
Pk PNi
y

y
total variation around group means
ij
i
i=1
j=1
=
=
Nk
# data points # of means computed
2

= sp =

Under NULL yi N(, 2 /Ni ), so independent estimate of 2 :


MS Groups =

SS Groups
k1

Total variation: SST =

k
X
(yi y )2
i=1

i,j (yij

k1

variation of group means about grand mean


# groups 1

y )2

MIRACLE OF ANOVA: SST = SS Groups + SSE


Under NULL, the ratio F = MSGroups
should be F -distributed about 1
MSE
with numerator DF (k 1) and denominator DF (N k).
SS Groups is what is explained by the separate group means and
SSE is what is left over.

Simple ANOVA outputs

Generic table
Source

df

SS

MS

Between
Within
Total

k 1
N k
n1

P
SS Groups = i Ni (yi y )2
P
SSE = i (Ni 1) si2
P
2
SST = i,j (yij y )

SS Groups
k1
SSE
Nk

F =

R output for the Diet data


Df Sum Sq Mean Sq F value
Pr(>F)
Diet
3
228
76.0
13.57 4.66e-05
Residuals 20
112
5.6

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

***

MSGroups
MSE

ANOVA as regression
Consider regression model on 4 diets:
time = 0 (diet A)+1 (diet B)+2 (diet C)+3 (diet D)
(diet A) is the indicator function: = 0 or 1
In R: 0 = A , 1 = (B A ), 2 = (C A ), 3 = (D A )
Regression style outputs
Coefficients:
Estimate Std. Error t value
(Intercept)
6.100e+01 1.183e+00 51.554
DietB
5.000e+00 1.528e+00
3.273
DietC
7.000e+00 1.528e+00
4.583
DietD
-3.333e-15 1.449e+00
0.000

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05

Pr(> |t|)
< 2e-16
0.003803
0.000181
1.000000
. 0.1

***
**
***

Plot checks for ANOVA assumptions


Visual checks for heteroscedascity, non-linearity, normality, and leverage:

Test for ANOVA assumptions

Bartlett Test for equal variance across groups:


bartlett.test(Time Diet,data=d)
Essentially simultaneously compares ratio of pooled variance
to variance for each group
This test is sensitive to departures from normality.

Pairwise comparisions: A vs. B? A vs. C? B vs. D? . . .


Which diet is best A,B,C,or D?
Imagine doing multiple pairwise comparisons i = j using a
t-test at 95% = 100 (1 )% confidence level
Type I error: Chance of rejecting null hypo when we shouldnt
have, ie. = 0.05
We are guaranteed to stuff up 5% of the time!
Under multiple tests, we will eventually make a Type I error
Bonferroni: multiply p-value by # of tests k
Conservative. Chance of observing at least one of k events is
less than the sum of probs for each event, ie. < k
Holm: multiply smallest p-value by k, the 2nd by k 1 etc
Tukey HSD: corrects for multiple comparisons returns
confidence intervals for differences.

Contrast choice and data dredging

Standard ANOVA null hypothesis: 1 = 2 = . . . = k .


Contrasts: 1 2 = 0 and 1 21 (2 + 3 ) = 0.
Often the treatment structure will suggest useful contrasts.
Snail tissue. { LL,LH,HL,HH }: humidity L/H, and temp L/H.
Contrast
1
1
2 (LH + HH ) 2 (HL + LL )
1
1
2 (HL + HH ) 2 (LH + LL )
1
1
2 (LL + HH ) 2 (LH + HL )

Comparison
Temperature matters
Humidity matters
Same matters

Be VERY careful if using data to suggest a contrast!


In the lab we look at planned and un-planned
comparisons.

What to do when data is not normal?


Is it possible to apply a transformation to the response
variable?

yij or yij2 or log(yij )?


Box-Cox :
()

yij


=

(yij 1)/ 6= 0
log(yij )
=0

And worry a bit more to find best sensible using ML


Use when Bartlett test says unequal variances.
Welch s method: essentially a series of t-tests but doesnt
pool variances across groups
In R: pairwise.t.test(d$Time ,d$Diet,pool.sd=F) ,
adjusted p-values

What to do when data is not normal?

To the rescue! Non-parametric tests


Careful! Valid for a wider range of distributions but lose power
Skewed data or extreme outliers?

Analysis of Medians:
Kruskal-Wallis : rank based, average rank for each group,
variation in these rank-averages in analysed
Moods Median test: Contingency table, greater than grand
median? less than? 2 test
Both assume groups have same shaped distribution, and
Kruskal-Wallis is more powerful than Mood
pairwise Wilcoxon: assumes roughly symmetric distribution,
rank based, Holm adjusted p-values

Why not always use a non-parametric test?


Its easier to reject NULL using tests with strong assumptions.
Type I error, : rejected NULL and shouldnt have
Type II error, : kept NULL and should have rejected
power = (1 ), ie. power to reject.
Null hypothesis

Weak assumptions

Strong assumptions

(or model)
# parameters
p-values
power
fit
Type I rate
Type II rate
Sample size
Bias
Variance

(eg. non-parametric)
many (!?)
large
low
good
low
high
large
low
high

(eg. Simple ANOVA)


few
small
high
bad
high
low
small
high
low

Factorial ANOVA

Effect of sleeping tablets AND alcohol.


More than the sum of the parts = interaction
Model: Response Sleep Tab + Alcohol + Sleep Tab:Alcohol
Factorial design: All treatments in all combinations.
Eg. 5 people given nothing, 5 given sleep tabs, 5 given
alcohol, and 5 given sleep tabs & alcohol (ST:A).
In factorial ANOVA, the main effect is the effect of each
variable separately, but now also have an interaction effect.
General model: yijk = + i + j + ij + ijk with ijk
independent and N(0, 2 ).
ij is the interaction term.

Take home messages

You should always double check that your data satisfies the
assumptions of the method you are applying.
The more you can assume the better, as you can use a more
powerful test and hence reduce Type II error
For ANOVA there is a sequence of assumptions across groups:
normal with identical variance . . . normal without identical
variance . . . not normal but same shape . . . completely nuts
The equal means null hypothesis is a good start, but if its
false we always want to know more this is where contrasts
come in.
Multiple tests lead to increased chance of Type I error
Contrasts are great, but p-values must be corrected for
multiple testing, AND dont use the data to suggest contrasts.

Você também pode gostar