Escolar Documentos
Profissional Documentos
Cultura Documentos
CVEN2002/2702
Week 12
This lecture
CVEN2002/2702 (Statistics)
Dr Justin Wishart
2 / 48
11. ANOVA
CVEN2002/2702 (Statistics)
Dr Justin Wishart
3 / 48
11.1 Introduction
Introduction
In Chapter 10, we introduced testing procedures for comparing the
means of two different populations, having observed two random
samples drawn from those populations (two-sample z- and t-tests)
However, in applications, it is common that we want to detect a
difference in a set of more than two populations
Imagine the following context: four groups of students were subjected
to different teaching techniques and tested at the end of a specified
period of time. Do the data shown in the table below present sufficient
evidence to indicate a difference in mean achievement for the four
teaching techniques?
Tech. 1
65
87
73
79
81
69
CVEN2002/2702 (Statistics)
Tech. 2
75
69
83
81
72
79
90
Dr Justin Wishart
Tech. 3
59
78
67
62
83
76
Tech. 4
94
89
80
88
4 / 48
11.1 Introduction
Introduction: randomisation
To answer this question, we should first note that the method of division of the
students into 4 groups is of vital importance
For instance, some basic visual inspection of the data suggests that the
members of group 4 scored higher than those in the other groups. Can we
conclude from this that teaching technique 4 is superior? Perhaps, students
in group 4 are just better learners
; it is essential that we divide the students into 4 groups in such a
way to make it very unlikely that one of the group is inherently
superior to others (regardless of the teaching technique it will be
subjected to)
; the only reliable method for doing this is to divide the students in a
completely random fashion, to balance out the effect of any
nuisance variable that may influence the variable of interest
This kind of consideration is part of a very important area of statistical
modelling called experimental design, which is not addressed in this course
(Chapter 10 in the textbook). In this course, we will always assume that the
division of the individuals into the groups was indeed done at random
CVEN2002/2702 (Statistics)
Dr Justin Wishart
5 / 48
11.1 Introduction
Introduction
75
69
83
81
72
79
90
78.43
7.11
59
78
67
62
83
76
94
89
80
88
70.83
9.58
87.75
5.80
65
75.67
8.17
85
65
87
73
79
81
69
80
Tech. 4
75
Tech. 3
70
Tech. 2
60
x
s
Tech. 1
90
95
Dr Justin Wishart
6 / 48
Group 2
5.51
5.50
5.50
5.49
5.50
Group 3
5.01
5.00
4.99
4.98
5.02
Dr Justin Wishart
7 / 48
Group 2
6.31
3.54
4.73
7.20
5.72
Group 3
4.52
6.93
4.48
5.55
3.52
Dr Justin Wishart
8 / 48
Analysis of Variance
Comparing (intelligently) the between-group variability and the
within-group variability is the purpose of the Analysis of Variance
; often shortened to the acronym ANOVA
Suppose that we have k different groups (k populations, or k
sub-populations of a population) that we wish to compare
Often, each group is called a treatment or treatment level (general
terms that can be traced back to the early applications of this
methodology in the agricultural sciences)
The response for each of the k treatments is the random variable of
interest, say X
Denote Xij the jth observation (j = 1, . . . , ni ) taken under treatment i
; we have k independent samples (one sample from each of the
treatments)
CVEN2002/2702 (Statistics)
Dr Justin Wishart
9 / 48
ANOVA samples
The k random samples are often presented as:
Treatment
Mean
St. Dev.
1
X11
X21
..
.
X1n1
2
X21
X22
..
.
X1n2
...
k
Xk 1
Xk 2
..
.
Xknk
1
X
S1
2
X
S2
...
...
k
X
Sk
...
...
i and Si are the sample mean and standard deviation of the ith
where X
sample. The total number of observations is
n = n1 + n2 + . . . + nk
, is
and the grand mean of all the observations, usually denoted X
k
i
XX
1 + n2 X
2 + . . . + nk X
k
n1 X
=1
X
Xij =
n
n
i=1 j=1
CVEN2002/2702 (Statistics)
Dr Justin Wishart
10 / 48
ANOVA model
The ANOVA model is the following:
Xij = i + ij
where
i is the mean response for the ith treatment (i = 1, 2, . . . , k )
ij is an individual random error component (j = 1, 2, . . . , ni )
As usual for errors, we will assume that the random variables ij are
normally and independently distributed with mean 0 and variance 2 :
i.i.d.
ij N (0, )
for all i, j
Xij N (i , )
Important: the variance 2 is common for all treatments
CVEN2002/2702 (Statistics)
Dr Justin Wishart
11 / 48
ANOVA hypotheses
We are interested in detecting differences between the different
treatment means i , which are population parameters
; hypothesis test!
The null hypothesis to be tested is
H0 : 1 = 2 = . . . = k
versus the general alternative
Ha : not all the means are equal
Careful! The alternative hypothesis should be that at least two of the
means differ, not that they are all different !
As pointed out previously, the primary tool when testing for equality of
the means is based on a comparison of the variances within the
groups and between the groups
CVEN2002/2702 (Statistics)
Dr Justin Wishart
12 / 48
Variability decomposition
The ANOVA partitions the total variability in the sample data,
described by the total sum of squares
ni
k X
X
SSTot =
)2
(Xij X
(df = n 1)
i=1 j=1
k
X
)2
i X
ni (X
(df = k 1)
i=1
ni
k X
X
i )2
(Xij X
(df = n k )
i=1 j=1
Dr Justin Wishart
(Proof: exercise)
Session 2, 2012 - Week 12
13 / 48
Variability decomposition
The total sum of squares SSTot quantifies the total amount of variation
contained in the global sample
The Treatment sum of squares SSTr quantifies the variation between
the groups, that is the variation between the means of the groups
(giving more weight to groups with more observations)
The Error sum of squares SSEr quantifies the variation within the
groups
treatment sample
error samples
95
global sample
95
95
samples
65
65
CVEN2002/2702 (Statistics)
10
5
X
Dr Justin Wishart
80
X
75
60
10
70
65
80
X
75
60
70
70
75
85
85
80
60
90
85
90
90
14 / 48
1
ni 1
Pni
j=1 (Xij
i )2
X
ni
k X
X
i )2 =
(Xij X
i=1 j=1
hence
E(SSEr ) =
k
X
(ni 1)Si2
i=1
k
k
X
X
(ni 1)E(Si2 ) = 2
(ni 1) = (n k ) 2
i=1
i=1
SSEr
nk
Dr Justin Wishart
15 / 48
1 ), n2 (X
2 ), . . . , nk (X
k ), is a random sample
; n1 (X
whose sample variance
k
1 X
)2 = SSTr
i X
ni (X
k 1
k 1
i=1
SSTr
k 1
Dr Justin Wishart
16 / 48
ANOVA test
Thus we have two potential estimators of 2 :
1
MSEr , which always estimates 2
2
MSTr , which estimates 2 only when H0 is true
Actually, if H0 is not true, MSTr tends to exceed 2 , as we have
E(MSTr ) = 2 + true variance between the groups
; the idea of the ANOVA test now takes shape
Suppose we have observed k samples xi1 , xi2 , . . . , xini , for
i = 1, 2, . . . , k , from which we can find through calculations the
observed values msTr and msEr . Then:
if msTr ' msEr , then H0 is probably reasonable
if msTr msEr , then H0 should be rejected
; this will thus be a one-sided hypothesis test
We need to determine what msTr msEr means so as to obtain a
hypothesis test at given significance level
CVEN2002/2702 (Statistics)
Dr Justin Wishart
17 / 48
Sampling distribution
It can be shown that, if H0 is true, the ratio
MSTr
F =
=
MSEr
SSTr
k 1
SSEr
nk
CVEN2002/2702 (Statistics)
Dr Justin Wishart
18 / 48
for x > 0
; SX = [0, +)
for y > 0
Dr Justin Wishart
19 / 48
1.0
Some F -distributions
d1=4,d2=4
d1=100,d2=6
0.6
d1=4,d2=100
0.2
0.4
0.2
0.4
f(x)
F(x)
0.6
0.8
d1=3,d2=10
d1=4,d2=4
d1=100,d2=6
d1=3,d2=10
0.0
0.0
d1=4,d2=100
cdf F (x)
CVEN2002/2702 (Statistics)
Dr Justin Wishart
20 / 48
d2
d2 2
for d2 > 2
2d22 (d1 + d2 2)
d1 (d2 2)2 (d2 4)
for d2 > 4
CVEN2002/2702 (Statistics)
Dr Justin Wishart
21 / 48
f(x)
1
fd2 ,d1 ;1
1
fd1, d2,
x
For any d1 and d2 , the main quantiles of interest may be found in the
F -distribution critical values tables
CVEN2002/2702 (Statistics)
Dr Justin Wishart
22 / 48
ANOVA test
The null hypothesis to test is H0 : 1 = 2 = . . . = k
versus the general alternative Ha : not all the means are equal
Evidence against H0 is shown if MSTr MSEr , so we will reject H0
whenever MSTr is much larger than MSEr , i.e.
MSTr
MSEr
MSTr
MSEr
Fk 1,nk
msTr
> fk 1,nk ;1
msEr
Dr Justin Wishart
23 / 48
f(x)
f0 =
p = P(X > f0 ),
p
where X Fk 1,nk
f0
(the probability that the test statistic will take on a value that is at least
as extreme as the observed value when H0 is true, definition on Slide
21 Week 9) ; from the F -distribution table, only bounds can be found
for this p-value (use software to get an exact value)
x
Dr Justin Wishart
24 / 48
ANOVA table
The computations for this test are usually summarised in tabular form
Source
degrees
of freedom
sum of
squares
mean
square
Treatment
dfTr = k 1
ssTr
msTr =
ssTr
k 1
Error
dfEr = n k
ssEr
msEr =
ssEr
nk
Total
dfTot = n 1
ssTot
F -statistic
f0 =
msTr
msEr
Dr Justin Wishart
25 / 48
ANOVA: example
Example
Consider the data shown on Slide 5. Test at significance level = 0.05 the
null hypothesis that there is no difference in mean achievement for the four
teaching techniques
We have k = 4, n1 = 6, n2 = 7, n3 = 6 and n4 = 4, with x1 = 75.67,
x2 = 78.43, x3 = 70.83, x4 = 87.75 and s1 = 8.17, s2 = 7.11, s3 = 9.58,
s4 = 5.80. Besides,
4
n = 6 + 6 + 7 + 4 = 23
and
x =
1X
ni xi = 77.35
n
i=1
Dr Justin Wishart
26 / 48
ANOVA: example
From there, the ANOVA table can be easily completed:
Source
degrees
of freedom
sum of
squares
mean
square
F -statistic
Treatments
dfTr = 3
ssTr = 712.59
msTr = 237.53
f0 = 3.77
Error
dfEr = 19
ssEr = 1196.63
msEr = 62.98
Total
dfTot = 22
ssTot =1909.22
CVEN2002/2702 (Statistics)
Dr Justin Wishart
27 / 48
ANOVA: example
According to M ATLAB, f3,19;0.95 = 3.1274 (in the table: f3,20;0.95 = 3.10)
; the decision rule is:
reject H0 if
msTr
> 3.1274
msEr
msTr
= 3.77
msEr
; reject H0
for X F3,19
28 / 48
Xi N i ,
ni
The value of 2 is unknown, however we have (numerous!) estimators
for it
CVEN2002/2702 (Statistics)
Dr Justin Wishart
29 / 48
Dr Justin Wishart
30 / 48
3
2
2
1
teaching technique
4
3
60
70
80
90
100
Dr Justin Wishart
31 / 48
Dr Justin Wishart
32 / 48
: i = j (Sl. 44 W9)
Successively testing H0
(1,3)
: 1 = 2 , and then H0
: 1 = 3 , and
(k 1,k )
H0
Dr Justin Wishart
33 / 48
Dr Justin Wishart
34 / 48
for all q
Dr Justin Wishart
35 / 48
; p-value = 0.5276
t-test for H0 : 1 = 3
; p-value = 0.3691
t-test for H0 : 1 = 4
; p-value = 0.0346
t-test for H0 : 2 = 3
; p-value = 0.1293
t-test for H0 : 2 = 4
; p-value = 0.0537
t-test for H0 : 3 = 4
; p-value = 0.0139
Dr Justin Wishart
36 / 48
Dr Justin Wishart
37 / 48
Residuals analysis
The normality assumption can be checked by constructing a normal
quantile plot for the residuals
The assumption of equal variances in each group can be checked by
plotting the residuals against the treatment level (that is, xi )
; the spread in the residuals should not depend on any way on xi
A rule-of-thumb is that, if the ratio of the largest sample standard
deviation to the smallest one is smaller than 2, the assumption of equal
population variances is reasonable
The assumption of independence can be checked by plotting the
residuals against time, if this information is available
; no pattern, such sequences of positive and negative residuals,
should be observed
As for the regression, the residuals are everything the model will not
consider ; no information should be observed in the residuals, they
should look like random noise
CVEN2002/2702 (Statistics)
Dr Justin Wishart
38 / 48
residuals
10
residuals
Theoretical Quantiles
10
10
10
65
Sample Quantiles
70
75
80
85
90
95
Dr Justin Wishart
39 / 48
Hem (2)
381
401
175
185
374
390
Spruce (3)
440
210
230
400
386
410
Dr Justin Wishart
40 / 48
degrees
of freedom
sum of
squares
mean
square
F -statistic
Treatment
7544
3772
0.33
Error
15
172929
11529
Total
17
180474
Dr Justin Wishart
41 / 48
Normal QQ Plot
50
residuals
50
Theoretical Quantiles
100
150
150
100
50
50
100
280
290
300
Sample Quantiles
310
320
330
340
350
Dr Justin Wishart
42 / 48
Blocking factor
450
400
300
250
bending parameter
350
200
150
Mill 1
Mill 2
Mill 3
Mill 4
Mill 5
Mill 6
tree type
Dr Justin Wishart
43 / 48
Blocking factor
It is clear that over and above the wood type, the mills where the
lumber was selected is another source of variability, in this example
even more important than the main treatment of interest (wood type)
This kind of extra source of variability is known as a blocking factor,
as it essentially groups some observations in blocks across the initial
groups ; the samples are not independent! (assumption violation)
; a potential blocking factor must be taken into account!
When a blocking factor is present, the initial Error Sum of Squares,
, that is the whole amount of variability not due to the
say SSEr
treatment, can in turn be partitioned into:
1
the variability due to the blocking factor, quantified by SSBlock
2
the true natural variability in the observations SSEr
= SS
We can write thatSSEr
Block + SSEr , and thus
SSTot = SSTr + SSBlock + SSEr
CVEN2002/2702 (Statistics)
Dr Justin Wishart
44 / 48
Blocking factor
The ANOVA table becomes:
Source
degrees
of freedom
sum of
squares
mean
square
Treatment
k 1
ssTr
msTr =
Block
b1
ssBlock
msBlock =
Error
nk b+1
ssEr
Total
n1
ssTot
msEr =
ssTr
k 1
F -statistic
f0 =
msTr
msEr
ssBlock
b1
ssEr
nk b+1
Dr Justin Wishart
45 / 48
Blocking factor
In the previous example, we would have found:
Source
degrees
of freedom
sum of
squares
mean
square
F -statistic
Treatment
7544
3772
15.87
Block
170552
34110
Error
10
2378
238
Total
17
180474
Dr Justin Wishart
46 / 48
The second ANOVA (with blocking factor) adjusts for this effect
The net effect is a substantial reduction in the genuine MSEr , leading to a
larger F -statistic (increased from 0.33 to 15.87!)
; with very little risk of being wrong (p ' 0), we can now conclude that
there is a significant difference in the mean bending parameters for
the three different wood types
An analysis of the residuals in this second ANOVA would not show anything
peculiar ; valid conclusion
Dr Justin Wishart
47 / 48
Objectives
Objectives
Now you should be able to:
conduct engineering experiments involving a treatment with a
certain number of levels
understand how the ANOVA is used to analyse the data from
these experiments
assess the ANOVA model adequacy with residual plots
understand the blocking principle and how it is used to isolate the
effect of nuisance factors
Recommended exercises: Q3, Q6 p.406, Q9 p.407, Q10, Q11 p.412,
Q13, Q15, Q17 p.413, Q19 p.414, Q22, Q23 p.415, Q35 p.428
CVEN2002/2702 (Statistics)
Dr Justin Wishart
48 / 48