Você está na página 1de 12

Lecture Notes

Chapter Ten: Analysis of Variance


Randall Miller

1. Elements of a Designed Experiment


Definition 10.1
The response variable is the variable of interest to be measured in the experiment. We also refer
to the response as the dependent variable.

Definition 10.2
Factors are those variables whose effect on the response is of interest to the experimenter.
Quantitative factors are measured on a numerical scale, whereas qualitative factors are not
(naturally) measured on a numerical scale.

Definition 10.3
Factor levels are the values of the factor utilized in the experiment.

Definition 10.4
The treatments of an experiment are the factor-level combinations utilized.

Definition 10.5
An experimental unit is the object on which the response and factors are observed or measured.

Definition 10.6
A designed experiment is an experiment in which the analyst controls the specification of the
treatments and the method of assigning the experimental units to each treatment. An
observational experiment is an experiment in which the analyst simply observes the treatments
and the response on a sample of experimental units.

1|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller

2. The Completely Randomized Design


Definition 10.7
The completely randomized design is a design in which treatments are randomly assigned to the
experimental units or in which independent random samples of experimental units are selected for
each treatment.

ANOVA F-test to Compare k Treatment Means: Completely Randomized Design


H 0 : 1= 2= ...= k
H a : At least two treatment means differ.

MST
MSE
Rejection region: F > F where F is based on=
1

Test statistic: F =

(associated with MST) and=


2

( k 1) numerator

degrees of freedom

( n k ) denominator degrees of freedom (associated with MSE).

Conditions required for a Valid ANOVA F-test: Completely Randomized Design


1. The samples are randomly selected in an independent manner from the k treatment
populations. (This can be accomplished by randomly assigning the experimental units to the
treatments.)
2. All k sampled populations have distributions that are approximately normal.
3. The k population variances are equal (i.e., 12= 22= ...= k2 ).

General ANOVA Summary Table for a Completely Randomized Design


Source
Treatments

df

k 1

SS
SST

Error

nk

SSE

Total

n 1

SS(Total)

MS

SST
k 1
SSE
MSE =
nk

MST
MSE

MST =

What Do You Do When the Assumptions are not Satisfied for the Analysis of
Variance for a Completely Randomized Design?
Answer: Use a nonparametric statistical method such as the Kruskal-Wallis H-test of section
14.5.

2|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller

Steps for Conducting an ANOVA for a Completely Randomized Design


1. Make sure that the design is truly completely randomized, with independent random samples
for each treatment.
2. Check the assumptions of normality and equal variances.
3. Create an ANOVA summary table that specifies the variabilitys attributable to treatments
and error, making sure that those variabilitys lead to the calculation of the F-statistic for
testing the null hypothesis that the treatment means are equal in the population. Use a
statistical software package to obtain the numerical results. If no such package is available,
use the calculation formulas in Appendix B.
4. If the F-test leads to the conclusion that the means differ,
a. Conduct a multiple-comparisons procedure for as many of the pairs of means as you
wish to compare. (See Section 10.3.) Use the results to summarize the statistically
significant differences among the treatment means.
b. If desired, from confidence intervals for one or more individual treatment means.
5. If the F-test leads to the nonrejection of the null hypothesis that the treatment means are
equal, consider the following possibilities;
a. The treatment means are equal; that is, the null hypothesis is true.
b. The treatment means really differ, but other important factors affecting the response
are not accounted for by the completely randomized design. These factors inflate the
sampling variability, as measured by MSE, resulting in smaller values of the Fstatistic. Either increase the sample size for each treatment, or use a different
experimental design (as in 10.4) that accounts for the other factors affecting the
response.
[Note: Be careful not to automatically conclude that the treatment means are equal since the
possibility of a Type II error must be considered if you accept H 0 .]

3|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller

Formulas for the Calculations in the Completely Randomized Design


CM = Correction for mean
n
yi
2
( Total of all observations )
i =1

=
=
Total number of observations
n

SS ( Total ) = Total sum of squares


=

=
(Sum of squares of all observations
)

=
CM

y
i =1

2
i

CM

SST = Sum of square for treatments


Sum of squares of treatments totals with

each square divided by the number of CM

observations for that treatment

T2 T2
T2
= 1 + 2 + ... + k CM
nk
n1 n2
SSE = Sum of squares for error = SS(Total) SST
SST
MST = Mean square for treatments =
k 1
SSE
MSE = Mean square for error =
nk
MST
F = Test statistic =
MSE

Where
n = Total number of observations
k = Number of treatments
=
Ti Total
=
for treatment i ( i 1, 2,..., k )

4|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller

3. Multiple Comparisons of Means


Determining the Number of Pairwise Comparisons of Treatment Means
In general, if there are k treatment means, there are

=
c k ( k 1) / 2
pairs of means that can be compared.

Guidelines for Selecting a Multiple-Comparison Method in ANOVA


Method
Tukey
Bonferroni
Scheff

Treatment sample sizes


Equal
Equal or unequal
Equal or unequal

Types of comparisons
Pairwise
Pairwise
General contrasts

4. The Randomized Block Design


Definition 10.8
The randomized block design consists of a two-step procedure:
1. Matched sets of experimental units, called blocks, are formed, with each block consisting of k
experimental unites (where k is the number of treatments). The b blocks should consist of
experimental units that are as similar as possible.
2. One experimental unit from each block is randomly assigned to each treatment, resulting in a
total of n = bk responses.

ANOVA F-Test to Compare k Treatment Means: Randomized Block Design


H 0 : 1= 2= ...= k
H a : At least two treatment means differ.

MST
MSE
Rejection region: F > F where F is based on ( k 1) numerator degrees of freedom and

Test statistic: F =

( n b k + 1) denominator degrees of freedom.


Conditions Required for a Valid ANOVA F-Test: Randomized Block Design
1. The b blocks are randomly selected, and all k treatments are applied (in random order) to each
block.
2. The distributions of observations corresponding to all bk block-treatment combinations are
approximately normal.
3. The bk block-treatment distributions have equal variances.

5|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller

General ANOVA Summary Table for a Randomized Block Design


Source
Treatments
Blocks
Error
Total

df

k 1
b 1
n k b +1
n 1

SS
SST
SSB
SSE
SS(Total)

MS
MST
MSB
MSE

F
MST/MSE

Steps for Conducting an ANOVA for a Randomized Block Design


1. Be sure that the design of blocks (preferably, blocks of homogeneous experimental units) and
that each treatment is randomly assigned to one experimental unit in each block.
2. If possible, check the assumptions of normality and equal variances for all block-treatment
combinations. [Note: This may be difficult to do, since the design will likely have only one
observation for each block-treatment combination.]
3. Create an ANOVA summary table that specifies the variability attributable to treatments,
blocks, and error, and that leafs to the calculation of the F-statistic to test the null hypothesis
that the treatment means are equal in the population. Use a statistical software package or the
calculation formulas in Appendix B to obtain the necessary numerical ingredients.
4. If the F-statistic leads to the conclusion that the means differ, employ the Bonferroni or
Tukey procedure, or a similar procedure, to conduct multiple comparisons of as many of the
pairs of means as you wish. Use the results to summarize the statistically significant
differences among the treatment means. Remember that, in general, the randomized block
design cannot be employed to form confidence intervals for individual treatment means.
5. If the F-test leads to the nonrejection of the null hypothesis that the treatment means are
equal, several possibilities exist:
a. The treatment means are equal: that is, the null hypothesis is true.
b. The treatment means really differ, but other important factors affecting the response
are not accounted for by the randomized block design. These factors inflate the
sampling variability, as measured by MSE, resulting in smaller values of the Fstatistic. Either increase the sample size for each treatment, or conduct an experiment
that accounts for the other factors affecting the response (as is to be done in Section
10.5). Do not automatically reach the former conclusion, since the possibility of a
Type II error must be considered if you accept H 0 .
6. If desired, conduct the F-test of the null hypothesis that the block means are equal. Rejection
of this hypothesis lends statistical support to the utilization of the randomized block design.

What Do You Do When the Assumptions Are Not Satisfied for the Analysis of
Variance for a Randomized Block Design?
Answer: Use a nonparametric statistical method such as the Friedman Fr test of Section 14.6.

6|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller

Formulas for the Calculations in the Randomized Block Design


CM = Correction for mean
n
yi
2
( Total of all observations )
i =1

=
=
Total number of observations
n

SS ( Total ) = Total sum of squares


=

=
(Sum of squares of all observations
)

=
CM

y
i =1

2
i

CM

SST = Sum of square for treatments


Sum of squares of treatments totals with

each square divided by b, the number of CM

observations for that treatment

2
2
2
T
T
T
= 1 + 2 + ... + k CM
b
b
b
SST = Sum of square for blocks
Sum of squares of block totals with

each square divided by k , the number of CM

observations for that block

2
2
2
B
B
B
= 1 + 2 + ... + k CM
k
k
k

SSE = Sum of squares for error = SS(Total) SST SSB


SST
MST = Mean square for treatments =
k 1
SSB
MSB = Mean square for blocks =
b 1
SSE
MSE = Mean square for error =
n k b +1
MST
F = Test statistic =
MSE

7|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller
Where
n = Total number of observations
b = Number of block
k = Number of treatments
=
Ti Total
=
for treatment i ( i 1, 2,..., k )
=
Bi Total
=
for block i ( i 1, 2,..., b )

5. Factorial Experiments
Definition 10.9
A complete factorial experiment is a factorial experiment in which every factor-level
combination is utilized. That is, the number of treatments in the experiment equals the total
number of factor-level combinations.

Factor A
At a levels

Level
1
2
3

1
Trt. 1
Trt. b + 1
Trt. 2b + 1

Factor B at b levels
2
3
Trt. 2
Trt. 3
Trt. b + 2
Trt. b + 3
Trt. 2b + 2
Trt. 2b + 3

Trt. (a-1)b + 1

Trt. (a-1)b + 2

Trt. (a-1)b + 3

B
Trt. b
Trt. 2b
Trt. 3b

Trt. ab

Procedure for Analysis of Two-Factor Factorial Experiment


1. Partition the total sum of squares into the treatment and error components (stage 1 of Figure
10.21). Use either a statistical software package or the calculation formulas in Appendix C to
accomplish the partitioning.
2. Use the F-ratio of the mean square for treatments to the mean square for error to test the null
hypothesis that the treatment means are equal.
a. If the test results in nonrejection of the null hypothesis, consider refining the
experiment by increasing the number of replications or introducing other factors.
Also, consider the possibility that the response is unrelated to the two factors.
b. If the test results in rejection of the null hypothesis, then proceed to step 3.
3. Partition the treatment sum of squares into the main effect and the interaction sum of squares
(stage 2 of Figure 10.21). Use either a statistical software package or the calculation
formulas in Appendix B to accomplish the partitioning.

8|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller
4. Test the null hypothesis that factors A and B do not interact to affect the response by
comparing the F-ratio of the mean square for interaction to the mean square for error.
a. If the test results in nonrejection of the null hypothesis, proceed to step 5.
b. If the test results in rejection of the null hypothesis, conclude that the two factors
interact to affect the mean response. Then proceed to step 6a.
5. Conduct tests of two null hypotheses that the mean response is the same at each level of
factor A and factor B. Compute tow F-ratios by comparing the mean square for each factor
main effect with the mean square for error.
a.
6. Compare the mean;
a. If the test for interaction (step 4) is significant, use a multiple-comparison procedure
to compare any or all pairs of the treatment means.
b. If the test for one or both main effects (step 5) is significant, use a multiplecomparison procedure to compare the pairs of means corresponding to the levels of
the significant factor(s).

Tests Conducted in Analyses of Factorial Experiments: Factorial Experiments, r


Replicates per Treatment
Test for Treatment Means
H 0 : No difference among the ab treatment means

H a : At least two treatment means differ

MST
MSE
Rejection region: F F , based on ( ab 1) numerator and ( n ab ) denominator degrees of

Test statistic: F =

freedom [Note: n = abr .]

Test for Factor Interaction


H 0 : Factors A and B do not interact to affect the response mean
H a : Factors A and B do interact to affect the response mean

Test statistic: F =

MS ( AB )

MSE
Rejection region: F F , based on ( a 1)( b 1) numerator and ( n ab ) denominator degrees of

freedom

9|Page

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller

Test for Main Effect of Factor A


H 0 : No difference among the a mean levels of factor A
H a : At least two factor A mean levels differ

Test statistic: F =

MS ( A )

MSE
Rejection region: F F , based on ( a 1) numerator and ( n ab ) denominator degrees of freedom

Test for Main Effect of Factor B


H 0 : No difference among the b mean levels of factor B

H a : At least two factor B mean levels differ

Test statistic: F =

MS ( B )

MSE
Rejection region: F F , based on ( b 1) numerator and ( n ab ) denominator degrees of freedom

Conditions Required for Valid F-Tests in Factorial Experiments


1. The response distribution for each factor-level combination (treatment) is normal.
2. The response variance is constant for all treatments.
3. Random and independent samples of experimental units are associated with each treatment.

General ANOVA Summary Table for a Two-Factor Factorial Experiment with r


Replicates, where Factor A has a Levels and Factor B has b Levels
Source
A
B
AB
Error
Total

df
a 1
b 1
( a 1)( b 1)

SS
SSA
SSB
SSAB

MS
MSA
MSB
MSAB

ab ( r 1)
n 1

SSE

MSE

F
MSA/MSE
MSB/MSE
MSAB/MSE

SS(Total)

Note: That A + B + AB = Treatments from a completely randomized experiment.

10 | P a g e

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller

Formulas for the Calculations for a Two-Factor Factorial Experiment


CM = Correction for mean
n
yi
2
( Total of all observations )
i =1

=
=
Total number of observations
n

SS ( Total ) = Total sum of squares


=
(Sum of squares of all observations
)

=
CM

y
i =1

2
i

CM

SS ( A ) = Sum of squares for main effects, factor A


Sum of squares of the totals A1 , A2 ,..., Aa

= divided by the number of measurements

in a single total, namely br

A
i =1

2
i

br

CM

SS ( B ) = Sum of squares for main effects, factor B


Sum of squares of the totals B1 , B2 ,..., Ba

= divided by the number of measurements

in a single total, namely ar

B
i =1

2
i

ar

CM

SS ( AB ) = Sum of squares for AB interaction


Sum of squares of the cells totals

AB11 , AB12 ,..., ABab divided by

=
the number of measurements in

a single total, namely r

AB

=j 1 =i 1

2
ij

SS ( A ) SS ( B ) CM

11 | P a g e

Lecture Notes
Chapter Ten: Analysis of Variance
Randall Miller
Where
n = Total number of observations
a = Number of levels of factor A
b = Number of levels of factor B
r = Number of replicates ( observations per treatment )
Ai Total
for level i of factor A ( i 1, 2,..., a )
=
=
Bi = Total for level i of factor B ( i = 1, 2,..., b )
ABij = Total for treatment ( ij ) , i.e., for ith level of factor A and ith level of factor B

12 | P a g e