Você está na página 1de 60

An Overview of Design of

Experiments

Dr. Keerti Jain


• EXPERIMENTS
• A QUICK HISTORY OF DESIGN OF EXPERIMENTS
• WHY WE USE EXPERIMENTAL DESIGNS
• WHAT IS DESIGN OF EXPERIMENT
• HOW DESIGN OF EXPERIMENT CONTRIBUTES
• TERMINOLOGY
• ANALYSIS OF VARIATION (ANOVA)
• BASIC PRINCIPLE OF DESIGN OF EXPERIMENTS
• SOME EXPERIMENTAL DESIGNS
5/21/19 2
EXPERIMENT
Experiments involve
manipulation of one or more
independent variables, and
observing the effect on some
outcome (dependent variable).
Experiments can be done in the
field or in a laboratory.
5/21/19 3
A QUICK HISTORY
OF
DESIGN OF
EXPERIMENTS
•The agricultural origins, 1918 – 1940s
• R. A. Fisher & his co-workers
• Profound impact on agricultural science
• Factorial designs, ANOVA
•The first industrial era, 1951 – late 1970s
• Box & Wilson, response surfaces
• Applications in the chemical & process industries

5/21/19 4
CONTD…
•The second industrial era, late 1970s – 1990
• Quality improvement initiatives in many companies
• TQM were important ideas and became management goals
• Taguchi and robust parameter design, process robustness
•The modern era, beginning 1990
• Six sigma, Lean Six sigma
• Clinical Trails, Mathematical biology.
• Algorithm design and analysis,
• Networking, group testing, and cryptography

5/21/19 5
Why we use
Experimental
Designs
"All experiments are designed
experiments, it is just that some are poorly
designed and some are well-designed."
Experimental designs are used so that
the treatments may be assigned in an
organized manner to allow valid
statistical analysis to be carried out on
the resulting data.
5/21/19 6
What is Design of
Experiments
It is a logical planning (or construction) of
the experiment having a complete
sequence of steps taken ahead of time to
ensure that the appropriate data will be
obtained in a way which permits an
objective analysis of a particular problem
leading to valid and precise inference in
most economic and useful forms.

5/21/19 7
Subject Matter of
Design of Experiments
It includes:
• Planning of the experiment
• Obtaining data from it
• Making statistical analysis of the data
obtained.

5/21/19 8
HOW DESIGN OF
EXPERIMENT
CONTRIBUTES

• Reduce time to design/develop new products &


processes
• Improve performance of existing processes
• Improve reliability and performance of products
• Achieve product & process robustness
• Perform evaluation of materials, design
alternatives,  setting component & system
tolerances, etc.
5/21/19 9
TERMINOLOGY

CONTROL LEVEL BLOCKS


GROUP

TREATME
RANDOMN EXPERIMENT
NT
ESS AL ERROR
GROUP

REPLICATIO
FACTORS ANOVA
N
5/21/19 10
TERMINOLOGY

• Control Group :- A group assigned to the experiment, but not for


the purpose of being exposed to the treatment. Performance of this
group serves as a baseline.
• Treatment Group:- The Group in an experiment which receives
the specified treatment.
• Factor:- This term is used when an experiment involves more than
one variable. These variables are often identified as factor.
• Level:- Refers to the degree or intensity of a factor.
• Randomness:-refers to the property of completely chance events
that are not predictable.
• Replication:- The repetition of the treatment under consideration.
• Blocks:- refers to the categories of subjects with a treatment
group. 5/21/19 11
EXPERIMENTAL
ERROR

Experimental Error is the variation in the responses among


experimental units which are assigned the same treatment,
and are observed under the same experimental conditions. It is
measured by SSE (or MSE).
Ideally, we would like experimental error to be zero.
This is impossible because of (at least) one or more of the
following reasons:
• There are inherent differences in the experimental units
before they receive treatments.
• There is variation in the devices that record the
measurements.
• There is variation in applying or setting the treatments.
• There are extraneous factors other than the treatments
5/21/19 12

which affect the response.


ANALYSIS OF
VARIANCE (ANOVA)
• This Statistical technique was first developed by
R.A.Fisher and was extensively used for agriculture
experiments.

• It is mainly employed for comparison of means of 3


or more samples including the variations in each
sample.

• ANOVA is the method to estimate the contribution


made by each factor to the total variation.
5/21/19 13
ANOVA TABLE FORMAT
Sum
Source of of Degree of Mean
Variation Squar Freedom Squares F
(SV) es (df) (MS)
(SS)
MSTR=SSt /
Treatment SSt dft = nt-1
dft
MSE=SSr / MSTR / MSE
Error SSr dfe = dfT-dft
dfe
Total SST dfT = nT-1

5/21/19 14
The Steps in Designing an
Experiment

• Step 1: Identify the problem or claim to be studied.


The statement of the problem needs to be as
specific as possible. As your text says, it must
"identify the response variable and the population
to be studied".
• Step 2: Determine the factors affecting the
response variable.
This is best done by an expert in the field, but we'll
be able to do this for most examples we'll be
looking at.

5/21/19 15
The Steps in Designing
an Experiment (Contd…)
• Step 3: Determine the number of
experimental units.
In general, more experimental units is
better. Unfortunately, time and money
will always be limiting factors, so we
have to decide an appropriate number

5/21/19 16
The Steps in Designing
an Experiment (Contd…)
• Step 4: Determine the level(s) of each factor.
We split factors up into three categories:
o Control: If possible, we try to fix the level of factors that
we're not interested in.
o Manipulate: This is the treatment - we manipulate the
levels of the variable that we think will affect the response
variable.
o Randomize: Often, there are factors we just can't
control. To mitigate their effect on the data, we randomize
the groups. By randomly assigning experimental units,
these factors should be equally spread among all groups.

5/21/19 17
The Steps in Designing
an Experiment (Contd…)
• Step 5: Conduct the experiment.

• Step 6: Test the claim.

• Step 7: Interpret the results

5/21/19 18
BASIC PRINCIPLE OF
DESIGN OF
EXPERIMENTS
• Randomization
• Replication
• Local Control (Blocking)

5/21/19 19
Complete and Incomplete
Block Designs

5/21/19 20
SOME
EXPERIMENTAL
DESIGNS
• Completely Randomized Design (CRD)
• Randomized Block Design (RBD)
• Latin Square Design (LSD)
• Factorial Designs
• Balanced Incomplete Block Design (BIBD)
• Nested Balanced Incomplete Block designs
(NBIBD)
• Balanced Incomplete Block Design with Nested
Rows and Columns
5/21/19 21
5/21/19 22
• COMPLETELY RANDOMIZED DESIGNS are
the simplest design in which the treatments are
assigned to the experimental units completely at
random. This allows every experimental unit to
have an equal probability of receiving a
treatment.

• For CRD, any difference among experimental


units receiving the same treatment is considered
as experimental error.
5/21/19 23
• CRD is the simplest design to use.
• CRD is appropriate only for experiments with homogeneous
experimental units, such as laboratory experiments, where
environmental effects are relatively easy to control. .
• The CRD is best suited for experiments with a small number of
treatments.
• For field experiments, where there is generally large variation
among experimental plots in such environmental factors as soil,
the CRD is rarely used.
• Every experimental unit has the same probability of receiving
any treatment
• Treatments are assigned to experimental units completely at
5/21/19 24
random using a random number table, computer program, etc.
• In order to determine whether there is significant
difference in the durability of 3 makes of computers,
samples of size 5 are selected from each make and
the frequency of repair during the first year is
observed. The results are as follows:
Makes
A B C
5 8 7
6 10 3
8 11 5
9 12 4
7 4 1 5/21/19 25
5/21/19 26
HYPOTHESIS

H0: The three makes of computers do not


differ
significantly in the durability.

H1: Atleast one of the makes of computers


differ
significantly in the durability.
5/21/19 27
TABLE FOR
CALCULATION

MA Ti 2 ∑X2
Xij Ti ni Ti /ni
2
KE ij

A 5 6 8 9 7 35 5 122 245 255


5
B 8 10 11 12 4 45 5 202 405 445
5
C 7 3 5 4 1 20 5 400 80 100
TOTAL 100 15 365 730 800
0

5/21/19 28
Null Hypothesis :
H0: the 3 makes of computers do not differ in the durability

• CF = (Ti)2/ni
= (100)2/15
= 666.67

• SST = ∑∑X2ij – CF
= 800 -666.67
= 133.33

• SSM = ∑Ti2/ni – CF
= 730 - 666.67
= 63.33

• SSE = SST – SSM


= 133.33 -63.33
= 70
5/21/19 29
ANOVA
TABLE
Sources Sum of Degree Mean sum F0
of Square of of Square
Variatio freedom
n
Between 63.33 2 31.67 31.67 /
Makes 5.83
Within 70 12 5.83
Makes = 5.43
Total 133.33 14

From F – Tables, F5%(v1= 2, v2= 12) = 3.88


F0 > F5% Null hypothesis is rejected.
5/21/19 30
There is significant difference between the makes of
computers.
• Very flexible design (i.e. number of treatments and
replicates is only limited by the available number of
experimental units).
• Statistical analysis is simple compared to other
designs.
• Loss of information due to missing data is small
compared to other designs due to the larger number of
degrees of freedom for the error source of variation.
• Provides maximum number of degree of freedom.

5/21/19 31
• If experimental units are not homogeneous
and you fail to minimize this variation
using blocking, there may be a loss of
precision.
• Usually the least efficient design unless
experimental units are homogeneous.
• Not suited for a large number of
treatments.
5/21/19 32
5/21/19 33
5/21/19 34
• Any experimental design in which the
randomization of treatments is restricted to
groups of experimental units within a
predefined block of units assumed to be
internally homogeneous is called a randomized
block design.
• Divides the group of experimental units into n
homogeneous groups of equal or unequal sizes.

• These homogeneous groups are called blocks.


• The treatments are then randomly assigned
5/21/19 35
to
the experimental units in each block - one
• A randomized block experiment is assumed to be a two-
factor experiment., the factors are blocks and treatments.
• The blocks of experimental units are uniform.
• There is one observation per cell. It is assumed that there
is no interaction between blocks and treatments.
• The degrees of freedom for the interaction is used to
estimate error.
• Treatments randomly assigned to each experimental unit
of a block.

5/21/19 36
Sum
Source
of Degree of Mean
of
Squar Freedom Squares F
Variation
es (df) (MS)
(SV)
(SS)
MSB=SSb /
dfb = nb-1 MSB /
Blocks SSb dfb
MSErr

MSTR=SSt /
Treatmen dft = nt-1
SSt dft
t MSTR /
MSErr
dfe = dfT- MSErr=SSe /
Error SSe dfb-dft dfe 5/21/19 37

df = n -1
Four Doctors each test 4 treatments for certain
disease and observe the number of each days each
patient takes to recover. The results are :
Treatments
Doctor 1 2 3 4
A 10 14 19 20
B 11 15 17 21
C 9 12 16 19
D 8 13 17 20

5/21/19 38
TwoHypothesis
WAY ANALYSIS
H0A: There is no significant difference between the
doctors.
H1A: Atleast one of the doctor is significantly different.

H0B: There is no significant difference between the


treatments.
H1B: Atleast one of the treatment is significantly
different.

5/21/19 39
Table for
calculations
Doct
1 2 3 4 Ti K Ti2 / k ∑X2ij
or
A 10 14 19 20 63 4 992.25 1057
B 11 15 17 21 64 4 1024 1076
C 9 12 16 19 56 4 784 842
D 8 13 17 20 58 4 841 922
∑Ti2 / k =
Tj 38 54 69 80 241 16 3641.25 3897

∑Tj2 / h
=
36 72 1190.
Tj2 / h 1600 3880.2
1 9 25 5/21/19 40
5
• CF = (Ti)2 / N
= (241)2 / 16 =3630.06

• SSTotal = ∑∑X2ij - CF
= 3897 – 3630.06 = 266.94

• SSD = ∑Ti2 / h – CF
= 3641.25 – 3630.06 = 11.19

• SSt = ∑Tj2 / k – CF
= 3880.25 -3630.06 = 250.19

• SSe= SSTotal - SSD - SSt = 5.56


5/21/19 41
ANOVA TABLE
Source of Sum of Degree Mean F0
Variation Square of sum of
Freedom square
Doctors 11.19 3 3.73 3.73 /
0.62

= 6.02
Treatment 250.19 3 83.40 83.40 /
s 0.62
= 134.52
Error 5.56 9 0.62 -
From F – Tables,
Total F5%(v1= 3, v15
266.94 2= 9) = 3.86

F0 > F5%

The difference between the doctors is significant and that 42


5/21/19
between the Treatments is highly significant.
• Complete flexibility can have any number of treatments and
blocks.
• Provides more accurate results than the completely
randomized design due to grouping.
• Relatively easy statistical analysis even with missing data.
• Some treatments may be replicated more times than others.
• Whole treatments or entire replicates may be deleted from the
analysis.

5/21/19 43
• Not suitable for large numbers of treatments because
blocks become too large, and there is possibility of
hetertrogenity among the experimental units of the blocks
• Interactions between block and treatment effects increase
error.
• Serious problem with the analysis if a block factor by
treatment interaction effect actually exists and no
replication within blocks has been included. (solution:
use replication within blocks when possible).
5/21/19 44
5/21/19 45
• A Latin square is a square array of objects (letters A, B, C, …) such that
each object appears once and only once in each row and each column.
• Example - 4 x 4 Latin Square.
ABCD
BCDA
CDAB
DABC
• The Latin Square Design is for a situation in which there are two
extraneous sources of variation. If the rows and columns of a square are
thought of as levels of the the two extraneous variables, then in a Latin
square each treatment appears exactly once in each row and column.
• With the Latin Square design we are able to control variation in two
directions. 5/21/19 46
CHARACTERI
STICS OF LSD
• In LSD we have three factors:
Treatments, Rows and Columns
• The number of treatments = the number of rows =
the number of colums = t (say).
• The row-column treatments are represented by cells
in a t x t array.
• The treatments are assigned to row-column
combinations using a Latin-square arrangement,
that is each row contains every treatment.
and each column contains every treatment.
• Every treatment occurs once in each5/21/19
row 47and
column.
Sum
Source Of Degree Of Mean
Of
Variation Freedom Squares F
Squar
(SV) (df) (MS)
es (SS)
MSTR = SSt /
dft = nt-1 MSTR /
Treatment SSt dft
MSErr

MSRow = SSr / MSRow /


dfr = nr-1
Rows SSr dfr MSErr

MSCol = SSc / MSCol /


dfc = nc-1
Columns SSc dfc MSErr

dfe = dfT-dft- MSErr = SSe /


Error SSe dfr-dfc dfe 5/21/19 48

dfT = nT-1
• The Following Data resulted from an experiment
to compare three burners B1, B2 and B3. LSD
was used as the tests were made on 3 engines
and were spread over
Engine 1 3 Engine
days. 2 Engine 3
Day 1 B1 – 16 B2 – 17 B3 - 20
Day 2 B2 – 16 B3 – 21 B1 - 15
Day 3 B3 – 15 B1 - 12 B2 - 13

5/21/19 49
HYPOTHESIS
H0A: There is no significant difference between
burners.
H1A: Atleast one of the burner is significantly different.
H0B: There is no significant difference between the
days.
H1B: Atleast one of the day is significantly different
H0C: There is no significant difference between
Engines.
H1C: Atleast one of the engine is significantly different
5/21/19 50
E1 E2 E3 Ti Ti 2 / n ∑X2ij
Day 1 16(B1) 17(B2) 20(B3) 53 936.33 945
Day 2 16(B2) 21(B3) 15(B1) 52 901.33 922
Day 3 15(B3) 12(B1) 13(B2) 40 533.33 538
Tj 47 50 48 145 ∑= 2405
2370.9
9
T2j / n 736.33 833.33 768 ∑=
2337.6
6
∑X2ij 737 874 794 2405
Rearranging data values according to the Burners :
Burner Xk Tk T k2 / n
B1 16 15 12 43 616.33
B2 17 16 13 46 705.33
B3 20 21 15 56 1045.33
2366.99
5/21/1951
• CF = (Ti)2 / n
= (145)2 / 9 = 2336.11

• SSTotal =∑∑X2ij – CF
= 2405 – 2336.11 = 68.89

• SSD1=∑∑Ti2 / n – CF
= 2370.99 – 2336.11 = 34.88

• SSD2=∑∑Tj2 / n – CF
= 2337.66 – 2336.11 = 1.55

• SSD3=∑∑Tk2 / n – CF
= 2366.99 – 2336.11 = 30.88

• SSE = SSTotal – SSD1 – SSD2 – SSD3


5/21/19 52
= 1.55
ANOVA TABLE

S.V S.S d.f M.S F0


Days 34.88 2 17.44 17.55 /
0.775

= 22.51
Engines 1.55 2 0.775 0.775 /
0.775
=1
Burners 30.88 2 15.44 15.44 /
0.775 =
19.93
Error 1.55 2 0.775
Total 68.89 8

5/21/19 53
• From F – Tables, F5%(v1= 2, v2= 2) = 19.00
• F0(19.93) > F5% There is a significant Difference Between
the Burners

• F0(22.51) > F5% The Difference Between the Days is


significant

• F0(1) < F5% The Difference Between the Engine is not


significant
5/21/19 54
• We
  can control variation in two directions. It
means LSD is more efficient then CRD and
RBD.
• Being an incomplete 3-way desin it is
economic over the corresponding complete
3-way design. Instead of experimental units,
here only experimental units are sufficient.
• The analysis remains relatively simple even
with missing data.
5/21/19 55
• Number of treatment is limited to the number of replicates
which seldom exceeds 10.
• If we have less than 5 treatments, the df for controlling
random variation is relatively large and the df for error is
small.
• The number of treatments must equal the number of
replicates.
• The experimental error is likely to increase with the size of
the square.
• Evaluation of interactions between rows and columns, rows
and treatments & columns and treatments is not possible
separately.
5/21/19 56
FACTORIAL
EXPERIMENT
Factorial designs include two or more
factors, each having more than one level
or treatment. Participants typically are
randomized to a combination that
includes one treatment or level from each
factor.

5/21/19 57
BALANCED
INCOMPLETE
BLOCK DESIGNS
(BIBD)
• Situation where the number of treatments exceeds
number of units per block (or logistics do not allow for
assignment of all treatments to all blocks)
• # of Treatments  v
• # of Blocks  b
• Replicates per Treatment  r < b
• Block Size  k < v
• Total Number of Units  N = kb = rv
• All pairs of Treatments appear together in l = r(k-1)/
(v-1) Blocks for some integer l

5/21/19 58
NESTED
DESIGNS
• In certain multifactor experiments, the
levels of one factor are similar but not
identical for different levels of another
factor, (is unique to that particular
factor) this is called hierarchical or
nested design.
http://jrss.in/data/5I12.pdf

5/21/19 59
5/21/19 60

Você também pode gostar