Escolar Documentos
Profissional Documentos
Cultura Documentos
Dr. Zou
1/60
One more concept
2/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 2 / 60
Example: A fascinating landmark study of placebo surgery
Moseley et al. (2002) showed that in this controlled trial involving patients
with osteoarthritis of the knee, the outcomes after arthroscopic lavage or
arthroscopic debridement were no better than those after a placebo
procedure.
3/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 3 / 60
Confounding occurs when the effect of one factor or treatment can not
be distinguished from that of another factor or treatment. The two factors
or treatments are said to be confounded. E.g., the factor word processing
packages (A and B) and the factor ”the order of the document entered”
are confounded in our previous example.
4/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 4 / 60
More on responses
5/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 5 / 60
An example of multiple responses
Yi4
6/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 6 / 60
Surrogate responses are responses that are supposed to be related to
and predictive for the primary response. They are oftentimes shorter to
follow up, easier and cheaper. Example, increase in life expectancy vs. the
fraction of patients still alive after five years.
7/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 7 / 60
Randomization
8/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 8 / 60
Why do we need randomization?
9/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 9 / 60
Two treatments comparison example
10/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 10 / 60
Since surgery is a more invasive procedure, patients with better health
conditions will be more willing to take the surgery. Thus the drug therapy
would likely to have a lower effect score due to getting the weaker
patients, even if those two treatments are as effective as each other.
(confounding appears here.)
11/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 11 / 60
Randomization Schemes
12/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 12 / 60
How does this work? Randomization against confounding.
13/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 13 / 60
A quick test
Consider the paired design we saw last time, a company is evaluating two
different word processing packages (A and B) for use by its clerical staff.
The goal is to see how quickly a test document can be entered correctly
using two programs. Suppose that 20 test secretaries need to enter the
same document twice using each program. How will you apply
randomization to this case in order to avoid the previous confounding
factor e.g., order?
14/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 14 / 60
A possible method: We randomly select 10 secretaries to enter the
document twice using each in the order A first and B second; The rest 10
secretaries will enter the document twice using each program in the order
B first and A second. Later, when we perform paired t-test, the order
effect will be averaged out.
15/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 15 / 60
Randomization techiniques are used throughout the
exepriments
Some examples:
If experimental units are not used simultaneously, you can randomize
the order in which they are used.
16/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 16 / 60
Simple Comparative Experiments
17/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 17 / 60
An example
An engineer is studying the formulation of a Portland cement mortar. He
has added a polymer latex emulsion during mixing to determine if this
impacts the curing time and tension bond strength of the mortar. The
experimenter prepared 20 experimental samples and randomly assign 10
samples to receive the original formulation and 10 samples to receive the
modified formulation. When the cure process was completed, the
experimenter did find a very large reduction in the cure time for the
modified mortar formulation. Then he began to address the tension bond
strength of the mortar. If the new mortar formulation has an adverse
effect on bond strength, this could impact its usefulness.
Remark: see Chapter 2 of Montgomery’s book 6th edition for more details.
18/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 18 / 60
19/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 19 / 60
The crude average tension bond of the modified mortar is
ȳ1 = 16.76 kgf /cm2 compares with the average tension bond
ȳ2 = 17.04 kgf /cm2 of the unmodified mortar. The average tension bond
strengths in these two samples differ by what seems to be a modest
amount. However, it is not obvious that this difference is large enough to
imply that the two formulations really are different. Perhaps this observed
difference in average strengths is the result of sampling fluctuation and the
two formulations are really identical.
20/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 20 / 60
1. Assumptions
Let y11 , y12 , ..., y1n1 represent the n1 observations from the first treatment
(or the first factor level).
Let y21 , y22 , ..., y2n2 represent the n2 observations from the second
treatment (or the second factor level).
Assumptions:
1 We will assume that these observations are independent with each
other.
In a word, the samples are drawn at random from two independent normal
populations.
21/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 21 / 60
2. A Model for the Data
yij is the jth obs from factor level i (or ith treatment).
µi is the mean of the response at the ith factor level.
ij are independent as N(0, σi2 ).
22/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 22 / 60
3. Statistical Hypotheses are derived from research
questions
A statistical hypothesis is a statement either about the parameters of a
probability distribution or the parameters of a model. The hypothesis
reflects some conjecture about the problem situation. For example, in the
Portland cement experiment, we may think that the mean tension bond
strengths of the two mortar formulations are equal. This may be stated
formally as
H0 : µ 1 = µ 2
vs.
H1 : µ1 6= µ2
where µ1 is the mean tension bond strength of the modified mortar and µ2
is the mean tension bond strength of the unmodified mortar.
24/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 24 / 60
Two kinds of errors
Sometimes it is more convenient to work with the power of the test, where
25/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 25 / 60
The general procedure in hypothesis testing is to specify a value of the
probability of type I error α, often called the significance level of the
test, and then design the test procedure so that the probability of type II
error β has a suitably small value.
26/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 26 / 60
4.1 The pooled Two-Sample t-Test
27/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 27 / 60
To test the null hypothesis that H0 : µ1 = µ2 vs. H0 : µ1 6= µ2 in a
two-sided fashion, we would compare the value of t0 to the t distribution
with n1 + n2 − 2 degrees of freedom.
If |t0 | ≥ tα/2,n1 +n2 −2 where tα/2,n1 +n2 −2 is the upper α/2 percentage
point of the t distribution, then we would reject the null hypothesis
H0 and conclude that the mean strengths of the two formulations of
Portland cement mortar differ.
Show some details related to justification (rationale) of this approach
1: Pivotal quantity; 2: Likelihood ratio test in Stat 640.
(Draw a plot and see this in R)
28/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 28 / 60
4.2 Unpooled Two-Sample t-test
29/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 29 / 60
4.3 Which test should we choose in practice?
30/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 30 / 60
Calculations behind the software
31/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 31 / 60
Because the sample standard deviations are reasonably similar via both
boxplot and summary statistics, it is not unreasonable to conclude that the
population standard deviations (or variances) are equal. Therefore, we
apply two sample t-test to test the hypotheses
H0 : µ 1 = µ 2
H1 : µ1 6= µ2
32/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 32 / 60
Then the value of test statistic is
16.76 − 17.04
t0 = q = −2.20
1 1
0.284 10 + 10
Since |t0 | = 2.20 > t0.025,18 = 2.101, we would reject H0 and conclude
that the mean tension bond strengths of the two formulations of Portland
cement mortar are different. One can conclude that the modified
formulation reduces the bond strength (just because we conducted a
two-sided test, this does not preclude drawing a one-sided conclusion when
the null hypothesis is rejected).
33/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 33 / 60
The P-value approach (Fisher’s approach)
The P-value is the probability that the test statistic will take on a value
that is at least as extreme as the observed value of the statistic when the
null hypothesis H0 is true. Thus, a P-value conveys much information
about the weight of evidence against H0, and so a decision maker can
draw a conclusion at any specified level of significance.
1 The smaller the p-value, the stronger the evidence against the null
hypothesis H0 .
34/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 34 / 60
For our case, under H0 , P.value = Pr (|T18 | ≥ |t0 |) = Pr (|T18 | ≥ 2.2) =
2Pr (T18 ≥ 2.2) = 2 ∗ (1 − pt(2.2, 18)) = 0.041
35/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 35 / 60
One-sided alternative hypotheses
In some problems, one may wish to reject H0 only if one mean is larger
than the other. Thus, one would specify a one-sided alternative hypothesis
H1 : µ1 > µ2 and would reject H0 only if t0 > tα,n1 +n2−2 . If one wants to
reject H0 only if µ1 is less than µ2 , then the alternative hypothesis is
H1 : µ1 < µ2 , and one would reject H0 if t0 < −tα,n1 +n2−2 .
36/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 36 / 60
Confidence Intervals (Whether there is a difference; How
large that difference is if possible.)
37/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 37 / 60
Suppose that θ is an unknown parameter. An interval estimate of θ is to
find two statistics L and U satisfying
Pr (L ≤ θ ≤ U) = 1 − α.
38/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 38 / 60
q
1 1 ȳ1 −ȳ2 −(µ1 −µ2 )
Letting SE = Sp n1 + n2 , the statistic SE ∼ tn1 +n2 −2 . Then
ȳ1 − ȳ2 − (µ1 − µ2 )
Pr −tα/2,n1 +n2 −2 ≤ ≤ tα/2,n1 +n2 −2 = 1 − α.
SE
Rearranging, we have
Pr ȳ1 − ȳ2 − tα/2,n1 +n2 −2 SE ≤ µ1 − µ2 ≤ ȳ1 − ȳ2 + tα/2,n1 +n2 −2 SE = 1 − α.
39/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 39 / 60
In our case of portland cement mortar, the 95% CI estimate for the
difference in mean tension bound strength for two formulations is found as
follows:
−0.55 ≤ µ1 − µ2 ≤ −0.01.
1 Hypotheses testing: Note that because µ1 − µ2 = 0 is not included
in this interval, the data do not support the hypothesis that µ1 = µ2
at the 5 percent level of significance.
2 With 95% confidence, we know that the true difference of two
formulations can be as large as -0.55 and as small as -0.01. Whether
this difference is of practical importance depends on engineers’
decisions.
40/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 40 / 60
Checking assumptions in the t-Test
In using the t-test procedure we make the assumptions that both samples
are random samples that are drawn from independent populations that
can be described by a normal distribution and that the standard
deviation or variances of both populations are equal. The assumption of
independence is critical, and if the run order is randomized (and, if
appropriate, other experimental units and materials are selected at
random), this assumption will usually be satisfied. The equal variance and
normality assumptions are easy to check using a Quantile-Quantile Plot
(qq plot).
41/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 41 / 60
QQ plot
42/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 42 / 60
QQ Plot: Red−UnModified, Black−Modified
17.4
17.2
17.0
Data points
16.8
16.6
16.4
Since they want to detect this change not by chance, they set the power
to be 0.99 (e.g., the Type II error is 0.01) when the true difference in
means is -0.5 kgf /cm2 . What is the sample size needed for each group
given the significance level α = 0.05? Assume that previous data shows
that the standard deviation of the units is 0.30.
44/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 44 / 60
power .t.test(delta = −0.5, sd = 0.30, power = 0.99)
Two-samplet test power calculation
n = 14.27349
delta = 0.5
sd = 0.3
sig .level = 0.05
power = 0.99
alternative = two.sided
NOTE: n is number in *each* group
45/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 45 / 60
Summary of Sample size selection
46/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 46 / 60
A nonparametric approach: Two sample Permutation test
(Randomization test)
In our example, the mean of modified mortar is 16.76 < 17.04 which is the
mean of unmodified mortar. We are interested in testing H0 : µ1 = µ2 vs
H1 : µ1 6= µ2 . Or in general you see H0 : no treatment effect vs
H1 : there is an effect.
1 If there is no treatment effect, then there is no difference between
units in treatment and those in control group.
2 When null hypothesis is true (there is no treatment effect), the
observed groups were simply obtained by randomly splitting the 20
subjects into two groups.
3 If we pool 20 observations and randomly allocate them into two
groups, then we should expect the same testing result from the two
“new” groups.
47/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 47 / 60
Testing procedures:
20!
3 Perform (2) for all 10!10! = 184756 possible combinations.
48/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 48 / 60
Remarks:
49/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 49 / 60
Paired Comparison Design
50/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 50 / 60
Paired Design
51/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 51 / 60
52/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 52 / 60
A Model for the Paired Data
53/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 53 / 60
Note that if we compute the jth paired difference
That is, we may make inferences about the difference in the mean
hardness readings of the two tips µ1 − µ2 by making inferences about the
mean of the differences µd . Notice that the additive effect of the
specimens βj cancels out when the observations are paired in this manner.
54/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 54 / 60
Testing H0 : µ1 = µ2 is equivalent to testing
H 0 : µd = 0
H1 : µd 6= 0.
55/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 55 / 60
Inferences About the Variances of Normal Distributions
56/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 56 / 60
One sample variance inference/test
H0 : σ 2 = σ02
H1 : σ 2 6= σ02
2
The test statistic χ20 = (n−1)S
σ02
∼ χ2n−1 , under H0 , where
S 2 = ni=1 (yi − ȳ )2 /(n − 1).
P
The null hypothesis is rejected if χ20 > χ2α/2,n−1 or if χ20 < χ21−α/2,n−1 .
57/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 57 / 60
The 100(1 − α)% confidence interval on σ 2 is
(n − 1)S 2 2 (n − 1)S 2
≤ σ ≤ .
χ2α/2,n−1 χ21−α/2,n−1
58/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 58 / 60
Two sample variance inference/test
H0 : σ12 = σ22
H1 : σ12 6= σ22
S12
is the ratio of the sample variances F0 = S22
∼ Fn1 −1,n2 −1 under H0 .
59/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 59 / 60
In R, you can use the code var .test(Modified, Unmodified).
S12 σ2 S2
2
F1−α/2,n1 −1,n2 −1 ≤ 12 ≤ 12 Fα/2,n1 −1,n2 −1 .
S2 σ2 S2
60/60
Dr. Zou (csueb) Randomization and Simple Comparative Experiments Jan 28, 2020 60 / 60