Polytechnic University of The Philippines College of Engineering Department of Industrial Engineering

POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
COLLEGE OF ENGINEERING
DEPARTMENT OF INDUSTRIAL ENGINEERING
MODULE 2
STATISTICAL INFERENCE FOR TWO SAMPLES
The previous module presented hypothesis tests for a single
population parameter (the mean, the variance, or a proportion).
This module extends the concepts learned in the first module to the
case of two independent populations.
1.0 Inference for a Difference in Means of Two Normal

Distributions, Variances Known
The first case to be learned in this module is testing the hypothesis

on the differences in means `1 2 of two normal distributions,
wherein both of their variances, 12 and 22 , are known.
ASSUMPTIONS:
1. Any element from either population can be considered a

random sample of the population from which it came from.
2. The two populations are independent.
3. Both population are normally distributed.
In the previous module, it was mentioned that a logical estimator

for population mean, , is the sample mean, X . Thus, a logical
estimator for the difference between two population means,
`1 2 , is the difference between two sample means, X 1 X 2 .
The test statistic therefore for this type of hypothesis testing is:
Page 1 of 17
z0 =
(X 1 X 2 ) (1 2 )
12 22
+
n1 n2
Consequently, the critical regions are defined as:
Alternative Hypotheses Critical Region

1 2 0 z0 > z 2 or z0 < z 2
1 2 > 0 z0 > z
1 2 < 0 z0 < z
EXAMPLE: A product developer is interested in reducing the

drying time of a primer paint. Two formulations of the paint are
tested; formulation 1 is the standard chemistry, and formulation 2
has a new drying ingredient that should reduce the drying time.
From experience, it is known that the standard deviation of drying
time is 8 minutes, and this inherent variability should be unaffected
by the addition of the new ingredient. Ten specimens are painted
with formulation 1, and another 10 specimens are painted with
formulation 2; the 20 specimens are painted in random order. The
two sample average drying times are x1 = 121 minutes and x 2 =
112 minutes, respectively. What conclusions can the product
developer draw about the effectiveness of the new ingredient,
using = 0.05 .
ANSWER:
1. The quantity of interest is the difference in mean drying times,

`1 2 , and 0 = 0 .
2. H 0 : 1 2 = 0 or H 0 : 1 = 2
H1 : 1 2 > 0 or H1 : 1 > 2
Page 2 of 17
3. = 0.05
4. The test statistic is
z0 =
(X 1 X 2 ) 0
12 22
+
n1 n2
where 12 = 22 = (8) = 64 and n1 = n2 = 10 .

2
5. C.R.: z0 > 1.645 = z0.05
6. Computation: Since x1 = 121 minutes and x 2 = 112 minutes, the

test statistic is
z0 =
(121 112) 0 = 2.52
82 82
+
10 10
7. Result: Since z0 = 2.52 > 1.645 , we reject the null hypothesis.
8. Conclusion: Adding the new ingredient to the paint significantly

reduces the drying time.
2.0 Inference for the Difference in Means of Two Normal

Distributions, Variances Unknown
In this section, the case of unknown variances are discussed. Since

variances of the two populations, 12 and 22 , are unknown, they
are logically estimated by S12 and S 22 , respectively.
Page 3 of 17
For large sample sizes, the normal distribution can be assumed.
However, for small sample sizes, the distribution to be used,
though still assumed to be normal, is the t-distribution.
There are two cases to be considered: the first is wherein the

unknown variances are assumed to be equal and the second is
wherein they are assumed to be unequal.
Case 1: 12 = 22
In this particular case, we assume that the two variances are equal.
Since they are equal, it is reasonable to combine the two sample
variances, S12 and S 22 , to form an estimator of 2 . The
combination of the two sample variances is what is known as the
pooled estimator, given by the equation:
S 2
=
(n1 1)S12 + (n2 1)S 22
(formal equation)
p
n1 + n2 2
If we try to rearrange the equation, we can see that the pooled

estimator is actually a weighted average of the sample variances
with respect to the sample sizes:
n1 1 n2 1
S p2 = S12 + S 22 (weighted ave.)
n1 + n2 2 n1 + n2 2
EXAMPLE: Compute for the pooled estimator, S p2 , given the

following values: n1 = 12 , n2 = 15 , S12 = 5.0 , S 22 = 5.3 . Prove that
the formal and the weighted average equations will give the same
answer.
ANSWER:
Page 4 of 17
Formal Equation
S p2 =
(12 1)(5.0) + (15 1)(5.3) = 5.17
12 + 15 2
Weighted Average Method
The sample sizes given are n1 = 12 and n2 = 15 . Total number of

samples is 27 (12 + 15 = 27). The weight of sample 1 is 0.444 or
44.4% (12/27=0.444) while the weight of sample 2 is 0.556 or
55.6% (15/27=0.556). Thus, using the weighted average method:
S p2 = 0.444(5.0 ) + 0.556(5.3) = 5.17
Knowing the pooled estimator to estimate for the value of 2 , the

test statistic for this case is given as:
t0 =
(X 1 X 2 ) (1 2 )
1 1
Sp +
n1 n2
with n1 + n2 2 degrees of freedom.

1 2 0 t0 > t 2,n1 +n2 2 or t0 < t 2,n1 +n2 2
1 2 > 0 t0 > t ,n1+n2 2
1 2 < 0 t0 < t ,n1+n2 2
Page 5 of 17
EXAMPLE: Two catalysts are being analyzed to determine how
they affect the mean yield of a chemical process. Specifically,
catalyst 1 is currently in use, but catalyst 2 is acceptable. Since
catalyst 2 is cheaper, it should be adopted, providing it does not
change the process yield. A test is run in the pilot plant and results
in the data shown immediately after this problem. Is there any
difference between the mean yields? Use = 0.05 and assume
equal variances.
Observation # Catalyst 1 Catalyst 2

1 91.50 89.19
2 94.18 90.95
3 92.18 90.46
4 95.39 93.21
5 91.79 97.19
6 89.07 97.04
7 94.72 91.07
8 89.21 92.75
Sample Mean x1 = 92.255 x 2 = 92.733
Sample S.D. s1 = 2.39 s2 = 2.98
ANSWER:
1. The parameters of interest are 1 and 2 , the mean process yield

using catalysts 1 and 2, respectively; specifically, we want to know
if `1 2 = 0 .
2. H 0 : 1 2 = 0 or H 0 : 1 = 2
H1 : 1 2 0 or H1 : 1 2
3. = 0.05
4. The test statistic to be used is:
Page 6 of 17
t0 =
(X 1 X 2 ) (1 2 )
1 1
Sp +
n1 n2
5. CR: t0 > t0.025,14 = 2.145 or t0 < t0.025,14 = 2.145
6. Computation:
Compute for the pooled estimator first:
s 2
=
(n1 1)s12 + (n2 1)s22 (7 )(2.39) + (7 )(2.98)
=
2 2
= 7.30
p
n1 + n2 2 8+82
s p = 7.30 = 2.70
Thus:
t0 =
(X 1 X 2 ) (1 2 ) (92.255 92.733) 0
= = 0.35
1 1 1 1
Sp + 2.70 +
n1 n2 8 8
7. Result: Since t0 = 0.35 < 2.145 , we cannot reject the null

hypothesis.
8. Conclusion: We do not have strong evidence to conclude that

catalyst 2 results in a mean yield that is different from the mean
yield when catalyst 1 is used.
Page 7 of 17
Case 2: 12 22
In some situations, we cannot, to a reasonable extent, assume that

the unknown variances 12 and 22 are equal. In this situation, we
cannot combine the two variances to form a pooled estimator.
Thus, the test statistic remains as:
t0 =
(X 1 X 2 ) (1 2 )
S12 S 22
+
n1 n2
However, the degrees of freedom is not as simple as n1 + n2 2 . To

compute for the degrees of freedom in the case 12 22 , we use
the equation:
2
S12 S 22
+
n1 n2
=
(S n1 ) (S 22 n2 )
1
2
+
2 2
n1 1 n2 1
EXAMPLE: Arsenic concentration in public drinking water

supplies is a potential health risk. An article in the Arizona
Republic (Sunday, May 27, 2001) reported drinking water arsenic
concentrations in parts per billion (ppb) for 1 metropolitan Phoenix
communities and 10 communities in rural Arizona.
Metro Phoenix Rural Arizona

Phoenix, 3 Rimrock, 48
Chandler, 7 Goodyear, 44
Gilbert, 25 New River, 40
Glendale, 10 Apachie Junction, 38
Page 8 of 17
Mesa, 15 Buckeye, 33
Paradise Valley, 6 Nogales, 21
Peoria, 12 Black Canyon City, 20
Scottsdale, 25 Sedona, 12
Tempe, 15 Payson, 1
Sun City, 7 Casa Grande, 18
x1 = 12.5 x 2 = 27.5
s1 = 7.63 s2 = 15.3
We wish to determine if there is any difference in mean arsenic

concentrations between metropolitan Phoenix communities and
communities in rural Arizona. Assume that the population
variances are not equal.
1. The parameters of interest are the mean arsenic concentrations

for the two geographic regions; furthermore, we are interested in
determining whether `1 2 = 0 .
2. H 0 : 1 2 = 0 or H 0 : 1 = 2
H1 : 1 2 0 or H1 : 1 2
3. = 0.05
4. The test statistic to be used is:
t0 =
(X 1 X 2 ) (1 2 )
S12 S 22
+
n1 n2
5. To compute for the critical region (C.R.), we need to first

compute for the degrees of freedom, :
Page 9 of 17
2
(7.63)2 + (15.3)2
2
S12 S 22
+
n1 n2 10 10
= = = 13.2 13
(S1
2
n1 ) (S n2 )
+
2 2
2
2
[(7.63) 10] + [(15.3) 10]
2 2 2 2
n1 1 n2 1 9 9
Thus, C.R.: t0 > t0.025.13 = 2.160 or t0 < t0.025.13 = 2.160
6. Computation:
t0 =
(X 1 X 2 ) (1 2 )
=
(12.5 27.5) 0 = 2.77
S
+
S1
2 2
2 (7.63)2 + (15.3)2
n1 n2 10 10
7. Result: Since, t0 = 2.77 < 2.160 , we reject the null hypothesis.
8. Conclusion: There is sufficient evidence to conclude that mean

arsenic concentration is different between the two geographic
locations.
3.0 Paired t-Test
A special case of the two-sample t-tests occurs when the

observations on the two populations of interest are collected in
pairs. In this situation we test if the mean of the differences is
equal to zero or not.
NOTE: For the paired t-test, we test if the mean of the differences
is equal to zero. For the previous cases in this module, we tested
whether the difference of the means is equal to zero.
Thus, the null and alternative hypotheses may be written as:
Page 10 of 17
H 0 : D = 0
H1 : D 0
with the test statistic being:
d 0
t0 =
Sd n

D 0 t0 > t 2,n1 or t0 < t 2,n1
1 2 > 0 t0 > t 2,n1
1 2 < 0 t0 < t 2,n1
EXAMPLE: An article in the Journal of Strain Analysis (1983,

Vol. 18, No. 2), compares several methods for predicting the shear
strength for steel plate girders. Data for two of these methods, the
Karlsruhe and Lehigh procedures, when applied to nine specific
girders, are shown in the table below. Determine whether there is
any difference (on the average) between the two methods.
Girder Karlsruhe Lehigh Difference,

Method Method dj
S1/1 1.186 1.067 0.119
S2/1 1.151 0.992 0.159
S3/1 1.322 1.063 0.259
S4/1 1.339 1.062 0.277
S5/1 1.200 1.062 0.138
S6/2 1.402 1.178 0.224
S2/2 1.365 1.037 0.328
S2/3 1.537 1.086 0.451
S2/4 1.559 1.052 0.507
Page 11 of 17
ANSWER:
1. The parameter of interest is the difference in mean shear

strength between the two methods, D .
2. H 0 : D = 0
H1 : D 0
3. = 0.05
4. The test statistic is:
d 0
t0 =
Sd n
5. C.R.: t0 > t0.025,8 = 2.306 or t0 < t0.025,8 = 2.306
6. Computation:
d 0 0.2736
t0 = = = 6.05
S d n 0.1356 9
7. Result: Since t0 = 6.05 > 2.306 , we reject H 0 .
8. Conclusion: The two strength prediction methods yield different

results.
4.0 Inferences on the Variances of Two Normal Populations
We not introduce tests for two population variances.
Page 12 of 17
Suppose that two independent normal populations are of interest,
where the population means and variances are unknown. We wish
to test the hypotheses about the equality of the two variances, say
H 0 : 12 = 22 . Assume that two random samples of size n1 from
population 1 and n2 from population 2 are available, and let S12 and
S 22 be the sample variances. We wish to test the hypotheses
H 0 : 12 = 22
H1 : 12 22
To test the hypotheses above, we will be employing the F-

distribution:
s12
f0 = 2
s2
which has an F-distribution with n1 1 numerator degrees of

freedom and n2 1 denominator degrees of freedom.

H1 : 12 22 f 0 > f 2, n1 1,n2 1 or
f 0 < f1 2, n1 1,n2 1
H1 : 12 > 22 f 0 > f ,n1 1,n2 1
H1 : 12 < 22 f 0 < f1 ,n1 1, n2 1
EXAMPLE: Oxide layers on seminconductor wafers are etched in

a mixture of gases to achieve the proper thickness. The variability
in the thickness of these oxide layers is a critical characteristic of
the wafer, and low variability is desirable for subsequent
processing steps. Two different mixtures of gases are being studies
Page 13 of 17
to determine whether one is superior in reducing the variability of
the oxide thickness. Twenty wafers are etched in each gas. The
sample standard deviations of oxide thickness are s1 = 1.96
angstroms and s2 = 2.13 angstroms, respectively. Is there any
evidence to indicate that either gas is preferable? Use = 0.05 .
ANSWER:
1. The parameters of interest are the variances of oxide thickness

12 and 22 . We will assume that oxide thickness is a normal
random variable for both gas mixtures.
2. H 0 : 12 = 22
H1 : 12 22
3. = 0.05
4. Test statistic:
s12
f0 = 2
s2
5. C.R.:
f 0 > f 0.025,19,19 = 2.53 or f 0 < f 0.975,19,19 = 1 = 0.40
f 0.025,19,19
6. Computation:
1.96 2 3.84
f0 = = = 0.85
2.132 4.54
Page 14 of 17
7. Result: Since 0.85 > f 0.975,19,19 = 0.40 , we cannot reject the null
hypothesis.
8. Conclusion: There is no strong evidence to indicate that either

gas results in a smaller variance of oxide thickness.
5.0 Large-Sample Test on Two Population Proportions
We now consider the case where there are two binomial

parameters of interest, say, p1 and p2 , and we wish to execute a
large-sample test of hypothesis to draw inferences about these
proportions. We will only be dealing with large-sample hypothesis
testing since we can easily use the normal distribution to
approximate the binomial distribution for large sample sizes, n.
Suppose that two independent random samples of sizes n1 and n2

are taken from two populations. Furthermore, suppose that the
estimators of the population proportions P1 = X 1 n1 and
P2 = X 2 n2 have approximate normal distributions. We are
interested in testing the hypothesis:
H 0 : p1 = p2
H1 : p1 p2
using the test statistic:
Z=
(P P ) ( p p )
1 2 1 2
P (1 P ) +
1 1
n n
1 2
where
Page 15 of 17
x +x
P = 1 2
n1 + n2

H1 : p1 p2 z0 > z 2 or z0 < z 2
H1 : p1 > p2 z0 > z
H1 : p1 < p2 z0 < z
EXAMPLE: Extracts of St. Johns Wort are widely used to treat

depression. An article in the April 18, 2001 issue of the Journal of
the American Medical Association compared the efficacy of a
standard extract of St. Johns Wort with a placebo in 200
outpatients diagnosed with major depression. Patients were
randomly assigned to two groups; one group received the St.
Johns Wort, and the other received the placebo. After eight week,s
19 of the placebo-treated patients showed improvement, whereas
27 of those treated with St. Johns Wort improved. Is there any
reason to believe that St. Johns Wort is effective in treating major
depression? Use = 0.05 .
ANSWER:
1. The parameter of interest are p1 and p2 , the proportion of

patients who improved following treatment with St. Johns Wort
( p1 ) or the placebo ( p2 ).
2. H 0 : p1 = p2
H1 : p1 p2
3. = 0.05
4. The test statistic is:
Page 16 of 17
Z=
( p 1 p 2 ) ( p1 p2 )
1 1
p (1 p ) +
n1 n2
where
x1 + x2
p =
n1 + n2
5. C.R.: z0 > z0.025 = 1.96 or z0 < z0.025 = 1.96
6. Computation:
Z=
(0.27 0.19) 0 = 1.35
1 1
0.23(0.77 ) +
100 100
7. Result: Since z0 = 1.35 < 1.96 , we fail to reject the null

hypothesis.
8. Conclusion: There is insufficient evidence to support the claim

that St. Johns Wort is effective in treating major depression.
Page 17 of 17

Polytechnic University of The Philippines College of Engineering Department of Industrial Engineering

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Polytechnic University of The Philippines College of Engineering Department of Industrial Engineering

Enviado por

Direitos autorais:

Formatos disponíveis

POLYTECHNIC UNIVERSITY OF THE PHILIPPINES

1.0 Inference for a Difference in Means of Two Normal

The first case to be learned in this module is testing the hypothesis

1. Any element from either population can be considered a

In the previous module, it was mentioned that a logical estimator

Consequently, the critical regions are defined as:

Alternative Hypotheses Critical Region

EXAMPLE: A product developer is interested in reducing the

1. The quantity of interest is the difference in mean drying times,

4. The test statistic is

where 12 = 22 = (8) = 64 and n1 = n2 = 10 .

5. C.R.: z0 > 1.645 = z0.05

6. Computation: Since x1 = 121 minutes and x 2 = 112 minutes, the

7. Result: Since z0 = 2.52 > 1.645 , we reject the null hypothesis.

8. Conclusion: Adding the new ingredient to the paint significantly

2.0 Inference for the Difference in Means of Two Normal

In this section, the case of unknown variances are discussed. Since

There are two cases to be considered: the first is wherein the

If we try to rearrange the equation, we can see that the pooled

EXAMPLE: Compute for the pooled estimator, S p2 , given the

Weighted Average Method

The sample sizes given are n1 = 12 and n2 = 15 . Total number of

S p2 = 0.444(5.0 ) + 0.556(5.3) = 5.17

Knowing the pooled estimator to estimate for the value of 2 , the

with n1 + n2 2 degrees of freedom.

Consequently, the critical regions are defined as:

Alternative Hypotheses Critical Region

Observation # Catalyst 1 Catalyst 2

1. The parameters of interest are 1 and 2 , the mean process yield

4. The test statistic to be used is:

5. CR: t0 > t0.025,14 = 2.145 or t0 < t0.025,14 = 2.145

Compute for the pooled estimator first:

7. Result: Since t0 = 0.35 < 2.145 , we cannot reject the null

8. Conclusion: We do not have strong evidence to conclude that

In some situations, we cannot, to a reasonable extent, assume that

However, the degrees of freedom is not as simple as n1 + n2 2 . To

EXAMPLE: Arsenic concentration in public drinking water

Metro Phoenix Rural Arizona

We wish to determine if there is any difference in mean arsenic

1. The parameters of interest are the mean arsenic concentrations

4. The test statistic to be used is:

5. To compute for the critical region (C.R.), we need to first

Thus, C.R.: t0 > t0.025.13 = 2.160 or t0 < t0.025.13 = 2.160

7. Result: Since, t0 = 2.77 < 2.160 , we reject the null hypothesis.

8. Conclusion: There is sufficient evidence to conclude that mean

3.0 Paired t-Test

A special case of the two-sample t-tests occurs when the

Thus, the null and alternative hypotheses may be written as:

with the test statistic being:

Consequently, the critical regions are defined as:

EXAMPLE: An article in the Journal of Strain Analysis (1983,

Girder Karlsruhe Lehigh Difference,

1. The parameter of interest is the difference in mean shear

4. The test statistic is:

5. C.R.: t0 > t0.025,8 = 2.306 or t0 < t0.025,8 = 2.306

7. Result: Since t0 = 6.05 > 2.306 , we reject H 0 .

8. Conclusion: The two strength prediction methods yield different

4.0 Inferences on the Variances of Two Normal Populations

We not introduce tests for two population variances.

To test the hypotheses above, we will be employing the F-

which has an F-distribution with n1 1 numerator degrees of

Alternative Hypotheses Critical Region