Escolar Documentos
Profissional Documentos
Cultura Documentos
Testing of Hypothesis
Hypothesis means the assumption or quantitative statement of the population parameter which may be true or false. In
order to make proper decision about the quantitative statement of the population, testing of hypothesis technique is
used. The procedure of testing the reliability or validity of such hypothesis by using sample statistic is called testing of
hypothesis or statistical hypothesis or test of significance.
For eg: Suppose a manufacturer of light bulbs claims that the mean life of its product is 3000 hours. Using a sample say
500 bulbs selected at random a consumer or decision maker can test the claim of the manufacturer by looking how
many hours on average they last by calculating the sample mean life. If the selected samples gave an average life of
2900 or 3100 hours then the decision maker will accept the manufacturers claim. If the selected samples gave an
average life of 1800 hours then the decision maker will reject the manufacturers claim. From this example it is clear
that acceptance and rejection depends upon the gap between the sample statistic and population parameter.
Types of Hypothesis
Basically, there are two types of hypothesis
1) Null hypothesis
2) Alternative hypothesis
1) Null Hypothesis
It is a hypothesis of no difference which means there is no significant difference between the sample statistic and
population parameter. In other word, if the difference between true and expected value is set zero, then the hypothesis is
called null hypothesis. It is denoted by Ho. For eg: If the population mean ( ) has specified value o, then we set up the
null hypothesis as
Ho: = o
( - true value, o expected value)
If the manufacturers claim that average life of bulb is 3000 hours then null hypothesis is set up as Ho: = 3000 hours.
2) Alternative Hypothesis
If the decision maker rejects the null hypothesis on the basis of sample information, he/she should accept another
hypothesis which is complementary to null hypothesis and is known as alternative hypothesis. In other word, if the
difference between true and expected value is not equal to zero, then the hypothesis is called alternative hypothesis. It is
denoted by H1.
If the null hypothesis is set up as Ho: = o, then the alternative hypothesis may be either of the following
i)
H1: = o, i.e. there is significant difference between sample statistic and population parameter.
ii)
H1: > o, i.e. population mean is greater than o
iii)
H1: < o, i.e. population mean is less than o
Ho False
Type II Error
No Error
Type I Error
The error committed in rejecting Ho when Ho is true is called type I error. The probability of committing type I error is
called the size of test or size of critical region and is denoted by .
= Prob.{ type I error} = Prob. { Reject Ho / Ho is true }
Type II Error
The error committed in accepting H o when Ho is false is called type II error. The probability of committing type II error
is denoted by .
= Prob.{ type II error} = Prob. { Accept Ho / Ho is false }
[Suppose we are going to buy 1000 pieces of apples. Out of these 1000, 50 apples are sour and remaining is sweet. In
order to buy apples, 20 apples are selected as sample and taste it. By chance, selected 20 apples are sour. Then we drop
1
Testing of Hypothesis
the idea of buying the apples. Here not buying the apple is wrong decision. Such error decision is said to be type I error
as we are rejecting true hypothesis i.e. rejecting sweet apples.
Similarly, if 50 apples are sweet out of 1000 and remaining are sour. Suppose 20 apples are selected as sample and taste
it. By chance, selected 20 apples are sweet. Then we decide to buy the apples. Here buying the apple is wrong decision.
Such error decision is said to be type II error as we are accepting false hypothesis i.e. accepting sour apples.
If the hypothesis is true and rejects it, no great harm has been done because we can wait for next lot. This type of error
simply leads to opportunity loss. But if the hypothesis is false and accept it, the result may be very harmful. Type II
error is very undesirable result and makes direct impact to the decision maker. Type II error is more harmful than type I
error.]
Level of Significance
The maximum probability of committing a type I error is called the level of significance. In other word, the probability
of rejecting a true null hypothesis is called level of significance. It is denoted by .
= Prob.{ type I error} = Prob. { Rejecting Ho when Ho is true }
Testing of Hypothesis
critical value of Z for one tailed test at level of significance is same as the critical value of Z at two tailed test at level
of significance 2. The critical value of Z for two tailed test and one tailed test at
= 5% level of significance can be shown in following diagram.
Critical Values (Z) of Z
Critical values
Level of significance ()
(Z)
1%
5%
10 %
Two-tailed test
Z = 2.58
Z = 1.96
Z = 1.645
Right-tailed test
Z = 2.33
Z = 1.645
Z = 1.28
Left-tailed test
Z = -2.33
Z = -1.645
Z = -1.28
5. Draw Conclusion:
Draw conclusion by comparing the calculated value and tabulated value of test statistic as follows;
If calculated value is less than or equal to the tabulated value at particular level of significance, then the null hypothesis
is accepted which means that there is no significant difference between the sample statistic and the population
parameter.
If calculated value is greater than the tabulated value at particular level of significance, then the null hypothesis is
rejected which means that there is significant difference between the sample statistic and the population parameter.
Testing of Hypothesis
S .E X
[ calculated value of Z]
Z N (0,1) i.e. Z follows the normal distribution with mean 0 and the standard deviation 1
If is unknown, we estimate from s
= s for large sample.
X
s
n
Step 4. Select the level of significance . The most commonly used is = 5% unless otherwise is stated. Decide
whether one tail or two tailed test is to be applied.
Step 5. Obtain tabulated value or critical value of Z according as whether alternative hypothesis is one tailed or two
tailed test.
Step 6. Draw conclusion by comparing the calculated and tabulated value of Z.
If calculated value of Z tabulated value of Z, then Ho is accepted. Thus, we may conclude that there is no
significant difference between sample mean ( X ) and population mean () or the sample has been drawn from the
given population.
If calculated value of Z > tabulated value of Z, then Ho is rejected. Thus, we may conclude that there is significant
difference between sample mean ( X ) and population mean () or the sample has not been drawn from the given
population.
Technique of Identifying one tailed test or two tailed test in test of significance for single mean:
There is no hard and fast rule for identifying one tailed test and two tailed test. It can be identified by observing the
nature of the problem whether the direction of the difference is specified or not,
If the direction is not specified two tailed test is used. If the direction is specified then one tailed test is used. If the
direction of the difference is specified by using words like less than, more than, lower than, smaller than, higher than,
low, high, at least, at most, only, increase, decrease etc. in that case one tailed test can be applied. The left tailed and
right tailed test can be identified in the following way.
i)
If sample mean ( X ) < population mean (), then we use left tailed test.
ii)
If sample mean ( X ) > population mean (), then we use right tailed test.
Eg.1: A sample of 400 male students is found to have a mean height of 171.38 cm. Can it be reasonably regarded as a sample from a
large population with mean height 171.17 cm. and standard deviation 3.30 cm.
Solution, Here, sample size (n) = 400, sample mean ( X ) = 171.38 cm,
population mean () = 171.17 cm, Standard deviation () = 3.30 cm
Step 1: Null hypothesis, Ho: = 171.17 cm i.e. the sample is from a population with mean height 171.17 cm and s.d. 3.30cm
Step 2: Alternative hypothesis, H1: 171.17 cm i.e. the sample is not from a large population with mean height 171.17 cm and
standard deviation 3.30 cm.
Step 3: Test Statistic, Under Ho, the test statistic is
X 171.38 171.17
0.21
0.21 20 4.2
1.27
3.30
3.30
3.30
3 .3
20
n
400
Testing of Hypothesis
Solution,
Here, population mean
sample mean ( X ) = 2100
()
2000,
standard
deviation
()
150,
sample
size
(n)
100,
Step 1: Null hypothesis, Ho: = 2000 i.e. the mean breaking strength of the cables supplied by a manufacturer is 2000.
Step 2: Alternative hypothesis, H1: > 2000 i.e. the mean breaking strength of the cables supplied by a manufacturer has been
increased.
Step 3: Test Statistic, Under Ho, the test statistic is
X 2100 2000
100
100
6.67
150
150
15
10
n
100
1
distribution with means 1 and 2 and variances
n1
1 2
If
1
X 1 N 1 ,
n1
and
and variance
2
X 2 N 2 ,
n2
12 2 2
n
2
1
i.e.
1 2
X 1 X 2 N 1 2 ,
n
n 2
1
then
X 2 1
S .E X 1 X 2
X 2 1 2
1 2
n1
n2
2
X1 X 2
1 2
n1
n2
2
( 1- 2 = 0 as 1 = 2)
Z N (0,1) i.e. Z follows the normal distribution with mean 0 and the standard deviation 1
Here, the population variances 12 and 22 are assumed to be known. If they are unknown then their estimates provided
by the corresponding sample variances s12 and s22 respectively are used i.e.
Hence the test statistic becomes
5
1 s12
and
2 s2 2
Testing of Hypothesis
X1 X 2
2
2
s1 s2 N(0,1)
n1 n2
X1 X 2
1
2 1
1
n1 n2
If the common variance 2 is unknown, then it is estimated by the combined sample variance i.e
2
n1 s1 n2 s22
n1 n 2
2
Step 4. Select the level of significance . The most commonly used is = 5% unless otherwise is stated. Decide
whether one tail or two tailed test is to be applied.
Step 5. Obtain tabulated value or critical value of Z according as whether alternative hypothesis is one tailed or two
tailed test.
Step 6. Draw conclusion by comparing the calculated and tabulated value of Z.
If calculated value of Z tabulated value of Z, then Ho is accepted. Thus, we may conclude that there is no
significant difference between two sample mean or the sample have been drawn from the same parent population.
If calculated value of Z > tabulated value of Z, then Ho is rejected. Thus, we may conclude that there is significant
difference between two sample mean or the sample have not been drawn from the same parent population.
Eg.3: In a certain factory there are two independent processes manufacturing the same items. The average weight in a sample of 250
items produced from one process is found to be 120 grammes with a standard deviation of 12 grammes, while the corresponding
figures in a sample of 400 items from the other process are 124 and 14. Find the standard error of the difference of means and also
test whether the two mean weights differ significantly or not at 10 percent level of significance.
Solution:
With usual notation , we have n1 = 250, X 1 = 120gm, s1 = 12gm, n2 = 400, X 2 = 124gm, s2 = 14gm
The standard error of the difference of means is given by
2
S.E. ( X 1
12 2 14 2
1 2
s
s
1 2
n1
n2
n1 n 2
2 )=
250
400
Step 1. Null hypothesis : Ho: 1 = 2 i.e. two mean weights do not differ significantly
Step 2. Alternative hypothesis: H1: 1 2 i.e. two mean weights differ significantly
Step 3. Test statistic: Under the null hypothesis Ho: 1 = 2, when the sample size are large the test statistic Z for the difference of
means ( X 1
X 2 ) becomes
X1 X 2
2
1 2
n1
n2
X1 X 2
2
s1
s
2
n1 n2
120 124
12
250
[
s1
14
4
0.576 0.49
4
1.066
4
3.88
1.032
400
and
2 s2 2
Testing of Hypothesis
Eg.4: The means of two samples of 1000 and 2000 individuals are 67.5 inches and 68 inches respectively. Can the samples be
regarded as drawn from the same population of standard deviation 2.5 inches.
Solution:
With usual notation , we have n1 = 1000, X 1 = 67.5 inches, n2 = 2000, X 2 = 68 inches
Common population standard deviation = = 2.5 inches.
Step 1. Null hypothesis : Ho: 1 = 2 i.e. both sample means are drawn from same population
Step 2. Alternative hypothesis: H1: 1 2 i.e. the sample means are drawn from different pop.
Step 3. Test statistic: Under the null hypothesis Ho: 1 = 2, the test statistic is given by
X1 X 2
1
1
n1 n2
67.5 68
2.5 2
1
1
1000 2000
0.5
0.5
6.25 0.0015
0.5
0.009375
Step 1. Null hypothesis : Ho: 1 = 2 i.e.there is no significant difference in male and female salary
Step 2. Alternative hypothesis: H1: 1 > 2 i.e. female graduate earns less than a male graduates.
Step 3. Test statistic: Under the null hypothesis Ho: 1 = 2, when the sample size are large the test statistic Z for the difference of
means ( X 1
X 2 ) becomes
X1 X 2
2
1 2
n1
n2
X1 X 2
2
s1
s
2
n1 n2
30000 29500
600
60
s1
and
500
500
6000 5000
500
11000
500
4.77
104.88
50
2
2 s2
Sampling of Attributes
If the quantitative measurements are taken on sampling units like height, weight, income, expenditure etc. in relating
the sample mean to the population mean, then it is called sampling of variables. If the qualitative characteristics are
taken on sampling units like honesty, intelligence, poverty, literacy etc. in relating the sample proportion to the
population proportion, then it is called sampling of attributes. In sampling of attributes, the given population is divided
into two mutually disjoint classes in such a way that one possesses a particular attribute under study termed as success
and the other does not possess the attribute termed as failure. Therefore, one may be interested to test the hypothesis on
proportion of units possessing or not possessing a certain attribute in the proportion.
7
0.5
0.09682
Testing of Hypothesis
p E ( p) p E ( p )
S .E p
PQ
n
Where, p = observed sample proportion of success = X/n, X = number of success relating to the given attribute, n =
sample size, E (p) = expected value of sample proportion and E(p) =P,
P = population proportion of success, Q = population proportion of failure such that P + Q = 1
Hence, for the large samples, the test statistic is
p E ( p) p E ( p )
S .E p
PQ N(0,1)
n
N n PQ
, if population proportion P is known
.
N 1 n
If P is unknown, then for large samples its estimate provided by the sample proportion p is used and unbiased estimate
of S.E (p) is given by
N n pq
= p for large sample]
[P
Est[ S .E ( p )]
.
N 1 n
S .E ( p )
p E ( p)
pP
S .E p
PQ
n
[ calculated value of Z]
Z N (0,1) i.e. Z follows the normal distribution with mean 0 and the standard deviation 1
Step 4. Select the level of significance . The most commonly used is = 5% unless otherwise is stated. Decide
whether one tail or two tailed test is to be applied.
Step 5. Obtain tabulated value or critical value of Z according as whether alternative hypothesis is one tailed or two
tailed test.
Step 6. Draw conclusion by comparing the calculated and tabulated value of Z.
If calculated value of Z tabulated value of Z, then Ho is accepted. Thus, we may conclude that there is no
significant difference between sample proportion (p) and population proportion (P) or the sample has been drawn from
the given population.
Testing of Hypothesis
If calculated value of Z > tabulated value of Z, then Ho is rejected. Thus, we may conclude that there is significant
difference between sample proportion (p) and population proportion (P) or the sample has not been drawn from the
given population.
#Eg.1: A sample of 600 persons selected at randomly from a large city gives the result that males are 53%. Is there reason to doubt
the hypothesis that males and females are in equal number in the city?
Solution,
Here, sample size (n) = 600, sample proportion of males (p) = 53% = 0.53,
Population proportion of males (P) = 0.5, Q = 1-0.5 = 0.5
Step 1: Null hypothesis, Ho: P = 0.5 i.e. population proportion of males and females are same
Step 2: Alternative hypothesis, H1: P 0.5 i.e. population proportion of males and females are different.
Step 3: Test Statistic, Under Ho, the test statistic is
pP
PQ
n
0.53 0.5
0.5 0.5
600
0.03
0.25
600
0.03
1.47
0.0204
160
0.08
200
pP
0.8 0.9
0.1
0.1
4.72
PQ
0.9 0.1
0.09 0.0212
n
200
200
X1
X2
and p 2
with population proportions P1 and P2,
n1
n2
then the corresponding sample proportion provide unbiased estimates for them. i.e.
Testing of Hypothesis
P1Q1
n1
&
Var ( p 2 )
P21Q2
n2
Since for large samples, p1 and p2 are normally distributed, then difference (p 1-p2) is also normally distributed with
mean (p1-p2) and variance Var (p1) + Var (p2).
Since the samples are independent
E (p1-p2) = E (p1) E (p2) = P1 P2 and Var (p1-p2) = Var (p1) + Var (p2) =
Thus, S.E (p1-p2) =
P1Q1 P1Q1
+
n1
n1
P1Q1 P2 Q2
n1
n2
p1 p2 E p1 p 2 p1 p 2 P1 P2
S .E.( p1 p 2 )
P1Q1 P2 Q2
n1
n2
N (0,1)
The steps in test of significance for difference of two proportions are as follows.
Step 1. Null hypothesis : Ho: P1 = P2 = P i.e. two population proportions are same or there is no significant different
between the sample proportions.
Step 2. Alternative hypothesis: H1: P1 P2 i.e. two population proportions are not same or there is significant
difference between the sample proportions.( two tailed test) or
H1: P1 > P2 i.e. the proportion of first population is greater than the proportion of the second population (right tail test)
or
H1: P1 < P2 i.e. the proportion of first population is less than the mean of the second population ( left tail test)
Step 3. Test statistic: Under the null hypothesis Ho: P1 = P2, then the test statistic Z for the difference of proportions (p 1
p2) becomes
p1 p2 E p1 p 2 p1 p2 P1 P2
S .E p1 p2
P1Q1 P2 Q2
n1
n2
p1 p2
1 1
n
1 n2
PQ
( P1- P2 = 0 as P1 = P2)
Z N (0,1) i.e. Z follows the normal distribution with mean 0 and the standard deviation 1
If the common population proportion P is unknown, then we use its unbiased estimate provided by both samples taken
together which is given by
X X 2 n1 p1 n2 p 2
P 1
and
n1 n2
n1 n2
1 P
p1 p2
1 1
P Q
n1 n2
Step 4. Select the level of significance . The most commonly used is = 5% unless otherwise is stated. Decide
whether one tail or two tailed test is to be applied.
Step 5. Obtain tabulated value or critical value of Z according as whether alternative hypothesis is one tailed or two
tailed test.
Step 6. Draw conclusion by comparing the calculated and tabulated value of Z.
If calculated value of Z tabulated value of Z, then Ho is accepted. Thus, we may conclude that there is no
significant difference between two sample proportions.
If calculated value of Z > tabulated value of Z, then Ho is rejected. Thus, we may conclude that there is significant
difference between two sample proportions.
10
Testing of Hypothesis
Eg.3: In a public opinion poll of 400 men and 600 women, 70% of the men and 80% of the women expressed that they
were pro-choice. At 0.05 level of significance, can we conclude that the observed difference between the two
proportions is significant?
Solution,
Here, Sample proportion of men (p1) = 70% = 0.7, Sample proportion of women (p2) = 80% = 0.8,
Number of men (n1) = 400, Number of women (n2) = 600
0.76
Now, estimated population proportion, P
n1 n 2
400 600
1000
1 P
1 0.76 0.24
Q
p1 p2
1
1
P Q
n1 n2
0.7 0.8
1
1
0.76 0.24
400 600
0.1
0.1824 1 1
200 2 3
0.1
0.1824
0.83
200
0.1
3.63
0.0275
0.77
Now, estimated population proportion, P
n1 n 2
900 1000
1900
1 P
1 0.77 0.23
Q
p1 p2
1
1
P Q
n1 n2
0.75 0.8
1
1
0.77 0.23
900 1000
0.05
19
0.1771
9000
0.05
3.3679
9000
0.05
2.59
0.019
Testing of Hypothesis
Step 6: Decision, Since calculated value of Z > tabulated value Z (Zcal > Ztab) at 5% level of significance for left tailed
test, it is significant and the null hypothesis H o is rejected which means that the rise of proportion is significant to
indicate that the advertisement was effective.
Eg.5: Two groups A and B consist of 100 people each who have a disease. A serum is given to group A but not to group
B. if is found that in group A and B, 75 and 65 people, respectively recover from the disease. Was the serum treatment
effective to cure the disease?
Here, Group A , sample size (n1) = 100, Sample proportion (p1) = 75/100 = 0.75
Group B, sample size (n2) = 100, Sample proportion (p2) = 65/100 = 0.65
0.7
Now, estimated population proportion, P
n1 n2
100 100
200
200
1 P
1 0.7 0.3
Q
i.e. the serum treatment was not effective to cure the disease.
Step 1: Null hypothesis, Ho: P1 = P2 = P
Step 2: Alternative hypothesis, H1: P1 > P2 i.e. the serum treatment was effective to cure the disease.
Step 3: Test Statistic, Under Ho, the test statistic is
p1 p2
1
1
P Q
n1 n2
0.75 0.65
1
1
0.7 0.3
100 100
0.1
2
0.21
100
0.1
0.42
100
0.1 10
1
1.54
0.649
0.649
12