Você está na página 1de 35

QUANTITATIVE ANALYSIS FOR

MANAGEMENT II

http://ganga.iiml.ac.in/~ggarg/QAM2.htm
QAM II by Gaurav Garg (IIM Lucknow)
COURSE OUTLINE
Sampling Distributions Chi-Square, t and F distributions
Interval Estimation
Sample Size Decision
Testing of Hypothesis
single population and
two populations
Measures of Association for Qualitative data and
contingency Table
Chi-square test for Goodness of fit
Analysis of Variance one way and two way
Nonparametric tests
Multiple Regression Analysis
QAM II by Gaurav Garg (IIM Lucknow)
EVALUATION SCHEME AND EXAM DATES






QAM II by Gaurav Garg (IIM Lucknow)
Weight Date
QUIZ -1: 10% SEP 28, 2011/ 0830 - 0850 Hrs.
QUIZ-2: 10% OCT 12, 2011/ 0830 - 0850 Hrs.
MID TERM EXAM: 25% OCT 27 - 29, 2011
QUIZ-3: 00% NOV 16, 2011/ 0830 - 0850 Hrs.
ASSIGNMENT: 10% NOV 24, 2011,
due date: DEC 10, 2011
END TERM EXAM: 40% DEC 14 - 19, 2011
IMPRESSION: 05% --
Three quizzes will be conducted and best two out of three will
be considered.
Sampling Distributions


Concept of Sampling Distribution
Distributions of Sample Mean and
Sample Proportion
Central Limit Theorem
t, Chi-Square and F distributions.
QAM II by Gaurav Garg (IIM Lucknow)
Parameter:
Statistical measures computed using population observations.
Let X
1
, X
2
,, X
N
are population units.
Population mean Population Variance


Statistic:
Statistical measures computed using sample observations.
Let x
1
, x
2
,, x
n
are sample units.
Sample mean Sample Variance

Parameter and Statistic

=
=
N
i
i
X
N
1
1

2
1
2
) (
1

=
=
N
i
i
X X
N
o

=
=
n
i
i
x
n
x
1
1 2
1
2
1
2
1
2
) (
1
1
or ) (
1

= =

= =
n
i
i
n
i
i
x x
n
s x x
n
s
QAM II by Gaurav Garg (IIM Lucknow)
In practice, parameter values are not known.
They are estimated using sample observations.
Parameter values are fixed.
Values of statistic varies sample to sample.
Unbiased Estimate
If E(statistic) = parameter,
then the statistic is said to be unbiased estimate of
the parameter.
Sample mean is an unbiased estimate of population
mean.

QAM II by Gaurav Garg (IIM Lucknow)
Let us consider the following population of size 4:
18, 20, 22, 24
Population mean = (18 + 20 + 22 + 24)/ 4 = 21
Population Variance
= [(18-21)
2
+ (20-21)
2
+ (22-21)
2
+ (24-21)
2
] / 4 = 5
Consider all possible samples of size 2
Obtain sample mean and sample variance of all the
samples.
Sample mean is an unbiased estimate of population
mean.
This means that the average of all sample means
equals population mean.


QAM II by Gaurav Garg (IIM Lucknow)
Samples
s
2
s
1
2
18, 18
18 0 0
20, 18
19 1 2
22, 18
20 4 8
24, 18
21 9 18
18, 20
19 1 2
20, 20
20 0 0
22, 20
21 1 2
24, 20
22 4 8
18, 22
20 4 8
20, 22
21 1 2
22, 22
22 0 0
24, 22
23 1 2
18, 24
21 9 18
20, 24
22 4 8
22, 24
23 1 2
24, 24
24 0 0
Average
21 2.5 5
x
2 , 5 , 21
2
= = = n o
2 2
1
2 2
) (
) (
) (
o
o

=
=
=
s E
s E
x E
=
=
=

=
=
=
n
i
i
n
i
i
n
i
i
x x
n
s
x x
n
s
x
n
x
1
2 2
1
1
2 2
1
) (
1
1
) (
1
1
Sampling Distributions
Unknown parameters are estimated using sample
observations.

Parameter values are fixed.

Values of statistic varies sample to sample.

Each sample has some probability of being chosen.

Each value of a statistic is associated with a probability.

Statistic is a random variable.

Distribution of a statistic is called sampling distribution.

Distribution of a statistic may not be the same as the
distribution of population.

QAM II by Gaurav Garg (IIM Lucknow)
Sampling Distribution of Mean
(or Distribution of Sample Mean)

Consider the previous example again.
Histogram of population units






Each item is frequented only once.
Population distribution is discrete uniform
distribution.


QAM II by Gaurav Garg (IIM Lucknow)
0.25
0
x
18 20 22 24
QAM II by Gaurav Garg (IIM Lucknow)
Samples
Sample
Mean Frequency
Probability = relative
frequency
(18, 18)
18 1 1/16
(20, 18), (18, 20)
19 2 2/16
(22, 18), (18, 20), (20, 20)
20 3 3/16
(24, 18), (18, 24), (20, 22), (22,20)
21 4 4/16
(20, 24), (24, 20), (22, 22)
22 3 3/16
(22, 24), (24, 22)
23 2 2/16
(24, 24)
24 1 1/16
Total 1
18 19 20 21 22 23 24
0

(no longer uniform)
4/16
3/16
2/16
1/16
The value of the sample mean depends on the chosen
sample.

Each sample is chosen with certain probability.

So, each possible value of sample mean is associated
with some probability.

Distribution of sample mean is the list of all possible
values along with corresponding probabilities.
QAM II by Gaurav Garg (IIM Lucknow)
Sample
Mean
18 19 20 21 22 23 24
Probability 1/16 2/16 3/16 4/16 3/16 2/16 1/16
In other words, the statistic (sample mean)
can be considered as a random variable.
The distribution of T is given by following table:
QAM II by Gaurav Garg (IIM Lucknow)
x T =
t P(T=t)
18 1/16
19 2/16
20 3/16
21 4/16
22 3/16
23 2/16
24 1/16
t x P(T=t) t
2
x P(T=t)
1.125 20.250
2.375 45.125
3.750 75.000
5.250 110.250
4.125 90.750
2.875 66.125
1.500 36.000
21.000 443.500
E(T) = 21
E(T
2
) = 443.5
Var(T) = E(T
2
) [E(T)]
2

= 2.5
In general,
can also be obtained as follows:






Common Notation:





QAM II by Gaurav Garg (IIM Lucknow)
n x Var x E
2
) ( , ) ( o = =
) ( and ) ( x Var x E
= = = = |
.
|

\
|
=

= = =
n
n n
x E
n
x
n
E x E
n
i
n
i
i
n
i
i
1 1
) (
1 1
) (
1 1 1
n
n
n n
x Var
n
x
n
Var x Var
n
i
n
i
i
n
i
i
2
2
2
1
2
2
1
2
1
1 1
) (
1 1
) (
o
o o = = = = |
.
|

\
|
=

= = =
n x Var x E
x x
2 2
) ( , ) ( o o = = = =
Standard Error
Different samples of the same size from the same
population will yield different sample means.
A measure of the variability in different values of
sample mean is given by the Standard Error of the
sample mean.


Standard error of a statistic is the standard deviation
of its distribution.

In our example,

Standard error decreases when sample size is
increased.

QAM II by Gaurav Garg (IIM Lucknow)
n x Var x
x
o o = = = ) ( ) ( error standard
5811 . 1 5 . 2 = =
x
o
Central Limit Theorem
When population distribution is N(, ),
then

When the population distribution is not normal,
then also provided n.

Practically, this result is true for n 30.

QAM II by Gaurav Garg (IIM Lucknow)
( ). , ~ n N x o
( ), , ~ n N x o
x
0
1
2
3
4
5
6
7
8
9
10
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
0
2
4
6
8
10
12
14
16
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00
x
1,800 Randomly Selected Values
from an Exponential Distribution
Distribution of Sample Mean
n=2
n=5 n=30
QAM II by Gaurav Garg (IIM Lucknow)
1,800 Randomly Selected Values from a Uniform
Distribution
X
F
r
e
q
u
e
n
c
y
0
50
100
150
200
250
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Distribution of Sample Mean
F
r
e
q
u
e
n
c
y
x
0
1
2
3
4
5
6
7
8
9
10
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
n=2
F
r
e
q
u
e
n
c
y
x
0
2
4
6
8
10
12
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
n=30
n=5
F
r
e
q
u
e
n
c
y
x
0
5
10
15
20
25
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
QAM II by Gaurav Garg (IIM Lucknow)
Example:
Suppose a population has mean = 8 and standard
deviation = 3.
Suppose a random sample of size n = 36 is selected.
What is the probability that the sample mean is
between 7.75 and 8.25?
Even if the population is not normally distributed,
the central limit theorem can be used (n > 30).
So, the distribution of the sample mean is
approximately N(8, 3/6).
i.e,
QAM II by Gaurav Garg (IIM Lucknow)
) 6 / 3 , 8 ( ~ N x
? ] 25 . 8 75 . 7 [ = < < x P
Sampling Distribution of Proportion
(or Distribution of Sampling Proportion)
Let us consider that the population is divided into
two mutually exclusive and collectively exhaustive
classes.

One class possesses a particular attribute,

Other class does not posses that attribute.

For example a people in a city could be divided into
Smokers and Non-smokers.
QAM II by Gaurav Garg (IIM Lucknow)
Let
N= population size
X= no. of people out of N possessing a particular
attribute
P= Actual proportion of the people possessing a
particular attribute = X/N

Let a sample is selected from this population.
n= sample size
x= no. of people in the sample possessing a
particular attribute
p= x/n = sample proportion

QAM II by Gaurav Garg (IIM Lucknow)
X, P are population parameters.
x, p are sample statistics.
p provides an estimate of P.
Note that, x ~ B(n, P)
E(x) = nP, Var(x) = nPQ, (where Q = 1-P).
This implies that
E(p) = E(x/n) = P,
Var(p) = Var(x/n) = nPQ/n
2
= PQ/n.
Standard error (p) = [Var(p)] = (PQ/n)



QAM II by Gaurav Garg (IIM Lucknow)
When the sample size n is large enough,





This is a particular case of central limit theorem.
Practically, this result is true for n 30.
Or, when nP 5 as well as nQ 5
QAM II by Gaurav Garg (IIM Lucknow)
) 1 , 0 ( ~ or
) 1 , 0 ( ~
N
n PQ
P p
Z
N
nPQ
nP x
Z

=
Example:
If the true proportion of voters who support ABC party
0.4.
What is the probability that a sample of size 200 yields a
sample proportion between 0.40 and 0.45?
P = 0.4, Q = 1 0.4 = 0.6
n = 200.
Pr[ 0.40 < p < 0.45 ] =?


QAM II by Gaurav Garg (IIM Lucknow)
) 1 , 0 ( ~ N
n PQ
P p
Z

=
Finite Population Correction
For the application of central limit theorem, we
assumed that the sample size n is large.

If the population size N is small, sample size n can
not be sufficiently large.

And we can not apply central limit theorem.

In this situation, we multiply the standard error by
Finite Population Correction (fpc),

which is given by


Clearly, when N , fpc 1.

QAM II by Gaurav Garg (IIM Lucknow)
1

=
N
n N
fpc
Thus



And



fpc should be used when n / N >0.05
QAM II by Gaurav Garg (IIM Lucknow)
( ) ). 1 , 0 ( ~
1
or , ~ N
N
n N
n
x
n fpc N x

o

o
( ). 1 , 0 ~
1
N
N
n N
n
PQ
P p
fpc
n
PQ
P p

Degree of Freedom
The no. of independent observations which make up a
statistic, is known as the degrees of freedom (d.f.)
associated with that statistic.
d.f. is the number of values in the final calculation of a
statistic that are free to vary.
In general, d.f. of a statistic = (no. of independent
observations) - (no. of parameters estimated)

Assume four numbers: a, b, c, and d,
such that a+b+c+d = m.
You are free to choose the any three numbers at random.
But 4
th
must be chosen so that it makes the total equal to m.
Thus your degree of freedom is three.
QAM II by Gaurav Garg (IIM Lucknow)
Chi Square Distribution
Let us take a sample from N(,).

Define the statistic

The symbol is read as Chi-Square and has a Chi-Square
Distribution with n degree of freedom and range (0,).

This distribution is denoted as .

If we define the statistic as

The distribution of this statistic is .

n
x x x ,..., ,
2 1
2
1
2

=
|
.
|

\
|

=
n
i
i
x
o

_
2
_
2
) (n
_
QAM II by Gaurav Garg (IIM Lucknow)
2
1
2

=
|
.
|

\
|

=
n
i
i
x x
o
_
2
) 1 ( n
_
If , then

Plot of p.d.f. of Chi-Square distribution with d.f. k











The modes are at T = k-2 (k>1).
2
) (
~
k
X _
. 2 ) ( , ) ( k X Var k X E = =
QAM II by Gaurav Garg (IIM Lucknow)
Students t Distribution
Let us take a sample from N(,).
Define the statistic


Then T follows Students t Distribution with (n-1)
d.f. and range (- ,).
It is denoted as

If , then
n
x x x ,..., ,
2 1
. ) (
1
1
,
1
where ,
2
1
2
1
1
1
x x
n
s x
n
x
n s
x
T
n
i
i
n
i
i

= =

=

= =

QAM II by Gaurav Garg (IIM Lucknow)


) (
~
k
t T ) 2 ( ,
2
) ( , 0 ) ( >

= = k
k
k
T Var T E
) 1 (
~
n
t T
pdf of Students t distribution with k d.f.











This distribution is symmetric about 0.
Mean=Median=Mode=0
QAM II by Gaurav Garg (IIM Lucknow)
t
0
t (df = 5)
t (df = 13) t-distributions are bell-
shaped and symmetric, but
have fatter tails than the
normal
Standard Normal
(t with df = )
Note: t
(n-1)
N(0,1) as n increases
QAM II by Gaurav Garg (IIM Lucknow)
Snedecors F Distribution
Let X and Y be two independent random
variables such that

Define the statistic

F follows Snedecors F Distribution with d
1
and
d
2
d.f. and range (0,).
It is denoted as F~F
(d
1
,d
2
)
.





QAM II by Gaurav Garg (IIM Lucknow)
2
) (
2
) (
2 1
~ and ~
d d
Y X _ _
2
1
d Y
d X
F =
4 ,
) 4 ( ) 2 (
) 2 ( 2
) ( and 2 ,
2
) (
2
2
2
2 1
1 2
2
2
2
2
2
>

+
= >

= d
d d d
d d d
F Var d
d
d
F E
pdf of Snedecors F distribution with d1 and d2 d.f.
QAM II by Gaurav Garg (IIM Lucknow)
Summary
Parameter and Statistic
Unbiasedness
Distribution of sample mean
Distribution of sample proportion
Central limit theorem
Finite population correction
Degree of Freedom
Students t, Chi-Square and Snedecors F-
distributions
QAM II by Gaurav Garg (IIM Lucknow)

Você também pode gostar