Escolar Documentos
Profissional Documentos
Cultura Documentos
1
Statistical Inference
2
Statistical Inference
3
Simple Random Sampling:
Finite Population
4
Simple Random Sampling:
Finite Population
5
Simple Random Sampling:
Infinite Population
6
Simple Random Sampling:
Infinite Population
7
Point Estimation
9
Sampling Error
The sampling errors are:
|x | for sample mean
10
Example: Victoria University Toronto
Victoria University receives
900 applications annually from
prospective students. The
application form contains
a variety of information
including the individuals
scholastic aptitude test (SAT) score and
whether or not
the individual desires on-campus housing.
11
Example: Victoria University
The director of admissions
would like to know the
following information:
the average SAT score for
the 900 applicants, and
the proportion of
applicants that want to live on campus.
12
Example: Victoria University
We will now look at two
alternatives for obtaining the
desired information.
Conducting a census of the
entire 900 applicants
Selecting a sample of 30
13
Conducting a Census
If the relevant data for the entire 900 applicants were
in the universitys database, the population parameters
of interest could be calculated using the formulas
presented in the Descriptive Numbers chapter.
We will assume for the moment that conducting a
census is practical in this example.
14
Conducting a Census
Population Mean SAT Score
xi
990
900
Population Standard Deviation for SAT Score
i
( x ) 2
80
900
Population Proportion Wanting On-Campus Housing
648
p .72
900
15
Simple Random Sampling
Now suppose that the necessary data on the
current years applicants were not yet entered in the
universitys database.
Furthermore, the Director of Admissions must obtain
estimates of the population parameters of interest for
a meeting taking place in a few hours.
She decides a sample of 30 applicants will be used.
The applicants were numbered, from 1 to 900, as
their applications arrived.
16
Simple Random Sampling:
Using a Random Number Table
17
Simple Random Sampling:
Using a Random Number Table
Taking a Sample of 30 Applicants
The numbers we draw will be the numbers of the
applicants we will sample unless
the random number is greater than 900 or
the random number has already been used.
We will continue to draw random numbers until
we have selected 30 applicants for our sample.
(We will go through all of column 3 and part of
column 4 of the random number table, encountering
in the process five numbers greater than 900 and
one duplicate, 835.)
18
Simple Random Sampling:
Using a Random Number Table
Use of Random Numbers for Sampling
3-Digit Applicant
Random Number Included in Sample
744 No. 744
436 No. 436
865 No. 865
790 No. 790
835 No. 835
902 Number exceeds 900
190 No. 190
836 No. 836
. . . and so on
19
Simple Random Sampling:
Using a Random Number Table
Sample Data
20
Simple Random Sampling:
Using a Computer
Taking a Sample of 30 Applicants
Computers can be used to generate random
numbers for selecting random samples.
For example, Excels function
= RANDBETWEEN(1,900)
can be used to generate random numbers between
1 and 900.
Then we choose the 30 applicants corresponding
to 30 generated random numbers as our sample.
21
Point Estimation
x as Point Estimator of
x
x i
29, 910
997
n 30
s as Point Estimator of
s
(x i x )2
163, 996
75.2
n1 29
p as Point Estimator of p
p 20 30 .68
23
Sampling Distribution of x
Process of Statistical Inference
24
Law of Large Numbers
As the number of randomly drawn observations in a
sample increases, the mean of the sample x gets closer
and closer to the population mean .
25
Law of Large Numbers (contd)
Note: We often
intuitively expect
predictability over a
few random
observations, but it is
wrong. The law of
large numbers only
applies to really
large numbers. Settlers of Catan 26
What is a Sampling Distribution?
The sampling distribution of a statistic is the
distribution of all possible values taken by the
statistic when all possible samples of a fixed size n
are taken from the population. It is a theoretical
idea we do not actually build it.
28
Sampling distribution of sample mean
Sampling distribution of x
Histogram
of some
sample
averages
29
For any population with mean and standard deviation :
/n
30
Sampling Distribution of x
The sampling distribution of x is the probability
distribution of all possible values of the sample
mean x .
Expected Value of x
E( x ) =
where:
= the population mean
31
Mean of sample mean
Mean of a sampling distribution of x
There is no tendency for a sample mean to fall
systematically above or below , even if the distribution
of the raw data is skewed. Thus, the mean of the
sampling distribution is an unbiased estimate of the
population mean it will be correct on average in
many samples.
32
Sampling Distribution of x
Standard Deviation of x
Finite Population Infinite Population
N n
x ( ) x
n N 1 n
A finite population is treated as being
infinite if n/N < .05.
( N n ) / ( N 1) is the finite correction factor.
x is referred to as the
standard error of the mean.
33
Standard deviation of sample mean
Standard deviation of a sampling distribution of x
34
Form of the Sampling Distribution of x
When the population has a normal distribution, the
sampling distribution of x is normally distributed
for any sample size.
35
For Normally Distributed Populations
When a variable in a population is normally
distributed, the sampling distribution of the sample
mean for all possible samples of size n is also
normally distributed.
36
For Normally Distributed Populations
Sampling distribution
Population
37
The Central Limit Theorem
38
The Central Limit Theorem
Population Sampling
with strongly distribution of x
skewed for n = 2
distribution observations
Sampling Sampling
distribution of x distribution of x
for n = 10 for n = 25
observations observations
39
IQ scores: Population vs. Sample
In a large population of adults, the mean IQ is 112 with standard
deviation 20. Suppose 200 adults are randomly selected for a
market research campaign.
The distribution of the sample mean IQ is:
A) Exactly normal, mean 112, standard deviation 20
B) Approximately normal, mean 112, standard deviation 20
C) Approximately normal, mean 112 , standard deviation
deviation 1.414
D) Approximately normal, mean 112, standard deviation 0.1
40
Sampling Distribution of x for SAT Scores
Sampling
Distribution
80
of x x 14.6
n 30
x
E( x ) 990
41
Sampling Distribution of x for SAT Scores
42
Sampling Distribution of x for SAT Scores
44
Sampling Distribution of x for SAT Scores
Sampling
Distribution x 14.6
of x
Area = .7517
x
990 1000
45
Sampling Distribution of x for SAT Scores
Sampling
Distribution x 14.6
of x
Area = .2483
x
980 990
47
Sampling Distribution of x for SAT Scores
48
Sampling Distribution of x for SAT Scores
Sampling
Distribution x 14.6
of x
Area = .5034
x
980 990 1000
49
Relationship Between the Sample Size
and the Sampling Distribution of x
Suppose we select a simple random sample of
100 applicants instead of the 30 originally considered.
E(x ) = regardless of the sample size. In our
example, E(x ) remains at 990.
Whenever the sample size is increased, the standard
error of the mean x is decreased. With the increase
in the sample size to n = 100, the standard error of the
mean is decreased to:
80
x 8.0
n 100
Note: Strictly speaking the finite population correction should be used here.
However, this does not affect the key result. 50
Relationship Between the Sample Size
and the Sampling Distribution of x
With n = 100,
x 8
With n = 30,
x 14.6
x
E( x ) 990
51
Relationship Between the Sample Size
and the Sampling Distribution of x
Recall that when n = 30,
P(980 x 1000) = .5034.
We follow the same steps to solve for P(980 x 1000)
when n = 100 as we showed earlier when n = 30.
Now, with n = 100, P(980 x 1000) = .7888.
Because the sampling distribution with n = 100 has a
smaller standard error, the values of x have less
variability and tend to be closer to the population
mean than the values of x with n = 30.
52
Relationship Between the Sample Size
and the Sampling Distribution of x
Sampling
Distribution x 8
of x
Area = .7888
x
980 990 1000
53
Sampling with Categorical Variables
54
Sampling Distribution of p
Making Inferences about a Population Proportion
55
Sampling Distribution of p
The sampling distribution of p is the probability
distribution of all possible values of the sample
proportion p .
Expected Value of p
E ( p) p
where:
p = the population proportion
56
Sampling Distribution of p
Standard Deviation of p
Finite Population Infinite Population
p (1 p ) N n p (1 p )
p p
n N 1 n
p is referred to as the
standard error of the proportion.
57
Sampling Distribution of p
The sampling distribution of a sample proportion p = X/n is
approximately normal (normal approximation of a binomial
distribution) when the sample size is large enough.
58
Form of the Sampling Distribution of p
The sampling distribution of p can be approximated
by a normal distribution whenever the sample size
is large.
np 5 and n(1 p) 5
Sampling Distribution of p
Example: Victoria University
Recall that 72% of the
prospective students applying
to Victoria University desire
on-campus housing.
What is the probability that
a simple random sample of 30 applicants will provide
an estimate of the population proportion of applicant
desiring on-campus housing that is within plus or
minus .05 of the actual population proportion?
60
Sampling Distribution of p
For our example, with n = 30 and p = .72,
the normal distribution is an acceptable
approximation because:
np = 30(.72) = 21.6 5
and
n(1 - p) = 30(.28) = 8.4 5
61
Sampling Distribution of p
p
E( p ) .72
62
Sampling Distribution of p
Step 1: Calculate the z-value at the upper endpoint of
the interval.
z = (.77 - .72)/.082 = .61
Step 2: Find the area under the curve to the left of the
upper endpoint.
P(z .61) = .7291
63
Sampling Distribution of p
Cumulative Probabilities for
the Standard Normal Distribution
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
. . . . . . . . . . .
64
Sampling Distribution of p
Sampling p .082
Distribution
of p
Area = .7291
p
.72 .77
65
Sampling Distribution of p
Step 3: Calculate the z-value at the lower endpoint of
the interval.
z = (.67 - .72)/.082 = - .61
Step 4: Find the area under the curve to the left of the
lower endpoint.
P(z -.61) = .2709
66
Sampling Distribution of p
Sampling p .082
Distribution
of p
Area = .2709
p
.67 .72
67
Sampling Distribution of p
Step 5: Calculate the area under the curve between
the lower and upper endpoints of the interval.
P(-.61 z .61) = P(z .61) - P(z -.61)
= .7291 - .2709
= .4582
The probability that the sample proportion of applicants
wanting on-campus housing will be within +/-.05 of the
actual population proportion :
68
Sampling Distribution of p
Sampling p .082
Distribution
of p
Area = .4582
p
.67 .72 .77
69
Readings
Textbook:
Chapter 7
70
Random Numbers
71