Você está na página 1de 24

3/13/2019

STATISTIKA
&
PROBABILITAS

9th week

Review
I. What’s in last lecture?
Continuous Probability distribution
Sampling Distribution
Sampling Method

II. What's in this lecture?


Point Estimation
Interval Estimation/ Confidence Interval

1
3/13/2019

Inference
Statistic

Inference

Use a random
sample to learn
something about a
larger population

Statistical inference deals with drawing


conclusions about population
parameters from an analysis of the
sample data.

2
3/13/2019

Types of Inference

Two most important types of inferences are:


 Estimation (This chapter):
 Estimating or predicting the value of the parameter
 “What is (are) the most likely values of m?”
 Hypothesis Testing (Next chapter):
 Deciding about the value of a parameter based on
some preconceived idea.
 “Did the sample come from a population with m = 5?”

Example 1: Types of inference: Point


Estimation, Interval Estimation,
and Testing Hypotheses
 What is the degree of participation in community service? A
student at a university questioned n = 40 students in
chemical engineering department concerning the amount of
time they spent doing community service during the past
month. The data on times, in hours, are presented in Table 1.

Descriptive statistic

𝑥ҧ = 4.55
𝑠 = 5.17
𝑚𝑒𝑑𝑖𝑎𝑛 = 4.0
etc

3
3/13/2019

Example 1: Types of inference: Point


Estimation, Interval Estimation,
and Testing Hypotheses
 However the target of our study is not just the particular
set of measurements recorded, but also concerns the
vast population of hours of community service for all
students at this university.
 The population distribution is unknown to us.
Consequently, parameters such as the population mean
m and population standard deviation  are also
unknown.
 If we take the view that the 40 observations represent a
random sample from the population distribution of times,
one goal of this study may be to “learn about m.”

Type of Inference

More specifically, depending on the purpose of the study,


we may wish to do one, two, or all three of the following:
1. Estimate a single value for the unknown m (point
estimation).
2. Determine an interval of plausible values for m
(interval estimation).
3. Decide whether or not the mean time m is 4.55 hours,
(hypotheses testing)

4
3/13/2019

POINT ESTIMATION OF A
POPULATION MEAN
 The object of point estimation is to calculate, from the
sample data, a single number that is likely to be close
to the unknown value of the parameter.
 A statistic intended for estimating a parameter is called
a point estimator, or simply an estimator.
 The standard deviation of an estimator is called
standard error: S.E.
 In case Example 1, we would naturally compute the
mean of the sample data. Employing the estimator 𝑋ത
with the data of Table 1, we get the result 𝑋ത = 4.55,
which we call a point estimate, or simply an estimate
of m

Point Estimation

10

5
3/13/2019

Confidence Intervals for a


Population Mean
 A confidence interval or interval estimator is a range or
interval of numbers believed to include an unknown
population parameter. Associated with the interval is a
measure of the confidence we have that the interval
does indeed contain the parameter of interest.
 Confidence intervals depend on sampling distributions
 The shape of sampling distributions depend on sample
sizes
 We will learn different methods for large and small sample
sizes
 For large sample sizes, → normal distributions → Ch. 8
 For small sample sizes, → student t distribution → Ch. 9

11

Illustration
𝑥ҧ − 𝜇
𝑧= 𝜎
 Let us review the results from ൗ 𝑛
Chapter Sampling Distribution.
1. Mean of 𝑋ത = 𝜇 → 𝐸(𝑋) ത =𝜇
ത = 𝜎 → 𝑆. 𝐸. (𝑋)
2. 𝑠𝑑(𝑋) ത = 𝜎
𝑛 𝑛
3. n large → 𝑋ത is nearly normal distributed

12

6
3/13/2019

Define:
(1- α) is confident coefficient
(1- α) 100% is confidence level

13

14

7
3/13/2019

Confidence Intervals for a


Population Mean
• “Fairly sure” means “with high probability”, measured using
the confidence coefficient, 1 – α.
Usually, 1-a = 0.90, 0.95, 0.99
Margin of Error
Point Estimator
• For large-Sample size,
𝜎
100(1-a)% Confidence Interval : 𝑥ҧ  za/2 ( 𝑛)

𝜎 𝜎
𝑥ҧ − zα/2( ) , 𝑥ҧ + zα/2( )
𝑛 𝑛

15

Confidence Intervals for a


Population Mean
• Reality population standard deviation 𝜎 is unknown.
We require the sample size n to be large in order to
dispense with the assumption of a normal population.
𝜎 𝑠
• When n is large → 𝑛 = 𝑛 ,
• s is sample standard deviation
𝑠
100(1-a)% confidence interval for µ is given by : 𝑋ത  z α/2 ( 𝑛)

𝑠 𝑠
𝑋ത − zα/2( ) , 𝑋ത + zα/2( )
𝑛 𝑛

16

8
3/13/2019

Example

 Consider data with summary statistic below


𝑛 = 40, 𝑥ҧ = 4.55 𝑎𝑛𝑑 𝑠 = 5.17
Compute (a) 90% and ( b) 80% confidence intervals for the mean.
Solution:
a) 100(1- α)% = 90% → α/2 = 0.05 →zα/2 = 1.645
𝑠 𝑠 5.17 5.17
𝑋ത − zα/2( 𝑛
) , 𝑋ത + zα/2( 𝑛
) = 4.55 − 1.645( 40
) , 4.55 + 1.645( 40
)

= (3.2, 5.9)

b) 100(1- α)% = 80% → α/2 = 0.1 →zα/2 = 1.28


𝑠 𝑠 5.17 5.17
𝑋ത − zα/2( ) , 𝑋ത + zα/2( ) = 4.55 − 1.28( ) , 4.55 + 1.28( )
𝑛 𝑛 40 40

= (3.5, 5.6)

17

Example: Calculating a Confidence


Interval
 The daily carbon monoxide (CO) emission from a large
production plant will be measured on 35 randomly selected
weekdays. The production process is always being modified and
the current mean value of daily CO emissions m is unknown. Data
collected over several years confirm that, for each year, the
distribution of CO emission is normal with a standard deviation
of 0.8 ton. Suppose the sample mean is found to be 𝑥ҧ = 2.7
tons.
 Construct a 95% confidence interval for the current daily mean
emission m.
0.8 0.8
2.7 − 1.96 ( 35
) ; 2.7 + 1.96 ( 35
) = 2.43 ; 2.96

18

9
3/13/2019

Determining Sample Size

 In order to determine how large a sample is


needed for estimating a population mean, we must
specify
d = Desired error margin
1- α = Probability associated with error margin
 Referring to the expression for a 100(1- α)% error
margin, we then equate:
𝜎 𝑧𝛼/2 𝜎 2
𝑧𝛼/2 =d 𝑛=
𝑛 𝑑

Required Sample size

19

Example:
Determining a Sample Size
 A limnologist wishes to estimate the mean phosphate content
per unit volume of lake water. It is known from studies in
previous years that the standard deviation has a fairly stable
value of 𝜎 = 4. How many water samples must the limnologist
analyze to be 90% certain that the error of estimation does not
exceed 0.8 milligrams?
 𝜎=4 𝑧𝛼/2 𝜎 2
 1- α = 0.9 → α/2 = 0.05 𝑛 =
𝑑
 From normal table 𝑧0.05 = 1.645 1.645 𝑥 4 2
= 0.8
 d = 0.8
= 67.65
= 68

20

10
3/13/2019

Exercise

21

Comparing Two
Treatments

22

11
3/13/2019

Comparing Two Treatments

Treatment is used to refer to the things that are being


compared.
For comparing two treatments, the two basic types of design
are:
1. Independent samples (complete randomization).
2. Matched pairs sample (randomization within each matched
pair).

Example:
To compare the effectiveness of two drugs in curing a
disease, suppose that 8 patients are included in a clinical
study. Here, the time to cure is the response of interest.

23

Comparing Two Treatments

24

12
3/13/2019

INDEPENDENT RANDOM SAMPLES


FROM TWO POPULATIONS

𝜇1 𝑎𝑛𝑑 𝜎1

𝜇2 𝑎𝑛𝑑 𝜎2

25

Notations –
Comparing Two Means
Mean Variance Standard Deviation
Population 1 µ1 σ12 σ1
Population 2 µ2 σ22 σ2
Sample Mean Variance Standard
size Deviation
Sample from n1 ഥ
X s12 s1
Population 1
Sample from n2 ഥ
Y s22 s2
Population 2

26

13
3/13/2019

Large Samples Confidence


Interval for µ1-µ2
For large samples, point estimates and their margin of
error as well as confidence intervals are based on the
standard normal (z) distribution.
When 𝑛1 ≥ 30 𝑎𝑛𝑑 𝑛2 ≥ 30

100 1 − 𝛼 % confidence interval for µ1-µ2 is given by :

𝑠1 2 𝑠2 2
ഥ Y  z α/2
X-ഥ +
𝑛1 𝑛2

𝑠1 2 𝑠2 2 𝑠12 𝑠2 2

X−ഥ
Y − z α/2 + ,ഥ
X−ഥ
Y + z α/2 +
𝑛1 𝑛2 𝑛1 𝑛2

27

Example
Responses Firefighter Office Supervisor
Sample size 226 247
Sample mean 3.673 3.547
Sample Std Dev 0.7235 0.6089
Construct a 95% confidence interval for difference in mean
job satisfaction scores
Solution:
𝑠1 2 𝑠2 2 0.72352 0.60892
ഥ Y  z α/2
X-ഥ + = 3.673 - 3.547  1.96 +
𝑛1 𝑛2 226 247

Therefore 95% confidence interval for µ1-µ2 is given by 0.126  0.121


Or (0.005 , 0.247)

28

14
3/13/2019

Exercise

29

Student’s t
Distribution

30

15
3/13/2019

INFERENCES ABOUT MEAN —


SMALL SAMPLE SIZE
 When the sample size is small (n<30), the estimation
procedures in previous slides are not appropriate.
 When we take a sample from a normal population, the
sample mean 𝑥ҧ has a normal distribution for any
sample size n, and The population
x−m
z=
standard deviation
is almost always
unknown / n
has a standard normal distribution.
 But if  is unknown, and we must use s to estimate it,
the resulting statistic is not normal.
The Sample x−m
standard deviation is not normal!
s/ n
31

Student’s t Distribution
 Fortunately, this statistic does have a sampling
distribution that is well known to statisticians,
called the Student’s t distribution, with
degrees of freedom (df) = n-1.

x−m
t=
s/ n

32

16
3/13/2019

Student’s t Distribution

Note : t → Normal as n increases

33

Small Sample Confidence


Interval for Population Mean µ
• For small sample size with n < 30

100(1-α)% confidence interval for µ is given by :


𝑠
𝑋ത  t α/2 ( 𝑛 )

𝑠 𝑠
𝑋ത − tα/2( ) , 𝑋ത + tα/2( )
𝑛 𝑛

where tα/2 is the upper α/2 point of the t distribution with d.f. = n-1.

34

17
3/13/2019

For a random sample


of size
n = 10,
df = n –1 = 9
t0.025 = ??

→ t0.025 = 2.262

35

Example

Ten randomly selected students were each


asked to list how many hours of television
they watched per month. The results are
82 66 90 84 75
88 80 94 110 91
Find a 90% confidence interval for the true
mean number of hours of television watched
per month by students.

36

18
3/13/2019

Solution
Calculating the sample mean and standard
deviation we have
n =10, 𝑥ҧ = 86, and s = 11.842.
We find that the value of 𝑡𝛼/2 is 1.833 by looking
on the t table in the row corresponding to df =9,
in the column with label 𝑡.050.
The 90% confidence interval for µ is
𝑠 11.842
𝑥ҧ ± 𝑡𝛼/2 = 86 ± 1.833
𝑛 10
= 86 ± 6.86
= (79.14 ; 92.86)
37

Example: Confidence interval for


small sample
 The weights (pounds) of n = 8 female wolves captured in the
Yukon- Charley Rivers National Reserve are
57 84 90 71 77 68 73 71
 Treating these weights as a random sample from a normal
distribution.
 Find a 90% confidence interval for the population mean
weight of all female wolves living on the Yukon-Charley
Rivers National Reserve.

38

19
3/13/2019

Exercise

39

Comparing Two
Treatments
(Small sample)

40

20
3/13/2019

Estimating the Difference


between Two Population Means
For small sample size with n1 < 30 and n2 < 30

100(1-a)% confidence interval for m1-m2.

1 1
(𝑥1- 𝑥2 )  t α/2 𝑠𝑝𝑜𝑜𝑙𝑒𝑑 2 𝑛1
+𝑛
2

𝑛1 − 1 𝑠1 2 + 𝑛2 − 1 𝑠2 2
𝑠𝑝𝑜𝑜𝑙𝑒𝑑 2 =
𝑛1 + 𝑛2 − 2

t α/2 is the critical value of t with df = 𝑛1 + 𝑛2 − 2

41

Example
A student recorded the mileage he obtained while
commuting to school in his car. He kept track of the
mileage for twelve different tanks of fuel, involving
gasoline of two different octane ratings. Compute the
95% confidence interval for the difference of mean
mileages. His data follow:

87 Octane 90 Octane
26.4, 27.6, 29.7 30.5, 30.9, 29.2
28.9, 29.3, 28.8 31.7, 32.8, 29.3

42

21
3/13/2019

Example
Let 87 octane fuel be the first group and 90 octane
fuel the second group, so we have n1 = n2 = 6 and
x1 =28.45, s1 = 1.228, x 2 =30.73, s 2 = 1.392
d.f.=n1 + n2 - 2=10. The critical
(n −value
1) s 2 + (of
n −t0.025
1) s 2 is52.228.
1.508 + 5 1.938
2
s = 1 1 2 2
= = 1.723
2 2
n1 + n22 − 2 10
𝑛1 − 1 𝑠1 + 𝑛2 − 1 𝑠2 5 1.228 + 5 1.3922
𝑠𝑝𝑜𝑜𝑙𝑒𝑑 2 = = = 1.723
𝑛1 + 𝑛2 − 2 2 10
1 1
x1 − x2  ta / 2 s  + 
 n1 n2 
1 1
(𝑥1- 𝑥2 )  t α/2 𝑠𝑝𝑜𝑜𝑙𝑒𝑑 2 𝑛1
+𝑛 = (28.45−
= 28.45 − 30.73  2.2.228
30.73) 228 1.723  (1 / 6 + 1 / 6)
2

1.723
1
+
1 = −2.28  1.688
6 6
The 95% confidence interval for µ 1 - µ2 is (-3.968 ; -0.592)

43

Exercise

44

22
3/13/2019

Key Concepts
I. Point Estimators

II. Large-Sample Interval Estimators

45

Key Concepts
III. Small-Sample Interval Estimaters

s
x  t α/2
n
1 1
(𝑥1- 𝑥2 )  t α/2 𝑠𝑝𝑜𝑜𝑙𝑒𝑑 2 +
𝑛1 𝑛2

𝑛1 − 1 𝑠1 2 + 𝑛2 − 1 𝑠2 2
𝑠𝑝𝑜𝑜𝑙𝑒𝑑 2 =
𝑛1 + 𝑛2 − 2

46

23
3/13/2019

47

24

Você também pode gostar