Você está na página 1de 17

Lesson 7:

Confidence intervals

Introduction

The difference between the present and the new commercial may be a
result of chance.

The example involves inferences about a population based on sample


data  How can we analyse such problems?

Two equivalent methods:


1.
2.

Confidence Intervals: estimating a value of a population parameter


Tests of significance: assessing evidence for a claim about a
population

Remember the sample distribution of 

The sample mean  is a random variable (if you pick a different sample
you will probably get a different sample mean).

The distribution of :


 has mean  and standard deviation




If the population distribution is normal, then the distribution of  is


normal.

If the population distribution is not normal and n is large (in practice:


  30), then  is approximately normal (the Central Limit Theorem).

We assume that the population is known.

Statistical confidence
Example 6.5:
In a sample of 453 large firms the mean monthly premium paid
for the health insurance is 405 dollars. We assume that is
known to be 112.

What can we say about the population mean ?


In repeated sampling  is approximately

112
 ,
 ,
, 5.27

453

Statistical confidence
 is approximately , 5.27
The 69-95-99.7 rule says that the probability is
about 0.95 that is within 10.54 of .
The 68-95-99.7 rule

3 2

+ + 2 + 3

Statistical confidence
To say that lies within 10.54 of  is the same as saying that  is
within 10.54 of ! (because distances are symmetrical)

So in 95% of all samples  is within 10.54 of or put differently: in 95%


of all samples  is in the interval from  10.54 to  10.54
(with 405).

We say that we are 95% confident


that  is between 394 and 416.

We call 405  10.54 394, 416


a 95% confidence interval
for .

Statistical confidence
We are 95% confident that  is between 394 and 416.
Be sure you understand that  in reality could be outside this interval!
(Although this only happens in 5% of all samples)

We cannot know whether our sample is one of the 95% or one of the
unlucky 5%. The only thing we can say is: we arrived at the conclusion
that  is between 394 and 416 by a process that gives correct results
95% of the time.

Implication: We dont need to take a lot of random samples to find the


sampling distribution and at its center. All we need is one sample and
relying on the properties of the sample means distribution to infer the
population mean .

Confidence levels and critical values


A confidence level C (e.g. 95%) gives the probability that the
interval will capture the true parameter in repeated samples.

The critical value z* for a confidence


level C is the value so that the area
under the standard Normal curve
between z* and z* is C.
Common critical values:
o

z*=2.576 (for C=99%)

z*=1.960 (for C=95%)

z*=1.645 (for C=90%)

Confidence levels and critical values


Exercise: Find the critical value for a 70% confidence level.
End solution:
 z*=1.04

10

Confidence intervals
A confidence interval always has the form
estimate  margin of error
where the margin of error m shows the precision of the estimate.

The confidence interval for a population mean with a known and


with a confidence level C is:

 *

where *




, the margin of error.

To derive the confidence interval we had to assume that the distribution


of  is normal. So the interval is exact when the population distribution

is normal and is approximately correct when n is large in other cases


(because of the central limit theorem).

11

Confidence intervals
A confidence interval is an
interval that contains the true
parameter in C percent of the
samples.

Experiment with the applet:


http://bcs.whfreeman.com/psbe3e/#613741__630698__



Figure 6.5: Twenty-five samples from the
same population gave these 95%
confidence intervals.

12

Application: quality control


A company makes 500ml bottles of green tea. Assume we know
the standard deviation is  2ml. In a sample of 10 bottles, the
mean content 501.94. Is this convincing evidence that the
mean fill of all bottles  differs from 500ml?
Two common mistakes: do not conclude that
- the mean is 501.94 therefore it is proven that the mean differs
from 500ml.
- the difference of 1.94ml is small compared to 500ml so there is
nothing unusual going on.

13

Application: quality control


So:
501.94
 10
 2ml
Is this convincing evidence that the mean fill differs from 500ml?
We calculate a 95% confidence interval for :
C.I. for   *




501.94  1.96

.
/0

500.7 , 503.2

Since 500 is not included in the C.I., we are 95% confident that  differs
from 500. So we can reject the hypothesis that the mean fill of all bottles
is 500ml.
Inferential analysis can be done either with confidence intervals or
significance tests (next lesson).

14

Reducing the margin of error


Confidence interval for :  *




To reduce the margin of error (to make the confidence interval


smaller), there are only 3 ways:

1. Use a lower level of confidence (smaller C  smaller * )


2. Reduce 
3. Increase the sample size 

15

Choosing the minimum sample size


It is possible to choose the sample size in a research so that inferences
about the population mean can be done with a predetermined level of
confidence and a predetermined margin of error:


, *

*

,

Note: in practice  is usually unknown, but if n is large then the sample standard
deviation s will be close to the unknown population standard deviation . So
substituting s for  in the formula will result in an approximately correct minimum
sample size n.

Confidence interval for :  *

Exercise 1
In a sample of 50 Belgian employees, the mean
gross monthly wage is 2400 euro. The standard
deviation in the population is known to be  1000 euro.

*

,

a. Find the 80% confidence interval for .


b. Find the 95% confidence interval for .
c. Find the 95% confidence interval for  if the sample size
in this example 500 instead of 50.

d. What sample size would you need to limit the margin of


error to 50 euro with 95% confidence?

End solutions:
a. (2219,2581)

b. (2123,2677)

c. (2312,2488)

d. 1536 cases

 16

.

Exercise 2


 *


17

An opinion poll asks:


Will you vote for Barack Obama in the elections: Yes or No?

Let  be a variable taking the values 0 (No) and 1 (Yes), so that  is


the percentage of Obama voters in the sample.

In a sample of 2400 respondents


o

0.49 (49% of the respondents in the sample say they will vote
Obama)
Assume that standard deviation is  0.50.

Estimate the margin of error in this poll for a 95% confidence level.

End solution:
the margin of error in this poll is approximately 2 percentage points.

Você também pode gostar