Você está na página 1de 252

10/17/2015

1-1

Advanced Probability and


Statistics
Dr. Said Mirza Pahlevi

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1-2

What is Statistics?

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1
10/17/2015

1-3

When you have completed this chapter, you will be able to:

1. Explain what is meant by statistics.

2. Identify the role of statistics in the development


of knowledge and everyday life.
3. Explain what is meant by descriptive statistics
and inferential statistics.

4. Distinguish between a qualitative variable and


a quantitative variable.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1-4

5. Distinguish between a discrete variable and


a continuous variable.

6. Collect data from published and unpublished sources.

7. Distinguish among the nominal, ordinal, interval,


and ratio levels of measurement.
8. Identify abuses of statistics.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2
10/17/2015

1-5

9. Gain an overview of the art and science of


statistics.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1-6
What is Meant by Statistics?

• In the more common usage, statistics refers


to numerical information
• A collection of numerical information is
called statistics (plural)

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

3
10/17/2015

1-7
Example of Statistics

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

…it is the art and science of…


1-8

 collecting
 organizing
What
 presenting data
is  drawing inferences
Meant from a sample of
information
by about an entire population
Statistics? as well as
 predicting and
developing policy analysis
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

4
10/17/2015

1-9
Question: What Statistics?

Statistics: the art and science of collecting organizing presenting data


drawing inferences from a sample of information about an entire
population as well as predicting and developing policy analysis

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 10
Question: What Statistics?

Statistics: the art and science of collecting organizing presenting data


drawing inferences from a sample of information about an entire
population as well as predicting and developing policy analysis

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

5
10/17/2015

1 - 11
Question: What Statistics?

Statistics: the art and science of collecting organizing presenting data


drawing inferences from a sample of information about an entire
population as well as predicting and developing policy analysis

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 12

in everyday life

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

6
10/17/2015

Who uses Statistics? 1 - 13

Those using Statistical techniques include :

Marketers Investors

Accountants Economists
Sports people
Consumers
Hospitals Statisticians
Quality Controllers
Educators Politicians Physicians
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Who uses Statistics? 1 - 14

Weather
Forecasters

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

7
10/17/2015

1 - 15

Types of
Statistics

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Types of Statistics 1 - 16

Descriptive Inferential

Methods of… Science of…


collecting making inferences
organizing about a population,
presenting based on sample
and information.
statistics: the art and science of collecting, organizing,
analyzing data presenting data, drawing inferences from a sample of
information about an entire population as well as predicting
and developing policy analysis
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

8
10/17/2015

Identify the following… 1 - 17

Descriptive Inferential
A. A Gallup poll found that 83%
of the people in a survey knew
which country won the gold
medal in Men’s Hockey in 2002.
B. The accounting department of
a firm will select a sample of
invoices to check for accuracy of
all the invoices of the company.
C. Wine tasters sip a few drops
of wine to make a decision
with respect to all the wine
waiting to be released for sale.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 18
The Method of
Experimentation

We start off with particular observations


from the real world and
draw conclusions
about the general patterns in the real world!
1. Define the experimental goal or a working hypothesis
2. Design an experiment
3. Collect data
4. Estimate the values/relations
5. Draw inferences
6. Predict and prepare policy analysis
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

9
10/17/2015

1 - 19

A study was undertaken to estimate the


average height of penguins in Antarctica.

Let’s review the steps they would take to


prepare the estimate.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 20

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

10
10/17/2015

1 - 21

A population is a
collection of
all possible individuals,
objects,
or
measurements of interest

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 22

From

Take a

…which are deemed to be representative of the

What we now need is…


Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

11
10/17/2015

1 - 23

Take a
Measurement
for each one
Record
in the sample

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 24

…to put the data


into a
readable and
understandable
format!

Displaying Data Results


Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

12
10/17/2015

1 - 25

Two methods that can be


used to ‘see’
what the data conveys are
Tables and
Graphs/Charts

More on these in chapter 2…


Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 26

Tables
… are an efficient method of displaying data
and depicting data accurately.
e.g.

More on these in chapter 2…


Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

13
10/17/2015

1 - 27

Line
Pie

Bar
More on these in chapter 2…
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 28

Why take a sample instead of studying


every member of the population?

Costs of surveying the entire population


may be too large or prohibitive
Destruction of elements during
investigation
The sample results are adequate

More in chapter 8 …
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

14
10/17/2015

1 - 29
Example of Using Sample

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 30
Example of Using Sample

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

15
10/17/2015

1 - 31
Example of Using Sample

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 32

Data are everywhere


Statistical techniques are used to make
many decision that affect our lives
The knowledge of statistical
methods will help you understand how
decisions are made and give you a better
understanding of how they affect you

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

16
10/17/2015

1 - 33
Question: What Decision?

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 34

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

17
10/17/2015

Types of Data 1 - 35

A Variable
a characteristic
of
a population or sample
that is of interest to us
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Types of Data 1 - 36

Variables

Qualitative Quantitative

Categorical Numerical
Observations Observations

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

18
10/17/2015

Variables 1 - 37

Qualitative – or Attribute
Country of Birth
U.K.
Eye Colour
Germany
Blue
Gender Taiwan
Brown
Male China
Hazel
Female India
Green
Japan
Red
Russia

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Variables 1 - 38

Quantitative – Numeric
Minutes to end Number of
of Class Two-Door
55 Garages Number of
Number of Satisfied
45 in a Street Maple Leafs Fans
Children in
30
5 a Family 10
20 0
0 1 30 20
2 40 30
3 ... 40
4 …

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

19
10/17/2015

Variables 1 - 39

Quantitative … can be classified as either


Numerical Discrete or
Observations Continuous

Characteristics
… can only assume certain values
Discrete and
there are usually “gaps” between values
e.g. - Number of bedrooms in a house
- Number of hammers sold (1,2,3,…etc)

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Variables 1 - 40

Quantitative … can be classified as either


Numerical Discrete or
Observations Continuous
Characteristics
Continuous … can assume any value
within a specified range!
e.g. - Pressure in a tire
- Weight of a beef chop
- Height of students in a class
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

20
10/17/2015

1 - 42
Summary of Types of Variables

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 43

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

21
10/17/2015

1 - 44

Nominal

Ordinal

Interval

Ratio

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 45

Nominal
Data can only be classified into categories or counted
and cannot be arranged in any particular order

Example M & Ms

Category: Candy
Classification: By Colour only
(No natural order)
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

22
10/17/2015

1 - 46

Nominal

Example M & Ms
Mutually Exclusive:
…where an individual, object, or measurement is
included in ONLY ONE CATEGORY
Exhaustive:
…where each individual, object, or measurement
MUST APPEAR in one of the categories

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 47

Ordinal
…involves data arranged in some order,
but
the differences between data values
cannot be determined or are meaningless!
Example During a taste test of 4 soft drinks:
Mello Yello was ranked number……..…. 1.
Sprite number……………………………. 2.
Seven Up number..…………………..…... 3.
Orange Crush number ….……………….4.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

23
10/17/2015

1 - 48

Interval
…similar to the Ordinal Level,
with the additional property
that meaningful amounts of differences between data
values can be determined.
There is no natural zero point

Example

Temperature on the Celsius scale.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 49

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

24
10/17/2015

1 - 50

Ratio
…the Interval Level with an inherent zero
starting point.
Differences and ratios are meaningful
for this level of measurement.

Examples
Monthly income of surgeons

Distance travelled by manufacturer’s


representatives per month
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 51

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

25
10/17/2015

1 - 52

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Sources of Statistical 1 - 53

Information

Published Data

Statistical Abstracts
Weather
Sports
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

26
10/17/2015

Sources of Statistical 1 - 54

Information

www.bps.go.id

www.bankofcanada.ca

Government of
Internet Canada & Provinces www.gc.ca

www.theweathernetwork.com

www.mcgrawhill.ca/college/lind

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Sources of Statistical 1 - 56

Information
Commissioned surveys:

To develop information for the survey that they are


doing, pollsters often contact the selected
‘sample population’.

For Example…At home, over the telephone, by


mail, by email, in the street, and at shopping malls!
How to collect data…
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

27
10/17/2015

1 - 57

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Caution 1 - 58

As you begin to study statistical methods,


you are cautioned to take what you see published as
“statistical facts”
with a healthy grain of skepticism!

… an average may not be representative of all the data


… graphs can also be misleading

… be sure to study the sampling methods

For Example
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

28
10/17/2015

Caution 1 - 59

Review the following three slides and


notice the effect
that the
different scales
have on your interpretation of the
pattern between
Crime and Unemployment Rates.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 60

Chart 1-11A
1986 - 1999
3200
3000
2800
2600
2400
2200
2000
0 1 2 3 4 5 6 7 8 9 10 11 12

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

29
10/17/2015

1 - 61

Chart 1-11B
1986 -1999
3000
2500
2000
1500
1000
500
0
7 8 9 10 11 12

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1 - 62

Chart 1-11C
1986 -1999
3200
3000
2800
2600
2400
2200
2000
7 8 9 10 11 12

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

30
2-1

f requency Distributions
Describing Data
Graphic Presentations

Copyright©©2003
Copyright 2004by
byThe
TheMcGraw-Hill
McGraw-HillCompanies,
Companies,Inc.
Inc.All rights reserved. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2-2

When you have completed this chapter, you will be able to:

Organize raw data into frequency distribution

Produce a histogram, a frequency polygon, and


a cumulative frequency polygon from
quantitative data

Develop and interpret a stem-and-leaf display

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

1
2-3

Present qualitative data using such graphical


techniques such as a clustered bar chart, a
stacked bar chart, and a pie chart

Detect graphic deceptions and use a graph


to present data with clarity, precision,
and efficiency

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2-4

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2
2-5

A Frequency Distribution is a
grouping of data into
non-overlapping classes
(mutually exclusive)…
showing the
number of observations
in each category
or class.

The range of categories includes all values in the


data set (collectively exhaustive classes).
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2-6

Class Midpoint or Class Mark:


A point that divides a class 12.5
into two equal parts, i.e. the 17.5 5
average of the upper and lower 22.5
class limits. 27.5
32.5
Class frequency:
The number of observations in each class.
Class interval:
The class interval is obtained by subtracting the lower limit of
a class from the lower limit of the next class, e.g.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

3
2-7

Dr. Tillman is Dean of the School of Business.


He wishes to prepare a report showing the
number of hours per week students spend studying.
He selects a random sample of 30 students and
determines the number of hours
each student studied last week.
15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7,
17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9,
10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6.

Organize the data into a frequency distribution.


Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2-8
There are five steps
that can be used to
Construct a Frequency Distribution:

Decide how many classes you wish to use.


Frequency Determine the class width.
Distributions
by hand Set up the individual class limits.
Tally the items into the classes.
Count the number of items in each class.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

4
2-9
Decide how many classes you wish to use

Rule of Thumb:
For most data sets, you would want
between 3 and 12 classes!
Use the 2 to the K rule.
Choose k so that 2 raised to the power of k is greater
than the number of data points (n) or 30.
In this 2k = 30 students
case…
25 = 32, so use k = about 5 classes
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 10
Determine the class width

Generally, the class width should be


the same size for all classes.

Class width >= Max - Min


K
K=5
15.0, 23.7, 19.7, 15.4, 18.3,
23.0, 14.2, 20.8, 13.5, 20.7, (33.8 – 10.3)/ 5 = 4.7
17.4, 18.6, 12.9, 20.3, 13.7,
21.4, 18.3, 29.8, 17.1, 18.9, Therefore, use
10.3, 26.1, 15.7, 14.0, 17.8, class size of 5 hours
33.8, 23.2, 12.9, 27.1, 16.6.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

5
Set up the individual class limits 2 - 11

Minimum Value is 10.3, Class Width 5 hours


therefore,
classes should start Lower class limits
at 10 hours will be: 10, 15, 20, etc.

15.0, 23.7, 19.7, 15.4, 18.3,


Classes or Classes
10.0 – 14.9 10.0 to under 15
23.0, 14.2, 20.8, 13.5, 20.7,
15.0 – 19.9 15.0 to under 20
17.4, 18.6, 12.9, 20.3, 13.7,
20.0 – 24.9 20.0 to under 25
21.4, 18.3, 29.8, 17.1, 18.9,
25.0 – 29.9 25.0 to under 30
10.3, 26.1, 15.7, 14.0, 17.8,
30.0 – 34.9 30.0 to under 35
33.8, 23.2, 12.9, 27.1, 16.6.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 12
Tally the items into the classes

15.0, 23.7, 19.7, 15.4, 18.3,


23.0, 14.2
14.2, 20.8, 13.5
13.5, 20.7,
17.4, 18.6, 12.9
12.9, 20.3, 13.7,
13.7
21.4, 18.3, 29.8, 17.1, 18.9,
10.3, 26.1, 15.7, 14.0,
14.0 17.8,
33.8, 23.2, 12.9,
12.9 27.1, 16.6.
Find
Classes Tally
10.0 to under 15
15.0 to under 20
20.0 to under 25 …and so on with
25.0 to under 30 the remaining
30.0 to under 35 hours
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

6
2 - 13
Count the number of items in each class

Hours Studying x Frequency f


10.0 to under 15 7
15.0 to under 20 12
20.0 to under 25 7 30
25.0 to under 30 3
30.0 to under 35 1

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 14
Using different limits
…will give you a different distribution, e.g.
Hours Studying x Frequency f
7.5 to under 12.5 1
12.5 to under 17.5 12
17.5 to under 22.5 10
30
22.5 to under 27.5 5
27.5 to under 32.5 1
32.5 to under 37.5 1
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

7
2 - 19

Relative Frequency
Distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Relative Frequency Distribution 2 - 20

…shows the percent of observations in each class!


Hours Studying x f Relative f
10.0 to under 15 7 7/30 = 0.2333
15.0 to under 20 12 12/30 = 0.40
20.0 to under 25 7 7/30 = 0.2333
25.0 to under 30 3 3/30 = 0.10
30.0 to under 35 1 1/30 = 0.0333
Total 30 30/30 =1
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

8
2 - 21
Using different limits

Hours Studying x f Relative f


7.5 to under 12.5 1 1/30 = 0.0333
12.5 to under 17.5 12 12/30 = 0.40
17.5 to under 22.5 10 10/30 = 0.3333
22.5 to under 27.5 5 5/30 = 0.1666
27.5 to under 32.5 1 1/30 = 0.0333
32.5 to under 37.5 1 1/30 = 0.0333
Total 30 30/30 =1
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 22
Stem-and-leaf Displays
A statistical technique for displaying a set of data.
Each numerical value is divided into two parts:
1. the leading digits become the stem and
2. the trailing digits become the leaf.

…an advantage of the stem-and-leaf


display over a frequency distribution is
that we retain the value of each observation!

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

9
2 - 23
Stem-and-leaf Displays

A student achieved the following


scores on the twelve accounting
quizzes this semester:
86, 79, 92, 84, 69, 88, 91,
83, 96, 78, 82, 85.
Construct a stem-and-leaf chart to
illustrate the results.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 24
Stem-and-leaf Displays

First, find the lowest score


86, 79, 92, 84, 69, 88,
91, 83, 96, 78, 82, 85.
Now list the next scores with the highest
leading digits.
You should now have the following STEMS:
669, 78,
7 82,
8 91
9
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

10
2 - 25
Stem-and-leaf Displays
86, 79, 92, 84, 69, 88,
91, 83, 96, 78, 82, 85.
Split Stem Leaf
Now, list the remaining
669 6 9 ‘leaf’ scores!
778 7 8 9
882 8 2 3 4 5 6 8
991 9 1 2 6
All 12 Scores
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 26

The grades on a statistics exam for


a sample of 40 students are as follows:
Stem Leaf Alpha-Numeric How many
Grading students
3 68 A+ = 90%-100% earned an
4 1278 A = 80%-89%
A
on this test?
5 0125589 B+ = 75%-79%
5
B = 70%-74%
6 01112578889 What is the
C+ = 65%-69%
7 0025667 most common
C = 60%-64% letter grade
8 46889 D = 55%-59% earned?
9 0246 F = 0%-54%
F
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

11
2 - 29

Graphic
Presentation of a
Frequency
Distribution
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Graphic Presentation of a 2 - 30

Frequency Distribution

The three commonly used graphic forms are:

Histograms

Frequency Polygons or Line Charts

Cumulative Frequency Distributions

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

12
Graphic Presentation of a 2 - 31

Frequency Distribution
A Histogram
is a graph in which the
classes are marked on
the horizontal axis and

Frequency
the class frequencies on
the vertical axis
The class frequencies
are represented by the
heights of the bars and
the bars are drawn Class
adjacent to each other.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Graphic Presentation of a 2 - 32

Frequency Distribution
Histogram
14
Hours Studying x f
12
10.0 to under 15 7 10
15.0 to under 20 12 8
20.0 to under 25 7 6
25.0 to under 30 3
4
30.0 to under 35 1
2
0 10 15 20 25 30 35
Hours spent studying

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

13
Graphic Presentation of a 2 - 33

Frequency Distribution
14
A frequency polygon
12
consists of line segments 10
connecting the points 8

formed by 6
4
the class midpoint and
2
the class frequency. 0

A cumulative frequency 7.5 12.5 17.5 22.5 27.5

distribution 35
30
is used to determine how 25
many or what proportion 20
15
of the data values are 10
below or above 5
0
a certain value. 10 15 20 25 30 35

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 34

Making a
Histogram
in Excel
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

14
Using 2 - 35

Click on DATA ANALYSIS

See

Click on HISTOGRAM
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Using 2 - 36

The upper limits of the classes you have


determined must now be entered from
Column B (Excel calls these “bins”)

Complete INPUTTING of DATA

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

15
Using 2 - 37

To remove the Legend on the right side…


Right mouse click and Click on Clear

To remove the spaces between the bars…


Right mouse click on one of the bars and
Click on Format Data Series
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Using 2 - 38

Now, Click on the Options tab;


To reduce/remove the spaces between the bars
Adjust the Gap width down to 0 and Click on OK.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

16
Using 2 - 39

Edit the size of the


histogram, titles, etc
as appropriate.

Note that the upper limit values


are included in each class
– this explains the
difference between this
Excel Frequency Distribution
and the one we did by hand.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Frequency Polygon or Line Chart 2 - 40

for Hours Spent Studying


14
Hours Studying xf 12
10.0 to under 15 7 10
15.0 to under 20 12 8
20.0 to under 25 7 6
25.0 to under 30 3 4
30.0 to under 35 1 2
0 10 15 20 25 30 35
Hours spent studying
Notice that the class midpoints
(the plotted points) aren’t as
“user friendly” in this distribution choice.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

17
Cumulative Frequency Distribution 2 - 41
For Hours Studying

Cumulative
Hours Studying xf Hours Studying f
10.0 to under 15 7 under 15 7
15.0 to under 20 12 under 20 19
20.0 to under 25 7 under 25 26
25.0 to under 30 3 under 30 29
30.0 to under 35 1 under 35 30

Graph…..
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Cumulative Frequency Distribution 2 - 42


For Hours Studying

Hours Studying
35
Cumulative 30
f 25
under 15 7 20
under 20 19 15
under 25 26 10
under 30 29 5
under 35 30
0 10 15 20 25 30 35
Hours spent studying
Notice that the limits are the plotted points.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

18
2 - 43

Pie
Bar
Line
… used primarily for Qualitative Data
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 44

Pie
…is useful for displaying a
Relative Frequency Distribution
A circle is divided
proportionally to the
relative frequency
and portions of the circle
are allocated for the
different groups.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

19
2 - 45

Pie
200 runners were asked to indicate their
favourite type of running shoe.
Type # of runners selecting:
Nike 92
Adidas 49
Reebok 37
Asics 13
Other 9
Draw a pie chart based on this information.
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 46

Pie
Relative Frequency Distribution
for the running shoes
Type # % Asics
Reebok Other
Nike 92 46.0 6.5%
Adidas 49 24.5 18.5% 4.5%
Reebok 37 18.5
Asics 13 6.5 Adidas 24.5% 46.0% Nike
Other 9 4.5
200 100

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

20
2 - 47

Pie
Type # %
Nike 92 46.0
Adidas 49 24.5
Reebok 37 18.5
Asics 13 6.5
Other 9 4.5
200 100

Using Excel, follow the steps in the Chart Wizard


to construct a Pie Chart!
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Bar 2 - 48

…can be used to depict any of the levels of measurement


(nominal, ordinal, interval, or ratio).

Examples of…

(also known as a 3-D


‘column chart’)

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

21
Bar 2 - 49

Use bar charts also


when the order
in which qualitative
data are presented
is meaningful.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Bar 2 - 50

How could we chart this data?

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

22
Bar 2 - 51

Using Excel we can


produce this…

Other formats…
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Canadian City
Employment
Rate
Bar 2 - 52

Victoria 57.7
Vancouver 61.4
Edmonton 67.1
Employment Rate in Canadian Cities
70
Winnipeg 66.7
68
Saskatoon 63.7
66
% employment

Regina 67.4
64
Thunder Bay 61.0
62
London 63.3
60
Kitchener 66.0
58
Hamilton 63.2
56
Toronto 65.1
54
Quebec 59.7
52
Sherbrooke 59.2
Montreal 60.4
Halifax 60.5
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

23
Canadian City
Employment
Rate
Bar 2 - 53

Victoria 57.7
Vancouver 61.4
Edmonton 67.1
Employment Rate in Canadian Cities
Winnipeg 66.7
70
- by Province
Saskatoon 63.7
68
Regina 67.4

% employment
66
Thunder Bay 61.0 64
London 63.3 62

Kitchener 66.0 60

Hamilton 63.2 58

Toronto 65.1 56
54
Quebec 59.7
52
Sherbrooke 59.2
Montreal 60.4
Halifax 60.5
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Bar 2 - 54

Did any of the previous Bar Charts


adequately display
all the information that was provided?

The following has been modified from


that data found by Statistics Canada.
Does it do an effective job of displaying
the StatCan data?

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

24
Clustered Bar 2 - 55

Comparison of Internet Use in 2000 and 2001


% of enterprises

100
80
60
40
20
0

Manufacturing % of enterprises that


Wholesale trade use the Internet 2000
Retail trade % of enterprises that
use the Internet 2001
% of enterprises with a
Web site 2000
% of enterprises with a
Web site 2001

Copyright © 2004 by The McGraw-Hill Companies, Inc. Data


Dr. SaidSource: Statistics
Mirza Pahlevi – STMIK Canada
Pascasarjana Nusa Mandiri

Stacked Bar 2 - 56

Full-Time University Faculty By Gender,


Canada and Jurisdictions, 1987-88 and 1997-98
Total Full Professor Associate Professor Other
1987-88 1997-98 1987-88 1997-98 1987-88 1997-98 1987-88 1997-98
34,651 33,925 12,829 13,910 12,650 12,095 9,172 7,817
% Female 17 25 7 13 17 28 32 44
% Male 83 75 93 87 83 72 68 56

Canadian Full Time University Faculty


120
% of Total

100
80 % males
60
40 % females
20
0
1987-88 1997-98
Copyright © 2004 by The McGraw-Hill Companies, Inc. Data Source: Statistics Canada
Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

25
2 - 57

Make sure that your


charts
are not
overly cluttered

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 58

Shapes
of Moda
Clas
ls
Histograms

There are four typical shape characteristics


Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

26
2 - 59

…a balanced effect!

Both ‘balanced’ or
‘have symmetry’

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

2 - 60

… occurs when the observations are graphed as


being skewed or tilted more to one side of the centre
of the observations than the other.
The skewness, if on the The skewness, if on the
right side is said to be left side is said to be
‘positive’. ‘negative’.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

27
Modal
2 - 61

Class
A modal class is the one with the
largest number of observations

This is a uniModal Histogram


Copyright © 2004 by The McGraw-Hill Companies, Inc.
biModal
Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Modal
2 - 62

Class

biModal

This is a biModal Histogram

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

28
2 - 63

Population distributions are often bell shaped.


Drawing a histogram
helps verify the shape of the population in question.

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Line 2 - 64

Line charts are particularly useful


when the trend over time
is to be emphasized
Examples …

3-D In combination

Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

29
Line 2 - 65

Time Plot
M oMn ot h
n th
l yly SSt te
e ee ll PPro
r oddu uc tio
c t ino n
8.5

7.5

6.5

5.5
Mo n th J F MAM J J A S O N D J F MAM J J A S O N D J F MAM J J A S O

2000 2001 2002


Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Line 2 - 66

Employment Rate in Canadian Cities


70
68
% employment

66
64
62
60
58
56
54
52

Preparing a Line Chart for this type of data is not overly useful!
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

30
Line 2 - 67

Employment Rate in Canadian Cities


70
68
% employment 66
64
62
60
58
56
54
52

Is this combination any better for displaying the data?


Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

Line 2 - 68

frequency Polygon and Ogive


frequency Polygon Ogive
0.3 1.0

0.2

0.5

0.1

0.0 0.0
0 10 20 30 40 50 0 10 20 30 40 50

Sales Sales
Copyright © 2004 by The McGraw-Hill Companies, Inc. Dr. Said Mirza Pahlevi – STMIK Pascasarjana Nusa Mandiri

31
30/10/2015

3-1

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3-2

When you have completed this chapter, you will be able to:
1. Calculate the arithmetic mean, the weighted mean, the median,
the mode, and the geometric mean of a given data set.

2. Identify the relative positions of the arithmetic mean, median


and mode for both symmetric and skewed distributions.

3. Point out the proper uses and common misuses of each


measure.

4. Explain your choice of the measure of central tendency of


data.

5. Explain the result of your analysis.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

1
30/10/2015

Five Measures of 3-3

Central Tendency
arithmetic mean median mode
weighted mean geometric mean

Average price of a house in The average price of a


Ottawa (2000) was $126 000 house in Toronto in 1996
was $238,511 (StatCan)
The average income of two
parent families with children in
My grade point average
Canada was $65,847 in 1995 and
for last semester was 4.0
$72,910 in 1999. (StatCan)

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Arithmetic Mean 3-4

…is the most widely used measure of location.

It is calculated by summing the values and


dividing by the number of values
It requires the interval scale
All values are used

It is unique

The sum of the deviations from the mean is 0

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

2
30/10/2015

Population Mean 3-5

Formula  x
m =
N
m … is the population mean
(pronounced mu)
N … is the total number of observations
x … is a particular value
S … indicates the operation of adding
(sigma)

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3-6

Terminology
Parameter
…is a measurable characteristic of a
Population

Statistic
…is a measurable characteristic of a
Sample

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3
30/10/2015

Population Mean 3-7

Formula  x
m =
N
The Kiers family
owns four cars. Find the mean
The following is mileage for the cars.
the current mileage
on each of the four
cars: 56000 + 23000 + 42000 + 73000
= 4
56,000 23,000
42,000 73,000 = 48 500

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Sample Mean 3-8

Formula  x
x =
n
x …is the sample mean (read “x bar”)
n … is the number of sample observations
x … is a particular value
S … indicates the operation of adding
(sigma)

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4
30/10/2015

3-9

A sample of five executives received the


following bonuses last year ($000):
14.0 15.0 17.0 16.0 15.0
Determine the average bonus given last year:

Formula x =
x =
14 + 15 + 17 + 16 + 15
5
n
= 77 / 5 = 15.4
The average bonus given last year was $15 400
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 10
Properties of an
Arithmetic Mean
…Every set of interval-level and ratio-
level data has a mean
… All the values are included in
computing the mean
…A set of data has a unique mean

…The mean is affected by unusually


large or small data values
…The arithmetic mean is the
only measure of central tendency where
the sum of the deviations
of each value from the mean is zero!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5
30/10/2015

3 - 11
Arithmetic Mean
as a Balance Point
Illustrate the mean of the values 3, 8 and 4.

= 15 / 3 =5

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 12
The mean is affected by unusually
large or small data values

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

6
30/10/2015

3 - 13

Determining
5
the Mean
in Excel
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using 3 - 14

See

Click on DATA
ANALYSIS

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

7
30/10/2015

Using 3 - 15

See

Highlight DESCRIPTIVE STATISTICS


…Click OK See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using 3 - 16

INPUT NEEDS
See

A3:A42

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8
30/10/2015

Using 3 - 17

See Solution

Alternate solution…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using 3 - 18

CLICK ON

CLICK ON PASTE FUCTION


See

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9
30/10/2015

Using 3 - 19

SCROLL DOWN TO STATISTICAL

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using 3 - 20

HIGHLIGHT AVERAGE IN RIGHT MENU

See

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

10
30/10/2015

Using 3 - 21

See

The mean (average) is placed


in the cell on the worksheet where
your cursor was when you began.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 22
Weighted Mean
The weighted mean of a set of numbers
x1, x2, ... xn,
with corresponding weights w1, w2, ...,wn,
is computed from the following formula:

w1 x1 + w2 x2 + ... + wn xn
mw =
w1 + w2 + w3 + ... + wn xn

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

11
30/10/2015

3 - 23
Weighted Mean

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 24

During a one hour period on a hot Saturday


afternoon cabana boy Chris served fifty drinks.
He sold:
…five drinks for $0.50
…fifteen for $0.75
…fifteen for $0.90
…fifteen for $1.10
Compute:
- the weighted mean of the price of the drinks -

5($0.50) + 15($0.75)+ 15($0.90)+ 15($1.15)


μw =
5 + 15 + 15 + 15
$44.50
= = $ 0 . 89
50
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

12
30/10/2015

3 - 25
The Median

The Median is the midpoint of the


values after they have been ordered
from the smallest to the largest

There are as many values


above the median as below it in the data array
For an even set of values,
the median will be the
arithmetic average of the two middle numbers
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 26
The Median

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

13
30/10/2015

3 - 27

The ages for a sample of five college students are:


21, 25, 19, 20, 22
Arranging the data in ascending order gives:
19, 20, 21, 22, 25
Thus the median is 21

The heights of four basketball players, in inches, are:


76, 73, 80, 75
Arranging the data in ascending order gives:
73, 75, 76, 80
Thus the median is 75.5
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 28
Properties of the Median

 There is a unique median for each data set

 It is not affected by extremely large or small


values and is therefore a valuable measure of
central tendency when such values occur

 It can be computed for ratio-level,


interval-level, and ordinal-level data

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

14
30/10/2015

3 - 29
The Mode
The Mode
is the value of the observation
that appears most frequently used

The exam scores for ten students are:


81, 93, 84, 75, 68, 87, 81, 75, 81, 87
The score of 81 occurs the most often
…it is the Mode!

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 30
EXAMPLE

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

15
30/10/2015

3 - 31
Disadvantages

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 32
Geometric Mean
The Geometric Mean (GM) of a
set of n numbers is defined as the
nth root of the product of the n numbers.
The geometric mean is used to average
percents, indexes, and relatives.
The formula is:

GM = n ( x 1 )( x 2 )( x 3 ). .. ( xn )

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

16
30/10/2015

3 - 33
Geometric Mean
The interest rate on three bonds was
5, 21, and 4 percent
The Geometric Mean is:

GM = 3 ( 5 )( 21 )( 4 ) = 7 . 49
The arithmetic mean is (5+21+4)/3 =10.0
The GM gives a more conservative profit figure
because it is not heavily weighted
by the rate of 21percent
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 34
Geometric Mean
continued…
Another use of the geometric mean is to determine the
percent increase in sales,
production or other business or economic series
from one time period to another.
The formula is:

GM = n (Value at end of period)  1


(Value at beginning of period)

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

17
30/10/2015

3 - 35
Geometric Mean
continued…

The total number of females enrolled in American


colleges increased from
755,000 in 1992 to 835,000 in 2000.

835 , 000
GM = 8  1 = . 0127
755 , 000
i.e. the Geometric Mean rate of increase is 1.27%.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 36
EXAMPLE

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

18
30/10/2015

3 - 37

Determining the
Median, Mode or
Geometric Mean
in Excel
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 38
Using

Click
DATA ANALYSIS

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

19
30/10/2015

Using 3 - 39

Highlight
DESCRIPTIVE
STATISTICS

INPUT NEEDS

SUMMARY
STATISTICS

See SOLUTION
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using 3 - 40

Solution

The
geometric mean
doesn’t show up in
summary
statistics!

Alternate solution…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

20
30/10/2015

Using 3 - 41

CLICK ON

CLICK ON PASTE FUCTION


See

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using 3 - 42

SCROLL DOWN
to
STATISTICAL

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

21
30/10/2015

Using 3 - 43

HIGHLIGHT
MEDIAN
IN RIGHT MENU
See

OR…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using 3 - 44

HIGHLIGHT
GEOMETRIC MEAN
See IN RIGHT MENU

OR…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

22
30/10/2015

Using 3 - 45

HIGHLIGHT
MODE
IN RIGHT MENU
See

See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using 3 - 46

The calculated values are placed in the cell on the


worksheet where your cursor was when you began
New
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

23
30/10/2015

The following table shows the expenditures of 3 - 47


Canadians in 15 countries they visited in 1999
Countries Visited Expenditures ($Cdn millions)
Australia 227
Cuba 265
Dominican Rep. 122
France 506
Germany 183
Source:
Hong Kong 138
Statistics Canada,
Ireland 114
Italy 283 Tourism
Japan 150 and the
Mexico 557 Centre for
Netherlands 107 Education
Spain 105 Statistics
Switzerland 91
United Kingdom 1009
United States 8401
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 48

Is the mean or median expenditure a


more accurate
reflection of the “average” Canadian
out-of-country expenditure?

What happens to the values of the mean and median


when you remove the United States expenditures
from the sample?
…if you remove both the UK and US from the sample?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

24
30/10/2015

Using 3 - 49

The mean is strongly affected by the inclusion of


these two OUTLIERS
… therefore, the median is a more appropriate measure
of “average” in this case
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 50
The Mean
of Grouped Data

• Generally, measures of location, such as the mean, and


measures of dispersion, such as the standard deviation,
are determined by using the individual values
• However, sometimes we are given only the frequency
distribution and wish to estimate the mean or standard
deviation.
• Nextn, we show how we can estimate the mean and
standard deviation from data organized into a frequency
distribution
• NOTE: a mean or standard deviation from grouped data
is an estimate of the corresponding actual values.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

25
30/10/2015

3 - 51
The Mean
of Grouped Data

The mean of a sample of data organized


in a frequency distribution is
computed by the following formula:

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 52

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

26
30/10/2015

3 - 53
The Mean
of Grouped Data

A sample of ten movie theatres in a metropolitan


area tallied the total number of movies showing
last week.
Compute the mean number of movies showing
per theatre.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Mean x = S fM
3 - 54

of Grouped Data n
Continued…
Movies Frequency Class (f)(M)
Showing f Midpoint (M)
1 to under 3 1 2 2
3 to under 5 2 4 8

5 to under 7 3 6 18
7 to under 9 1 8 8
9 to under 11 3 10 30
Total 10 66
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

27
30/10/2015

The Mean 3 - 55

of Grouped Data
Continued…
Movies Frequency Class (f)(M)
Showing f Midpoint (M)

Total 10 66

Formula
X= SfM
n

= 66 = 6.6
10
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Mean 3 - 56

of Grouped Data
Determine the average student study time
Hours Frequency Class (f)(M)
Studying f Midpoint (M)
10 to under 15 5 12.5 62.5
15 to under 20 12 17.5 210

20 to under 25 6 22.5 135


fM
= S n5
25 to under 30 61027.5 137.5
Formula x =
30
= 20.33
30 to under 35 2 32.5 65
Total 30 610
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

28
30/10/2015

3 - 57
Symmetric Distribution

zero skewness
mode = median = mean

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 58
Right Skewed Distribution
Mean and Median are to the right of the Mode
Positively skewed

Mode<
Median<
Mean
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

29
30/10/2015

3 - 59
Left Skewed Distribution
Mean and Median are to the left of the Mode

Negatively skewed

< Mode
< Median
Mean
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 - 60

Mean should not be used to


represent the data

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

30
30/10/2015

4-1

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4-2

When you have completed this chapter, you will be able to:

1. Compute and interpret the range, the mean


deviation, the variance, the standard deviation,
and the coefficient of variation of ungrouped data

2. Compute and interpret the range, the variance,


and the standard deviation from grouped data

3. Explain the characteristics, uses, advantages,


and disadvantages of each measure

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

1
30/10/2015

4-3

4. Understand Chebyshev’s theorem and the normal


or empirical rule, as it relates to a set of observations

Compute and interpret percentiles, quartiles and the


5. interquartile range

6. Construct and interpret box plots

7. Compute and describe the coefficient of skewness and


kurtosis of a data distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4-4
Measures of Dispersion

• Range
• Mean Deviation
• Variance
• Standard Deviation

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

2
30/10/2015

4-5
Why Dispersion-1?

The depth average one meter. Want to cross it?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4-6
Why Dispersion-2?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3
30/10/2015

4-7

Terminology
Range
…is the difference between the
largest and the smallest value.

Only two values are used in its calculation.


It is influenced by an extreme value.
It is easy to compute and understand.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4-8

Terminology
Mean Deviation
…is the arithmetic mean of the absolute values of
the deviations from the arithmetic mean.

S x - m For
MD = population
N
All values are used in the calculation.
It is not unduly influenced by large or small values.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4
30/10/2015

4-9
For Samples

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 10

The weights of a sample of crates


containing books for the bookstore
(in kg) are:
103 97 101 106 103
Find the range and the mean deviation.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5
30/10/2015

4 - 11

103 97 101 106 103


Find the mean weight x =  x 510
n = = 102
5
Find the mean deviation ∑| − |
=

103 - 102  ...  103 - 102 1 5 1 4  5


= = = 2.4
5 5
Find the range 106 – 97 = 9

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 12

Terminology
Variance
…is the arithmetic mean of the
squared deviations
from the arithmetic mean.

 All values are used in the calculation.


 It is not influenced by extreme values.
 The units are awkward…the square
of the original units.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Computation

6
30/10/2015

4 - 13
Computing the Variance

Formula … for a Population

2 S( x - m )2
s =
N

Formula … for a Sample


n tends to
underestimate
the population

S( x - x )2
variance and
hence n-1
2
s =
n -1
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 14
The ages of the Dunn family are:
2, 18, 34, 42

What is the population mean and variance?

m=
 x = 96 = 24
N 4

2 S( x - m )2 (2 - 24 )2  ...  (42 - 24 )2
s = =
N 4
944
=
4
= 236

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

7
30/10/2015

4 - 15
Population Standard Deviation

… is the square root of the


population variance

From previous example…

s = s2
= 236 = 15.36

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Example

4 - 16
EXAMPLE
The hourly wages earned by a sample of five
students are: $7, $5, $11, $8, $6.
Find the mean, variance, and Standard Deviation.

x =
 x = 37
= 7.40
n 5

S(x - x )2 (7 - 7 .4)2  ...  (6 - 7 .4 )2 21.2


s2 = = = = 5.30
n -1 5 -1 5-1

s= s2 = 5.29 = 2.30
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8
30/10/2015

4 - 17
The Mean
of Grouped Data

From chapter 3….


A sample of ten movie theatres in a metropolitan
area tallied the total number of movies showing
last week.
Compute the mean number of movies showing
per theatre.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Mean x = S fM
4 - 18

of Grouped Data n
Continued…
Movies Frequency Class (f)(M)
Showing f Midpoint (M)
1 to under 3 1 2 2
3 to under 5 2 4 8

5 to under 7 3 6 18
7 to under 9 1 8 8
9 to under 11 3 10 30
Total 10 66
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9
30/10/2015

The Mean
x = S fM
4 - 19

of Grouped Data n
Continued…
Movies Frequency Class (f)(M)
Showing f Midpoint (M)

Total 10 66

Formula x = S fM
n
Now: Compute the
variance and = 66 = 6.6
standard deviation. 10
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 20
Standard Deviation
for Grouped Data

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

10
30/10/2015

4 - 21
Sample Variance
for Grouped Data

Movies Freq. Class (f)(M) f(M2)


Showing f Midpoint (M)
1 to under 3 1 2 2 4
3 to under 5 2 4 8 32

5 to under 7 3 6 18 108
7 to under 9 1 8 8 64
9 to under 11 3 10 30 300
Total 10 66 508
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Sample Variance 4 - 22

for Grouped Data


Movies Freq. Class (f)(M) f(M2)
Showing f Midpoint
Total 10 66 508

2 ( S fM )2
S fM -
= n
s2 n-1
662
= 508 - 10 The standard
9 deviation is
The variance is = 8.04 8.04 = 2.8
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

11
30/10/2015

4 - 23
Example

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 24

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

12
30/10/2015

4 - 25
Interpretation and Uses
of the Standard Deviation
Chebyshev’s Theorem:
For any set of observations,
the minimum proportion of the values
that lie within k standard deviations
of the mean is at least:
1
Formula 1 -
k2
where k2 is any constant greater than 1

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 26

Suppose that a wholesale plumbing supply company has


a group of 50 sales vouchers from a particular day.
The amount of these vouchers are:

How well
does this
data set
fit
Chebychev’s
Theorem?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Solution

13
30/10/2015

4 - 27
Solution (continued)
Using
Step 1
Determine the mean and Mean = $319
standard deviation of the sample SD = $101.78

Step 2
Input k =2 1- 1
into Chebyshev’s theorem 22 = 1 – ¼ = 3/4

i.e. At least .75 of the observations will fall


within 2SDof the mean.

Step 3
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 28
Solution (continued)
Step 3
Using the mean and SD, Mean = $319
find the range of data values SD = $101.78
within 2 SD of the mean
( x - 2S, x + 2S) = 319 - (2)101.78, 319 +2(101.78)
= (115.44, 522.56)
Now, go back to the sample data,
and see what proportion of the values fall between
115.44 and 522.5656

Proportion
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

14
30/10/2015

4 - 29
Solution (continued)
Proportion of the values
that fall
between 115.44 and 522.56
We find that
48-50
or 96%
of the data
values are in
this range
– certainly
at least 75%
as the theorem
suggests!

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 30
Interpretation and Uses of the
Standard Deviation
Empirical Rule:
For any symmetrical, bell-shaped distribution:
…About 68% of the observations
will lie within 1s of the mean
…About 95% of the observations will
lie within 2s of the mean
…Virtually all the observations
will be within 3s of the mean
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

15
30/10/2015

Bell-Shaped Curve 4 - 31

…showing the relationship between


s and m

m-3s m m 3s
m-2s m2s
m-1s m1s
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 32

Suppose that a wholesale plumbing supply company has


a group of 50 sales vouchers from a particular day.
The amount of these vouchers are:

How well
does this
data set
fit the
Empirical
Rule?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Solution

16
30/10/2015

Solution 4 - 33

First check if the histogram has an approximate mound-shape

Not bad…so we’ll proceed!


We need to calculate the mean and standard deviation
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 34

Skewness
…is the measurement of the
lack of symmetry
of the distribution

…The coefficient of skewness


can range from -3.00 up to +3.00

…A value of 0 indicates a symmetric distribution.


It is computed as follows:
SK = 3 ( Mean - Median ) 1
σ

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

17
30/10/2015

4 - 35

Skewness
SK1 = 3 ( Mean - Median )
SD
Following are the earnings per share for a sample of 15
software companies for the year 2000. The earnings
per share are arranged from smallest to largest.
$0.09 0.13 0.41 0.51 1.12 1.20 1.49 3.18
3.50 6.36 7.83 8.92 10.13 12.99 16.40
Find the
Mean = 4.95 SK = 3(4.95-3.18)/5.22
coefficient 1
Median = 3.18
of = 1.017
SD = 5.22
skewness.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 36
Positively Skewed Distribution
Mean and Median are to the right of the Mode

Mode<
Median<
Mean
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

18
30/10/2015

4 - 37
Negatively Skewed Distribution
Mean and Median are to the left of the Mode

< Mode
< Median
Mean
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 38
Percentile

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

19
30/10/2015

4 - 39
Example of Percentile

sort

First Quartile

Third Quartile

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 40

Interquartile
Range
…is the distance between the third quartile
Q3 and the first quartile Q1.
This distance
will include the middle 50 percent of the
observations.

Interquartile Range = Q3 - Q1

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Example

20
30/10/2015

4 - 41

Example
For a set of observations the
third quartile is 24 and the first quartile is 10.
What is the interquartile range?

The interquartile range is 24 - 10 = 14.


Fifty percent of the observations
will occur between 10 and 24.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 42

Box Plots

…is a graphical display, based on quartiles,


that helps to picture a set of data
Five pieces of data are needed to construct a box plot:
… the Minimum Value,
… the First Quartile,
… the Median,
… the Third Quartile, and
… the Maximum Value

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Example

21
30/10/2015

4 - 43

Example
Based on a sample of 20 deliveries, Buddy’s
Pizza determined the following information.
The…minimum delivery time was 13minutes
…the maximum 30 minutes
The…first quartile was 15 minutes
…the median 18 minutes, and
… the third quartile 22 minutes
Develop a box plot for the delivery times.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Solution

Solution 4 - 44

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

22
30/10/2015

4 - 45

Right skewed because:


• The dashed line to the right of the box from Q3 to the max
time is longer than the dashed line from Q1 to the min value
• The distance from the first quartile to the median is smaller
than the distance from the median to the third quartile

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 46

The following are the average rates of return


for Stocks A and B over a six year period,
In which of the following Stocks would you
prefer to invest?

Why?
Stock A: 7 6 8 5 7 3
Stock B: 15 -10 18 10 -5 8
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

23
30/10/2015

4 - 47

Find the Mean rate of return for


each of the two stocks:

Stock A: 7 6 8 5 7 3
Mean = 36/6 = 6
Stock B: 15 -10 18 10 -5 8
Mean = 36/6 = 6
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 48

Find the Range of Values of each stock:

Stock A: 7 6 8 5 7 3
8–3=5
Stock B: 15 -10 18 10 -5 8
18 – ( -10) = 28
Therefore, Stock B is riskier.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

24
30/10/2015

4 - 49
Relative Dispersion
The coefficient of variation
is the ratio of the standard deviation to the
arithmetic mean, expressed as a percentage:
s
CV = x (100%)
A standard deviation of 10 may be perceived as
large when the mean value is 100,
but only
moderately large
when the mean value is 500!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 50

Example
Rates of return over the past 6 years for
two mutual funds are shown below.
Fund A: 8.3, -6.0, 18.9, -5.7, 23.6, 20
Fund B: 12, -4.8, 6.4, 10.2, 25.3, 1.4

Which one has a higher level of risk?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Solution

25
30/10/2015

4 - 51

Solution Fund A Fund B

Mean 9.85 Mean 8.42


Let us use Standard Error 5.38 Standard Error 4.20
the Excel Median 13.60 Median 8.30
printout Mode #N/A Mode #N/A
that is run Standard Deviation 13.19 Standard Deviation 10.29
from the Sample Variance 173.88 Sample Variance 105.81
“Descriptive Kurtosis -2.21 Kurtosis 0.90
Statistics” Skewness -0.44 Skewness 0.61
sub-menu Range 29.60 Range 30.1
Minimum -6 Minimum -4.8
Maximum 23.6 Maximum 25.3
Sum 59.1 Sum 50.5
Count 6 Count 6
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 52

Solution Fund A Fund B

Mean 9.85 Mean 8.42


Is Standard Error 5.38 Standard Error 4.20
Fund A Median 13.60 Median 8.30
Mode #N/A Mode #N/A
riskier
Standard Deviation 13.19 Standard Deviation 10.29
because Sample Variance 173.88 Sample Variance 105.81
its Kurtosis -2.21 Kurtosis 0.90
standard Skewness -0.44 Skewness 0.61
Range 29.60 Range 30.1
deviation
Minimum -6 Minimum -4.8
is Maximum 23.6 Maximum 25.3
larger? Sum 59.1 Sum 50.5
Count 6 Count 6
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

26
30/10/2015

4 - 53

Solution Fund A Fund B

Mean 9.85 Mean 8.42


But the Standard Error 5.38 Standard Error 4.20
means of Median 13.60 Median 8.30
Mode #N/A Mode #N/A
the two
Standard Deviation 13.19 Standard Deviation 10.29
funds are Sample Variance 173.88 Sample Variance 105.81
different. Fund A has a -2.21
Kurtosis higher rate of return,
Kurtosis 0.90
Skewness but it also -0.44 Skewness
has a larger sd. 0.61
Range 29.60 Range 30.1
Therefore
Minimum
we need to compare
-6 Minimum
the -4.8
Maximum relative
23.6variability
Maximum 25.3
Sum using the coefficient
59.1 Sum of variation. 50.5
Count 6 Count 6
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4 - 54

Solution s
CV = x (100%)
Fund A: CV = 13.19 / 9.85 = 1.34

Fund B: CV = 10.29 / 8.42 = 1.22

So now we say that there is


more variability in Fund A
as compared to Fund B
Therefore, Fund A is riskier.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

27
11/23/2015

5-1

A Survey of

Concepts

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5-2

When you have completed this chapter, you will be able to:

1 Explain the terms random experiment,


outcome, sample space, permutations,
and combinations.
2 Define probability.

3 Describe the classical, empirical, and subjective


approaches to probability.
4 Explain and calculate conditional probability
and joint probability.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

1
11/23/2015

5-3

5 Calculate probability using the rules of


addition and rules of multiplication.

6 Use a tree diagram to organize and


compute probabilities.
7 Calculate a probability using Bayes’ theorem.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Types of Statistics 5-4

Descriptive Inferential

Methods of… Science of…


collecting making inferences
organizing about a population,
presenting based on sample
and information.
analyzing data
Emphasis now to be on this!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

2
11/23/2015

5-5

Terminology
Probability
…is a measure of the
likelihood that an event in the future will happen!

 It can only assume a value between 0 and 1.


 A value near zero means the event is not
likely happen; near one means it is likely..
 There are three definitions of probability:
classical, empirical, and subjective
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5-6

Terminology
Random Experiment
…is a process
repetitive in nature
the outcome of any trial is uncertain
well-defined set of possible outcomes
each outcome has a probability
associated with it
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3
11/23/2015

5-7

Terminology
…is a particular result of a
random experiment.
... is the collection or set of all
the possible outcomes of a
random experiment.
…is the collection of one or more
outcomes of an experiment.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5-8
Example

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4
11/23/2015

Approaches to Assigning Probability 5-9

Subjective
…probability is based on whatever information is available
Objective
Classical Probability Empirical Probability
… is based on the … applies when the number
assumption that the of times the event happens
outcomes of an experiment is divided by the number of
are equally likely observations

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

S ubjective
Probability
5 - 10

…. refers to the chance of occurrence


assigned to an event
by a particular individual
It is not computed objectively,
i.e., not from prior knowledge or from actual data…

…that the Toronto Maple Leafs will win the Stanley


Cup next season!
…that you will arrive to class on time tomorrow!

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5
11/23/2015

CPlassical
robability
5 - 11

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

E Pmpirical
robability
5 - 12

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

6
11/23/2015

E Pmpirical
robability
5 - 13

Students measure the contents of their soft


drink cans… 10 cans are underfilled,
32 are filled correctly and
8 are overfilled
When the contents of the next can is measured,
what is the probability that it is… (a) filled correctly?
P(C) = 32 / 50 = 64%
…(b) not filled correctly?
P(~C) = 1 – P(C) = 1 - .64 = 36%
This is called the Complement of C
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 14
Example

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

7
11/23/2015

5 - 15
What do you think?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 16
Random Experiment

The experiment is rolling the die...once!

The possible outcomes are the numbers…

1 2 3 4 5 6
An event is the occurrence of an even number
i.e. we collect the outcomes 2, 4, and 6.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8
11/23/2015

5 - 17
Tree Diagrams

This is a useful device to show all the possible outcomes


of the experiment
and their corresponding probabilities

Consider the random experiment of flipping a coin twice.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 18
Tree Diagrams
Origin First Second Expressed as:
Flip Flip
P(HH)= 0.25
H HH
H P(HT)= 0.25
T HT Simple Events
H TH
P(TH)= 0.25

T
T TT P(TT)= 0.25
1.00

New
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9
11/23/2015

5 - 19
Tree Diagrams
Origin Appetizer Entrée Dessert
Pie
Menu Ice Cream
Beef
Appetizer:
Soup or Turkey
Soup Pie
Juice
Entrée: Fish Ice Cream
Beef Beef Pie
Turkey Ice Cream
Fish Turkey Pie
Dessert: Juice
Ice Cream
Pie Fish
Pie
Ice Cream
Ice Cream
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 20
Tree Diagrams
How many complete dinners are there?

5 - 12
T re e D ia g ra m s
O r ig in A p p e t iz e r E n tr é e D essert
P ie
M enu Ice Cream
Beef
P ie
A p p e t iz e r :
S oup or T urkey Ice Cream
S oup P ie
J u ic e
E ntré e : F is h Ice Cream
Beef
T ur ke y
F is h
Beef
T urk ey
P ie
Ice Cream
P ie
12
J u ic e
D e s s e r t: Ice Cream
P ie F is h
P ie
Ice C ream
Ice Cream

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

10
11/23/2015

5 - 21
Tree Diagrams

5 - 12
How many
Tree dinners
Diagramsinclude beef?

Origin Appetizer Entrée Dessert


1.
Pie
M enu Ice Cream
2.
Beef
Pie
Appetizer:
Soup or
Juice
Soup
Turkey
Fish
Ice Cream
Pie 4
Entrée: Ice Cream
Beef Pie 3.
Beef Ice Cream
Turkey 4.
Fish Turkey Pie
Juice
Dessert: Ice Cream
Pie Fish
Pie
Ice Cream
Ice Cream

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 22
Tree Diagrams
What is the probability that a complete dinner will
include…
Tree Diagrams
5 - 12 Juice?
6/12
Origin Appetizer Entrée Dessert
Pie
Menu Ice Cream Turkey?
Beef
Pie
Appetizer:
Soup or Soup
Turkey Ice Cream 4/12
Juice Pie
Entrée: Fish Ice Cream
Beef Pie
Beef
Turkey
Pie
Ice Cream Both beef
Fish Turkey
Juice and soup?
Dessert: Ice Cream
Pie Fish
Pie
See next
Ice Cream
Ice Cream
2/12 slide…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

11
11/23/2015

5 - 23
M * N Rule
If one thing can be done in M ways,
and if after this is done, something else
can be done in N ways,
then both things can be done in a
total of M*N different ways in that stated order!
Legend: Appetizer Entrée Dessert
Refer back to tree diagram example:
# different meals = 2 * 3 * 2 = 12
# meals with beef = 2 * 1 * 2 = 4
# meals with juice = 1 * 3 * 2 = 6
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 24

When getting dressed, you have a


choice between wearing one of:
3 pairs of shoes
2 pairs of pants
5 shirts
Find the number of
different “outfits” possible

3 * 2 * 5 = 30

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

12
11/23/2015

5 - 25
robability
What is the probability of
drawing a red Ace
from a deck of well-shuffled cards?

P( Red Ace) = 2/52


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 26
robability
Deck = 52 Cards
4 Suits

Clubs Diamonds Hearts Spades


13 cards in each

Key steps
Using robabilityAnalysis
1. Determine….the Outcomes that Meet Our
Condition
2. List….all Possible Outcomes
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

13
11/23/2015

5 - 27
robability

P = probability …of getting four(4) aces


Deck = 52 Cards(the Population)

4 Suits
4 Suits
x
Clubs Diamonds Hearts Spades 13 cards

13 cards in each
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 28
robability
Deck = 52 Cards
4 Suits (13 cards in each)

Hearts Spades Diamonds Clubs

Each Suit has a…….


‘Honours’
cards

Scenarios
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

14
11/23/2015

5 - 29
Deck = 52 Cards
4 Suits (13 cards in each)
Scenarios

Hearts Spades Diamonds Clubs

Condition Outcomes 4
1. Draw an Ace
All Possible Outcomes 52
2. Draw a Black Ace Condition Outcomes 2
All Possible Outcomes 52
3. Draw a Red Card Condition Outcomes 26
All Possible Outcomes 52
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 30
Deck = 52 Cards
4 Suits (13 cards in each)
Scenarios

Hearts Spades Diamonds Clubs

4. Drawing…a Red Condition Outcomes 26 2 28


Card or a black All Possible Outcomes 52
+ 52
= 52
Queen
-or- P(Red) + P(Queen) - P (Red Queen)
= 26 + 4 - 2 28
= 52
52

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

15
11/23/2015

Deck = 52 Cards 5 - 31

4 Suits (13 cards in each)


Scenarios

Hearts Spades Diamonds Clubs


What is the probability of drawing a Jack or
a King from a deck of well-shuffled cards?

= 4/52

= 4/52
P( Jack or King) = 4/52 + 4/52 = 8/52

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Deck = 52 Cards 5 - 32

4 Suits (13 cards in each)


Scenarios

Hearts Spades Diamonds Clubs


What is the probability of drawing one card that
is both a Jack and a King from a deck of well-
shuffled cards?

These are MUTUALLY EXCLUSIVE events,


i.e. they can’t both happen at the same time!

P( Jack and King) = 0

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

16
11/23/2015

Deck = 52 Cards 5 - 33

4 Suits (13 cards in each)


Scenarios

Hearts Spades Diamonds Clubs


What is the probability of drawing one card that
is both BLACK and a King from a deck of well-
shuffled cards?

P( Black and King) =2/52


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Deck = 52 Cards 5 - 34

4 Suits (13 cards in each)


Scenarios

Hearts Spades Diamonds Clubs


What is the probability of drawing a card that is
either BLACK or a King from a deck of well-
shuffled cards?

Formula P(A or B) = P (A) + P(B) –P(Both)


This is called the Addition Rule

P( Black or King) = 26/52 + 4/52 - 2/52 = 28/52


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

17
11/23/2015

5 - 35

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Deck = 52 Cards 5 - 36

4 Suits (13 cards in each)


Scenarios

Hearts Spades Diamonds Clubs


What is the probability of drawing a King given
that you have drawn a BLACK card?
Our sample space is now just the BLACK cards

P(King|Black ) = 2/26
This is called a CONDITIONAL probability
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Alternate solution

18
11/23/2015

Deck = 52 Cards 5 - 37

4 Suits (13 cards in each)


Scenarios

Hearts Spades Diamonds Clubs


What is the probability of drawing a King given
that you have drawn a BLACK card?
P (Both)
Formula P(A|B) =
P(Given)
= (2/52) / (26/52)
= (2/52) * (52/26)
= 2/26
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Deck = 52 Cards 5 - 38

4 Suits (13 cards in each)


Scenarios

Hearts Spades Diamonds Clubs


What is the probability of drawing a King of Clubs
given that you have drawn a BLACK card?

Formula
P (Both)
P(A|B) = P(Both) = P(Given) P (A|B)
P(Given)
P(King of Clubs|Black ) = (1/52) / (26/52)
= (1/52) * (52/26)
= 1/26
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

19
11/23/2015

Deck = 52 Cards 5 - 39

4 Suits (13 cards in each)


Scenarios

Hearts Spades Diamonds Clubs


What is the probability of drawing a King of Clubs
given that you have drawn a CLUB?

P(King of Clubs given Club)


= P(King of Clubs|Club)
= P(1/52) / (13/52)
= (1/52) * (52/13)
= 1/13
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 40

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

20
11/23/2015

5 - 41

Reading
Probabilities
from a Table
(Contigency Table)

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Reading Probabilities 5 - 42
from a Table
A survey of undergraduate students in the School of
Business Management at Eton College revealed the
following regarding the gender and majors of the students:
Gender Accounting International HR TOTAL
Male 150 150 50 350
Female 175 160 65 400
325 310 115 750
What is the Probability of selecting a female student?

400/750 = 53.33%
More
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

21
11/23/2015

Reading Probabilities 5 - 43
from a Table
What is the Probability of selecting a Human
Resources or International major?
Gender Accounting International HR TOTAL
Male 150 150 50 350
Female 175 160 65 400
325 310 115 750

P(HR or I) = P(HR) + P(I)


= 115/750 + 310/750 = 425/750
= 56.67%
More
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Reading Probabilities 5 - 44
from a Table
What is the Probability of selecting a Female
or International major?

Gender Accounting International HR TOTAL


Male 150 150 50 350
Female 175 160 65 400
325 310 115 750

P(F or I) = P(F) + P(I) – P(F and I)


= 400/750 + 310/750 – 160/750
= 550/750 = 73.33%
More
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

22
11/23/2015

Reading Probabilities 5 - 45
from a Table
What is the Probability of selecting a Female
Accounting student?
Gender Accounting International HR TOTAL
Male 150 150 50 350
Female 175 160 65 400
325 310 115 750

P(F and A) = 175/750 = 23.33%

More
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Reading Probabilities 5 - 46
from a Table
What is the Probability of selecting a Female,
given that the person selected
is an International major?
Gender Accounting International HR TOTAL
Male 150 150 50 350
Female 175 160 65 400
325 310 115 750

P(F|I) = 160/310 = 51.6%


Alternative Solution
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

23
11/23/2015

Reading Probabilities 5 - 47
from a Table
What is the Probability of selecting a Female,
given that the person selected
is an International major?
Formula P(A|B) = P(Both)
P(Given)
P(F|I) = P(F and I) / P(I)
= (160/750) / (310/750)
= 160/310
= 51.6%
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Reading Probabilities 5 - 48
from a Table
What is the Probability of selecting an
International major, given that the person
selected is a Female?
Gender Accounting International HR TOTAL
Male 150 150 50 350
Female 175 160 65 400
325 310 115 750

P(I|F) = 160/400 = 40%


More
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

24
11/23/2015

Reading Probabilities 5 - 49
from a Table

…between F given I and …I given F!


Notice the significant difference:
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Terminology 5 - 50

Independent Events
Events are independent if the occurrence of
one event does not affect the probability of the other
Each flip is
independent of the other! Find the probability
Flip once of flipping
Flip twice
2 Heads in a row

P(2H) = .5*.5
Consider the random
= .25 or 25%
experiment of flipping a coin twice.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

25
11/23/2015

Terminology 5 - 51

Independent Events
Draw three cards with replacement
i.e., draw one card,
look at it,
put it back,
and repeat twice more.
Each draw is independent of the other
Find the probability of drawing 3 Queens in a row:
P(3Q) = 4/52 * 4/52 *4/52 = 0.00046 = most unlikely!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Dependent Events 5 - 52

Consider 2 events:
 Drawing a RED card from a deck of cards
 Drawing a HEART from a deck of cards

Are these two events considered to be independent?

If two events, A and B are independent, then


P(A|B) = P(A)

P(Red) = 26/52 = 1/2


P(Red|Heart) = 13/13 = 1
Therefore these are NOT independent events!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

26
11/23/2015

5 - 53

ayes’
heorem

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 54
ayes’
heorem

…is a method for revising a probability


given additional information!

Formula

P(A1 ) P(B|A1 )
P(A1|B) =
P(A1 ) P(B|A1)+ P(A2 )P(B|A2 )

Example
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

27
11/23/2015

5 - 55
ayes’
heorem
Duff Cola Company recently received several
complaints that their bottles are under-filled.
A complaint was received today but the production
manager is unable to identify which of the two
Springfield plants (A1 or A2) filled this bottle.
What is the probability that the under-
filled bottle came from plant A1?

% of Total Production % of Underfilled Bottles


A1 55 3
A2 45 4
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5 - 56
ayes’
heorem
What is the probability that the under-
filled bottle came from plant A1?
% of Total Production % of Underfilled Bottles
A1 55 3
A2 45 4
List the P(plant A1) = .55
1
Probabilities given P(plant A2) = .45
Input values into P(Underfilled - A1) = .03
2 formula and compute P(Underfilled - A2) = .04
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

28
11/23/2015

5 - 57
ayes’
heorem
What is the probability that the under-
filled bottle came from plant A?
List the P(plant A1) = .55
1
Probabilities given P(plant A2) = .45

2 Input values into P(Underfilled/ A1) = .03


formula and compute
P(Underfilled/ A2) = .04
P(A1 ) P(B|A1 )
P(A1 |B) = The likelihood that the
P(A1 )P(B|A1 )+ P(A2 ) P(B|A2 ) underfilled bottle came
.55(.03) from Plant A1
= = .4783 has been reduced from
.55(.03) + .45(.04)
55% to 47.83%
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

29
8/01/2016

6-1 6-2

When you have completed this chapter, you will be able to:
1. Define the terms probability distribution and
random variable.
2. Distinguish between discrete and
Discrete Probability Distributions continuous random variables.
3. Calculate the mean, variance, and standard deviation of
a discrete probability distribution.

4. Describe the characteristics and compute probabilities


using the Poisson probability distribution.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

1
8/01/2016

6-3 6-4
Introduction Introduction
• Chapter 2 – 4 explained the descriptive statistics • Chapter 5 examined something that would probably
– Describing raw data by organizing it into a frequency happen
distribution and portraying the distribution in tables, – We note that this facet of statistics is called statistical
graphs, and charts inference.
– Computing a measure of location—such as the – The objective is to make inferences (statements) about a
arithmetic mean, median, or mode—to locate a typical population based on a number of observations, called a
value near the center of the distribution sample, selected from the population
– The range and the standard deviation are used to
describe the spread in the data
• Probability is a value between 0 and 1 inclusive, and
can be combined using rules of addition and
• These chapters focus on describing something that multiplication
has already happened.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

2
8/01/2016

6-5 6-6
Introduction
Terminology
• Chapter 6 explains a probability distribution Random Variable
that gives the entire range of values that can …is a numerical value determined by
occur based on an experiment the (random) outcome of an experiment.
• A probability distribution is similar to a
Probability Distribution
relative frequency distribution.
…is the listing of all possible outcomes
• However, instead of describing the past, it of an experiment
describes a likely future event. and the corresponding probability.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3
8/01/2016

Types of Probability Distributions 6-7


Types of Probability Distributions 6-8

Discrete Continuous
Discrete Continuous Students in a class Examples
Under this distribution Under this distribution Distance driven by an
executive to get to work
the random variable the random variable
has a has an Number of children The length of time of a
countable number infinite number in a family particular phone call
of possible outcomes of possible outcomes Mortgage Loan The length of
Number of Mortgages time of an
Resulting of Resulting of some
counting type of measurement
approved in a month afternoon nap!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4
8/01/2016

6-9 6 - 10
Distinguishing features
of a
Consider a random experiment in which
Discrete Distribution:
a coin is tossed three times
The sum of the probabilities of the various
outcomes is 1.00 Let x be the number of Heads

The probability of a particular outcome Let H represent the outcome of a Head


is between 0 and 1.00
The outcomes are mutually exclusive Let T represent the outcome of Tails

Determine the probability distribution


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5
8/01/2016

Listing the possibilities 6 - 11 6 - 12

Heads Heads Heads Heads Heads Tails

Heads Tails Heads Tails Heads Heads

Heads Tails Tails Tails Heads Tails

Tails Tails Heads Tails Tails Tails

… the possible values of x


(number of heads) are 0,1,2,3.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

6
8/01/2016

6 - 13 6 - 14
Probability Distribution
Consider a Probability Distribution
random
experiment in # of
x P(x)
which a coin is Outcomes
tossed three times. 0 1 1/8
Determine the 1 3 3/8
probability
distribution. 2 3 3/8
What is the
3 1 1/8
probability of
tossing 2 heads in 8 8/8 = 1
3 flips?
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

7
8/01/2016

Mean of a Discrete 6 - 15 Mean of a Discrete 6 - 16

Probability Distribution Probability Distribution

reports the central location of the data


Formula m = S[xP( x))]
is denoted by the Greek symbol m, mu x P(x) xP(x)
Flip a coin three times 0 1/8 0
is the long-run average value of
the random variable Let x be the 1 3/8 3/8
also referred to as its expected value, E(X), number of heads 2 3/8 6/8
in a probability distribution
3 1/8 3/8
is a weighted average
8/8 = 1 12/8=1.5
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8
8/01/2016

Variance of a Discrete 6 - 17
Variance of a Discrete 6 - 18

Probability Distribution Probability Distribution


Flip a coin three times.
…measures the amount of spread Let x be the number of heads
(variation) of a distribution.
Formula s 2 = S [(x - m)2 P ((x)
x)]
…denoted by the Greek
x P(x) xP(x) X-m (X-m )2 (X- m )2P(x)
letter s2 (sigma squared). 0 1/8 0 - 1.5 2.25 .28125
1 3/8 3/8 - 0.5 0.25 .09375
…the standard deviation is
2 3/8 6/8 0.5 0.25 .09375
the square root of s2 3 1/8 3/8 1.5 2.25 .28125
8/8 = 1 m =1.5 0.75
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9
8/01/2016

6 - 19
Computing the m 6 - 20

Dan Desch, owner


of College Painters,
# of Painted
Houses x Weeks P(x) Formula m = S [xP( x))]
studied his records # of Painted
10 5 5/20 = 0.25
for the past Houses P(x) xP(x)
20 weeks 11 6 6/20 = 0.30
and reported the 10 5/20 = 0.25 2.5
following 12 7 7/20 = 0.35 11 6/20 = 0.30 3.3
number of houses
painted per week: 13 2 2/20 = 0.10 12 7/20 = 0.35 4.2
20 20/20 = 1.0
13 2/20 = 0.10 1.3
Determine the Probability distribution and its mean and variance.
20/20 = 1.0 11.3

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Computing the m
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Computing the s2

10
8/01/2016

Computing the s 2 6 - 21 6 - 22

Formula s 2 = S [((xx -- m)2 P((x)


x)]
x P(x) xP(x) (x - m )2 (x - m )2 P(x) Binomial
0.25 2.5 1.69 2
(10-11.3) .4225
10
11 0.30 3.3 0.09 .0270
Probability
12 0.35 4.2 0.49 .1715 Distribution
13 0.10 1.3 2.89 .2890
1.0 11.3 0.91

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

11
8/01/2016

6 - 23 Binomial Probability 6 - 24

Terminology Distribution
Bernoulli Trial The experiment consists of n Bernoulli or
…is a random experiment in which the binomial trials
number of possible outcomes The outcomes are classified into
is precisely two! one of two mutually exclusive categories,
…such as success or failure
For Example…
The probability of success
stays the same for each trial
Heads Tails
The trials are independent
Right Wrong
Interested in the number of successes
Course Grade… A B or C or D or…!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

12
8/01/2016

Binomial Probability 6 - 25 6 - 26

Distribution Combination
Let…
n be the number of trials
x be the number of observed successes
π be the probability of success on each trial

Formula P(x) = n Cx πx(1-π)(n-x)

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

13
8/01/2016

x (n-x)
6 - 27
P( x) =n Cx π ( 1 - π) 6 - 28

The Department of Labour reports that DATA: 20% Unemployed & Sample of 10
20% of the workforce
aged between 15 and 19 years is unemployed.
 Exactly three are unemployed
P ( 3) = 10C 3 (. 20 )3 (1 - . 20 )10-3
From a sample of 10 workers in this age group, = (120)(.0080)(.2097)
calculate the following probabilities:
= .2013 or 20.13%
 Exactly three are unemployed
At least three are unemployed
 At least three are unemployed
P(x  3) = 10 C3 (.20)3 (.80)10-3 +... +10C10 (.20)10 (.80)0
 None are unemployed
 At least one is unemployed = .2013 + .0881 + ... + .000
= .322 or 32.2%
Calculations More…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

14
8/01/2016

x (n-x)
P( x) =n Cx π ( 1 - π) 6 - 29 Binomial Probability 6 - 30

Distribution
DATA: 20% Unemployed & Sample of 10 n = 10
Table 6.2 Probability
 None are unemployed X 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95
P ( 0) = 10C 0 (. 20 )0 (1 - . 20 )10-0 0 0.599 0.349 0.107 0.028 0.006 0.001 0.000 0.000 0.000 0.000 0.000
1 0.315 0.387 0.268 0.121 0.040 0.010 0.002 0.000 0.000 0.000 0.000
= (1)(1) (.1074) 2 0.075 0.194 0.302 0.233 0.121 0.044 0.011 0.000 0.000 0.000 0.000
3 0.010 0.057 0.201 0.267 0.215 0.117 0.042 0.009 0.001 0.000 0.000
= .1074 or 10.74% 4 0.001 0.011 0.088 0.200 0.251 0.205 0.111 0.037 0.006 0.000 0.000

 At least one is unemployed 5 0.000 0.001 0.026 0.103 0.201 0.246 0.201 0.103 0.026 0.001 0.000
6 0.000 0.000 0.006 0.037 0.111 0.205 0.251 0.200 0.088 0.011 0.001
…same as “all the time”(100%) except when 7 0.000 0.000 0.001 0.009 0.042 0.117 0.215 0.267 0.201 0.057 0.010
“none are unemployed” 8 0.000 0.001 0.000 0.001 0.011 0.044 0.121 0.233 0.302 0.194 0.075
9 0.000 0.000 0.000 0.000 0.002 0.010 0.040 0.121 0.268 0.387 0.315
i.e. P ( 1) = 1 – P(0) = 1 - .1074 = .8926 or 89.26% 10 0.000 0.001 0.000 0.000 0.000 0.001 0.006 0.028 0.107 0.349 0.599
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

15
8/01/2016

Binomial Probability 6 - 31 Alternate Solution… 6 - 32

Distribution
Using Appendix A
n = 10
From text Appendix A Probability Binomial Probability Distribution
X 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95 DATA: 20% Unemployed & Sample of 10
0 0.599 0.349 0.107 0.028 0.006 0.001 0.000 0.000 0.000 0.000 0.000
1 Where
0.315 represents
0.387 0.268 X the
0.121 0.040 ‘Number Unemployed’
0.010 0.002 0.000 0.000 0.000 0.000  Exactly three are unemployed
2 0.075 0.194 0.302 0.233 0.121 0.044 0.011 0.000 0.000 0.000 0.000
3 Represents
0.010 the0.267 0.215
0.057 0.201 ‘Probability’ 0.117 0.042 0.009 0.001 0.000 0.000 X 0.05 0.10 0.20 0.30 …0.80 0.90 0.95
4 0.001 0.011 0.088 0.200 0.251 0.205 0.111 0.037 0.006 0.000 0.000
0 0.599 0.349 0.107 0.028 …0.000 0.000 0.000
5 Explanations
0.000 0.001 0.026 0.103 0.201 0.246 0.201 0.103 0.026 0.001 0.000 1 0.315 0.387 0.268 0.121 …0.000 0.000 0.000
6 0.000 0.000 0.006 0.037 0.111 0.205 0.251 0.200 0.088 0.011 0.001 2 0.075 0.194 0.302 0.233 …0.000 0.000 0.000
7 0.000 0.000 0.001 0.009 0.042 0.117 0.215 0.267 0.201 0.057 0.010 3 0.010 0.057 0.201
0.201 0.267 …0.001 0.000 0.000
8
9
0.000Represents
0.001 0.000the
0.001 0.011 0.044 0.121 0.233
‘ PROBABILITY’ when0.302
0.000 0.000 0.000 0.000 0.002 0.010 0.040 0.121 0.268
n = 100.194
0.387
0.075
0.315
4 0.001 0.011 0.088 0.200 …0.006 0.000 0.000

10
= .201 or 20.1%
0.000 0.001 0.000 0.000 0.000 0.001 0.006 0.028
Using0.107 0.349 0.599
Appendix A
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

16
8/01/2016

Alternate Solution… 6 - 33 Alternate Solution… 6 - 34

Using Appendix A
Using Appendix A
Binomial Probability Distribution
Binomial Probability Distribution
DATA: 20% Unemployed & Sample of 10
DATA: 20% Unemployed & Sample of 10
At least three are unemployed
At least three are unemployed X 0.05 0.10 0.20 To account for the ‘at least 3 unemployed’,
Alternate Reasoning: 0 0.599 0.349 0.107 we must TOTAL the percentages from
1 0.315 0.387 0.268 3 to 10, inclusively
If ‘at least three are unemployed’ 2 0.075 0.194 0.302
0.201
it follows that 3 0.010 0.057 0.201
0.088
4 0.001 0.011 0.088
‘at most seven are employed!’ 0.026
5 0.000 0.001 0.026
0.006
You can turn this into a problem of 6 0.000 0.000 0.006
0.001
= 0.322 or 32.2%
7 0.000 0.000 0.001
80% employment if you wish 0.000
8 0.000 0.000 0.000
0.000
9 0.000 0.000 0.000
0.000
10 0.000 0.000 0.000
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

17
8/01/2016

Alternate Solution… 6 - 35 Alternate Solution… 6 - 36

Using Appendix A Using Appendix A


Binomial Probability Distribution Binomial Probability Distribution
DATA: 20% Unemployed & Sample of 10 DATA: 20% Unemployed & Sample of 10

 None are unemployed X 0.05 0.10 0.20


0.268  At least one is unemployed
0 0.599 0.349 0.107 0.302
X 0.05 0.10 0.20 0.30 …0.80 0.90 0.95 …at most nine
1 0.315 0.387 0.268
0 0.201 are employed!
0.599 0.349 0.107 0.028 …0.000 0.000 0.000 2 0.075 0.194 0.302
3 0.010 0.057 0.201 0.088
1 0.315 0.387 0.268 0.121 …0.000 0.000 0.000
4 0.001 0.011 0.088 0.026
2 0.075 0.194 0.302 0.233 …0.000 0.000 0.000
5 0.000 0.001 0.026 0.006 = .892 or 89.2%
3 0.010 0.057 0.201 0.267 …0.001 0.000 0.000 6 0.000 0.000 0.006 0.001
4 0.001 0.011 0.088 0.200 …0.006 0.000 0.000 7 0.000 0.000 0.001 0.000
8 0.000 0.001 0.000
= .107 or 10.7% 9 0.000 0.000 0.000 0.000
10 0.000 0.001 0.000 0.000
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

18
8/01/2016

2
Meanm and Variance s 6 - 37 6 - 38
of a
Binomial Probability Distribution

The Ontario
Formula m = nπ
Department of = 10(.20)
Labour reports that = 2.0
20% of the workforce Poisson Probability
aged between
15 and 19 years is
Formula s2= nπ(1 - π) Distribution
unemployed. = 10(.20)(.80)
From a sample of 10 = 1.60
workers in this age
group, calculate: Therefore, the
andm 2
s standard deviation is 1.6 = 1.3
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

19
8/01/2016

6 - 39 6 - 40
Introduction Poisson probability distribution

• When there is a large number of trials, but a • Describes the number of times some event
small probability of success, binomial occurs during a specified interval.
calculation becomes impractical – The interval may be time, distance, area, or
– Example: Number of deaths from horse kicks in volume
the Army in different years • It is often referred to as the “law of
improbable events,” meaning that the
probability, π, of a particular event’s
happening is quite small

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

20
8/01/2016

6 - 41 Poisson Probability 6 - 42
Applications Distribution
The Binomial Distribution becomes
• It is used as a model to describe: more skewed to the right
– the distribution of errors in data entry
Positive
– the number of scratches and other imperfections
in newly painted car panels
– the number of defective parts in outgoing as the Probability of success become smaller.
shipments The limiting form of the Binomial Distribution
– the number of customers waiting to be served at where the probability of success
a restaurant or waiting to get into an attraction
at Disney World π is small and n is large is called the
Poisson Probability Distribution
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

21
8/01/2016

Poisson Probability 6 - 43 Poisson Probability 6 - 44

Distribution Distribution
Poisson Probability Distribution
can be described mathematically using the formula: The mean number of successes… m
mx e -m can be determined in binomial

P( x) = x!
situations by…

Where… µ is the mean number of successes where n is the number of trials


in a particular interval of time and π the probability of a success
e is the constant 2.71828
The variance s2
of the Poisson distribution
x is the number of successes is also equal to nπ

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

22
8/01/2016

6 - 45 Poisson Probability 6 - 46
Example Distribution
Probability of an accident in a year is 0.00024. So in a Poisson with µ =2.4
town of 10,000 people, the mean number of person
involved in accident:
µ= nπ
= 10,000 x 0.00024 = 2.4

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

23
31/08/2018

THE

Normal
PROBABILITY
DISTRIBUTION

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

When you have completed this chapter, you will be able to:

1. Explain how probabilities are assigned


to a continuous random variable.

2. Explain the characteristics of a normal


probability distribution.

3. Define and calculate z value corresponding to


any observation on a normal distribution.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

1
31/08/2018

4. Determine the probability a random observation is


in a given interval on a normal distribution
using the standard normal distribution.

5. Use the normal probability distribution to approximate


the binomial probability distribution.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Continuous Random Variable


…the set of
all the values
in any interval
is
uncountable or infinite!
….we will now study the class of
continuous probability distributions.

Recall…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

2
31/08/2018

Variables
Quantitative … can be classified as either
Discrete or
Numerical Continuous
Observations
Characteristics
Continuous … can assume any value
within a specified range!
e.g. - Pressure in a tire
- Weight of a beef chop
- Height of students in a class
Also Recall that…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Continuous Random Variable

…the Total sum of probabilities


should always be 1!
When dealing with a
Continuous Random Variable
we assume that
the probability that the variable
will take on any particular value is 0!
Instead,
Probabilities are assigned to intervals of values!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3
31/08/2018

Continuous Probability
Distributions

f(x) Range of Values


P(a<X<b)

a b x

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Relative Frequency Histogram



Probability Function

Data:
Heights of adult Canadian males…
a) n = 50, class interval = 2
b) n = 500, class interval = 1
c) n = 5000, class interval = 0.4
d) Probability function

See Histograms
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4
31/08/2018

(A) n = 50 (B) n = 500


0.08 - 0.08 -
0.07 - 0.07 -
0.06 - 0.06 -
0.05 - 0.05 -
0.04 - 0.04 -
0.03 - 0.03 -
0.02 - 0.02 -
0.01 - 0.01 -
0.00 - 0.00 -
155 165 175 185 195 150 160 170 180 190
(C) n = 5000 0.07 - Probability Function
0.08 -
0.07 - 0.06 -
0.06 - 0.05 -
0.05 - 0.04 -
0.04 -
0.03 -
0.03 -
0.02 - 0.02 -
0.01 - 0.01 -
0.00 -
0.00 -
145 155 165 175 185 195 140 150 160 170 180 190 200
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

…that all probability functions are not normal!


A probability function
can have any shape,
as long as it is non-negative
(the curve is above the x-axis, and
the total area under the curve is 1)

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5
31/08/2018

Normal Probability Distribution

The normal The arithmetic


curve is mean, median,
bell-shaped and mode
and has a of the
single peak distribution
at the exact are equal
centre of the and located at
distribution. the peak.
…thus half the area under the curve is above
the mean and half is below it.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Normal Probability Distribution

The normal probability


distribution is
The normal asymptotic,
probability
distribution
is i.e. the curve gets
symmetrical closer and closer to
about its the x-axis
mean.
but never actually
touches it.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

6
31/08/2018

Normal Probability Distribution


Summary
Normal curve
is
symmetrical

+
Theoretically, curve extends to infinity
Mean, median, and mode are equal

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Normal Probability Distribution


part of a “family” of curves

How does the Standard Deviation values


affect the shape of f(x)?
 
= 2
 =3
 =4

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

7
31/08/2018

Normal Probability Distribution


part of a “family” of curves

How does the following Expected values


affect the location of f(x)?



 

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Standard
Normal Probability Distribution
… is a normal distribution
with a mean of 0 and a standard deviation of 1.
Also called the Z Distribution

A z-value is the distance between a selected value (designated X)


and the population mean,
divided by the population Standard Deviation, .

A z-value is, therefore, a location on the standard normal curve.

Formula z  (X µ)
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8
31/08/2018

The Standard
Normal Probability Distribution
Total Area under the curve is 100% or 1

Area = +0.5 Area = + 0.5

…since the mean is located at the peak of the curve,


half the area under the curve is
above the mean and half is below the mean
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Standard
Normal Probability Distribution
Formula z  (X µ)
The bi-monthly
starting salaries of


recent MBA z  $3,300 $3,000
graduates $300
follows the normal
distribution with a z = 1.00
mean of $3,000
A z-value of 1 indicates that
and a standard
deviation of $300. the value of $3,300 is
What is the one standard deviation
z-value for a above the mean of $3,000.
salary of $3,300?
Show curve
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9
31/08/2018

A z-value of 1 indicates that


the value of $3,300 is
one standard deviation
above the mean of $3,000.

$3300
z 

$3000
z 
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Standard
Normal Probability Distribution
Formula z  (X µ)
The bi-monthly
starting salaries of 

recent MBA  $2,550 $3,000 z
graduates follows $300
the normal Z = -1.50
distribution with a
mean of $3,000 A z-value of –1.50 indicates that
and a standard the value of $2,550 is
deviation of $300. one and a half (1.5)
What is the standard deviations
z-value for a salary
below the mean of $3,000.
of $2,550?
Show curve
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

10
31/08/2018

The Standard
Normal Probability Distribution
A z-value of –1.50 indicates that the value of
$2,550 is one and a half standard deviations
below the mean of $3,000.

$2550
z 

$3000
z 
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Standard
Normal Probability Distribution
 … there is an infinite number of normal
distributions, but only one table
 … each distribution is determined by
&
 …work with  &  to calculate normal
probabilities, by figuring out
… how many standard deviations is x
away from the mean?

…Z becomes the metre stick for measuring


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

11
31/08/2018

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

How many  is 40 away from  ?

If…
 = 50  = 8

+
0 20 40 50 60 80 100

40 is 10 units(-10)away from 50 40 is –1.25  away from 50
If 1 standard deviation () is 8 units, then 10 units must be 1.25 SD

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


Using z

12
31/08/2018

The Standard
Normal Probability Distribution
the Z Distribution

A z-value is the distance between a selected value (designated X)


and the population mean µ,
divided by the population Standard Deviation, .

 X 
… distance from the 
Formula z
 … divide by  to get z

40 – 50
z  – 10 
8 8
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Standard
Normal Probability Distribution
the Z Distribution
- z-value + z-value
Left of mean Right of mean

0 20 40 50 60 80 100
Z-scale 0
The mean µ, of a standard normal distribution is 0, and
Standard Deviation  is 1.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

13
31/08/2018

The Standard
Normal Probability Distribution
Calculate P(50 < X < 60)
If… Transferring each
 = 50  = 8 value into z-scores,

z  (X µ)

0 20 40 50 60 80 100
0 1.25
Z1= (60-50)/8 = 1.25
P(50 < X < 60) = P(0 < z < 1.25) z-table
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Normal Distribution Table


To find the area between a z score of 0 and a score of 1.25
Z 0.00
0.00 0.01
0.01 0.02
0.02 0.03
0.03 0.04 0.05
0.04 0.05
0.05 0.06
0.06 0.07
0.07
0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279
0.10 0.0398
Provides0.0438 0.0478decimal
the second 0.0517 place
0.0557 0.0597 0.0636 0.0675
0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064
0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443
0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808
0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157
0.60 Provides
0.2291 the z-score
0.2257 0.2324to 10.2357
decimal place 0.2422
0.2389 0.2454 0.2486
0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794
0.80
0.80 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078
0.90
0.90 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340
1.00
1.00 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577
1.10
1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790
1.20
1.20
1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944
0.3944 0.3962 0.3980
1.30
1.30 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
…thus

14
31/08/2018

The Standard
Normal Probability Distribution
the Z Distribution
… 0.3944 is the area between the mean and a positive z score 1.25

0.3944

0 20 40 50 60 80 100
0 1.25
Therefore, the Probability of 50<X<60 is 39.44%

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


New

The Standard
Normal Probability Distribution
Calculate P(40 < X < 60)
If… Transferring each
 = 50  = 8 value into z-scores,

z  (X µ)

0 20 40 50 60 80 100
-1.25 0 1.25
Z1= (40-50)/8 = -1.25 Z2= (60-50)/8 = 1.25
…and
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

15
31/08/2018

The Standard
Normal Probability Distribution
… because both sides are symmetrical, the left side
(the area between the mean and a negative z score -1.25)
must have the same area………….. 0.3944.

Z2= 1.25 0.3944


(determined
earlier)

0 20 40 50 60 80 100
-1.25 0 1.25

= for a Total area of: 0.3944 + 0.3944 or 0.7888

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Standard
Normal Probability Distribution
the Z Distribution
Calculate P(40  X  60)

If…
 = 50  = 8

0 20 40 50 60 80 100
P(40 < X < 60) = P(40  X  60)
…since P(X= 40 or 60) is zero
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

16
31/08/2018

Using the Normal Distribution


Table

Practise Finding Areas

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area between:


Z1 = 0 and Z2 = 1.00

1 Locate Area on
the normal curve

2 Look up 1.00 in Table

0 1.00
Be sure to always sketch the curve, insert the
given values and shade the required area.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


z-table

17
31/08/2018

Normal Distribution Table

To find the area between a z score of 0 and a score of 1.00


Z 0.00
0.00
0.00 0.01
0.01 0.02
0.02 0.03
0.03 0.04
0.04 0.05
0.05 0.06
0.06 0.07
0.07
0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279
0.10 0.0398
Provides0.0438 0.0478decimal
the second 0.0517 place
0.0557 0.0597 0.0636 0.0675
0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064
0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443
0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808
0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157
0.60 Provides
0.2291 the z-score
0.2257 0.2324to 10.2357
decimal place 0.2422 0.2454
0.2389 0.2486
0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794
0.80
0.80 0.2881 0.2910
The Area between 0<Z<1.00 is 0.34130.3051
0.2939 0.2967 0.2995 0.3023 0.3078
0.90
0.90 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340
1.00
1.00
1.00 0.3413
0.3413 Therefore,
0.3438 0.3461 the0.3485 0.3508
Probability of 0<Z<1.00
0.3531 is
0.3554 34.13%
0.3577
1.10
1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790
1.20
1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980
1.30
1.30 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area between:


Z1 = 0 and Z2 = 1.64

Locate Area on
1
the normal curve

2 Look up 1.64 in Table


0 1.64
The Area between 0<Z<1.64 is 0.4495
Therefore, the Probability of 0<Z<1.64 is 44.95%

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

18
31/08/2018

Practise using the Normal Distribution Table

…the area between:


Z1 = 0 and Z2 = 0.47

1 Locate Area on
the normal curve

2 Look up 0.47 in Table


0 0.47
The Area between 0<Z<0.47 is 0.1808

Therefore, the Probability of 0<Z<0.47 is 18.08%

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area between:


Z1 = 0 and Z2 = -1.64
1 Locate Area on
the normal curve

2 Look up 1.64 in Table

The Area between -1.64<Z<0 is 0.4495 -1.64 0


Therefore, the Probability of -1.64<Z<0 is 44.95%

Because of symmetry, this area


is the same as between z of 0 and positive 1.64
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

19
31/08/2018

Practise using the Normal Distribution Table

…the area between:


Z1 = 0 and Z2 = -2.22

1 Locate Area on
the normal curve

2 Look up 2.22 in Table

-2.22 0 2.22
The Area between –2.22<Z<0 is 0.4868

Therefore, the Probability of –2.22<Z<0 is 48.68%

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

A z-value is a location on the


standard normal curve!

Therefore, a z-value(also called z-score)


can have a positive or negative value!

However, areas and probabilities


are always positive values!

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

20
31/08/2018

Practise using the Normal Distribution Table

…the area between:


Z1 = 0 and Z2 = -2.96

1 Locate Area on
the normal curve

2 Look up 2.96 in Table

-2.96 0 2.96
The Area between –2.96<Z<0 is 0.4985
Therefore, the Probability of –2.96<Z<0 is 49.85%

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area between:


Z1 = -1.65 and Z2 = 1.65

1 Locate Area on
the normal curve

2 Look up 1.65 in Table

-1.65 0 1.65
The Area (a1) between –1.65<Z<0 is 0.4505
Add together
The Area (a2) between 0<Z<1.65 is 0.4505

Therefore, the required Total Area is 0.9010


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

21
31/08/2018

Practise using the Normal Distribution Table

…the area between:


Z1 = -2.00 and Z2 = 1.00

Locate Area on
1
the normal curve
Look up – 2.00
2 then 1.00 in Table
-2.00 0 1.00
The Area (a1) between –2.00<Z<0 is 0.4772
Add together
The Area (a2) between 0<Z<1.00 is 0.3413
Therefore, the required Total Area is 0.8185
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area between:


Z1 = -0.44 and Z2 = 1.96

1 Locate Area on
the normal curve

2 Look up – 0.44 then 1.96 in Table

-0.44 0 1.96
The Area (a1) between –0.44<Z<0 is 0.1700
Add together
The Area (a2) between 0<Z<1.96 is 0.4750

Therefore, the required Total Area is 0.6450


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

22
31/08/2018

The Standard
Normal Probability Distribution
Total Area under the curve is 100% or 1

Area = 0.5 Area = 0.5

…since the mean is located at the peak of the curve,


half the area under the curve is
above the mean and half is below the mean
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area to the Left of Z1 = 2.00


Locate Area on
1
the normal curve

2 Look up 2.00 in Table


a1
Area = 0.5
3 Adjust as needed 0 2
The Area (a1) between 0<Z<2.0 is 0.4772
The Area to the left of Z1 is a1+.5 = 0.4772 + 0.5

Therefore, the required Total Area is 0.9772


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

23
31/08/2018

Practise using the Normal Distribution Table

…the area to the Left of Z1 = 1.96

1 Locate Area on
the normal curve

2 a1
Look up 1.96 in Table Area = 0.5
3 Adjust as needed 0 1.96

The Area (a1) between 0<Z<1.96 is 0.4750


The Area to the left of Z1 is a1+.5= 0.4750 + 0.5

Therefore, the required Total Area is 0.9750


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area to the Left of Z1 = -1.64

1 Locate Area on
the normal curve
Area = 0.5
2 Look up 1.64 in Table
a1
3 Adjust as needed -1.64 0

The Area (a1) between –1.64<Z<0 is 0.4495


The Area to the left of Z1 is 0.5 - a1 = 0.5 - 0.4495

Therefore, the required Total Area is 0.0505


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

24
31/08/2018

Practise using the Normal Distribution Table

…the area to the Left of Z1 = -0.95

1 Locate Area on
the normal curve
Area = 0.5
2 Look up 0.95 in Table a1

3 Adjust as needed -0.95


0

The Area (a1) between –0.95<Z<0 is 0.3289

The Area to the left of Z1 is 0.5 - a1 = 0.5 - 0.3289

Therefore, the required Total Area is 0.1711


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area to the Right of Z1 = -0.95

1 Locate Area on
the normal curve
Area = 0.5
2 Look up 0.95 in Table
a1

3 Adjust as needed -0.95


0

The Area (a1) between –0.95<Z<0 is 0.3289


The Area to the left of Z1 is 0.5 + a1 = 0.5 + 0.3289

Therefore, the required Total Area is 0.8289


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

25
31/08/2018

Practise using the Normal Distribution Table

…the area to the Right of Z1 = 1.00

1 Locate Area on
the normal curve
A
Area = 0.5
2 Look up 1.00 in Table a1

3 Adjust as needed
0 1.00

The Area (a1) between 0<Z<1.00 is 0.3413

The Area to the left of Z1 is 0.5 - a1 = 0.5 - 0.3413

Therefore, the required Total Area is 0.1587


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Practise using the Normal Distribution Table

…the area between: Z1 = 1.96 and Z2 = 2.58

1 Locate Area on
the normal curve a1 Find

2 Look up 1.96 then 2.58 in Table a2

0 1.96
3 Adjust as needed 2.58
The Area (a1) between 0<Z<1.96 is 0.4750
Subtract
The Area (a2) between 0<Z<2.58 is 0.4951

Therefore, the required Total Area is 0.0201


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

26
31/08/2018

Practise using the Normal Distribution Table

…the area between: Z1 = 0.55 and Z2 = 1.96

1 Locate Area on
the normal curve
a1
2 Look up 0.55 then 1.96 in Table
a2
0.55
3 Adjust as needed 1.96

The Area (a1) between 0<Z<0.55 is 0.2088


Subtract
The Area (a2) between 0<Z<1.96 is 0.4750

Therefore, the required Total Area is 0.2662


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Areas under the


Normal Curve

 About 68 percent of the area under the normal curve


is within one standard deviation of the mean.
 +/- 
 About 95 percent is within two standard deviations
of the mean.
 +/- 2
 Practically all are within three standard deviations
of the mean.
 +/- 3
Examples
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

27
31/08/2018

Areas under the


Normal Curve

Between:
68.26%
95.44%
99.74%

-1 -1
-2 -2
-3 -3
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Areas under the


Normal Curve

The daily water usage per person in


Newmarket, Ontario is normally distributed
with a mean of 80 litres and
a standard deviation of 10 litres.

About 68 percent of those living in Newmarket


will use how many litres of water?
About 68% of the daily water usage
will lie between 70 and 90 litres
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

28
31/08/2018

robability that…
a person from Newmarket, selected at random,
will use between 80 and 88 litres per day, is…?

1 Locate Area on the normal curve

2 Calculate the Z scores a1

(X  80 - 80  0.00
z1   10 80 88
(X 88 - 80
z1    10  0.80 Z-scale 0.0 0.80
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279
0.10 0.0398 0.0438 0.0478 0.0517 0.0557 0.0597 0.0636 0.0675
0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064
0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443
0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808
0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157
0.60 0.2291 0.2257 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486
0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794
0.80 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078
0.90 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340
1.00 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577
1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790
1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980
1.30 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

robability that…
a person from Newmarket, selected at random,
will use between 80 and 88 litres per day, is…?

3 Look up 0.80 in Table A =.2881

The area under a normal curve between a z-value of 80 and


a z-value of 88 is 0.2881, therefore, we conclude that
28.81% of the residents will use between 80 to 88 litres per day!

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

29
31/08/2018

robability that…
a person from Newmarket, selected at random,
will use between 76 and 92 litres per day, is…?

1 Locate Area on the normal curve

2 Calculate the Z scores a1 a2

(X  76 - 80 
z1   10  0.40 76 92
(X 92 - 80 Z-scale -.4 1.20
z1    10  1.20 Z
0.00
0.10
0.00
0.0000
0.0398
0.01
0.0040
0.0438
0.02
0.0080
0.0478
0.03
0.0120
0.0517
0.04
0.0160
0.0557
0.05
0.0199
0.0597
0.06
0.0239
0.0636
0.07
0.0279
0.0675
0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064
0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443
0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808
0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157
0.60 0.2291 0.2257 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486
0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794
0.80 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078
0.90 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340
1.00 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577
1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790
1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980
1.30 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

robability that…
a person from Newmarket, selected at random,
will use between 76 and 92 litres per day, is…?

3 Look up 0.40 then 1.20 in Table

a1 =.1554
a2 =.3849
Add together A =.5403
The area under a normal curve between a z-value of 76 and
a z-value of 92 is 0.5403, therefore, we conclude that 54.03% of
the residents will use between 76 to 92 litres per day!

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.


New

30
31/08/2018

Professor Mann has


determined that the
scores in his statistics 1 Sketch given information
onto the normal curve
course are
approximately
normally distributed Let X be the score that separates
with a mean of 72 and an A from a B.
a standard deviation of 5.
He announces to the class
that the top 15 percent
of the scores will earn an A.
What is the lowest score FDCB
a student can earn A
and still receive an A?  X
72
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Determine the Z score


2 from the given areas
To begin, let X be
the score that
separates
an A from a B. a1=.35
A=.15
A
If 15 percent of the 
students score more X
72
than X, Now, use the table “backwards”
then 35 percent must to find the z-score
score between the corresponding to a1=.35
mean of 72 and X
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279
0.10 0.0398 0.0438 0.0478 0.0517 0.0557 0.0597 0.0636 0.0675
0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064
0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443
0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808
0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157
0.60 0.2291 0.2257 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486
0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794
0.80 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078
0.90 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340
1.00 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577
1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790
1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980
1.30 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

31
31/08/2018

Normal Distribution Table

Search in the centre of the table for the area of 0.35

Z 0.00 0.01 0.02 0.03 0.04


0.04 0.05 0.06 0.07
0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279
0.10 0.0398 0.0438 0.0478 0.0517 0.0557 0.0597 0.0636 0.0675
0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064
0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443
0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808
0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157
0.60 0.2291 0.2257 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486
0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794
0.80 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078
0.90 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340
1.00
1.00 0.3413 0.3438 0.3461 0.3485 0.3508
0.3508 0.3531 0.3554 0.3577
1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790
1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980
1.30 0.4032 0.4049 0.4066 0.4082 0.4099 Substitution
0.4115 0.4131 0.4147
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using
the Normal Curve
to Approximate
the Binomial Distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

32
31/08/2018

The Normal Approximation


to the Binomial

The normal distribution


(a continuous distribution)
yields a good approximation of
the binomial distribution
(a discrete distribution)
for large values of n.
Use when np and n(1- p ) are both greater than 5!

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Binomial Experiment

The experiment consists of n Bernoulli trials


The outcomes are classified into
one of two mutually exclusive categories,
…such as success or failure
The probability of success
stays the same for each trial
The trials are independent
Interested in the number of successes

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

33
31/08/2018

Example

• Suppose the management of the Santoni


Pizza Restaurant found that 70 percent of its
new customers return for another meal
• For a week, in which 80 new (first-time)
customers dined at Santoni’s, what is the
probability that 60 or more will return for
another meal?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Solution 1

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

34
31/08/2018

Using Binomial Distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using Binomial Distribution

• We continue this process until we have the


probability that all 80 customers return.
• Finally, we add the probabilities from 60 to
80.
• Solving the above problem in this manner is
tedious.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

35
31/08/2018

Using Binomial Distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Binomial Distribution closed to


Normal Distribution

The value 0.5 is called the


continuity correction factor.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

36
31/08/2018

Continuity Correction Factor

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Solution 2: Using Normal


Distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

37
31/08/2018

Using Normal Distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using Normal Distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

38
8- 1

Sampling
Methods
&
Central Limit Theorem

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 2

When you have completed this chapter, you will be able to:
1. Explain under what conditions sampling is the
proper way to learn something about a population.
2. Describe methods for selecting a sample.
3. Define and construct a sampling distribution
of the sample mean.
4. Explain the central limit theorem.
5. Use the central limit theorem to find probabilities of
selecting possible sample means from
a specified population.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

1
8- 3

We use sample information


to make decisions or inferences
about the population.

Two KEY steps:

1. Choice of a proper method for selecting sample data


&
2. Proper analysis of the sample data (more later)

KEY 1.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 4

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

2
8- 5

KEY 1.
If the proper
method for selecting
the sample is
NOT MADE … the SAMPLE
will not be truly
representative of the
TOTAL Population!

… and wrong conclusions can be drawn!


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 6

Why Sample the Population?


Because…
 …of the physical impossibility of checking
all items in the population, and,
also, it would be too time-consuming

$ …the studying of all the items in a population


would NOT be cost effective

…the sample results are usually adequate


 …the destructive nature of certain tests
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3
8- 7

Techniques
with Replacement without Replacement
Each data unit in the Each data unit in the
population is allowed to population is allowed to
appear in the sample appear in the sample
more than once no more than once

Probability Sampling Non-Probability Sampling

Each data unit in the Does not involve


population random selection;
has a known likelihood inclusion of an item is
of being based on convenience
included in the sample
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Methods
8- 8

Simple Random ...each item(person) in the population


has an equal chance of being included
Systematic Random …items(people) of the population
are arranged in some order.
A random starting point is selected, and
then every kth member of the population
is selected for the sample
Stratified Random …a population is
first divided into subgroups, called strata,
and a sample is selected from each strata
Cluster …a population is
first divided into primary units, and
samples are selected from each unit
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4
8- 9

Simple Random Sample

 Can be done by using table of random numbers


 Suppose a population consists of 845 employees of Nitra
Industries.
 A sample of 52 employees is to be selected from that
population.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 10

Systematic Random Sampling

 The sample is chosen by selecting a random starting point


and then picking every ith element in succession from the
sampling frame.

 The sampling interval, i, is determined by dividing the


population size N by the sample size n and rounding to the
nearest integer.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

5
8- 11

Example

 There are 100,000 elements in the population and a sample


of 1,000 is desired.

 In this case the sampling interval, i, is 100.

A random number between 1 and 100 is selected. If, for


example, this number is 23, the sample consists of elements
23, 123, 223, 323, 423, 523, and so on.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 12

Stratified Random Sample

 It guarantees each group is represented in the sample. The


groups are also called strata

 For example, college students can be grouped as full time or


part time, male or female, or traditional or nontraditional.

 Once the strata are defined, we can apply simple random


sampling within each group or stratum to collect the sample.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

6
8- 13

Example
 We might study the advertising expenditures for the 352
largest companies in the United States.

 Suppose the objective of the study is to determine whether


firms with high profit spent more of each sales dollar on
advertising than firms with a low return or deficit. Let’s say
that 50 firms are selected for intensive study.

 To make sure that the sample is a fair representation of the


352 companies, the companies are grouped on percent return
on equity.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 14

Table showing the proportion

 If simple random sampling was used, observe that firms in


the 3rd and 4th strata have a high chance of selection
(probability of 0.87) while firms in the other strata have a
low chance of selection (probability of 0.13).

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

7
8- 15

Cluster Random Sample

 It is often employed to reduce the cost of sampling a


population scattered over a large geographic area

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 16

Example

 Suppose you want to determine the views of residents in


Oregon about state and federal environmental protection
policies.
 Selecting a random sample of residents in Oregon and
personally contacting each one would be time consuming
and very expensive
 Instead, you could employ cluster sampling by subdividing
the state into small units—either counties or regions.
 These are often called primary units

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8
8- 17

Selected Cluster
 Suppose you divided the state into 12 primary units, then
selected at random four regions—2, 7, 4, and 12—and
concentrated your efforts in these primary units.

 You could take a random sample of the residents in each of


these regions and interview them.
 Note that this is a combination of cluster sampling and simple
random sampling

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 18

Sampling Error

 Samples are used to estimate population characteristics


 For example, the mean of a sample is used to estimate the
population mean
 However, since the sample is a part or portion of the
population, it is unlikely that the sample mean would be
exactly equal to the population mean
 Similarly, it is unlikely that the sample standard deviation
would be exactly equal to the population standard deviation
 We can therefore expect a difference between a sample
statistic and its corresponding population parameter

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9
8- 19

Terminology
“Sampling error” … is the difference between
a sample statistic
and its
corresponding population
parameter
“Sampling distribution … is a probability distribution
of the sample mean” consisting of
all possible sample means
of a given sample size
selected from a population
Example
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 20

Example

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

10
8- 21

Solution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 22

Solution

What is the
sampling error?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

11
8- 23

Solution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 24

Solution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

12
Solution 8- 25

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 26

Relationships between the population distribution


and the sampling distribution of the sample mean

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

13
The law firm of Hoya and Associates has five partners.
8- 27

At their weekly partners meeting each reported the


number of hours they billed their clients last week:
Partner Hours
Dunn 22
Hardy 26
Kiers 30
Malinowski 26
Tillman 22
1. What is the sampling distribution of the sample mean for samples of size 2?
2. What is the mean of the sampling distribution
3. What is the population mean?
4. What is the sampling error?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 28

If two partners are selected randomly…


how many different samples are possible?

Partner Hours Objects


Dunn 22
Hardy 26
Kiers 30 5 …taken 2 at a time
Malinowski 26 …for a Total of 10 Samples!
Tillman 22
Using 5C2 …
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

14
8- 29

If two partners are selected randomly…


how many different samples are possible?

Partner Hours Objects


Dunn 22 5C2 =
Hardy 26
Kiers 30 5 5!
Malinowski 26 =
2! (5 – 2)!
Tillman 22
= 10 Samples
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 30

Partners Samples of 2 Mean


1&2 (22+26)/2 = 24
1&3 (22+30)/2 = 26
1&4 (22+26)/2 = 24
1&5 (22+22)/2 = 22
2&3 (26+30)/2 = 28
2&4 (26+26)/2 = 26
2&5 (26+22)/2 = 24
3&4 (30+26)/2 = 28
3&5 (30+22)/2 = 26
4&5 (26+22)/2 = 24
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

15
8- 31

Example …continued
Mean
24 Organize the sample means
26 into a Sampling Distribution
24
Sample Frequency Relative frequency
22 Mean Probability
28
26 22 1 1/10
24 24 4 4/10
28
26 3 3/10
26
24 28 2 2/10
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 32

Example …continued

Compute the mean of the sample means.


Compare it with the population mean
Sample Mean Frequency

  22(1)+ 24(4)+ 26(3) + 28(2)


22 1
X

24 4 10
26 3 = 25.2
28 2
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

16
8- 33

Example …continued

Note The population mean is also the same as


the sample means…25.2 hours!

Partner Hours
Dunn 22
  22  26  30  26  22
Hardy 26
5
Kiers 30
Malinowski 26 = 25.2
Tillman 22
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 34

The Sampling Error


Partners Samples of 2 Mean
1&2 (22+26)/2 = 24
 
1&3 (22+30)/2 = 26 What is the
sampling error?
1&4 (22+26)/2 = 24
1&5 (22+22)/2 = 22
The sum of these
2&3 (26+30)/2 = 28 sampling errors over a
2&4 (26+26)/2 = 26 large number of samples
is close to zero
2&5 (26+22)/2 = 24 The sample mean is an
3&4 (30+26)/2 = 28 unbiased estimator of
the population mean.
3&5 (30+22)/2 = 26
4&5 (26+22)/2 = 24
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

17
8- 35

Central Limit Theorem


The sampling distribution of the means
of all possible samples of size n
generated from the population
will be approximately normally distributed!
Sampling Distributions:
Mean (µx ) µ
Variance 2 /n
Standard Deviation 
/ n
X
(standard error of the mean)
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 36

Central Limit Theorem

 The approximation is more accurate for large samples than


for small samples.
 This is one of the most useful conclusions in statistics.

 We can reason about the distribution of the sample mean


with absolutely no information about the shape of the
population distribution from which the sample is taken.
 In other words, the central limit theorem is true for all
distributions.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

18
8- 37

Related Concepts

 If the population follows a normal probability distribution,


then for any sample size the sampling distribution of the
sample mean will also be normal
 If the population distribution is symmetrical (but not
normal), you will see the normal shape of the distribution
of the sample mean emerge with samples as small as 10
 If you start with a distribution that is skewed or has thick
tails, it may require samples of 30 or more to observe the
normality feature.
 Most statisticians consider a sample of 30 or more to be
large enough for the central limit theorem to be employed.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 38

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

19
8- 39

Example

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 40

Solution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

20
8- 41

Solution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 42

Histogram of Mean Lengths of Service


for 25 Samples of Five Employees

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

21
8- 43

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 44

Histogram of Mean Lengths of Service for


25 Samples of 20 Employees

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

22
8- 45

Using the Sampling


Distribution of the Sample Mean
 We have a population about which we have some
information.
 We take a sample from that population and wish to conclude
whether the difference between the population parameter and
the sample statistic (random sampling error) due to chance
 We can compute the probability that a sample mean will fall
within a certain range
 We know that the sampling distribution of the sample mean
will follow the normal probability distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 46

Z-Value

sampling error

standard error of the


sampling distribution of
the sample mean

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

23
8- 47

Example

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 48

Find the Value of Z

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

24
8- 49

Solution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 50

Point Estimates
A point estimate is one value ( a single point)
that is used to estimate a population parameter

 sample mean
sample standard deviation
sample variance
sample proportion
More
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

25
8- 51

Point Estimates
Population follows… Population does NOT follow…
the normal distribution the normal distribution
The sampling distribution If the sample is of at least 30
of the sample means also follows observations, the sample WILL
the normal distribution follow the normal distribution
Probability of a sample mean Probability of a sample mean
falling within a particular region, falling within a particular region,
use:
Z= X   use:
Z= X  
 n s n

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using the Sampling Distribution 8- 52

of the Sample Mean


Data…
Suppose it takes an A consumer watchdog
average of 330 minutes agency selects a random
for taxpayers to sample of 40 taxpayers
prepare, copy, and and finds the standard
mail an income tax deviation of the time
return form. needed is 80 minutes

What is the standard error of the mean?

Formula s /n = 80 /  40 = 12.6


Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

26
Using the Sampling Distribution 8- 53

of the Sample Mean


Data…
Suppose it takes an average of 330 minutes for
taxpayers to prepare, copy, and mail an income tax
return form. A consumer watchdog agency selects a
random sample of 40 taxpayers and finds the
standard deviation of the time needed is 80 minutes

What is the likelihood the sample mean


is greater than 320 minutes?

nswer…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using the Sampling Distribution 8- 54

of the Sample Mean


Data…* average of 330 minutes *random sample of 40
* standard deviation is 80 minutes
What is the likelihood the sample mean
is greater than 320 minutes?

X 
1 Formula z
s n
320  330 a1
 = 0.79
80 40
320 330
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

27
Using the Sampling Distribution 8- 55

of the Sample Mean


Data…* average of 330 minutes *random sample of 40
* standard deviation is 80 minutes
What is the likelihood the sample mean
is greater than 320 minutes?

2 Look up 0.79
in Table

Required Area = a1
a1 =0.2852
0.2852 + .5 = 0.7852
320 330
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8- 56

END

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

28
Estimation and
Confidence Intervals
Chapter 9

McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.

Learning Objectives
LO1 Define a point estimate.
LO2 Define level of confidence.
LO3 Compute a confidence interval for the population
mean when the population standard deviation is
known.
LO4 Compute a confidence interval for a population mean
when the population standard deviation is unknown.
LO5 Compute a confidence interval for a population
proportion.

9-2

1
LO1 Define a point estimate.

Sampling
Why Use Sampling?
1. To contact the entire population is too time consuming.
2. The cost of studying all the items in the population is often too expensive.
3. The sample results are usually adequate.
4. Certain tests are destructive.
5. Checking all the items is physically impossible.

9-3

LO1 Define a point estimate.

Estimates
Point Estimate versus Confidence Interval Estimate
• A point estimate is a single value (point) derived from a sample and used
to estimate a population value.
• A confidence interval estimate is a range of values constructed from
sample data so that the population parameter is likely to occur within that
range at a specified probability. The specified probability is called the level
of confidence.

What are the factors that determine the width of a confidence interval?
1. The sample size, n.
2. The variability in the population, usually σ estimated by s.
3. The desired level of confidence.

9-4

2
LO2

How to Obtain z value for a Given


Confidence Level
The 95 percent confidence refers to
the middle 95 percent of the
observations. Therefore, the
remaining 5 percent are equally
divided between the two tails.

Following is a portion of Appendix B.1.

9-5

From the Central Limit Theorem

3
LO3 Compute a confidence interval for the population
mean when the population standard deviation is known.

Point Estimates and Confidence Intervals for a


Mean (σ) Known

x  sample mean
z  z - value for a particular confidence level
σ  the population standard deviation
n  the number of observations in the sample

1. The width of the interval is determined by the level of confidence and the
size of the standard error of the mean.
2. The standard error is affected by two values:
- Standard deviation
- Number of observations in the sample

9-7

Example

4
Answer 1

Answer 2

5
How do we interpret these
results?
 Suppose we select many samples of 256 store
managers, perhaps several hundred
 For each sample, we compute the mean and then
construct a 95 percent confidence interval, such as we
did above
 We could expect about 95 percent of these confidence
intervals to contain the population mean
 About 5 percent of the intervals would not contain the
population mean annual income, which is µ

 For a 95% confidence interval, about 95% of the similarly


constructed intervals will contain the parameter being estimated.
 Also 95% of the sample means for a specified sample size will lie
within 1.96 standard deviations of the hypothesized population

6
LO4 Compute a confidence interval for the population mean
when the population standard deviation is not known.

Population Standard Deviation (σ) Unknown


– The t-Distribution
 In most sampling situations the population standard deviation (σ) is
not known

 Examples:

 The Dean of the Business College wants to estimate the mean number
of hours full-time students work at paying jobs each week. He selects a
sample of 30 students, contacts each student and asks them how many
hours they worked last week.

 The Dean of Students wants to estimate the distance the typical


commuter student travels to class. She selects a sample of 40
commuter students, contacts each, and determines the one-way
distance from each student’s home to the center of campus.

9-13

LO4 Compute a confidence interval for the population mean


when the population standard deviation is not known.

Characteristics of t-Distribution
1. It is, like the z distribution, a continuous distribution

2. It is, like the z distribution, bell-shaped and symmetrical

3. There is not one t distribution, but rather a family of t


distributions. All t distributions have a mean of 0, but their
standard deviations differ according to the sample size, n.

4. The t distribution is more spread out and flatter at the


center than the standard normal distribution As the
sample size increases, however, the t distribution approaches
the standard normal distribution

9-14

7
When? 9 - 16

…to use the z Distribution or the t Distribution


Population Normal?

NO YES
Population standard
n 30 or more?
deviation known?
NO YES NO YES
Use a
nonparametric Use the z Use the t Use the z
test distribution distribution distribution
(see Ch16)
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8
LO4

Confidence Interval Estimates for the Mean


Use Z-distribution Use t-distribution
If the population standard deviation If the population standard deviation
is known or the sample is greater is unknown and the sample is
than 30 less than 30

• df  degree of freedom
• 1 or 2 tail

9-17

Example

9
10
Answer
 The endpoints of the confidence interval are 0.256 and 0.384
 If we repeated this study 200 times, calculating the 95%
confidence interval with each sample’s mean and the
standard deviation
 190 of the intervals would include the population mean
 Ten of the intervals would not include the population mean
 This is the effect of sampling error
 A further interpretation is to conclude that the population
mean is in this interval
 The manufacturer can be reasonably sure (95 % confident)
that the mean remaining tread depth is between 0.256 and
0.384 inches
 Because the value of 0.30 is in this interval, it is possible that
the mean of the population is 0.30

11

Você também pode gostar