Você está na página 1de 30

W4 L1

• Organizing & Graphing Quantitative Data


– Stem & Leaf Display
– Frequency Distribution
– Histogram
– Polygon
– Ogive
– Box Plot

SQQS1013 W4 L1 ZZ 1
Stem & Leaf Display
• A stem and leaf plot retains (most of) the raw numerical
data.
• Each value is divided into two parts: a stem and a leaf.
• Then the leaves for each stem are shown separately in a
display.
• Shows the data pattern (e.g. outliers).
• Can detect which value is frequently repeated (mode).

• Example 9:
25 12 9 10 5 12 23 7
36 13 11 12 31 28 37 6 07: 0 = first digit (stem)
7 = second digit (leaf)
14 41 38 44 13 22 18 19
SQQS1013 W4 L1 ZZ 2
Stem & Leaf Display
• Example 9:
• 25 12 9 10 5 12 23 7
• 36 13 11 12 31 28 37 6
• 14 41 38 44 13 22 18 19

• Step 1: Arrange data in ascending order.


05,06,07,09,10,11,12,12,12,13,13,14,18,19,22,23,25,28,
31,36,37,38,41,44

• Step 2: Separate the data according to the first digit.


05,06,07,09 10,11,12,12,12,13,13,14,18,19
22,23,25,28 31,36,37,38 41,44
SQQS1013 W4 L1 ZZ 3
Stem & Leaf Display
• Step 3: Plot the stem and leaf.

• 0 5 6 7 9 Min?
• 1 0 1 2 2 2 3 3 4 8 9 Max?
• 2 2 3 5 8 Most repeated value?
• 3 1 6 7 8
• 4 1 4
Key:
1| 0 means 10

Leaf a.k.a. Trailing digit


Stem
a.k.a. Leading digit SQQS1013 W4 L1 ZZ 4
Frequency Distribution
• A frequency distribution for quantitative data lists all the classes and the
number of values that belong to each class.
• Data presented in form of frequency distribution are called grouped data.

SQQS1013 W4 L1 ZZ 5
Frequency Distribution
Class boundary (real class limit ) is given by the midpoint of the upper
limit of the class and the lower limit of the next class..
• divide the sum of these two limits by 2.
400  401
• e.g.  400.5  class boundary
2
Class Width (class size):
• Class width = Upper boundary – Lower boundary
• e.g.
• Width of the first class = 600.5 – 400.5 = 200
Lower limit + Upper limit
Class Midpoint or Mark =
2
• e.g. Midpoint of the 1st class = 401  600  500.5
2

SQQS1013 W4 L1 ZZ 6
Frequency Distribution

SQQS1013 W4 L1 ZZ 7
Constructing Frequency
Distribution Tables
1. To decide the number of classes, c, we use Sturges’
formula,
c = 1 + 3.3 log n,
where n is the no. of observations in the data set.
2. Class width, i
Largest value - Smallest value
i
Number of classes
Range
i
c

• class width is rounded-up to a convenient number.

3. Lower Limit of the First Class or the Starting Point


• use the smallest value in the data set.
SQQS1013 W4 L1 ZZ 8
Constructing Frequency
Distribution Tables
• Example 10:
• The following data give the total home runs hit by all
players of each of the 30 Major League Baseball teams
during 2004 season

SQQS1013 W4 L1 ZZ 9
Constructing Frequency
Distribution Tables

SQQS1013 W4 L1 ZZ 10
Constructing Frequency
Distribution Tables
• Number of classes, c = 1 + 3.3 log 30
• = 1 + 3.3(1.48)
• = 5.89 ~ 6 classes
• Class width,
c shall be
242  135 rounded-up or
i
6 rounded down
 17.8
i must always be  18
rounded-up

• Starting Point = 135

SQQS1013 W4 L1 ZZ 11
Constructing Frequency
Distribution Tables
• Table 2.10: Frequency Distribution for Data of
Table 2.9
Total Home Runs Tally f
135 – 152
153 |||| |||| 10
153 – 170
171 || 2
171 – 188
189 |||| 5
189 – 206
207 |||| | 6
207 – 224
225 ||| 3
225 – 242 |||| 4
 f  30

SQQS1013 W4 L1 ZZ 12
Relative Frequency and
Percentage Distributions
• Example 11: Refer example 10.
Table 2.11: Relative Frequency and Percentage
Distributions for Table 2.10.

Relative
Total Home Runs Class Boundaries %
Frequency
135 – 152
153 134.5 less than 152.5 0.3333 33.33
153 – 170
171 152.5 less than 170.5 0.0667 6.67
171 – 188
189 170.5 less than 188.5 0.1667 16.67
189 – 206
207 188.5 less than 206.5 0.2000 20.00
207 – 224
225 206.5 less than 224.5 0.1000 10.00
225 – 242 224.5 less than 242.5 0.1333 13.33
Total 1.0 100%

SQQS1013 W4 L1 ZZ 13
Histograms
• A histogram is a graph in which classes are marked on
the horizontal axis and either the frequencies, relative
frequencies, or percentages are marked on the vertical
axis.
• The frequencies, relative frequencies or percentages are
represented by the heights of the bars.
• In histogram, the bars are drawn adjacent to each other
and there is a space between the y-axis and the first bar.

SQQS1013 W4 L1 ZZ 14
Histograms
Example 12 : Refer example 10.

y axis

classes

Space!

SQQS1013 W4 L1 ZZ 15
Polygon
A graph formed by joining the midpoints of the tops of
successive bars in a histogram with straight lines is called a
polygon.

SQQS1013 W4 L1 ZZ 16
Polygon
 For a very large data set, as the number of classes is increased
(and the width of classes is decreased), the frequency polygon
eventually becomes a smooth curve called a frequency
distribution curve or simply a frequency curve.

SQQS1013 W4 L1 ZZ 17
Shape of Histogram
• Histogram shows
– centre, spread and skewness of the data
– presence of outliers
– multiple modes ?
• The most common shapes are:
• (1) Symmetric
• (2) Right skewed
• (3) Left skewed

SQQS1013 W4 L1 ZZ 18
Shape of Histogram

Multiple modes

SQQS1013 W4 L1 ZZ 19
Shape of Histogram
tail

SQQS1013 W4 L1 ZZ 20
Cumulative Frequency Distributions
• A cumulative frequency distribution gives the total
number of values that fall below the upper boundary of
each class.
• Example 14: Using the frequency distribution of table
2.10,

SQQS1013 W4 L1 ZZ 21
Choosing the Class Limits
3 approaches for Continuous Data:
• 1st approach:
Class Limits: Values lie between:
30-39 29.5-39.5
40-49 39.5-49.5
50-59 49.5-59.
* The 1st approach shall be used throughout this course.
• 2nd approach:
Class Limits: Values lie between:
30.0-39.9 29.95-39.95
40.0-49.9 39.95-49.95
50.0-59.9 49.95-59.95
This approach is used when the values of data set are
approximated to 1 decimal point.
SQQS1013 W4 L1 ZZ 22
Choosing the Class Limits
• 3rd approach:
Class Limits Values lie between:*
30-40 30-39.99…
40-50 40-49.99..
50-60 50-59.99..
– Note that the value 40 is included in the 2nd class.

SQQS1013 W4 L1 ZZ 23
Ogive
• An ogive is a curve drawn for the cumulative frequency
distribution.
• Two types of ogive:
• (1) ogive less than
• (2) ogive greater than
• Steps:
– Build a table of cumulative frequency.
– Draw x and y axes. Label x = class boundaries, y= cumulative
frequencies.
– Plot graph using the appropriate class boundary.
– Join the 1st appropriate class boundary to the consecutive points.

SQQS1013 W4 L1 ZZ 24
Ogive

SQQS1013 W4 L1 ZZ 25
Ogive

SQQS1013 W4 L1 ZZ 26
Box Plot (Box and Whisker)
• Data is graphically described using 5 specific values:
min, K1, median, K3 and max.

SQQS1013 W4 L1 ZZ 27
Multiple Box Plot

SQQS1013 W4 L1 ZZ 28
Box Plot
• Can provide answers to the following questions:
– Does the location differ between subgroups?
– Does the variation differ between subgroups?
– Are there any outliers?
• An effective tool for summarizing large quantities of
information.

SQQS1013 W4 L1 ZZ 29
Closure W4 L1
• Organizing & Graphing Quantitative Data
– Stem & Leaf Display
– Frequency Distribution
– Histogram Suitability?
Interpretation?
– Polygon
Benefits?
– Ogive
– Box Plot
• W4 L2 Lessons:
– Measures of Central Tendency. Read up! Exercise.

SQQS1013 W4 L1 ZZ 30

Você também pode gostar