Você está na página 1de 104

CIVIL ENGINEERING

STATISTICS
BFC 34303
Chapter 1 :
Review on Descriptive Statistics
INTRODUCTION
These are Mathematics marks
for 30 students who are taking
Test 1

12 , 23, 24, 45, 34, 48, 56,


63, 23, 44,
69, 78, 84, 95, 98, 67, 73,
69, 58, 70,
40, 88, 59, 47, 37, 15, 17,
36, 63, 38
WHAT IS STATISTICS ?
~ Statistics is the science that deals
with collecting, classifying, presenting,
describing, analyzing and interpreting
data to enable us to draw conclusions
and making reasonable decisions
~ Can be divided into 2 categories
(a) Descriptive statistics
(b) Inferential statistics
Descriptive statistics
~ The activities of collecting, classifying, presenting and
describing quantitative data
~ Methods for organizing (frequency table), representing
(graphs) and summarizing data (central tendency and
variability).
Inferential statistics
~ The part dealing with technique and method of
interpretation of the results obtained from the descriptive
statistics
WHAT IS POPULATION ?
~ Population is the entire (complete)
collection of data whose properties are
analyzed. It contains all the subjects of
interest.
~ Can be of any size, its items need not
be uniform but must share at least one
measurable feature.
WHAT IS SAMPLE?

~ A portion of population selected for


study

~ Sample is any set of entities, cases,


subjects, items or experimental units
chosen from the population.
WHAT IS RANDOM SAMPLE?
~ A random sample is a sample
selected in such a way that each
element of the population has the
same chance of being selected
WHAT IS PARAMETER ?
~ Parameter is a numerical measurement
describing some characteristics of a
population
~ Eg: The population mean , variance

WHAT IS STATISTIC?
~ Statistic is a numerical measurement
describing some characteristics of a
sample
~ Eg: The sample mean ,variance
WHAT IS VARIABLE ?
~ Any measured characteristic or
attribute that differs for different
elements

~ For example, if the weight of 30


subjects were measured, then weight
would be a variable.

~ Can be classified as quantitative or


qualitative
WHAT IS QUANTITATIVE
VARIABLE ?
~ The variable being studied is
numeric

~ measured on an ordinal, interval,


or ratio scale

~ eg: If the time it took them to


respond were measured, then the
variable would be quantitative.
WHAT IS QUALITATIVE
VARIABLE ?
~ The variable being studied is non-numeric
~ Called "categorical variables

~ Measured on a nominal scale


~ eg: gender, educational level, eye
colour
If five-year old students were asked to
name their favourite colour, then the
variable would be qualitative.
WHAT IS DATA ?
~ A set of data is a collection of
observation, measurements or
information obtained

~ Can be classified as quantitative or


qualitative

~ Can be presented in various ways


WHAT IS QUANTITATIVE
DATA ?
~ Quantitative data refers to
observations which can be measured
numerically or counted
~ Can be divided into discrete data
and continuous data

~ eg: length, time,


temperature and mass
WHAT IS QUALITATIVE DATA ?

~ Qualitative data are not in


numerical form but instead assigned
as attributes
~ eg: race, marital status, age, gender
Discrete data
~ is a set of data that can only take exact
and countable values
~ For example:

a) The number of students in a class.


b) The number of cars sold on any day
at a car dealership.
c) The number of persons in a family.
d) The number of students in a class.
Continuous data
~ is a data can take any value over
certain interval and can be measured to a
certain degree of accuracy
(correct to certain decimal places)
~ For example:
a) The weight of students in a class.
b) The time taken to complete an
examination.
c) The amount of soda in a 150ml can.
d) The income of a family.
WHAT IS UNGROUPED DATA ?
~ (a) Raw data
(b) Not in the term of interval
(c) Frequency distribution that has
been arranged in order

~ Example:
(i) 3,5,6,2,5,2,4,6,5

(ii) Number of books 0 1 2 3


Frequency 3 7 4 2
WHAT IS GROUPED DATA ?
~ The data can be grouped into class
interval before the frequency
distribution is constructed
~ The table constructed is called
frequency distribution table
~ Example:

Height 150-155 155-160 160-165 165-170


(cm)
Frequency 2 8 6 5
WHAT IS FREQUENCY DISTRIBUTION?

One method for simplifying and organizing data is to


construct a frequency distribution.

A frequency distribution is an organized tabulation


showing exactly how many individuals are located in
each category on the scale of measurement.
Examples:
Determine whether the data obtained is discrete or
continuous data.

(a) The number of books sold by a stationary shop.


(b)The time taken to travel from Kuala Terengganu to Batu
Pahat
(c) The number of As in SPM
(d)The weight of FKAAS students
(e) The diameter of twenty spheres
REMARKS
All data are to be considered as
sample unless otherwise stated
in the questions.
Exampl
e : The number of male children in 20 families chosen
at
random is as follows.
14 2 0 2 3 3 2 1 4 5 2 1 2 0 1 2 3 1 2

The above data is called a raw data and it can be


summarized as a frequency distribution as shown :
Number of male 0 1 2 3 4 5
children
Frequency 2 5 7 3 2 1

The data shown in this frequency distribution table


is known
as ungrouped data.
MEAN MODE MEDIAN

MEASURES OF LOCATION

PERCENTILE QUARTILE
MEASURES OF LOCATION
( CENTRAL TENDENCY)
MEAN
Given a set data of x1,x2,x3,..xn.
The mean, is defined as
sum of all observations
x
number of observations
x1 x 2 ... x n

n
n For a set of data k
x i which can be fx i i
i1 represented in a
i1
n frequency distribution k
table, the mean is
given by
f
i 1
i
CENTRAL TENDENCY
In general terms, central tendency
(mean, median, and mode) is a statistical
measure that determines a single value
that accurately describes the center of the
distribution and represents the entire
distribution of scores.

The goal of central tendency is to identify


the single value that is the best
representative for the entire set of data.
Exampl
e:
Find the mean of the following data
14 2 0 2 3 3 2 1 4 5 2 1 2 0 1 2 3 1 2
Solution:
n

x i
1 4 2 ... 3 1 2
x i 1

n 20
41
2.05
20
OR

x 0 1 2 3 4 5
f 2 5 7 3 2 1

f x i i
2(0) 5(1) 7(2) 3(3) 2(4) 1(5)
x i1
k 20
f
i 1
i 2.05
Exampl
e : To obtain grade A, Saleha must achieve an average
of at least 75 marks in four tests. If her average
mark for the first three tests is 70, calculate the
lowest mark she must get in her fourth test in order
to obtain grade A.
Solution:
Let the four tests : w,x,y,z
Mean for w,x,y : 70
Mean for w,x,y,z :
3(70) z
75
4
210 z
75
4 So, the lowest mark
210 z 300 she must get in her
fourth test in order to
z 90 obtain grade A is 90
MEDIAN
The median is the middle value of a set of data that is arranged in
order of magnitude.
th
Let x(k) be the k observation in a set of data which has been
arranged in ascending or descending order.
For example, consider the following set of numbers
9 2 7 10 5 16
After arrangement, it becomes
2 5 7 9 10 16
Thus,
between x 3 7 and x 4 9
median is 8
Themedianof a set data x1 ,x 2 ,...,x n is denoted
by x(m) and x m may becalculated as:

x n1 ,if n is odd

2


xm
1
x x
2 2
n n
,if n is even
1
2
Exampl
e : Find the median for the following sets of data
a) 21, 24, 17, 28, 36, 20, 32
b) 3.56, 2.7, 5.48, 8.61, 4.35, 6.22

Solution:
a) The data arranged in ascending order :
17 , 20 , 21 , 24 , 28 , 32 , 36
Since n = 7 , which is odd, thus the
median is x x x 24
m n 1 4
2
b) The data arranged in ascending order :
2.71 , 3.56 , 4.35 , 5.48 , 6.22 , 8.61
Since n = 6 , which is even, thus the
median is
1
xm x 6
x 6
2 2

2
1

1
x 3 x 4
2

1
4.35 5.48
2
4.915
MODE

The mode of a set of data is the


value that
occurs most frequently.

The mode may not be unique or


they may be
no mode at all.
Exampl
e:
Find the mode for the following set of data

a) 2, 3, 3, 4, 5, 28, 5, 5

b) 2, 3, 5, 8, 10

c) 0.2, 0.4, 0.4, 0.4, 0.5, 0.7,


0.7, 0.7, 0.5
QUARTILES
Quartiles divide a set of data which are arranged in
ascending order into 4 equal parts.
To find quartile ( Qk ):
Let k
r n
4
where : n number of observations
k quartile for Q k
(i) If r is an integer:
1 th
Qk r observation ( r 1) observation
th

2
(ii) If r is not an integer, then round up to the next
integer.
Q2 is also called median.
Interquartile Range = Q3 Q1
PERCENTILES
Percentiles divide a set of data which are arranged in
ascending order into 100 equal parts.
To find percentile ( Pk ):
k
Let r n
100
where : n number of observations
k percentile for Pk
(i) If r is an integer:
1 th
Pk r observation ( r 1)th observation
2
(ii) If r is not an integer, then round up to the next
integer.

Notes:Q1 =P25 , Median Q2 =P50 , Q3 =P75


Example :
Find the median, first quartile (Q1) ,third
quartile (Q3) and 40th percentile ( P40) for the
following sets of data
a) 21, 24, 17, 28, 36, 20, 32
b) 3.5, 2.7, 5.4, 8.6, 4.3, 6.2, 9.9, 7.6
Solution:
a) The data arranged in ascending order :
17 , 20 , 21 , 24 , 28 , 32 , 36
Median Q2
k 2
r n 7 3.5 ( not an integer )
4 4
Median Q2 4 observation 24
th
First quartile Q1
k 1
r n 7 1.75 ( not an integer )
4 4
Q1 2 observation 20
th

Third quartile Q3
k 3
r n 7 5.25 ( not an integer )
4 4
Q3 6 observation 32
th

40 percentile P40
th

k 40
r n 7 2.8 ( not an integer )
100 100
P40 3 observation 21
rd
Example :

The following table shows the marks obtained


by 30 students in a Mathematics quiz, where
the maximum marks is 10.
Marks 2 3 4 5 6 7 8 9 10
No. of 2 4 3 6 4 5 4 1 1
students

Find the mean, mode, median, first and


third quartiles, interquartile range and
the 60th percentile.
VARIANCE STANDARD
DEVIATION

MEASURES OF DISPERSION

RANGE
Exampl
e:
Data 1: 6,7,8,6,9,6 mean = 7
Data 2: 5,7,2,6,13,9 mean = 7

Most of the numbers in data 1 are around the


mean value.
Data 2 is more spread away from the mean.
The difference in the spread can be determined
by the measure of dispersion
MEASURES OF DISPERSION

Variability
The goal for variability is to obtain a
measure of how spread out the
scores are in a distribution.
A measure of variability usually
accompanies a measure of central
tendency as basic descriptive
statistics for a set of scores.
MEASURES OF DISPERSION

Three common measure of


dispersion are:
Range
Variance
Standard deviation
Range = Largest value Smallest value

REMARK
Range is not a good measure of dispersion because it
is influenced by the extreme values and the
calculation does not cover all observations.

Variance and standard deviation are most useful and


widely used measure of dispersion. Although they are
influenced by the extreme values, the calculations
cover all the observations
REMARK

Standard deviation measures how spreads out the


values in a data set are.
If the data points are all close to the mean, then the
standard deviation is close to zero.
If many data points are far from the mean, then the
standard deviation is far from zero.
If all the data values are equal, then the standard
deviation is zero.
VARIANCE x
X

fx
i i

nfi

S 2

(X X) i
2

n 1 for i 1,2,...,n
Commonly in use formulae
STANDARD
DEVIATION

2
x
2
2
nX fx 2
nX
S
2 i S 2
i i

n 1 n 1
S VARIANCE
x
2
fx
2

x 2
i
i
fx 2

i i S 2

n i i
n

n 1 n 1
Exampl
e:
Calculate the variance and standard deviation for the
following sets of sample data. Hence, determine which data
is more disperse about the mean.

Set 1 : 16,10,9,2,5,2,7
Set 2 : 10,32,8,12,14,36,20,8,40,4,32,1
For Data 1:

Data 1 : 16,10,9,2,5,2,7
n
2

x x2 n x i
i 1
2 4
i 1
Xi
2

n
2 4
5 25
S
2

7 49 n 1
9 81
51
2

10 100 519
7 24.571849
16 256
6
n n

Xi 51
i1
i 519
X
i1
2
S 24.571849 4.957
For Data 2:

Data 2 : 10,32,8,12,14,36,20,8,40,4,32,1
n

2
n n
n x
Xi 217 i 5929
2
i X
i 1

i 1
Xi
2

n
i1 i1



S
2

n 1

217
2

5929
12 182.265 Hence, data 2 is
11 more disperse
than data 1
S 182.265 13.5
STEM-AND-LEAF DIAGRAMS
Used to extract every data value in dataset.

The digit(s) in the greatest place value(s) of the data

values are thestems.


The digits in the next greatest place values are

theleaves.
To construct a stem-and-leaf diagram:

1. Place the stems in order vertically from smallest to


largest.
2. Place the leaves in order in each row from smallest
to largest.
3. Create a key for the stem-and-leaf diagram so that
people know how to interpret the diagram.
Exampl
e:
STEM-AND-LEAF DIAGRAMS
Shape of distribution
A perfectly symmetric curve is one in which both sides of
the distribution would exactly match the other if the figure
were folded over its central point.
An example is shown below:

A symmetric, bell-shaped distribution, a relatively common


occurrence is called a normal distribution.
STEM-AND-LEAF DIAGRAMS

A distribution is said to be skewed to the right, or


positively skewed, when most of the data are
concentrated on the left of the distribution. The right tail
clearly extends farther from the distribution's centre than
the left tail, as shown below:
STEM-AND-LEAF DIAGRAMS
A distribution is said to be skewed to the left, or
negatively skewed, if most of the data are concentrated
on the right of the distribution. The left tail clearly extends
farther from the distribution's centre than the right tail, as
shown below:
STEM-AND-LEAF DIAGRAMS
Example:
If the stem and leaf plot is turned on its side, it will look like
the following:

The distribution shows that most data are clustered at the right.
The left tail extends farther from the data centre than the right
tail. Therefore, the distribution is skewed to the left or
negatively skewed.
Exampl
e:
Marks of a recent Mathematics test are as given below:
73, 42, 67, 78, 99, 84, 91, 82, 86, 94
Based on the marks given:
(a)Construct astem-and-leaf diagram.
(b)What is the highest and lowest mark?
(c)Interpret the distribution.

Solution:
(a) Mathematics Test Mark
Stem Leaf
4 2
5
6 7
7 3 8
8 2 4 6
9 1 4 9
Key:
9 9 means 99 marks
(b) Highest mark = 99, Lowest mark = 42
(c) Negatively skewed
Exampl
e:
Given the heights of 20 people are as follows:
154, 143, 148, 139, 143, 147, 153,
162, 136, 147, 144, 143, 139, 142,
143, 156, 151, 164, 157, 149.
Construct a stem-and-leaf diagram and state the
shortest and
tallest height. Interpret the distribution.

Solution:
Stem Leaf
13 6 9 9
14 2 3 3 3 3 4 7 7
8 9
15 1 3 4 6 7
16 2 4 Key:
13 6 means 136 cm
Shortest height =136 cm
Tallest height =164cm
Positively skewed
Exercise:

The length of a straight line that were estimated by


22
students in mm are as given below:
10.5, 8.5, 8.6, 8.1, 7.3, 4.4, 6.6, 6.6, 7.9, 8.7,
8.3, 6.0, 8.7, 7.5, 7.9, 6.0, 9.1, 7.2, 8.4, 8.1,
8.6, 9.3
Construct astem-and-leaf diagram based on the
given
data. Interpret the distribution.
BOX-AND-WHISKER PLOTS
70
max

Q1 Q2 Q3 60
min max

50

0 10 20 30 40 50 60 70
40 Q3
Horizontal Box and Whisker
30
Q2

20 Q1

10
min
Vertical Box and Whisker
0
BOX-AND-WHISKER PLOTS
To construct a box-and-whisker plot:

STEP 1: Determine the five number summary.


STEP 2: Draw a horizontal axis on which the number
obtained in step 1 can be located. Above this
axis, mark all the five number summary with
vertical lines.
STEP 3: Connect the quartiles to each other to
make a box, and then connect the box
to the maximum and minimum lines.
STEP 4: Calculate the values of upper and lower
inner fence to determine whether the data
Upper inner fence = Q3 + 1.5 (Q3 Q1)
Lower inner fence = Q1 - 1.5 (Q3 Q1)
Lower inner Upper inner fence
fence

min max
Q1 Q2 Q3

10 20 30 40 50 60 70 80 90 100

The data lies within the upper and lower inner fence, so the data has no outlier.

Lower inner fence Upper inner fence


Outlier

min max
Q1 Q2 Q3

10 20 30 40 50 60 70 80 90 100

The observation that lies outside fence is known as outlier.


SHAPE OF DATA DISTRIBUTION
(SYMMETRY AND SKEWNESS)

Symmetrical distribution-the whiskers are


the same length and the median Q 2 is in
the centre of the box.

Q1 Q2 Q3
min max
SHAPE OF DATA DISTRIBUTION
(SYMMETRY AND SKEWNESS)

Positively skewed distribution-the left


whiskers is shorter than the right whiskers
and the median is nearer to Q 1.

Q1 Q2 Q3
min max
SHAPE OF DATA DISTRIBUTION
(SYMMETRY AND SKEWNESS)

Negatively skewed distribution-the left


whiskers is longer than the right
whiskers and the median is nearer to Q 3.

Q1 Q2 Q3
min max
Exampl
e ::
Data
40, 32, 61, 52, 65, 68, 41, 61, 70, 66, 57, 55, 45,
51, 62, 69, 31, 50, 72, 66, 41, 54, 65, 79, 66
(a) Display the data in a stem and leaf diagram.
(b) Find the first, second and third quartiles, upper and lower inner
fence.
(c) Construct a box and whisker plot for the above data.
Solution :
(a) Stem Leaf
3 1 2
4 0 1 1 5
5 0 1 2 4 5 7
6 1 1 2 5 5 6 6 6 8 9
7 0 2 9
Key:
5 4 means 54
(b) Number of observation, n = 25, min = 31 , max = 79
1
r 25 6.25 , Q1 = the 7th observation
4
= 50
2
r 25 12.5 , Q2 = the 13th observation
4
= 61
3
r 25 18.75, Q3 = the 19th observation
4
= 66

Upper inner fence = Q3 + 1.5 (Q3 Q1)


= 66 + 1.5(66 - 50)
= 90

Lower inner fence = Q1 - 1.5 (Q3 Q1)


= 50 - 1.5(66 - 50)
= 26
(c)
Lower inner fence Upper inner fence
26 90
Q1 Q2 Q3

31 50 61 66 79

10 20 30 40 50 60 70 80 90 100

No outlier. The data is negatively skewed (skewed to the left).


Exampl
e:
Stem Leaf
5 1 9
6 2 3 3 4 4 4 4 4 5
6 8 8 8 9 9 9
7 0 2 2 3 6 7

Key:
5 9 means 59o F

From the given Stem and Leaf diagram, construct Box


and Whiskers plot. Determine the outliers of the data.
Number of observation, n = 23, min = 51 , max = 77

1
r 23 5.75 Q1 = the 6th observation
4
= 64o F
2 Q2 = the 12th observation
r 23 11.5
4
= 68o F
3
r 23 17.25 Q3 = the 18th observation
4
= 70o F
Upper inner fence = Q3 + 1.5 (Q3 Q1)
= 70 + 1.5(70-64)
= 79o F

Lower inner fence = Q1 - 1.5 (Q3 Q1)


= 64 - 1.5(70-64)
= 55o F
Lower inner fence Upper inner fence
55 79
Outlier
Q1 Q2 Q 3

51 64 68 70 77

50 60 70 80
From the boxplot, we can see that the minimum value
51o F is outside the fence and this value is the outlier.
Therefore whiskers is drawn from 59o F to 77o F .
Lower inner fence Upper inner fence
55 79
Q1 Q2 Q3
Outlier

51 59 77
64 68 70

50 60 70 80
The data is negatively skewed (skewed to the left).
GROUPED
DATA
MEAN MODE MEDIAN

MEASURES OF LOCATION

PERCENTILEQUARTILE DECILE
MEAN of a frequency distribution

The mean of a set of grouped data given in


the form of a frequency distribution is
defined as
k

f i xi
x i 1
k

f
i 1
i

f
i 1
i total no. of frequency

x i class mark
Exampl
e:
Find the mean for the following data
Class Frequency, fi
0 x <10 2
10 x <20 17
20 x <30 26
30 x <40 10
40 x <50 5
Class Frequency

0 x <10 2
10 x <20 17
20 x <30 26
30 x <40 10
40 x <50 5
0 10
SOLUTION: x
2
Class Class mark, Frequency, fixi
xi fi
0 x <10 5 2 10
10 x <20 15 17 255
20 x <30 25 26 650
30 x <40 35 10 350
40 x <50 45 5 225
fi = 60 f x
i i 1490
k

f i xi 1490
x i 1
k x 24.83
f i 60
i 1
MODE of a frequency distribution

d1
mod e Lm c
d1 d 2
Lm = lower boundary of the class containing the
mode
d1 = the diff. between the frequency of the mode
class and the frequency of the class
immediately before it.
d2 = the diff. between the frequency of the mode
class and the frequency of the class
immediately after it
C = size of the mode class
Exampl
e the
Find : mode of frequency distribution given below:
Class Frequency
15 - 19 1
20 - 24 4
25 - 29 22
30 - 34 35
35 - 39 20
40 - 44 8
SOLUTION:

The mode class is 30 34 and the


corresponding frequency is 35.

Lm 29.5
d1
d1 35 22 mod e Lm c
d 2 35 20 d1 d 2
c5
13
mod e 29.5 5
13 15
= 31.8
Mode from histogram
Draw a line from the left upper
Draw
cornera of
line from
the the right
highest upper
vertical bar
frequency corner ofestimated
the highest vertical
to the is
Mode left upper corner
from of
thethe bar
to thevertical
next right upper
intersection bar corner
point of bothof the
lines
vertical bar before it
Histogram should be drawn on a
graph paper in order to obtain an
accurate answer

mode Class boundaries


Exampl
e: For the data in example 2, find the mode
using the histogram
SOLUTION:

Frequenc
35
30

25
y
20

15

10
5

14.5 19.5 24.5 29.5 34.5 39.5 44.


Mode = 31.8 5
MEDIAN of a frequency distribution

NOTE :

Median of frequency distribution can't


be counted like the ungrouped data
because the data has been grouped in
the form of classes. So, we will get an
estimated value of median.
MEDIAN

n
2 FL
m Lm c
fm

L m lower boundary
n total no. of frequency
FL cumulative frequency of the class before median class
fm frequency of median class
c size of median class
Exampl
e : the median for the following data
Calculate
Class Frequency, f
0x<5 7
5 x <10 27
10 x <15 35
15 x < 20 54
20 x < 25 63
25 x < 30 43
30 x < 35 25
35 x < 40 17
40 x < 45 9
45 x < 50 4
SOLUTION:
Class Frequency, f Frequency, FL

0x<5 7 7
5 x <10 27 34
10 x <15 35 69
15 x < 20 54 123
20 x < 25 63 186
25 x < 30 43 229
30 x < 35 25 254
35 x < 40 17 271
40 x < 45 9 280
45 x < 50 4 284
f 284
The median class is 20 x < 25 with the
corresponding frequency as 63.
Hence, the median is n
2 FL
m Lm
Lm 20 fm
c

f 284 1
FL 123 2 (284) 123
m 20 5
63
fm 63
c5 21.51
Quartile
Quartiles divide a set of data which are
arranged in ascending order into 4 equal
parts
Percentile
Percentiles divide a set of data which are
arranged in ascending order into 100 equal
parts
Decile
Deciles divide a set of data which are
arranged in ascending order into 10 equal
parts
For grouped data;
k
4 n FL
Qk L k C k, k 1, 2, 3,..
fk

k
100 n FL
Pk L k C k, k 1, 2,3,..,99
fk

k
10 n FL
Dk L k Ck, k 1, 2,3,..,9
fk


Lk = lower boundry of the class where Q k ,Pk ,Dk lies
n = total number of observations
FL = cumulative frequency before the class Qk ,Pk ,Dk
fk = frequency of the class where Q k ,Pk ,Dk lies
ck = class width where Qk ,Pk ,Dk lies
Exampl
e:
Height (cm) 3-5 6-8 9-11 12-14 15-17 18-20
Frequency 1 2 11 10 5 1

From the above data , calculate :


(a) first , third quartiles & interquartile range
th th
(b) the 10 , 90 percentiles
c the 5 th
decile, D 5
Solution:
Class Class Cumulative frequency
Limit Bound. Freq.
3-5 2.5-5.5 1 1
6-8 5.5-8.5 2 3
9-11 8.5-11.5 11 14
12-14 11.5-14.5 10 24
15-17 14.5-17.5 5 29
18-20 17.5-20.5 1 30
Q1 is in third class with boundries (8.5 - 11.5 )
Thus, L k 8.5, f k 11, FL 3, c=3

(a ) First and third quartile


Q1 P25

7.5 3
= 8.5 + 3 9.73
11
Q3 is in third class with boundries (11.5-14.5 )
Thus, L k 11.5, f k 10, FL 14, c=3

Q3 = P75
22.5-14
=11.5 + 3
10
14.05

Q3 Q1 14.05 9.73 4.32


3 - 1
(b) P10 = 5.5 + x 3 8.5
2
27 - 24
P90 = 14.5 + x 3 16.3
5
c D5 P50 Median
15 - 14
= 11.5 + x3
10
11.8
VARIANCE STANDARD
DEVIATION

MEASURES OF DISPERSION

RANGE INTER-QUARTILE RANGE


RANGE

Range = upper boundary of the last data


- lower boundary of the first class

INTERQUARTILE RANGE
Defined as the difference
between the third quartile and
the first quartile
Interquartile range = Q3 - Q1
Variance and standard deviation

fx
2

fx
2

Variance, S2
f
f -1

standard deviation, S Variance


S 2
Exampl
e:
Find the range, variance and
standardClass
deviation
Frequency Class 2
Intervals mark x fx fx
1-3 5 2 10 20
4-6 3 5 15 75
7-9 2 8 16 128
10-12 1 11 11 121
13-15 6 14 84 1176
16-18 4 17 68 1156

f 21 fx fx 2
= 204 2676
Solution:
Range = upper boundary of the last data
- lower boundary of the first class
= 18.5 0.5 = 18

fx
2

fx
2

S 2

f S 34.71
2

f 1
204
2 S = 34.71
2676
21

20 5.892
Exampl
e :the mean, variance and standard
Find
deviation.

Você também pode gostar