Você está na página 1de 12

Reading Material #5

------------------------------------------------------------------------------------------------------------------------ 1
---------------------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12


THE NORMAL CURVE
INTRODUCTION

In this section we consider a very important family of distributions studied in
statistics called the normal distribution. Its equation is given by
2

2
1
2
1
|
.
|

\
|
=
x
e y where the
parameters and o are the mean and standard deviation of the distribution, respectively.
We present in Figures 1a and 1b the graph of the equation for specific values of and o.
Geometrically, the mean is the point on the x axis that is directly below the highest point
of the normal curve. The standard deviation o dictates the shape of the distribution.










For small values of o, the distribution tends to be leptukurtic (Figure 1a) while for
large values of o, the distribution tends to be platykurtic (Figure 1b). The normal
distribution is thus a family of curves determined by the paramaters and o. The normal
distribution has the following properties:

1. It is bell-shaped and is symmetric with respect to the vertical line that passes
through the highest point of the curve;
2. It is unimodal and the mean, median and mode are equal;
3. It is asymptotic with respect to the baseline, which means that the tails of the
distribution gets closer and closer to the baseline without crossing the baseline.
4. The total area under the curve and above the baseline is always equal to 1.0.

In general, normal distributions may yield equal means but different standard
deviations (Figure 2a); same standard deviations but different means (Figure 2b); or different
values of means and standard deviations (Figure 2c).






y
x
-
y
x
-
Figure 1a Figure 1b
Figure 2a. Normal Distributions with equal means
but different standard deviations

1

2

-
Reading Material #5
------------------------------------------------------------------------------------------------------------- 2
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12















The normal distribution is considered important for the following reasons
a
. First,
there are many real data such as IQ scores or achievement scores based on standardized
tests which can be described using the normal distribution. Second, results of many kinds of
chance outcomes, such as tossing a coin many times, can be approximated using the normal
disctribution. The third and most important reason is that, many statistical inference
procedures based on normal distributions work well for other roughly symmetric
distributions.


THE EMPIRICAL RULE

Because the area under the normal curve and above the baseline is 1.0, we consider
the normal curve as the graphic picture of the proportion of scores in a distribution. We
state below a common property of all normal curves with a given mean and standard
deviation o. This property is called the empirical rule which highlights one interpretation of
the standard deviation as a concept of distance.


Empirical Rule: If a given set of data is assumed to be normally distributed with a given mean
and standard deviation o, then

a) about 68.27% of all the cases are expected to fall between and
+ ;
b) about 95.45% of all the cases are expected to fall between 2 and
2 + ; and
c) about 99.73% or practically all the cases are expected to fall between
3 and 3 + .

a
Davis S. Moore and George P MaCabe (2003). Introduction to the Practice of Statistics (4
th
Ed.). U.S.A.: W.H.
Freeman.
1

2

Figure 2b. Normal Distributions with different means
but equal standard deviations


- -
1

2

Figure 2c. Normal Distributions with different means
and different standard deviations
2

1

- -
Reading Material #5
------------------------------------------------------------------------------------------------------------- 3
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12










Example 1. Suppose the Grade VI class consisting of 200 pupils posted a mean score of 84
with a standard deviation of 4 in the achievement test in English. Assuming
that the scores are normally distributed,
a. how many pupils are expected to score between 80 and 88?
b. within what two scores do we expect 95% of the pupils to score?

Solution:
a. Note that 80 = 84 - 4 = . .D S X while 88 = 84 + 4 = . .D S X + According to the
empirical rule, 68% of the scores are expected to fall between . .D S X and
. .D S X + Hence about 68%(200) = 136 pupils are expected to score between 80
and 88.

b. Again from the empirical rule, we expect about 95% of the scores to fall
between the values . . 2 D S X and . . 2 D S X + Since

. . 2 D S X = 84 - 2(4) = 76 and . . 2 D S X + = 84 + 2(4) = 92,

about 95% of the pupils are expected to score between 76 and 92.



THE STANDARD SCORE

Suppose a student in a class obtained a grade of 84 in Mathematics and 89 in English.
Would it be logical to say that the student performed better in English than in Mathematics?
Judging from the grades, one might be tempted to say that the student did well in English
than in Mathematics.

Statistics allows us to make an objective way of comparing values. In the preceding
situation, the best way to compare the scores of the student is to determine his relative
standing in relation to the performance of the entire class for each subject. One of the ways
by which an objective comparison may be done is to compute the percentile rank of the
grades of the student in these subjects. Another way to make a comparison of the students
68%
95%
99%
-
3 2 + 2 + 3 +
- - - - - -
Reading Material #5
------------------------------------------------------------------------------------------------------------- 4
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12
grades in relation to the performance of the other students in the class is to create standard
scores or z-scores.

A standard score has no attached unit and is merely the number of standard
deviation units a value falls above (or below) the mean of the group. In symbols, if Z denotes
a standard score, then

. .D S
X X
Z

= (Equation 1)

A positive value of Z describes the distance of the corresponding raw score above the
mean in terms of standard deviation units. Similarly, a negative value of Z describes the
distance of the corresponding raw score below the mean also in terms of standard deviation
units. Thus, the higher the value of Z, the higher is the corresponding raw score. A raw
score which is equal to the mean has a corresponding Z value of 0.


Example 2. Suppose a set of scores has a mean of X = 84 and 4 . . = D S . Compute and
interpret the Z score corresponding to a) 80 = X and b) 90.

Solution: a) Using Equation 1, we have 0 . 1
4
84 80
=

= Z which means that the score of 80


is 1 standard deviation unit below the mean.

b) If X = 90, 5 . 1
4
6
4
84 90
= =

= Z which means that the score is 1 and


2
1
standard
deviation units above the mean.


Example 3. A student obtained a score of 45 in Math and a score of 60 in Science. If the
mean and standard deviation of the math scores are 26.4 and 7.2, respectively
while that of science are 42.5 and 12.6, respectively, in which subject is his
performance better?

Solution: To compare the performance of the student in the two subjects, we have to
convert his raw scores into standard scores. Thus,

for Math, 58 . 2
2 . 7
4 . 26 45
=

= Z while for Science, 39 . 1


6 . 12
5 . 42 60
=

= Z . Hence,
the students score in Math is 2.58 standard deviation units above mean while
his score in Science is only 1.39 units above the mean. Consequently, relative to
the entire class, the student did better in Math than in Science.



Reading Material #5
------------------------------------------------------------------------------------------------------------- 5
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12
THE STANDARD NORMAL DISTRIBUTION

If the variable X is normally distributed with mean and standard deviation o, then
the standard score Z is also normally distributed with mean 0 =
Z
and standard deviation
0 . 1 =
Z
. Since the area under the normal curve is equal to 1.0 and it is symmetric with
respect to the vertical line passing through the mean, the area to the right and to the left of
the line passing through the mean
Z
= 0 is 0.5.








Let us illustrate this idea using an example. In the table below, we consider the
scores of eigth students in Mathematics denoted by X and the corresponding standard
scores denoted by Z.

X Z
80 0
77 -1.5
83 1.5
79 -0.5
80 0
81 0.5
80 = X 0 = Z
2 . . = D S 1 . . = D S

In the given table, the standardized values of X obtained using Equation 1 are
indicated in the column labeled Z. Note that the mean of the standard scores is 0 while its
standard deviation is 1.0. Any set of data when standardized or when converted into
standard scores will always have the property that the mean is 0 and the standard deviation
is 1.0.

AREAS UNDER THE NORMAL CURVE

When dealing with the normal distribution, we will often be concerned about the
area under the curve. As is it impossible to find areas of all normal curves, a table of areas
for the standard normal distribution (when the mean = 0 and standard deviation o = 1.0) is
tabulated as shown in Table 5.1. The areas under any normal curve are then obtained by
converting the raw scores into standard scores using the formula


=
X
Z (Equation 2)
-
z
= 0
Figure 3. Standard Normal Distribution
=1
z
Area = 0.5
Reading Material #5
------------------------------------------------------------------------------------------------------------- 6
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12

Table 5.1

AREAS UNDER THE NORMAL CURVE
Between the Mean Z = 0 and Positive Values of Z



EXAMPLE: To find the area under the curve between the mean Z=0 and a point 2.24 standard
deviations to the right of the mean, look up the value opposite 2.2 and under .04 in the
table; 0.4875 of the area under the curve lies between the mean and a z value of 2.24.

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753
0.2 .0793 .0832 .0871 .0910 0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549
0.7 .2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3414 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441
1.6 .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 .4608 .4616 .4625 .4633
1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4756 .4761 .4767
2.0 .4772 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4864 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
2.3 .4893 .4896 .4898 .4901 .4904 .4906 .4909 .4911 .4913 .4916
2.4 .4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 .4938 .4940 .4941 .4943 .4945 .4946 .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4979 .4980 .4981
2.9 .4981 .4982 .4982 .4983 .4984 .4984 .4985 .4985 .4986 .4986
3.0 .4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990
0.4875 of total area
z=0 z=2.24
Reading Material #5
------------------------------------------------------------------------------------------------------------- 7
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12

Example 4. Find the areas under the standard normal curve between the mean and each
given value of z:
a. z = 1.75
b. z = 2.33
c. z = -0.75

Solution: a) To find the area between the mean z = 0 and z = 1.75, we read the z value of
1.7 on the first column, then the z value of 0.05 on the first row of Table 5.1.
The intersection of the identified row and column yields the number 0.4599.
Thus the area from the mean up to the value of z = 1.75 is 0.4599 or 45.99% of
the total area.




b) For z = 2.33, we read the z value of 2.3 on the first column, then the z value of
0.03 on the first row of the table. The intersection of the identified row and
column yields the number 0.4901. Thus the area from the mean up to the
value of z = 2.33 is 0.4901 or 49.09% of the total area.





c) For negative values of z, we use the fact that the normal distribution is
symmetric with respect to the line that passes through the origin. Hence the
area from the mean to z = -0.75 is equal to the area from the mean to z = 0.75.
We read the z value of 0.7 on the first column, then the z value of 0.05 on the
first row of the table. The intersection of the identified row and column yields
the number 0.2734. Thus the area from the mean to the value of z = -0.75 is
0.2734 or 27.34% of the total area.






Example 5. Find the area under the standard normal curve
a. to the left of z = 2.0
b. to the right of z = -1.0
c. to the right of z = 1.96
d. to the left of z = -2.65
e. between z = 1.5 and z = 2.75
f. between z = -1.0 and z = 2.0
-
-0.75
-
0
0.4599
1.75
-
0
-
0.4599
- -
2.33 0
0.4901
Reading Material #5
------------------------------------------------------------------------------------------------------------- 8
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12
Solution: a) The area to the left of z = 2.0 includes the area from z = 0 to z = 2.0 plus half of
the entire area under the normal curve. From the table, the area from the
mean up to z = 2.0 is 0.4772. Therefore, the entire area to the left of z = 2.0 is

0.5 + 0.4772 = 0.9772 or 97.72%.



b) The area to the right of z = -1.0 includes the area from the mean down to
0 . 1 z = plus half of the entire area under the normal curve. By symmetry, the
area from the mean down to z = -1.0 is equal to the area from the mean up to
0 . 1 z = which is 0.3414. Thus, the entire area to the right of z = -1.0 is

0.5 + 0.3414 = 0.8414 or 84.14%.



c) To find the area to the right of z = 1.96, we first note that the area from the
mean to the entire right is 0.5. If we subtract the area from the mean up to
96 . 1 z = from 0.5, we get the desired area to the right of z = 1.96. Using the
table, the area from the mean up to z = 1.96 is 0.4750. Therefore, the area to
the right of z = 1.96 is

0.5 - 0.4750 = 0.025 or 2.5%.


d) The solution to this problem is similar to letter c) above. The area to the left of
z = -2.65 is equal to the area to the right of z = 2.65 by symmetry. Using the
normal table, the area from the mean up to the z value of 2.65 is 0.4960.
Therefore, the area to the right of z = 2.65 is given by

0.5 - 0.4960 = 0.004 or 0.4% which is also the area to the left of z = -2.65.





e) To find the area between z = 1.5 and z = 2.75, we first get the area from the
mean up to z = 2.75, then subtract the area from the mean up to z = 1.5.
Using the normal table, the area from the mean up to z = 2.75 is 0.4970 while
the area from the mean up to z = 1.5 is 0.4332. Therefore, the desired area is
given by

0.4970 - 0.4332 = 0.0638 or 6.38%.

-
-1.0
0.8414
0
-
-
1.96
0.4599
0
-
-
-2.65
0.004
0
-
-
2.75
0.0638
-
1.5 0
-
-
2.0
0.9772
0
-
Reading Material #5
------------------------------------------------------------------------------------------------------------- 9
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12

f) Finally, the area from z = -1.0 to z = 2.0 can be obtained by adding the area
from the mean down to z = -1.0 and the area from the mean up to z = 2.0. By
symmetry, the area from the mean down to z = -1.0 is equal to the area from
the mean up to z = 1.0 which is 0.3414. Also, the area from the mean up to z =
2.0 is 0.4772. Therefore, the desired area is given by

0.3414 + 0.4772 = 0.8186 or 81.86%




Example 7. Find the z-score corresponding to the area of 0.4972 from the mean up to z.

Solution: In this problem, we are given the area between the mean and an unknown value
of z. To find the value of z, we locate a table entry equal to 0.4972.




Once the value of 0.4972 is located, we read the corresponding value of z to the
left and above this value. The z value corresponding to the given area is obtained
by adding the value of z to the left and above this table entry. Hence, the desired
value of z is 2.77.


Example 8. Find the z-score whose area to the right is 5%.

Solution: Since the given area to the right is 0.05, the area from the mean up to this
unknown z value must be 0.45 or 0.4500 (why?).





We then locate a table entry equal to 0.4500. Note that this value does not
appear in the table. We take two table entries containing the value of 0.4500.
These values are .4495 and 0.4505. The corresponding z values for these table
entries are 1.64 and 1.65, respectively. Since 0.4500 is midway between these
table entries, the desired z value is the average of 1.64 and 1.65 which is 1.645.
(This procedure is actually called linear interpolation which is applied when the
given area is not found in the table entries).
0
- -
2.0
0.8186
-
-1.0
0
-
z =?
0.4972
-
0
-
z =?
0.05
-
0.4500
Reading Material #5
------------------------------------------------------------------------------------------------------------- 10
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12

APPLICATIONS OF THE NORMAL DISTRIBUTION

Since the total area under the normal curve is 1.0, areas under the normal curve are
often interpreted as probabilities. Hence, whenever the values of and o are given, we can
easily find probabilities or percent of areas under the normal curve using Equation 2.

Example 9. Scores on a college entrance test in a particular university are normally
distributed with a mean of 110 and standard deviation of 25. About what
percent of the students scored
a) above 120?
b) above 160?
c) below 85?

Solution: a) To find the percent or proportion of students scoring above 120, we first
standardize this score using the equation


=
X
Z . Thus,

4 . 0
25
110 120

=
X
Z .

Note that 120 > 110, hence the desired answer is the area under the normal
curve to the right of z = 0.4 which is 0.5 0.1554 = 0.3446. Hence,
approximately 34.46% of the students are expected to score above 120.

b) The standard score corresponding to a score of 160 is given by

0 . 2
25
110 160

=
X
Z .

Since 160 > 110, the desired answer is the area under the normal curve to the
right of z = 2.0 which is 0.5 0.4772 = 0.0228. Hence, approximately 2.28% of
the students are expected to score above 160.

c) The standard score corresponding to a score of 85 is given by

0 . 1
25
110 85

=
X
Z .

Since 85 < 110, the desired answer is the area under the normal curve to the
left of z = -1.0. By symmetry, the area to the left of z = -1.0 is equal to the area
to the right of z = 1.0 which is 0.5 0.3414 = 0.1586. Hence, approximately
15.86% of the students are expected to have scored lower than 85.


Reading Material #5
------------------------------------------------------------------------------------------------------------- 11
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12
Example 10. With reference to Example 9.0, a) what score must a student get to belong to
the upper 10% of all the examinees? b) if the university decides to fail the
lower 5% of all the examinees, what would be the minimum passing score?

Solution: a) We first obtain the value of z whose area to the right is 0.10. This means that
the area from the mean up to this z value is 0.40 or 0.4000. Using the normal
table, an area of 0.4000 is between the table entries 0.3997 and 0.4015 with
corresponding z-values of 1.28 and 1.29, respectively. Since 0.4000 is closer
to 0.3997 than 0.4015, we take 1.28 for our value of z.

To find the corresponding raw score, we solve for X in the equation


=
X
Z .
Hence, 142 110 1.28(25) = + = + = Z X . A student must have a score of at
least 142 to belong to the upper 10% of all examinees in the entrance test.

b) In this problem, we are given the area to the left of a given z value which is
0.05. Which means that the area from the mean down to the unknown z
value is 0.4500. The corresponding z value based on the table entry of
0.4500 is 1.645 (see Example 8). Since the score is below the mean, we attach
a minus sign to this z value. Hence the desired z-value is -1.645.
To find the corresponding raw score, we solve for X in the equation


=
X
Z .
Hence, 68.875 110 1.645(25) = + = + = Z X . Hence, the minimum passing
score must be 69.


The areas under the normal curve may be used to find the percentile rank associated
to a particular score in a given set of scores. Consider the following example.

Example 11. Still with reference to Example 9, what is the percentile rank of a student
whose score is
a) 130?
b) 98?

Solution: a) The percentile rank corresponding to the score of 130 can be obtained by
getting the percent of cases that fall below the value of X = 130. Converting
this raw score to standard score we get 8 . 0
25
110 130
=

= Z . Using the
normal table, the area from the mean up to z = 0.8 is 0.2881. Therefore, total
area below the score of X = 130 is 0.5 + 0.2881 or 0.7881. Hence, the
percentile rank corresponding to a score of 130 is 78.81.
Reading Material #5
------------------------------------------------------------------------------------------------------------- 12
-----------------------------------------------------------------------------------------------------------------
Gabino P. Petilos, Ph.D.
FIC, EDRE 231, 2
nd
Sem 11-12
b) The percentile rank corresponding to the score of 98 can be obtained by
getting the percent of cases that fall below this value. Converting this raw
score to standard score we get 48 . 0
25
110 98
=

= Z . Using the normal table,


the area from the mean up down to z = -0.48 is the equal to 0.1844.
Therefore, total area below the standard score of -0.48 is 0.5 0.1844 or
0.3156. Hence, the percentile rank corresponding to a score of 98 is 31.56.

Você também pode gostar