Você está na página 1de 17

Lecture 6

1 of 17
MAT 212 Probability & Statistics for Science &
Engineering
Lecture 6
Continuous Probability Distributions

Introduction

We have learnt earlier that in statistics there are two broad categories of data discrete and
continuous. In the last chapter we discussed the way to handle discrete probability
distributions, and some useful discrete distributions. In this chapter we will do a similar
exercise for continuous variables. We recognize one fact at the onset. Because of the discrete
nature of the variable, where needed, we performed a discrete summation in discrete
distribution. But in case of continuous variable, a discrete summation is not possible. We will
require performing a continuous summation i.e. integration in case of continuous variable.

There is a second, and more important, difference between discrete and continuous
probability. Once again, because of the nature of the problem, it was quite possible for a
discrete variable to have a particular, fixed, specified value; like 3, 27, 19, or 51, etc. As a
result, we ended up finding the probability of the random variable having these particular,
fixed, specified values. The case for continuous variable is little interesting. Suppose we are
investigating the height of people. We have evaluated that the heights range from 1.3 m to
1.7 m, with average 1.45 m. If we choose one person at random and ask the question what is
the probability that the height of the person is 1.45 m? To answer this question, we use the
classic definition of probability that we learnt in lecture 3.
proceed can experiment the ways of No.
occur can event the ways of No.
) ( = A p
There is only one way the even can occur, i.e. is the height of the person is exactly 1.45 m.
When we say 1.45 m, we mean exactly 1.45 m, with no variation. In the denominator, there
are infinite possible heights between 1.3 m and 1.7 m. Therefore, the probability turns out to
be 0! In fact this is a hallmark of continuous probability that the probability of the random
variable having a particular value is always zero. Keeping this in mind, to understand how to

Figure 1. A discrete and continuous distribution.
(a) (b)

Lecture 6
2 of 17
handle probability of continuous variable, let us consider a distribution of a discrete variable,
as shown in figure 1(a). The grouping of the variable is natural in this case. But if the variable
is continuous, we can imagine the distribution to be a limiting case of the discrete distribution
with the width of the bar becoming narrower and narrower. In the limit, the distribution
would be as shown in figure 1(b).

Keeping both the figures in mind, we can develop a little different interpretation for
probability in case of a discrete variable that would be equally applicable for the case of a
continuous variable also. Let us assume that for any value of x = x
0
, p(x = x
0
) = p
0
. This
would correspond to one of the bars of the histogram. Since the width of the bar has no
physical significance, we can arbitrarily consider the width unity, and the height p
0
. This
means that the probability p
0
is the area of the bar. In case of a continuous variable, each
number will be a valid value of the random variable; and each number will correspond to a
vertical line. We know from basic geometry that the area of a line is 0. Therefore, once again,
we establish that in case of a continuous variable, probability of the random variable having a
particular value is 0.

As an outcome of the above discussion, we therefore, conclude that in case of a continuous
variable, we would attempt to find the probability of the variable with a range of values,
rather than a particular value, e.g. p(a < x < b). We, formally, state the definition of a
continuous distribution.

Definition. Let x be a continuous random variable. A function f such that
1. f(x) 0 (1)
2. 1 ) ( =


dx x f (2)
3.

=
b
a
dx x f b x a p ) ( ) (
is called a probability density function for x.

Property 1 and 2 are necessary and sufficient condition for the function f(x) to be a
continuous density. It is evident that the term density in the continuous case is just an
extension of the word distribution presented in the discrete case, with summation replaced
by integration. This is an important notion, as it will allow us to define all other mathematical
concept.

Example 6.1
Lead is added to gasoline on purpose. It acts as a catalyst to accelerate the combustion of
the fuel. But lead in the atmosphere is a major pollutant. Therefore, environmental
regulators are paying extra attention to lead content in the gasoline.

The lead concentration in gasoline currently ranges from 0.1 gms per liter to 0.5 gms per
liters linearly. If a sample of gasoline is chosen at random, what is the probability that the
lead concentration will be between 0.2 and 0.3 gms/l.

Our first exercise would be to construct the probability density function, f. To do this, we
recall conditions (1) and (2) shown above, and we observe the graphical nature of the
distribution in figure 2(a). Essentially, we have to choose the value of the function between

Lecture 6
3 of 17
0.1 and 0,5 in such a way so that the area under the distribution is 1. With little bit of
college algebra, we can derive the following expression for f(x).

< <
=
otherwise 0
5 . 0 1 . 0 25 . 1 5 . 12
) (
x x
x f
We can verify the conditions (1) and (2) above with this function. The function is always
positive in the given range of existence, i.e. between x = 0.1 to 0.5. We can also verify the
condition (2) as follows.



+ + =
5 . 0
5 . 0
1 . 0
1 . 0
) ( ) ( ) ( ) ( dx x f dx x f dx x f dx x f
0 ) ( 0
5 . 0
1 . 0
+ + =

dx x f
( ) 1 25 . 1
2
5 . 12 25 . 1 5 . 12
5 . 0
1 . 0
2 5 . 0
1 . 0
=

= =

x
x
dx x
The details of integrations are skipped here. To find the probability that has been asked for,
we write,
( ) 1875 . 0 25 . 1
2
5 . 12 25 . 1 5 . 12 ) 3 . 0 2 . 0 (
3 . 0
2 . 0
2 3 . 0
2 . 0
=

= = < <

x
x
dx x x p
This probability is the area under the function between x = 0.2 and 0.3, as shown in
figure 2(b).

We explain one point here. In the example above, we were not very critical about the end
points x = 0.2 and 0.3. While finding the probability, we never made clear whether the end
points were included or not. The reason is that it wound not have made any difference,
because we have already established that in case of continuous probability, p(x = a) = 0.
Therefore, p(a x b) = p(a x < b) = p(a < x b) = p(a < x < b).


0
1
2
3
4
5
6
0 0.1 0.2 0.3 0.4 0.5 0.6
0
1
2
3
4
5
6
0 0.1 0.2 0.3 0.4 0.5 0.6
f(x)
x x
(a) (b)
Figure 2. Example 6.1

Lecture 6
4 of 17
Cumulative distribution

The idea of cumulative distribution function in continuous case is very useful. It is defined
exactly as in the discrete case, although found by integration rather than summation. In case
of continuous variable, cumulative distribution of a probability density function f(t), F(x), is
defined as


= < =
x
dt t f x t p x F ) ( ) ( ) (
Graphically, the cumulative distribution is understood to be the area under the curve on the
left of any point x.

Example 6.2
We will evaluate the cumulative distribution for the function given in example 6.1. For this
example the density function was defined as

< <
=
otherwise 0
5 . 0 1 . 0 25 . 1 5 . 12
) (
x x
x f
Cumulative distribution has been defined as


= < =
x
dt t f x t p x F ) ( ) ( ) (
For x < 0.1, this integral is 0, as f(t) = 0. For 0.1 < t < 0.5
( ) 0625 . 0 25 . 1 25 . 6 25 . 1 5 . 12 ) (
2
1 . 0
+ = =

x x dx x x F
x

For x > 0.5, F(x) = 1.0. We, now compile the complete result

>
< < +
<
=
5 . 0 0 . 1
5 . 0 1 . 0 0625 . 0 25 . 1 25 . 6
1 . 0 0 . 0
) (
2
x
x x x
x
x F
The probability density function and the corresponding cumulative distribution function are
shown in figure 3.

In the previous problem, the probability p(0.2 < x < 0.3) was found by direct integration.
Once we have evaluated the cumulative function, finding the probability becomes easy.
The probability can be found as
p(0.2 < x < 0.3) = p(x < 0.3) p(x < 0.2)
= F(0.3) F(0.2)
By direct substitution
F(0.3) = 6.25 (0.3)
2
1.25(0.3) + 0.0625 = 0.2500
F(0.2) = 6.25 (0.2)
2
1.25(0.2) + 0.0625 = 0.0625

Lecture 6
5 of 17
Therefore,
p(0.2 < x < 0.3) = 0.2500 0.0625 = 0.1875


Expected values

In this section we define the expected values of a continuous probability density function. We
would also other quantities related with the expected values, such as mean, variance, moment
generating function, etc. The definition of expected value in case of continuous variable
parallels that of discrete variable with the summation operation replaced by integration.

In case of a continuous density function f(x), the expected value of a function H(x) is defined
as


= dx x f x H x H E ) ( ) ( )] ( [
As in the discrete case the mean, variance and the moment generating function are defined as


= = dx x xf x E ) ( ) (
( ) | | ( ) ( ) ( ) | |
2 2 2 2 2
) ( x E x E dx x f x x E = = =



( )


= = dx x f e e E t m
tx tx
x
) ( ) (

Example 6.3
We will find the mean and variance for the density function of example 6.1.

The given density function is
0
0.2
0.4
0.6
0.8
1
1.2
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
0
1
2
3
4
5
6
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

x x
f(x) F(x)
Figure 3. Example 6.2

Lecture 6
6 of 17

< <
=
otherwise 0
5 . 0 1 . 0 25 . 1 5 . 12
) (
x x
x f
Therefore
( ) ( )

= = = =


5 . 0
1 . 0
2
5 . 0
1 . 0
25 . 1 5 . 12 25 . 1 5 . 12 ) ( ) ( dx x x dx x x dx x xf x E
gms/l 3667 . 0
2
25 . 1
3
5 . 12
5 . 0
1 . 0
2 3
=

=
x x

To find the variance, we find E(x
2
).
( ) ( ) ( )

= = =


5 . 0
1 . 0
2 3
5 . 0
1 . 0
2 2 2
25 . 6 5 . 12 25 . 6 5 . 12 ) ( dx x x dx x x dx x f x x E
gms/l 1433 . 0
3
25 . 1
4
5 . 12
5 . 0
1 . 0
3 4
=

=
x x

Therefore,
( ) ( ) | | gms/l 00883 . 0 3667 . 0 1433 . 0
2 2 2 2
= = = x E x E
The standard deviation is
gms/l 09396 . 0 00883 . 0 = =

We will now study some special continuous random variables.


Uniform random variable

A random variable x is said to be uniformly distributed over the interval (, ) if its
probability density function is given by

< <

=
otherwise 0


1
) (
x
x f
A graph of this function is given in figure 4.
Note that the foregoing meets the
requirements of a continuous density
function state in conditions (1) & (2) above.
A uniform distribution arises in practice
when we suppose that a particular random
variable is equally likely to be near any
value in the interval (, ). The mean,
variance and the moment generating
function are,



1


x
f(x)
Figure 4

Lecture 6
7 of 17
2

) (
+
= = x E
( )
12

2
2

=
( )
) (

=
t
e e
t m
t t
x


Example 6.4
The current in a semiconductor diode is often measured by the Shockley equation
( ) 1
0
=
aV
e I I
where V is the voltage across the diode; I
0
is the reverse current; a is a constant; and I is the
resulting diode current. Find E(I) if a = 5, I
0
= 10
6
, and V is uniformly distributed over
(1, 3).

( ) ( ) | | 1
0
=
aV
e I E I E
( ) 1
0
=
aV
e E I
( ) | | 1
0
=
aV
e E I

1
2
1
10
3
1
5 6
dx e
x

=

1
10
10
5 15
6
e e

3269 . 0 =


Gamma, exponential, chi-square distributions

In this section we consider the gamma distribution. This distribution is especially important
in that it allows us to define two families of random variables, the exponential and chi-
squared, both of which are extensively used in applied statistics. The theoretical basis for the
gamma distribution is the gamma function, a mathematical function defined in terms of an
integral


=
0
1
) ( dz e z
z
> 0
Gamma function has two important properties.
1. (1) = 1
2. For > 1, () = ( 1)( 1)
We illustrate this through an example.

Lecture 6
8 of 17

Example 6.5
Evaluate

0
3
dz e z
z

To evaluate this integral using the techniques of elementary calculus would require
successive application of integration by parts. This integration can be integrated very
quickly using Gamma function.
) 4 (
0
1 4
0
3
= =


dz e z dz e z
z z

(4) = 3(3) = 32(2) = 321(1) = 321 = 3! = 6

A random variable x with density

/ 1
) (
) (

=
x
e x
x f x > 0, > 0, > 0
is said to have a gamma distribution with parameter and .

The different properties of the distribution are
The moment generating function, m
x
(t) = (1 t)

t < 1/
Mean, u = E(x) =
Variance,
2
=
2

The cumulative distribution function is given by
) (
) (
) (

=
x
x F
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5 6 7 8 9

x
f(x)
= 1
= 2
= 3
= 4
= 5
Figure 5. Gamma distribution for different values of .

Lecture 6
9 of 17
The function
x
() is called the incomplete gamma function, and is defined as


=
x
z
x
dz e z
0
1
) (
The graphical nature of gamma distribution is shown in figure 5 for different values of , and
in figure 6 for different values of . With the graphical representation of the gamma
distribution, we are in a better position to give interpretation for the parameter. The quantity
is called the shape parameter which sets the overall shape of the distribution, and the
quantity is called the scale parameter which stretches the parameter.

Gamma distribution with integer values of is also called Erlang distribution. Erlang
distribution is used very widely in Teletraffic engineering.


Exponential distribution

A special case of gamma distribution with = 1 is the exponential distribution. Therefore,
the exponential density distribution is
/

1
) (
x
e x f

= x > 0, > 0
The graphical representation of the exponential distribution is shown in figures 5 & 6. The
exponential distribution often arises, in practice, as being the distribution of the amount of
time until some specific event occurs. For instance, the amount of time (starting from now)
until an earthquake occurs, or until a new epidemic breaks out, or until a telephone call you
receive turns out to be a wrong number are all random variables that tend in practice to have
exponential distribution.

The cumulative distribution function, F(x) = 1 e
x/

0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5 6 7 8 9

x
f(x)
= 1, = 1
= 1, = 2
= 1, = 3
= 3, = 1
= 3, = 2
= 3, = 3
Figure 6. Gamma distribution for different values of .

Lecture 6
10 of 17
The moment generating function,
t
t m
x
1
1
) (

= t <
Mean, u =
Variance,
2
=
2


Example 6.6
Defective parts are produced at a factory on an average every 5.3 minutes. If it is assumed
that the time between the production of two defective parts follow an exponential
distribution, find the probability
(a) time between the production of two defective item would be less than 3.5 minutes,
(b) time between the production of two defective items would be more than 2 minutes,
(c) time between the production of two defective items would be between 2.5 to 4.5
minutes.

Since we have been told that the distribution is exponential with mean 5.3, we can write
the distribution as
3 . 5 /
3 . 5
1
) (
x
e x f

=
The cumulative distribution is
F(x) = 1 e
x/5.3

With these tools, we can start solving the problems
(a) p(x < 3.5) = F(3.5) = 1 e
3.5/5.3
= 0.5166
(b) p(x > 2.0) = 1 F(2.0) = 1 (1 e
2.0/5.3
) = 0.6857
(c) p(2.5 < x < 4.5) = F(4.5) F(2.5) = (1 e
4.5/5.3
) (1 e
2.5/5.3
)
= e
2.5/5.3
e
4.5/5.3
= 0.1961

Example 6.7
The waiting time at gas stations is assumed to follow an exponential distribution. At a
particular gas station, the average waiting time is 3.8 minutes. Find the probability of
(a) the waiting time would be more than 1 minute,
(b) the waiting time would be more than 2 minutes,
(c) the waiting time would be more than 3 minutes
(d) the waiting time would be more than 3 minutes, given that the waiting time is more
than 1 min.

The probability density function is
8 . 3 /
8 . 3
1
) (
x
e x f

=
The cumulative distribution is
F(x) = 1 e
x/3.8

(a) p(x > 1.0) = 1 F(1.0) = 1 (1 e
1.0/3.8
) = 0.7686

Lecture 6
11 of 17
(b) p(x > 2.0) = 1 F(2.0) = 1 (1 e
2.0/3.8
) =0.5908
(c) p(x > 3.0) = 1 F(3.0) = 1 (1 e
3.0/3.8
) = 0.4540
(d) p(x > 3.0 | x > 2.0) =
| |
7685 . 0
5908 . 0
4540 . 0
) 0 . 2 (
) 0 . 3 (
) 0 . 2 (
.) 2 ( ) 0 . 3 (
= = =
>
> >
F
F
x p
x x p


The last result requires a little bit of interpretation. We were asked to find the probability that
the waiting time would be more than 3 minutes, given that we have already waited for 2
minute. This means that after a wait of 2 minutes, we have been asked to find the probability
of waiting for 1 more minutes. This turn out exactly the same as the waiting time for more
than 1 minute even if we were not given the information that we have already waited for 2
minutes. This is something like that the distribution has forgotten that it has already waited
for 2 minutes. This is called the memoryless property of exponential distribution. In general,
the memoryless property is
p(x > s + t | x > s) = p(x > t)

Poisson distribution and exponential distribution have very close relationship. Both these
distributions are used to model queuing systems. Poisson distribution is used to model the
rate at which the events happen; while exponential distribution is used to model the time
between two successive events. For example, we may model the rate at which customers
arrive at a service center using Poisson distribution. In this case, the time between arrivals of
two customers would follow exponential distribution. It should be understood that neither the
Poisson process, nor the exponential process are to be taken for granted. But, what is to be
taken for granted is that if arrival rate follows Poisson distribution with mean arrival rate ,
the inter-arrival time would follow exponential distribution with mean inter-arrival time
= 1/. Furthermore, if the inter-arrival time is exponentially distributed with mean inter-
arrival time , the arrival rate would follow a Poisson distribution with mean arrival rate
= 1/.

In fact it is possible to derive the expression of exponential distribution starting with the
expression of Poisson distribution. Since Poisson distribution models an event in terms of
occurrence rate, this can be considered as frequency. On the other hand exponential
distribution models an event in terms of time between events, which can be considered as
time. Fourier transforms is a mathematical tool that is used to transform a function from time
domain to frequency domain, whereas inverse Fourier transform is used to transform a
function from frequency domain to time domain. We can use inverse Fourier transform on
Poisson distribution to transform the distribution to exponential distribution.

Example 6.8
Cars arrive at a toll plaza at a rate of 6.5 cars/min and is assumed to follow Poisson
distribution. Find the probability that (a) in any given minute 4 to 6 cars will arrive;
(b) gap between two successive arrivals is less than 15 secs.

(a) We have been asked to find p(4 < x < 6) = p(x = 4) + p(x = 5) + p(x = 6)
! 6
5 . 6
! 5
5 . 6
! 4
5 . 6
5 . 6 6 5 . 6 5 5 . 6 4
+ + =
e e e
p = 0.4146
(b) This is a problem of exponential distribution. Hence = 1/6.5 = 0.1538 min =

Lecture 6
12 of 17
9.23 sec. Therefore p(t < 15) = F(15) = 1 e
15.0/9.23
= 0.8031


Chi-square distribution

The gamma distribution gives rise to another important family of random variables, namely,
the chi-squared family. This distribution is used extensively in statistics. Among other things,
it provides the basis for making inferences about the variance of a population based on a
sample. We will just state the definition of this distribution here, and leave the issue of
showing its application for later.

Definition. Let x be a gamma random variable with = 2 and = /2 for g a positive integer.
x is said to have a chi-squared distribution with degrees of freedom. We denote this variable
by
2

x .

We will end the discussion of this distribution only with the statement that with this
distribution, we will normally find the probability ) (
2 2
r
x p > .


Normal distribution

The normal distribution is a distribution that underlies many of the statistical methods used in
data analysis. This distribution was first describes by de Moivre in 1733 as being the limiting
form of binomial distribution with the number of trials approaching infinity. This was again
described by Laplace and Gauss about half century later as they were trying to models errors
in astronomical measurements. This distribution is often referred to a Gaussian distribution.

Definition. A random variable x with density
2

2
1
2
1
) (
|
.
|

\
|

=
x
e x f
is said to have a normal distribution with mean u and standard deviation .

It is not within the scope of our lectures,
but it can be proven theoretically, using
polar coordinates, as well as numerically
that
1
2
1
2

2
1
=


|
.
|

\
|

dx e
x

The moment generating function is
2

2 2
) (
t
t
x
e t m
+
=
The distribution is shown in figure 7.


u

Figure 7. The normal distribution.

Lecture 6
13 of 17
Standard normal

It can be observed that the normal distribution depends upon the mean, u, and standard
deviation, , of the random variable. It is also understood that there are infinite possibilities
of both the mean and the standard deviation. This means that it would be necessary to
perform a separate integration for each set of mean and standard deviation. To avoid this
situation, we perform a transformation on the normal distribution using


=
x
z
This transformation converts all normal distribution into another normal distribution with
mean 0 and standard deviation 1. The new variable z is called the standard normal. The
standard normal denotes the variation from the mean in terms of the standard deviation. For
example, if a particular value of the random variable x, suppose x
0
, corresponds to a standard
normal z
0
, this means that the value x
0
is z
0
standard deviations away from the mean of the
random variable x.

The area under the standard normal curve is usually calculated numerically, and the values
appear in tabular form. An example is shown in Table 1.
Table 1. Tabulated values of the standard normal.

z
z
dz e
2 /
2
2
1


z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-2.9 0.00187 0.00181 0.00176 0.00170 0.00165 0.00159 0.00154 0.00149 0.00145 0.00140
-2.8 0.00256 0.00249 0.00241 0.00234 0.00226 0.00219 0.00213 0.00206 0.00200 0.00193
-2.7 0.00348 0.00337 0.00327 0.00318 0.00308 0.00299 0.00290 0.00281 0.00273 0.00264
-2.6 0.00468 0.00454 0.00441 0.00428 0.00416 0.00404 0.00392 0.00380 0.00369 0.00358
-2.5 0.00623 0.00605 0.00588 0.00572 0.00556 0.00540 0.00525 0.00510 0.00495 0.00481

For the purpose of solving problems, values are read directly from the table.

Example 6.9
The Transient Voltage Suppressor diode 1.5KE6V8A is supposed to have a Reverse
Standoff Voltage (V
RWM
) of 5.8 V. A random sample of this diode collected from different
manufacturer shows a mean of 5.83 V with a standard deviation 0f 0.18 V. If the reverse
standoff voltage is known to follow a normal distribution, find the probability that if a
diode is selected at random V
RWM
would be found to be between 5.7 V to 5.9 V.


Lecture 6
14 of 17
We have been given a problem with mean 5.83 and standard deviation 0.18. We have been
asked to find p(5.7 < x < 5.9). To find the probability, we will convert the variable x into
standard normal z. In terms of standard normal, we have to find
p |
.
|

\
|
< <

18 . 0
83 . 5 9 . 5
18 . 0
83 . 5 7 . 5
z
= p(0.72 < z < 0.39)
= p(z < 0.39) p(z < 0.72)
From standard normal table, this can be found to be
= 0.6517 0.2358 = 0.4159
Therefore, p(5.7 < x < 5.9) = 0.4159

It is extremely instructive to pay attention to the values in the standard normal table. This
table can have many variations all related to each other. Because the normal curve is
symmetrical, the z values may start from 0.0 instead of some negative value. In this case, the
area would start with 0.5.

Sometimes, it may also be necessary to read the z value from the table.

Example 6.10
In very large classes, the grades of the students are known to follow normal distribution. In
a particular class of more than 500 students, the mean marks is 64 with standard deviation
9.1. If the professor has decided that he will give the top 15% the grade A, and the
bottom 15% the grade F, what are the cut-off marks for A and F grades?

Here we are required to find the z value, separately, so that the area on the right and the
area on the left under the curve is 0.15.

For the area on the left, we look for 0.15 within the body of the table, and we find the
corresponding z to be 2.17. Therefore 17 . 2

=
x
. We have been given u = 64, and
= 9.1. Therefore, we obtain x = 44 for the cut-off marks for F grade.

Similarly, considering 17 . 2

=
x
would lead us to x = 84 as the cut-off marks for A
grade.


Natural processes and six sigma

It is very common to assume that natural, or even industrial processes, follow a normal
distribution. As we have seen in example 6.9, reverse standoff voltage was assumed to follow
a normal distribution. Normal distribution has a very interesting feature that becomes
important in many real life situations. From the standard normal table, we see that area under
the normal curve between z = 3 and z = 3 is 0.9974. This means that for the random variable
under consideration 99.7% of all the variables would be within 3 to 3 standard deviations of
the mean. This roughly means almost all the data should be within 3 to 3 of the mean.

Lecture 6
15 of 17
This can be used for many purposes. For example, after normalization, if a particular data is
far away from (3, 3), the validity of the data may come to be questioned; or the data may be
considered to be an outlier. Another way to use this information is that if it is possible to
estimate the maximum and minimum value of the variable, we can divide the difference
between the maximum and the minimum by 6 to estimate the standard deviation. We can also
estimate the mean. With these two, we have a rough distribution of the variable. This would
help us find many other properties of the variable.

Example 6.11
For the Transient Voltage Suppressor diode 1.5KE6V8A we saw in example 6.9, suppose
that the Maximum Peak Pulse Current (I
PP
) ranges from 140 A to 145 A. Find the
probability that for a diode selected at random, I
PP
would be more than 143 A.

We will assume that the Maximum Peak Current follows a normal distribution. As the
result of the discussion on the range of a random variable and its relationship with the
mean and standard deviation, we can find

Mean, 5 . 142
2
145 140
=
+
=
Standard deviation, 8333 . 0
6
140 145
=

=
The standard normal corresponding to x = 143 is 6 . 0
833 . 0
5 . 142 143
=

= z
From the normal table, p(z > 0.6) = 0.2743

Example 6.12
In Example 4.1 of lecture 4 we saw the maximum and minimum temperature in July of
Dhaka from 1930 till 1990. The minimum maximum-temperature was 29.8C and the
maximum maximum-temperature was 32.4C. What percentage of years had temperature
more than 32C?

We assume the behavior of the maximum temperature over the years will follow a normal
distribution. The mean and the standard deviation can be estimated as
Mean, 1 . 31
2
4 . 32 8 . 29
=
+
=
Standard deviation, 433 . 0
6
8 . 29 4 . 32
=

=
Therefore, the standard normal for x = 32 is 08 . 2
433 . 0
1 . 31 32
=

= z
From the standard normal table p(z > 2.08) = 0.0188
Inspecting the table shows us that we have data for 54 years between 1931 and 1990.
Therefore the number of years with temperature more than (and equal to 32C) would be
approximately 0.0188 54 = 1.01, or approximately 1 year.

In reality, there were 9 years!

Conclusion: Our fundamental premise that the behavior of the maximum temperature over
the years will follow a normal distribution is incorrect!

Lecture 6
16 of 17


Normal approximation of binomial distribution

Very often normal distribution though a continuous distribution is used to approximate
discrete distributions. This is true especially for binomial distribution especially when the
sizes of the sample, n, and the random variable, x, are large. The connection between the two
distributions is made through their mean and standard deviations. We have learnt in the last
lecture that the mean of binomial distribution is np, and the standard deviation is ) 1 ( p np .
To approximate a binomial process using normal distribution, we assume the mean of the
normal distribution to be np, and the standard deviation to be ) 1 ( p np .

Example 6.13
A study is performed to investigate the effect of stormy weather on the quality of cell
phone audio. A sample of audio collected during stormy weather shows that 45% of the
sample have more noise than the accepted levels. In a randomly collected of 30 samples
during stormy weather, what is the probability that more than 10 samples would have noise
more than the accepted level?

In this problem, probability that a sample collected during stormy weather would have
more noise than accepted level, p = 0.45. The number of samples, n = 30, and the value of
the random variable x = 10. We have to find p(x > 10). We will use normal approximation
to binomial distribution.

First, though, let us establish that this, in fact, is a problem of binomial distribution. A
randomly collected sample of audio during stormy weather would either clean, or noisy.
Therefore, the quality checking would be a Bernoulli trial. Hence this is a problem of
binomial distribution.

To use normal distribution for this problem, the mean, u = 0.45 30 = 13.5, the standard
deviation 725 . 2 ) 45 . 0 1 ( 45 . 0 30 = = . For the value x = 10, the standard normal
28 . 1
725 . 2
5 . 13 10

=
x
z . Therefore, p(x > 10) = p(z > 1.28) = 1 0.1003 = 0.8997.


Other distributions

Continuous distributions are extensively used in all branches and fields of engineering. Even
if the problem is discrete in nature, continuous distributions are still used for their flexibility
and simplicity of their usage; because the usage of most of these distributions is based upon
tabulated values. We will state just a few of these distributions below without going into any
details. All of these distributions have specialized applications which are also mentioned with
them. Most of these distributions have substantial theoretical basis behind them, which are all
skipped here. All these distributions have appeared as a result of attempt to solve some
fundamental problem.



Lecture 6
17 of 17
Distribution Properties Remarks
t-distribution
|
.
|

\
| +

|
|
.
|

\
|
+
|
.
|

\
|
|
.
|

\
| +
=

2
1
2
1
2
2
1
) (
n
n
t
n
n
n
t f


n degrees of freedom.
Use widely in two
applications. First, used
instead of normal
distribution if the sample
size is small. Also used for
comparison of two
distributions.
F distribution
m
n
F
m
n
m n
/
/
2
2
,

=
F distribution with n and m degrees of
freedom.
The F-distribution arises
frequently as the null
distribution of a test
statistic, especially in
likelihood-ratio tests,
perhaps most notably in the
analysis of variance
(ANOVA).
Weibull
distribution
k
x k
e
x k
x f
|
.
|

\
|

|
.
|

\
|
|
.
|

\
|
=

1

) (
k is shape parameter
scale parameter
Widely used in reliability
engineering; also for
modeling traffic systems. k
is a measure of failure rate,
and is a measure of
average life.
Logistic
distribution
2
1
) (
2
|
|
.
|

\
|
+
=
|
.
|

\
|

|
.
|

\
|

s
x
s
x
e s
e
x f
u
u

Used widely to model
knowledge-based system,
such as neural networks.
Rayleigh
distribution
2
2
2

) (
|
.
|

\
|

=
x
e
x
x f
Variation of Weibull
distribution. Used widely in
communication
engineering.
Cauchy
distribution,
Lorentz
distribution
( )
2 2

/
) (
+
=
x
x f
Used widely in the study of
resonance and
spectroscopy.

Você também pode gostar