Introduction To Probability: Unit 1

UNIT 1 INTRODUCTION TO
PROBABILITY
Structure
1.1 Introduction
Objectives
1.2 Why Probability?
1.2.1 Extent and Scope of Probability
1.2.2 History of Probability
1.3 Sample Space and Events
1.3.1 Counting Methods
1.3.2 Random Experiments and sample sapces
1.3.3 Sets and Events
1.3.4 Disjoint or Mutually Exclusive Sets
1.3.5 Classes of Events
1.4 Definitions of Probability
1.4.1 Relative Frequency Definition
1.4.2 Subjective Definition
1.5 Probability Models
1.5.1 Examples of Some Simple Probability Model
1.5.2 Model Building from Experimental and Theoretical considerations
1.6 Summary
1.7 Solutions/Answers
1 . INTRODUCTION
Probability theory plays a very important role in many areas of physical, social,
biological, engineering and management sciences. It lays the foundations for a
systematic study of mathematical statistics. Games of chance, number of accidents,
birth and death rates, system reliability and expected gain in a business venture are
some examples where the probability concepts are used.
In probability theory, we are usually interested in the occurrence or non-occurrence
of some events. There are several ways of defining the probability of an event.
Various definitions of probability are, in general, consistent with one another. In
this unit, you will learn about these definitions. You will also learn about the ways ,
of calculating the probability of events in simple cases and methods of building
some simple probability models. This is what is intended in this unit.
Objectives
After reading this unit, you should be able to
* perform operations with sets
* state probability axioms for events of a sample space
.$
* evaluate the probability of events in simple cases, and
* specify and build some simple probability models.
1.2 WHY PROBABILITY ?
What is Probability ?
Initial studies in probability theory originated from calculation of gambling odds.
Probability Concepts
Slowly over a period of years, it found its applications in other areas where the
outcome of an individual random experiment cannot be predicted with certainty. In
many cases some events occur more often than others and it is desirable to attach a
quantitative measure to various events of a random experiment. For example, if you
toss five coins, you may like to assign a quantitative measure to the following two
events:
1. the total number of heads observed is more than the total number of tails,
and
2. either two or three heads are observed.
Similarly, before the birth of a baby, a doctor may like to assign a quantitative
measure to the event that the weight of baby is less than 1.5 kgms.
Probability is such a quantitative measure. It is measured in a unit of "unity",
although sometimes in everyday life we also express it as a percentage after
multiplying it by 100. Thus, you may say that the chances that a team will win a
particular tournament are 35%. The probability "p" of an event also reflects the
degree of belief, which you have in the occurrence of that event. A high value of
"p" indicates that you are almost certain that the event will occur, whereas a low
value of "p" indicates that the event is almost impossible. Using appropriate
analysis, if you find that the probability that a given dam will develop a major
structural defect in the next 50 years is 0.001, then you are almost certain that this
event will not occur and that the design is a safe one. You have thus seen that the
probability of an event is a quantitative measure showing the degree of belief which
one has in the occurrence or non-occurrence of the event under consideration.
1.2.1 Extent and Scope of Probability
Probability theory has many applications in several areas of physical, social,
biological, engineering, business and management sciences. In order to cover such '
diverse fields, we require its systematic study rather than studying it just as a game
of chance. There are several examples in experimental studies, where an experiment
can be repeated a large number of times under identical conditions. In such cases,
events exhibit a'statistical regularity. Thus, if a coin is tossed a large number of
times, it is observed that the proportion of heads fluctuates around a fixed valuep,
which can be called the probability of getting a head in a single toss of this coin.
The value of p may not necessarily be equal to 0.5. As you will see later in this unit,
this relative frequency interpretation of probability is not enough in many cases.
Now- a-days, mathematicians study the subject from a theoretical point of view
without bothering about actual interpretation of the probability of an event. As you
know, any mathematical theory needs some basic assumptions and formal
definitions. Likewise, the modern theory of probability is based on some axioms.
Surprisingly, this axiomatic approach enables us to solve typical games of chance and
other similar problems as special cases and helps us in getting a clearer picture of the
overall situation.
As you know by now that probability is a quantitative measure. It, therefore, allows
us to compare the chances of occurrence of two or more events. Thus, an insurance
company may like to compare the survival probabilities of 40 years old salaried
persons and 40 years old self-employed persons in order'to offer lower premiums to
one category. Such a quantification also serves as a theoretical basis for studying
mathematical statistics and further analysis. For example, a power generation
company may like to estimate the power requirements of a region after 10 years, or
a businessman may like to estimate his expected profit in a new company.
1.2.2 History of Probability
The history of probability theory dates back to the 17th century. During that period,
the classical theory of probability was propounded by several distinguished
scientists. Among others, Pascal, Fermat, Huygens and Jakob Bernoulli applied it to
Introduction to Probability
games of chance and obtained numerical values of probability of various events by
using the classical theory.
The relative frequency definition of probability of events which show statistical
regularity in repeated experiments is due to Richard von Mises. He developed this
theory around the year 1921. At that time, the classical theory was prevalent and the
relative frequency approach provided a new dimension. This definition is quite
popular among engineers and experimental scientists as it gives a physical
interpretation to the occurrence of an event.
Both classical and relative frequency definitions give a method of assigning a
numerical value to the probability of an event. This can also be done simply by
interpreting probability as the degree of belief according to the subjective opinion
of a person. For example, the event that a particular team will win a tournament
may be assigned a probability 0.65 by an expert of the game. Such probabilities
when assigned are called subjective probabilities.
The modem theory of probability owes a great deal to the work of several Russian
mathematicians, notably A. Kolmogorov, who gave the axiomatic definition of
probability in the year 1933. According to this, the probability of an event satisfies
three axioms without any concern for its physical interpretation. As you will find
out shortly it provides a solid foundation for a deeper study of probability and
statistics.
1.3 SAMPLE SPACE AND EVENTS
1.3.1 Counting Methods
In several cases, the total number of outcomes of a random experiment are finite
and one has to find this number. Further, the total number of outcomes which are
favourable for the occurrence of some event are also needed in many cases. You
will now learn a systematic way of calculating these numbers by simple
combinatorial arguments. The main argument is based on the following
multiplication rule. .
The Multiplication Rule :
If an operation can be performed in 2 steps, out of which the first can be
performed in m different ways and then for each of these, the second can be
performed in n different ways, then the entire operation can be performed in
mn different ways.
Observe that if m different ways of doing the first operation are al, ..., am and
n different ways of doing the second operation are bl, h, ....., b,, then
symbolically the mn different ways of doing the entire operation can be written
as shown below :
......................
( am, bl ), ( am, b2 ), -.., ( am, bn )
You like to extend this result for an operation involving k steps.
Permutations and Combinations :
The words permutation and combination relate to the possible arrangements
which can be formed from a given set of objects. A permutation is an
arrangement of objects in a definite order, whereas a combination is the set of
objects itself without considering the order.
Suppose
is a set of n distinct objects. The totd number of different permutations of
these n objects are n ( n - 1 ) ( n - 2 ) .... 2 . 1. This product is usually denoted
by n! ( read as n factorial ) .
Let "P, denote the total number of different permutations which can be formed
from the objects of the set A taken r at a time ( r r n ), that i s, different
ordered arrangements containing exactly r objects out of ai , 422, ..., an . Then
where O! is defined as equal to 1.
In order to establish this result, we apply the multiplication rule successively in
r steps. Denote a typical permutation by an r-tuple ( xi , xz, ..., Xr ) where xi
stands for the object appearing in the ith position. The first position can be
filled in n ways by any of the n objects. Suppose this object is aji .
Consequently there are n - I remaining objects
a1 a2, ..., Uji- aji+ ..., an
This gives ( n - 1 )choices for x2. By the multiplication rule, the first two
positions can be filled in n ( n - 1 ) ways. After filling the first two positions, .
we can fill the third with any one of the remaining ( n - 2 ) objects, and so on.
By successive use of the multiplication rule, we find-the total number of
r-tuples as
Example 1 :
Find the total number of 3 digit numbers which can be formed by using the
digits of the set {1,2,3,4,5) allowing (i) norqetition (ii) repetition.
Solution :
(i)
In case no repetition is allowed, the total number of 3 digit numbers 6
which can be formed by using the digits of the set (1, 2, 3, 4, 5) are
5 ~ 3 = 5! / 2! = 60.
(ii) In case, repetition is allowed, then the total of 3 digit numbers are
5.5.5 = s3 = 125.
Let (: ) denote the total number of different combinations which can be
formed from the objects of the set A taken r at a time ( r r n ), that is,
different sets containing exactly r objects out of al, a2, ..., a,. For evaluating
(: ), you may note that each set contains r objects, and the sets are not ordered.
Now every distinct combination of r objects leads to a total to r! different
permutations. Hence, we must have
and hence
This number of combinations is also known as the binomial coefficient
since for a positive integer r
F
I
on using the binomial expansion of ( a + b )".
i
Example 2 :
i
Ten optional courses are available in a semester, and a student has to enrol for
I
3 of these courses. Find the number of possible combinations available to the
student.
Solution :
I
;
By definition, the required number is
Permutation involving alike objects :
Suppose we have a set of n objects in which several objects are alike. Suppose
that the objects are of k different types with ni objects of ith type , i.e.
We want to find the total number of arrangements which are possible with
these n objects. The total number of permutations is now not n ! even though
we have n objects. This is because the objects which are alike can be permuted
among themselves without altering the arrangements. Thus the total number of
arrangements is
This number is usually denoted by
Example 3 :
Three red balls, two white balls and four black balls are arranged in a row. If
all balls of same colour are alike, then find the total number of distinct possible
arrangements.
Solution :
By using the concept of permutation involving alike objects, the required
number is :
Find the total number of ways in which a committee of 3 persons can be
formed out of a group of 8 persons.
E2
From eight different flags, how many signals made up of three flags may be
I obtained.
I
I
I
I
I
I
1.3.2 Random Experiments and Sample Spaces
As you know, an experiment is to do or observe something happen under certain
conditions. This results in some final outcome. In very few experiments one can
predict the final outcome with certainty, for example, a particle with an initial
2
velocity of u m/sec and an acceleration off m/sec will attain a velocity v = u + ft
after t seconds. Such experiments are called deterministic. However, in most such
cases ideal conditions are assumed and the element of chance is ignored for the sake
of simplicity.
On the other hand, in a random experiment all possible outcomes are known, but it
is not possible to predict the outcome of an individual experiment in advance. The
theory of probability deals with random experiments which we shall simply call as
experiments. Such experiments also include conceptual experiments, whieh need
not be actually carried out, such a tossing as coin an infinite number of times.
The collection of all possible outcomes of an experiment E is called the sample
space associated with E and is denoted by S. An i ndi vkhl outcome of E is called
an elementary event or a sample point or just a point and is denoted by s.
Sample spaces are conveniently classified according to the number of points they
contain. Thus a sample space S having a finite number of points is often termed as
a finite sample space. If S has a finite or a countable number of points, then we call
the sample space S as discrete. Finally, a sample space with more than a countable
number of points, such as all points of the interval [ 0 , 11 , is called a continuous
sample space.
Example of Random Experiments and Sample Spaces
1)
The tossing of a coin once is a random experiment, since we cannot
predict the final outcome in advance. The final outcome may either be a
head H or a tail T where the possibilities that the coin may 'stand on its
edge or roll away or may not fall back on ground are ignored.
Consequently
S = {H, T}.
2)
On throwing a die once and recording the number appearing on the
uppermost face, the sample space is S = {1, 2, 3, 4, 5, 6).
3)
On tossing a coin 3 times, the sample space has Z3 = 8 points which can
be converliently represented by
S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT),
where the symbol at ith place represents the outcome of the ith toss.
4)
Conceptually, we may continue tossing a coin an infinite number of
times. An elementary event now looks like HTMHTH. .. ad infinitum, and
the sample space S contains all such infinite sequences of H s and T ' s.
Here, we can never observe even a single outcome of this experiment.
!
Inspite of such limitations, such sample spaces are useful from conceptual
point of view. You will later see that several important and useful results
of probability theory such as laws of large numbers, central limit
theorems etc. are concerned with such experiments.
5 )
If the total number of accidents reported during a week in a factory are
observed, the sample space. S can be taken as
6 )
The sample space for the total rainfall recorded in a city during a year is
the set of all non-negative real numbers, that is
For the sake of convenience this is usually extended to R, the set of all
real numbers, even though a negative rainfall cannot be recorded. Such
extensions of S are often quite useful.
7)
Water level of two rivers above (or below) a fixed point is recorded. The
sample space is
where R~ is the set of all points of the real plane.
-
Write down the sample space S associated with tossing a coin 4 times. Find the
total number of points in S.
A coin is tossed till a head is obtained. However, the experiment is stopped if
no head is observed in the first four tossings. List all points of the sample space.
1.33 Sets and Events
You now know about the sample space S. In general, we are only interested in some
subsets of S and not in the whole space S. All those subsets of S which are of our
~.obabiIity -pe
interest are termed as events. Events will be described either by specifying some
property or by listing the various elements of S. For example, if a coin is tossed 3
times, then the subset
A = {HHT, HTH, THH) is an event, which can be described by the property that
exactly 2 heads appear in 3 tosses.
Events are thus subsets of points of S in which we are interested. In order to work
effectively with them we require some basic concepts of the set theory. To refresh
your memory, the main set operations defined with reference to a fixed space S are
desaibed here. In this,A, B, C,Al,A2 ,.... etc. are subsets of S, that is, these are
collections of points of S. The empty set which contains no point of S is denoted by @.
Inclusion
If all points of a setA are also points of a set B thenA is said to be a subset of
B. Symbolically, this is written as A C B or B 3 A and is read as "A is a
subset of B" or "A is included in B" or "B containsA", etc.
In this terminology, subsets of space S are simply termed as sets of S.
Equality
If A C B and B C A, then A and B are said to be equal sets and we write
A = B.
You may like to verify the following properties of the equality relation :
i) A - A
ii) A - B + B = A
(Reflexive)
(Symmetry)
iii) A 5 B,B 5 C=+A - C (Transitivity)
Complement
The complementA1 (read asA complement) of any setA is the set of all those
points of S which do not belong to A. Thus
Union
The unionA1 UA2 (read asAl unionA2) of two setsA1 and A2 is the set of
all those points of S which belong to either A1 or A2 (or both).
ThusAi U A2 = {s : s belongs to at least one of the sets A1 or A2). More
generally, the union of setsAl,A2, ..., An is denoted byA1 U A2 U ... UA, or
simply by
and is defined by
n
U Ai - { s : s E Aj for atleast one i
i - 1 I
Intersection .
\
The intersectionA1 n A2 (read as A1 intersectionA2) of two sets A1 andA2 is
the set of all those points of S which belong to bothA1 andA2. Thus
AlflA2 - { s : s ~ l a n d sEA2
Similarly, the intersection of sets Al, Ap ..., An is denoted by
A1 n A2 n ... n An or simplj h .,
I
i
i
I
n
I f l Ai
! i - 1
i
L and is defined by
n
f l Ai = {s : r E Ai for all i
i - 1
These definitions of union and intersection can be extended to any infinite
I number of sets of S.
Difference
L
The difference A - B (read as A minus B) is the set of all those points of A
which do not belong to B. Thus
A- B = { s : s Aa n d s e B) .
1
You are urged to verify that
A' - S -A and
A- B = An B ' .
i
i 1.3.4 Disjoint or Mutually Exclusive Sets
If two setsA and B have no common points, then they are said to be disjoint or
,mutually exclusive. Thus for mutually exclusive setsA and B we have
A ~ B = $ .
The above concepts of set theory can easily be understood in a pictorial manner by
means of a Venn diagram shown in Figure 1.1.
In this figure the set S is represented by a rectangle, and the points of S are the
points in the rectangle. Sets of S such as A1 andA2 are represented by the points
inside closed curve. In this figure, the setAi UA2 is shown by the shaded area.
Self Assessment Questions: The following relations are valid among u~i ons,
intersections and complements and you may like to verify some of them by means
of a Venn diagram.
i) A U A - A , A n A = A (Reflexive)
ii) A U B = B U A , A n B = B n A (Commutative)
iii) ( A U B ) U C = A U ( B U C ) = A U B U C
( A n B ) n C = A n ( B n C ) = A n B n C (Associative)
Introduction to -ability
iv) A n ( B U C ) = ( A n B ) U ( A n C )
A U ( B n C ) = ( A U B ) n ( A U C ) (Distributive)
Probabiiy Concepts
I---- V) ( AUB) ' - A' n B1
( An B) ' = A' UBr
(De-Morgan's rules).
1.3.5 Classes of Events
As stated earlier, events are sets of S which are of our interest. Now, if we are
interested in occurrence of some event E, then we are also interested in the
non-occurrence of E. Thus the complement E' of E is also an event. Likewise,
unions, intersections, differences of events are also events. The collection of events
is called a class of events. For studying probability theory systematically, we need a
class of events A, which is non-empty and is closed under countable set operations,
that is, if ~ount abl e number of union, intersection and complement operations are
performed with events of class A, then the resulting event is also a member of the
class A. Such a class is called a a-field (sigma field). It ensures that after
performing a finite or countable number of operations with events, we again get an
event. This implies that every such class A must have S, the sure event and @, the
impossible event. It may not always be possible to provide a physical interpretation
to all events as can be seen in the following example :
s = (0, 1,2, ....., 10).
i
The events of interest here may be as follows :
A = (0, I), B = {2,3) and C = (4, s ,..... , 10).
If A occurs, then there is enough power, if B occurs then some load-shedding is
i
1
required, and if C occurs then there is a total collapse and no power is generated.
I
Clearly, the classA of events which is of our interest also includesAr, the event that
there is not enough power. Similarly, we are also interested in B', C' etc. You may
like to verify that this class is actually given by
A = {$,A, B, C,A1, B', C', S ),
even though a physical interpretation of some of these events, like @, B', S etc. may
be difficult.
1.4 DEFINITIONS OF PROBABILITY
Like any other branch of mathematics, the word probability also requires a formal
definition. Over the years, seveml definitions of probability have emerged. You will
now learn about some of these definitions.
1.41 Relative Frequency Definition
This is a popular definition among experimental scientists and engineers. The main
idea behind this definition is statistical regularity of events. As you have seen that in
a random experiment E one cannot predict whether an eventA will occur or not.
However, it has been observed that for large number of repetitions of experiment E
under identical conditions, the proportion of occurrences of eventA fluctuates
around some fixed numberp which lies between 0 and 1. In othq words, the event
A occurs with some statistical regularity, even though the individual outcome
cannot be predicted. For example, if a coin is tossed a large number of times, then
the proportion of occurrence of heads tends to some fixed number p. We can then
say that the probability of occurrence of head in a single toss is p. A definition of
probability of an eventA may be given as follows :
LetA be an event in ag experiment E. Suppose it is possible to repeat the
experiment E. under identical conditions. If in first n repetitions of E., the eventA
occurs n (A ) times, the n (A ) is called the frequency of A and - ( A ) is called the
n
relative frequency ofA.The probability of eventA is then defined by
n ( A)
P ( A ) = lim -
n - r m n
/
hhduction to Probability
provided this limit exists.
Inspite of its intuitive appeal, this definition has some obvious limitations. You
might have noticed that P ( A ) is defined as a limiting value. Since, we cannot
repeat E an infinite number of times, we can never find P ( A ). In practice, we
n ( A )
repeat an experiment a large number of times and take the relative frequency -
n
as an approximate value of P (A). Thus, we may toss a given coin 50,000 times. If
head occurs (event A) 24,683 times, then an approximate value of P ( A ) is
24,683150,000 = 0.49366. It is clear that if the number of tossings are increased to
1,00,000, then this approximate value of P (A )may change to some other number.
Thus, if the total number of heads in first 1,00,000 tosses are 50,721, then
P ( A ) = 0.50721 approximately.
As a consequence of this definition it follows that for any eventA, 0 s P ( A ) s 1.
Further for the sure event S, which occurs in every repetition P ( S ) = 1 and for the
impossible event @ which never occurs P(@) = 0 . However, P(A) = 0 does not
mean that A is an impossible event. For example,
we may have
the integral part of fi. ThenP (A ) = lim n ( A )/n = 0, but the eventA does occur
n ( A ) times.
Further for two disjoint eventsA and B, we have
One major drawback of this definition is that it is not applicable to situations where
an experiment E cannot be repeated under identical conditions, or where E cannot
be repeated at all. For example, if five teams are participating in a tournament, we
cannot repeat the experiment for finding out the probability that a particular team
will win the tournament. Similar is the case with experiments of destructive nature.
This leads us to the next definition of probability.
1.4.2 Subjective Definition
You might have often come across statements like "the probability is 70% that it
will rain tomorrow "or"the odds are 2 : 7 that a particular horse will win a race".
I
These are the so called subjective probabilities of respective events. According to
this definition, the probability of an event is assigned by an individual according to
his subjective opinion or experience. These probabilities represent the degree of
confidence and belief which one has in the occurrence of an event. Clearly, such
probabilities can be calculated in all cases. It is another matter that different persons
may assign different probability value to the same event.
Axiomatic Definition
In the year 1933, Russian mathematician Kolmogorov gave the modern definition
of probability. In this definition, no attempt is made to give a physical interpretation
to the probability of an event A. It is totally mathematical and is based on three
axioms only. A! you know, axioms are self evident truths and cannot be proved or
disproved.
According to this definition, the probability P ( A ) of an eventA is a function. From
the earlier courses, you know that in order to define a function, we need a domain
space, a range space and a rule which assigns a value to every element of the
domain space. For the probability function, we also need the same things. Now the
Probability conceps
domain space is a a-field A of events, and the range space is the closed interval [ 0,
1 ] of the real line.
In other words, starting with an experiment E and its associated sample space S, we
first define a a-fieldA of events in which we are interested. The probability is then
a real valued function P ( . ) defined on A such that the following axioms are
satisfied.
Axiom1 p(A)2 0 VAE A
Axiom2 P ( S ) - 1,
Axiom 3 For any sequence of disjoint events
belonging to A,
A major advantage of this definition is that it is not concerned with assigning
numerical values to probability of events. Such assignments could be done either by
the relative frequency or by any other method. The following properties of the
probability function P ( . ) are easy to prove and you may like to prove some of
them :
i) P ( A ) r l VAEA.
ii) P ( $ ) = 0.
iii) For a finite number of disjoint eventsAi,A2, ..., Ak belonging toA,
We now consider a very specid case of &is definition which is of considerable
practical use. Suppose that the sample space S of an experiment E has a finite
number N of points, say
S = {sl, S2, ..., SN)
Suppose we are interested in all possible subsets of S, which is also called the
power set of S. This is our a-field A. Different events belonging to this a-field are
(1) - the impossible event $ containing no points of S,
(2) - the singleton events {si), {a), ..., {SN} containing just one point of S,
I
(3) - the doubleton events {sl, s ~ ) , {sg s3), ..., {SN-1, SN) containing exactly
two different points of S, and so on.
( N + 1 ) - the sure event {si, s2, ..., SN) = S, containing all N points of S.
You may like to show that the total number of events are 2N. We assign an
equal probability 1/N to all singleton events { si }. In other words, all outcomes
sl, 52, ..., SN of the experiment E are assumed to be equally likely, with
probability of an individual outcome as 11 N. Next, if an eventA consists of k
distinct points
Sip Siz, ..., Si,
then its probability is assigned by Axiom 3, that is
Thus if a sample space S has N points and an eventA has k of them, then
k
P ( A ) = -
N
=
Total number of points favourable the occurrence of event A
Total number of points in the sample space S
This is also known as the classical definition of probability. It is thus seen that
the classical method is a very special case of the axiomatic definition. You
have seen that this special case is applicable, whenever the sample space S is
finite and all possible outcomes are equally likely. This is the situation in most
games of chance, where it is reasonable to assume that outcomes are equally
likely. We now illustrate its application by giving following examples:
Example 4 :
A coin is tossed 3 times. Let A be the event that the total number of heads are 2
and B be the event that the second toss results in a tail. Find P(A ) and P( B ).
Solution :
Here the sample space S is given by :
S = ( HHH, HHT, HTH, THH, Hi T, THT, lTH, TTT 1. Also A = { HHT, HTH, THH }
and B = { HTH, HTT, TTH, TZT}. Assuming all 8 points of S as equally likely
and using the definition of probability we get
P(A ) = 3/8 and P ( B ) = 4/8 = 1/2.
0
Example 5 :
From the digits 1, 2, 3, 4, 5 and 6, first one digit is chosen at random and then a
second digit is chosen at random from the remaining five digits. LetA be the
event that the largest number chosen is 4 and B be the event that the total of
digits selected is 6. Obtain P( A ) and P( B ).
Solution :
Here the sample space S consists bf ordered pairs
S = { ( 1 , 2 ) 7...7( 1 , 6 ) 7 ( 2 , 1 ) , ( 2 , 3 ) .,.( 2 7 6 ) . . . ( 671) ( 672) . . . ( 675) } .
This has 30 points which are equally likely. Also
A = [ ( 1 7 4 ) , ( 2 7 4 ) , ( 3 7 4 ) , ( 4 , 1 ) 7 ( 4 7 2 ) , ( 4 7 3 ) }
andB = { ( 1 , 5 ) , ( 2 , 4 ) , ( 4 , 2 ) , ( 5 , l )}. Thi sgi ve
P( A ) = 6/30 = 1/5 and P( B ) = 4/30 = 2/15.
E5
Describe any 2 mutually exclusive events of the sample space of Example 4.
Probability c o ~ c ps
E6
Find P ( A U B ) and P ( A n B ) for events described in Example 5.
E7
Find tL: probability of obtaining a total of 6 or 9 in a single throw of two dice ?
1.5 PROBABILITY MODELS
After learning about the various definitions of probability, it is now time for you to
know about some simple probability models. Like other mathematical models, these
models have been developed over a period of time. Initially for studying a random
phenomenon, the experimenter collects empirical data. This generates his insight
and he visualizes some patterns. These patterns lead to the formulation of a suitable
model for the random phenomenon under study. This model is then compared and
tested with the empirical data, and the process continues. Finally, a model which
describes the particular phenomenon best is chosen. You might have noticed that
the main stress is in choosing a probability model which describes the random
phenomenon reasonably.
1.5.1 Examples of Some Simple Probability Models
Simple Random Sampling
Sampling represents one of the few cases where we deliberately introduce
randomness in an experiment. Suppose that there are units ui, 242, ..., UN in a
population, and we are interested in some characteristicxof this population. The
word population here refers t da population consisting of any object, such as people,
towns, lakes, hand pumps, electric bulbs etc. Now, even though it may be
physically possible to collect data onXfor each unit ul, yet we may not want to do
so due to time or cost considerations. Instead we may conduct an experiment of
selecting a sample of size n of these N population units. For example, suppose that a
population of N = 8637 villages in a state is available, and our characteristic X is
the total number of households having electricity. We first list these villages serially
from 1,2, ..., N. Let Xi denote the number of houses with electricity in the it12 village
( i = 1,2, ..., N ). The object may be tofind the total number ofhouses having
electricity, that is tofind
Clearly, this is a time consuming task and hence even though it is possible to findXt
exactly, yet we introduce randomness by collecting the data for a sample of size
n = 62 (say) villages only. Apart from saving time and money, it may give better
results than complete enumeration in some cases.
A sample of size n from a population ul, ..., UN maybe defined as an ordered
arrangement of n of these units, for example, for N = 5 and n = 3, ( ul, u2, u3 ) is
one such sample. As you might have guessed, a sample can be selected in many
F
ways. One such method gives a simple random sample (S.R.S.). A sample is called
a S.R.S. if the probability of selecting each particular sample is same. Even a S.R.S.
can be further divided into two types.
Simple random sample with replacement. In this type of sampling, a unit
u; can-be selected more than once in a sarrgle. Intuitively, we can imagine
that units are selected one by one and after each selection, the chosen unit
is replaced. Thus each selection is made out of N units. In this case, there
are N" possible samples and the sample space representing the samples as,
outcomes is
1 .
The probability of selecting each sample is now
N
Simple random sample without replacement. In this type of sampling, all
those ordered samples in which a unit appears more than once are rejected
and an equal probability is assigned to the remaining samples. We can
again imagine that units are drawn one by one, but a unit once chosen is
removed from the population and hence thc sample contains no
N
repetitions. Clearly, the number of such samples is P n, and these are
1
The probability of selecting each sample is now ni-. Intuitively, one feels that
pn
there is no advantage in having a unit more than once in a sample. This is indeed so.
Thus a S.R.S. without replacement is better than a S.R.S. with replacement.
A well shuffled deck of card bearing numbers 1,2, ..., Nor tables of random
numbers may be used for selecting a simple random sample. The use of random
number tables eliminates personal biases of selection. You may like to prepare such
a random number table yourself for your own use. One possible method of making
such a table is described here.
Take two ordinary dice and label them as die No. 1 and die No. 2. On throwing
these dice and observing the uppermost face, we get a sample space with 36 points.
We discard any six of these points, for example, the outcomes (1,6), (2,5), (3,4),
(4,3), ( 5, 2) and (6, 1) when the total of uppermost faces is 7. This leaves us with
30 points. Now assign the digits 0, 1, ..., 9 to exactly 3 distinct points of this sample
space to cover all 30 points. One such assignment is as follows :
Event Digit Assigned
Eo = {(I, 11, (1,219 (291))
0
El = ((1,319 (2,213 (391))
1
E2 = {(1,4), (2,3), (3, 2))
2
E3 = ((4,117 (1,5), (2,4))
3
E4 = {(3,3), (4721, (591))
4
Es = {(2,6), (3,5), (494))
5
E6 = {(5,3), (6,2), (396))
6
E7 = {(4,5), (5,419 (673))
7
E8 = {(4,6), (5, 5), ( 64) )
8
E9 = {(5,6), (6,5), (6,611
9
Unless the dice are extremely unbalanced, it is reasonable to assume that all these
ten events Ei ( i = 0, ..., 9 )are equally likely. Now throw this pair of dice once
and observe whether or not any one of these 10 events occur. Note that none of
these events occur if the total of uppermost faces is 7. If the event Ei occurs, then
write down the digit 1. Repeat this experiment and write the corresponding digit
next to i. Continue in this manner till you have 2500 digits, written in 50 rows with
each row containing 50 digits. The digits may be written horizontally or vertically.
These are then your random digits and you have a table of random numbers ready
for later use. This table may be read column-wise, row-wise, two or more columns
(rows) at a time, diagonally or in any manner as long as the random numbers once
used are not used again in the same draw.
Birthday Problem
An interesting application of S.R.S. with replacement is the famous birthday
problem. Suppose that the birthdays of n persons attending a meeting form a
S.R.S. with replacement of size n from the population of N = 365 days in the
year (excluding February 29). Then the probability P n that all npeople have
different birthdays is
Simple calculations show that P23 = 0.493. Thus for 23 persons, the
probability that at least two persons have a common birthday is greater than
112. Similarly, Psois approximately equal to 0.03, and hence in a gathering of
about 50 persons, you can be almost sure that at least two persons will have a
common birthday.
Binomial Probability Model
Consider an experiment where a coin is tossed n times. The sample space S has
2" points. A typical point is si = ( HTTH ... T ), which is an n-tuple of H s and
Ts. The number of heads in a single point may vary from 0, 1,2, ..., n. If all the
outcomes are equally likely, then we can assign an equal probability 112" to all
these points. Suppose, we are interested in the events
E k - [ s ~ : s h a s exactlyk, Hs and n- k, T1s, k = 0, 1, 2, 3 ,..., n }
You are urged to verify that E k contains exactly points of S and
(L 1
Next, suppose that all 2" of s zke not kqually likely. However to each
point si containing exactly k, @s and n - k, T 's a probabilitypk $-k, where
0 s p s 1 andp + q = 1, may be assigned. Now E k contains
I;)
points of S each having the probability pkqn-* Thus
P ( ~ k ) = (;) pkqn-k, x = 0, I, 2 , .." , n
which reduces to equally likely case forp = 1/2. Finally
The probabilities of events Ek are referred as binomial probabilities. Thus the
assignment of probabilities in this manner gives rise to the binomial
probability model. This model occurs quite often in many situations, some of
which are described below :
a) Tossing n coins simultaneously.
b)
Throwing two dice and labelling an event as success (H) if the total of
uppermost faces is 7 and failure (T ) if the total is different from 7, and
repeating this experiment n times.
c)
Drawing a S.R.S. with replacement of size n from a box containing
N items, such as radio valves, out of which D are defective and remain
(N - D) are non-defective.
Geometrical Probability Model
In all the examples considered so far, the sample space S is discrete, that is, it
has a finite or a countable number of points. However, when the sample space
is continuous, we cannot define probability in terms of number of points. But,
occasionally such probability problems can be solved geometrically, by using a
geometrical measure appropriate to the nature of the problem. Thus we can use
the measure of length in one dimension; area in two dimensions, volume in
three dimensions, etc. In geometrical probability problems, the total probability
of admissible region S is assumed to be unity and a uniform probability
measure directly proportional to the size of region is assumed.
Example 6 :
Les s - { ( X , ~ ) : O S X S 1,0sys2}andtheprobabi1itymeasurebe
proportional to the area of the region. Let E be an event
E = { ( x , y ) ES : x + y s 1).
Find P ( E ).
Solution :
Clearly E is a triangle bounded by the lines x = 0, y = 0 and x + y = 1
and has an area equal to 112 units. Hence
P ( E ) =
Area of E 1/ 2
- -
-
2
= 1/ 4 = 0.25.
Area of S
1.5.2 Model Building from Experimental and Theoretical
Considerations
As stated in the beginning of this section, we are generally interested in finding the
probabilities of some events of interest. In actual practice, these probabilities are
determined by theoretical or practical considerations. As you will see that model
building is an extremely difficult task and different persons may arrive at different
probability models for the same case. We now illustrate the process by means of
some examples.
Probability Concepts Examples of Model Building
1. A box contains 50 nails out of which 20 are 1 cm. long, 25 are 2 cm. long
and 5 are 4 cm. long. A nail is drawn randomly and its length is measured.
The sample space may be taken as
s = [ Sl, 32, s3}
where sl, s2 and s3, respectively, represent that al , 2 and 4 cm. long nail
has been chosen. On this sample space, you may specify a simple equally
likely model
Pi ( {si) ) = Pi ( ( ~ 2 ) ) = Pi ( ( ~ 3 ) ) = 113
for assigning the probabilities to the three sample points, whereas your
friend may assign the probabilities
P2 ( { si } ) = 0.4, P2 ( { s2 } ) = 0.5, P 2 ( [ s3 1 ) = 0.1
which is obtained by considerating the number of nails of various sizes. A
natural question then is which probability model is correct and who is
right. Obviously, both you and your friend are assigning different
probability measures to events of S. If the draw is really random, then by
assumptions your friend is right. But it is also possible that both of you
are wrong. This may happen if the draw is actually made by asking a
child to pick one nail from the box. A larger size nail will then have a
higher chance of being picked up than a smaller size nail. A model which
takes into account the size of nail also may then be much closer to reality.
Thus
Similarly P3 ( [ s2 } ) = 5/9 and P3 ( { s3 } ) = 2/9. We thus see that
anyone of these three models can be recommended by using theoretical
considerations depending on the assumptions made.
For a practical method of assigning probabilities in this example, we have
to actually perform this experiment a large number of times, say 10000
times. If in these 10000 repetitions of the experiment, we observe the
outcome sl on 2890, s2 on 5862 and s3 on 1248 occasions, then the
probability model assigned by this method is
p4([ si } ) = 0.2890, Pq ( [ s2 ] ) = 0.5862, Pq ( ( s3 } ) = 0.1248 .
2)
Suppose that we are studying the number of customers arriving at a bank
service counter in a 10 minute interval. The number of arrivals are
non-negative integers 0, 1, 2, 3, ... . It can be shown that under certain
assumptions (known as Poisson process assumptions) about the arrivals,
the probability that there are x arrivals in a 10 minute interval has the form
P ( x arrivals ) =
e- . Ax
, x = 0 , . 1 , 2 , 3 ,...,
x !
where h is a parameter representing the average number of arrivals per 10
minute interval. Again it may be stressed that if the Poisson process
assumptions do not hold then we will not get the above expression
theoretically. In that case, we are again left with the relative frequency
approach. ,
,
You may have noticed by now that the examples discussed so far are that
of a discrete sample space. Such sample spaces are relatively easier to
handle since we have to deal with a finite or a countable number of points
of a sample space S. However, in many practical situations, the
experiment is of such a nature that the sample space is continuou;. For
exainples may be an interval of the real line R or the entire real line itsdf.
Events of this sampl space, which are of our general interest are intervals
of the form (a, b), [a, b], (- w, x] (a, a ) etc. or the single points like {a}.
In this case, the a- fieldA on which a probability function is defined is
the smallest a-field containing all intervals of the form (- w, x] for all x. It
can be obtained by other types of intervals as well. Events of this a-field
are essentially obtained by applying countable set operations to intervals {(-
a, x] : x E R) which includes singleton points { a) , open, closed,
semi-closed intervals etc. The sets of A are also called Borel sets of the real
line and A is also known as the Borel field. For defining a probability
measure on A, we start with a real valued integrable function f ( x ) defined
on R such that
ii)
j - f ( x ) d r = 1.
- m
For an eventA E A, we then define the probability measure by
Different f ( x )then give rise to different probability models. The example 3
below illustrates one such situation:
3)
Electric bulbs are produced in a factory. Our main object is to study the
life time x (in hours) of a bulb produced. The sample space representing
the life time is the positive half of the real line. Thus S = ( 0, a ) and the
probability that a bulb will have a life between a and b is simply given by
b
P ( a < X < b ) = $ f ( x ) d r , Oc a < b c m.
a
The function f ( x ) may either be guessed by past records, or it may be estimated by
the method of histograms. The method of histograms is a practical method and
gives some idea about the function f ( x ). For this we start with n bulbs and observe
their life times in hours. Let xi ( i = 1, ..., n ) be the life time of ith bulb. Now
classify this data into c mutually exclusive and exhaustive classes
( ao, a1 ), ( a1 a2 ), ...., ( a,- I, ac ) where a0 = 0. usually the value of c is around
10 and the classes are of equal length. However f ( x ) can be estimated by unequal
class intervals as well. Suppose that the frequency for ith class ( a; - 1, a1 ) is
f;: ( i = 1, ..., c ). Thusfi observations out of n belong to the ith class interval. The
data may then be summarized in a frequency table as below :
S1. No.
1
2
Class Interval
( a ~ , al
( a,, a,
Frequency
fi
h
A plot of frequenciesf;: against class inrervals is called a histogram. Histograms
are representations of collected data and closely approximate the function f ( x )for
the phenomenon under study. In the above example, we may start with n = 1000
bulbs and obtain a frequency Table 1.1 given below :
S1. No. Class Interval Frequency
1 (0, 200 ] 332
2 ( 200,400 ] 220
3 ( 400, 600 ] 147
4 ( 600, 800 ] 103
5 ( 800, 1000 ] 67
6 ( 1000, 1200 ] 45
7 (1200, 14001 30
8 (1400, 16001 21
9 (1600, 18001 13
10 ( 1800,2000 ] 9
11 ( 2000, oc ) 13
Total 1000
Table 1.1: Frequency table for life times of 1000 bulbs.
I
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Class interval
Flgurp 1.2
In subsequent units, you will learn methods of estimating f ( x ) and testing whether
or not a specified f ( x ) represents the data adequately. For the above data, suppose
we somehow guess that
It is then possible to verify whether or not our assumption is correct. Assuming it to
be correct, we can use this probability model for evaluating probabilities of various
events.
For example, the probability that the lifeXof a bulb chosen randomly will be
between 200 and 600 hours is then given by
600
Similarly,
It may be stressed that all situations may not be as simple as illustrated above.
There may be considerably complicated cases, where appropriate probability
models may have to be specified. For instance, our object may be to study the total
volume of water in a reservoir on a weekly basis. On week t, let
X ( t ) = volume of water initially,
I ( t ) = random inflow of water,
R ( t ) = controlled release of water, and
L ( t ) - random loss of water due to seepage etc.
We can then form an stochastic equation
and attempt to estimate the function X ( t ), using theoretical and practical methods.
1.6 SUMMARY
In this unit you have learnt about the sample space associated with a random
experiment. You know that a sample space is either discretc or continuous. Events
are simply subsets of sample spaces in which we are interested. Counting points of
a sample space using permutation and combination methods are reviewed. Basic set
operations of union, intersection and complementation are defined.
You have also learnt about the relaltive frequency and axiomatic definitions of
probability. The former one is a practical definition whereas the latter one simply
specifies the probability of an event as a quantitative measure satisfying three basic
axioms. Systematic methods for building a probability model for the phenomenon
under study are described. Concepts of simple random sampling with and without
replacements are introduced and a method for prcparing a table of random numbers
is described.
1.7 SOLUTIONSIANSWERS
We observe that two committees are same if they are made up of the same
members (regardless of the order in which they were chosen). So the required
number is given by
8!
= 56.
E2
We observe that in this problem order does make a difference because a
particular order will represent a specific signal. Therefore, the required number
is given by
8!
SP3 = - 3! = 336 signals.
E3
The random experiment E consists of tossing a coin four times and observing
the sequence of heads and tails obtained. The sample space S consists of all
possible sequences of the form ( al , a2, a3, a4 ) where each a; = H or T
depending on whether head or tail appeared on the ith toss. The total number
of points in this sample will be Z4 = 16.
I
E4
The outcome of a single toss is either head (denoted by H) or tail (denoted by
Y). The sample space S consists of
s = [ H, TH, TTH, m ~ , m}
E5
LetA be the event that all tossings result in "head" and B to be.the event that
all tossings result in "tail". As A rind B cannot occur together, A fl B = + and
so A and B are mutually exclusive. (You should try to list other choices of A
and AB as well.)
The sample space S consists of 36 points as follows:
S = ( x, y) : x = 1, 2, 3, 4, 5, 6, andy = 1, 2, 3, 4, 5, 6. }
= i ( 1, 1 ) ( I , 2 1, --., ( 1, 6 1, ..., ( 632 1, .-*, ( 6, 6 ) }
LetA be the event consisting of outcome ( x, y ) such that x + y = 6 and B be the
event consisting of outcome ( x, y ) such that x + y = 9. This gives
A = [ ( 1 , 5 ) , ( 5 , 1 ) , ( 2 , 4 ) , ( 4 , 2 ) , ( 3 , 3 ) }
B = {( 3, 6) , ( 6, 3) , ( 5, 4) , ( 4, 5 1)
Therefore, we have
P(A ) = 5/36
P( B ) = 4/36 = 1/9.

Introduction To Probability: Unit 1

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Introduction To Probability: Unit 1

Enviado por

Direitos autorais:

Formatos disponíveis

UNIT 1 INTRODUCTION TO

Você também pode gostar