Escolar Documentos
Profissional Documentos
Cultura Documentos
Contents
1 Introduction 2
2 Common Terms 2
3 Theory of Probability 3
3.1 Definition’s of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1.1 Classical Definition of Probability . . . . . . . . . . . . . . . . . . 4
3.1.2 Empirical Definition of Probability . . . . . . . . . . . . . . . . . 5
3.1.3 Axiomatic Definition of Probability . . . . . . . . . . . . . . . . . 6
5 Conditional Probability 8
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.2 Definition and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3 Bayes Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.4 Independence of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6 Some Cautions 11
List of Tables
1 Laws of Set Theory used in Probability . . . . . . . . . . . . . . . . . . . 4
1
1 INTRODUCTION 2
1 Introduction
The idea of probability, chance or randomness is quite old; where as its rigorous ax-
iomisation in mathematical terms have occurred relatively recently. Many of the ideas
of probability theory originated in the study of game of chance. In this century, the
mathematical theory of probability has been applied to a wide variety of phenomena
for example — It has been used in genetic theory to understand mutation and gene
sequence, In Information management, probability has been used in designing and
optimising various operating systems, to model the length of various queues, In com-
munication theory, probability has been used to study noise in electrical devices and
communication systems, In atmospheric research turbulence is modeled using proba-
bility, Actuarial sciences, used by insurance companies, rely heavily on the theory of
probability to determine premium etc.
In this article, we treat the basic idea of probability and statistics. In the first part we
explain the theory of probability and subsequently we deal with problems of probability.
2 Common Terms
Probability theory is concerned with situations in which the outcomes occur randomly.
Generally, such situations are called experiments, and the set of all possible outcome
is known as the sample space corresponding to the experiment. The sample space is
generally denoted by S and an element of S is demoted by ω.
where csc, for example, denotes the outcome that the commuter continues through
the first light, stops at the second light and continues through the third light.
S = {0, 1, 2, 3, . . . , N}
3 THEORY OF PROBABILITY 3
Events of subsets, are usually denoted by uppercase roman letters. Similarly, in Example
B, the event that the print queue has fewer than five jobs can be denoted by
A = {0, 1, 2, 3, 4}
The algebra of set theory is directly applicable to the events in probability theory.
The union of two events A and B, denoted by A ∪ B is defined as the event such that
either A occurs or B occurs or both occurs. The intersection of two events, denoted by
A ∩ B is defined the event such that A and B both occur. The compliment of event A,
denoted by A or Ac or A′ , is the event that A does not occur and thus consist of all the
elements of S which are not in A. An empty set, denoted by φ, is a set which has no
elements i.e. it is the event with no outcomes. If there are say two events A and B and
if A ∩ B = φ then the events A and B are said to be disjoint events.
The laws of set theory are extensively used in probability theory. Table 1 gives some
of the laws of set theory which are used frequently in statistics.
You are advised to check the validity of the laws using Venn Diagram
3 Theory of Probability
Suppose an event results in n mutually exclusive, exhaustive and equally likely cases.
Let m be the number of events which are favorable to the event A. Then the probability
of event A, denoted as P (A) is defined as:
The classical definition introduced some concepts which are defined here. Consider
an experiment which though repeated under essentially identical conditions does not
give unique results but may result in any one of the several possible outcomes. Then
the experiment is known as a Trial and the outcomes are known as events or cases.
The total number of possible outcome in any trial is known as the exhaustive event.
Exhaustive events corresponds to the sample space of set theory. Events are said to be
mutually exclusive if the occurrence of any one of them precludes (or prevents) the
occurrence of all other events. Outcomes of a trial are said to be equally likely if taking
into consideration all the relevant evidences, there is no reason to expect one event in
preference to any other event. Favorable Events in a trial are the number of events,
the occurrence of which leads to the occurrence of the defined event.
3 THEORY OF PROBABILITY 5
ab
ddd b
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bb
e
ddd e
e
e
e
U NDERSTANDING THE C LASSICAL P ROBABILITY D EFINITION
ddd e
e
e
Let us consider that we toss a dice. Then, tossing of the dice is an Trial and
e
getting 1 (or 2 or 3 or 4 or 5 or 6) is an event. Thus the above trial leads
ddd e
e
to the following events S = {1, 2, 3, 4, 5, 6} which is the sample space of the
e
ddd e
outcomes. If we assume that the dice is unbiased, we don’t know which event
e
e
will occur. So we cannot favor any event over any other event. Hence all the
ddd e
events are said to be equally likely. Now if after tossing the dice, we get 1
e
e
(say), then in this toss any other number (2,3,4,5,6) cannot occur. Hence the
ddd e
e
outcomes of the event are mutually exclusive.
ddd e
e
Now suppose we define a new event A as the outcome of an odd number
e
e
when an unbiased dice is tossed. Then the number of outcomes which are
d e
favorable to event A are A = {1, 3, 5}. Then the probability of event A is
fggggggggggggggggggggggggggggggggggggggggh
P (A) = 3/6 = 1/2.
It can be easily seen that 0 ≤ m ≤ n and therefore the value of P (A) lies between 0
and 1, both inclusive. It can also be seen that the classical definition is dependent on a
finite number of cases (n 6= 0). Sometimes, the expression (1) is also referred to as that
“the odds in favor of event A are m : (n − m) or the odds against event A are (n − m) : n.
In case the sample space is infinite then the classical definition fails to define proba-
bility. The classical definition also fails if the trials of an event are not equally likely. For
example, suppose a candidate appears for a test then we normally assume that the can-
didate is equally likely to fail or pass. However, if we already know that the candidate
has more than 50% chance of passing, then the outcomes are not equally likely and
hence we cannnot apply the classical definition. Given these limitations of the classical
definition, we now look at other definitions of probability
of event A denoted by
m
P (A) = lim (2)
n→∞ n
1. P (S) = 1
∞
! ∞
[ X
P Ai = P (Ai)
i=1 i=1
The first two axioms are obviously desirable. Since S consists of all possible events,
hence P (S) = 1. The second axiom simply states that probability of any event A is
defined and non-negative. Let us first understand the third axiom in terms of two
events A1 and A2 which are disjoint in nature i.e. they have no outcome in common;
then P (A1 ∪ A2 ) = P (A1 ) + P (A2 ). Thus what the third axiom says is that if there are a
large number of events, A1 , A2 , A3 , . . . , An , . . ., defined on the sample space S and they
are all disjoint, then the probability of the union of these events is nothing but the sum
of probabilities of the individual events.
The following important properties derive from the axiomatic definition of proba-
bility:
4. If A and B be any two events defined on the sample space S and are not disjoint,
then P (A ∪ B) = P (A) + P (B) − P (A ∩ B). This is also known as the Addition
Law of probability.
ab
ddd b
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbbbb
e
ddd e
e
e
S OME P ROPOSITIONS R EMEMBER
e
TO
ddd e
e
P ROPOSITION A: For a set of size n and a sample of size r, there are nr different
e
e
n!
ordered samples with replacement and n Pr = (n−r)! = n(n − 1)(n − 2) . . . (n −
ddd e
e
r + 1) different ordered samples without replacement.
e
ddd e
e
P ROPOSITION B: The number of unordered samples of r objects selected from
e
ddd e
n!
n objects without replacement is n Cr = (r!)(n−r)!
e
e
C OROLLARY: The number of orderings of n elements is n(n−1)(n−2) . . . 1 = n!
fggggggggggggggggggggggggggggggggggggggggh
Let us now consider a special case, in which we are not interested in ordered
samples, but in the constituents of the sample regardless of the order in which they
have been obtained. In particular, we ask the following question: If r objects are
taken from a set of n objects without replacement and disregarding order of selec-
tion, then how many different samples are possible? Now we know that the number
5 CONDITIONAL PROBABILITY 8
n n!
Cr =
r!(n − r)!
n(n − 1)(n − 2) . . . (n − r + 1)
=
r(r − 1)(r − 2) . . . 3.2.1
5 Conditional Probability
5.1 Introduction
To introduce the aspect of conditional probability we take a help of an example. Dig-
italis therapy is often beneficial to patients who have suffered congestive heart failure
— a type of cardiac disease. But giving Digitalis to patients has a serious side effect as
the patient runs the risk of having Digitalis toxicity which can prove fatal. To improve
the chance of correct diagnosis, the concentration of Digitalis in the blood can be mea-
sured. A study was conducted in 135 cardiac heart patients to find the concentration
of Digitalis in the blood of the patients. The table below gives the results where the fol-
lowing notations are used: D+ congestive heart disease is present, D− the congestive
heart disease is not present, T + there is high concentration of Digitalis in the blood
and T − there is low concentration of Digitalis in the blood.
D+ D− Total
T+ 25 14 39
T− 18 78 96
Total 43 92 135
Thus for example, 25 patients had high concentration of Digitalis in blood and the
disease is present. Assuming that the findings of the the study holds for all the cardiac
patients, the probability of having congestive heart disease is 43/135 = 0.318. But
suppose now, that we have a patient and the patient shows a high concentration of
toxicity in the blood. Then what would be the probability of the patient having the con-
gestive heart disease? To answer this question we can restrict our attention to the first
row of the table. We see that out of 39 cardiac patients who have high concentration
of Digitalis in the blood, 25 suffer from congestive heart disease. Thus the probabil-
ity of having congestive heart disease given that the patient has high concentration of
Digitalis in blood is 25/ 39 = 0.640.
5 CONDITIONAL PROBABILITY 9
Let us understand the results. The probability of having a congestive heart disease
amongst cardiac patients P (D+) = 0.318, but when we get additional information
of Digitalis concentration in blood the probability of having congestive heart disease
becomes P (D + |T +) = 0.640 which is much higher than the P (D+). P (D+) is the
unconditional probability of having congestive heart disease and P (D + |T +) is known
as the conditional probability of having congestive heart disease given that we have
the information T +. Conditional probability can also be looked upon as the probability
of a particular event provided some additional information about the occurrence (or
non occurrence) of the event is available.
2. If A1 and A2 are two events such that A1 ∩A2 = φ then P (A1 ∪A2 |B) = P (A1|B) +
P (A2 |B).
Proof
Since the events Ai ’s are exhaustive we have
∪ni=1 Ai = S
5 CONDITIONAL PROBABILITY 10
Now we can write event B as B ∩ S (From Complimentary Law of Set Theory) and
as such we have:
B = B∩S
= B ∩ (∪ni=1 Ai )
= ∪ni=1 (B ∩ Ai )
P (B) = P (∪ni=1 (B ∩ Ai ))
Xn
= P (B ∩ Ai )
i=1
And from the multiplicative law of probability we know that P (B∩Ai ) = P (Ai )P (B|Ai)
hence
n
X
P (B) = P (Ai )P (B|Ai)
i=1
QED
ab
ddd b
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bbb
bb
e
ddd e
e
e
e
I NDEPENDENCE OF M ANY E VENTS
ddd e
e
e
If there are more than two events say A1 , A2 , A3 , . . . An then for stochastic
e
independence it is required that the following conditions are met:
ddd e
e
e
ddd e
P (Ai ∪ Aj ) = P (Ai)P (Aj )
ddd
P (Ai ∪ Aj ∪ Ak )
e
e
e
= P (Ai)P (Aj )P (Ak )
e
... ... ...
ddd
P (Ai ∪ Aj ∪ Aj . . . ∪ An )
e
e
e
= P (Ai)P (Aj )P (Ak ) . . . P (An )
ddd e
e
e
e
Note: For n events, (2n −n−1) conditions are required to be met for stochastic
fggggggggggggggggggggggggggggggggggggggggh
independence
6 Some Cautions
The concept of probability is not easily understood by people and therefore at times is
used to confuse the population at large. Take for example the following quote from the
Los Angeles Times (August 24, 1987) which talks about AIDS:
Several studies of people infected with the AIDS virus shows that a single act
of unprotected sex with an has a surprising low risk of infecting partners —
probably one in 100 to one in 1000. For an average, consider the risk to be
1 in 500. Statistically, 500 acts of unprotected sex with an infected partner
or 100 acts with five partner leads to a 100% probability of infection.
Have you spotted the flaw? There are many but we will consider only a few. First
and foremost, the report says that 500 acts with one infected partner will lead to infec-
tion. So suppose a person has 1000 acts of sex with an infected partner, then what is
his probability of getting infected? According to the report it is 2 (which, of course, is
not possible theoretically).
Let us assume that the probability of infection is 1/500, as reported in the news. Now
suppose a person has 500 acts of sex with an infected partner. What is his probability
of getting infected? Let us work it out. Assume that the sexual acts are independent
of each other and each act has 1/500 probability of having the infection. Then the
probability of non-infection in each act is 1 − (1/500) = 499/500. So in 500 acts of
6 SOME CAUTIONS 12
More than 80% of those questioned choose the second statement, despite the fact
that the correct answer is the first statement.
Even hardened professional’s have difficulty in answering probabilistic calculations.
For example the following question was asked to 100 doctors:
“In the absence of any special information, the probability that a woman has breast
cancer is 1%. If the patient has breast cancer, the probability that the radiologist will
correctly diagnose it is 80%. And if the patient has benign lession (no brest cancer)
then the probability that the radiologist will incorrectly diagnose it as breast cancer is
10%” Then what is the probability that a patient with a positive mammogram actually
has breast cancer?
95 out of the 100 physicians estimated the probability to be about 75% However,
the correct probability, as given by Bayes rule is 7.5% (You can check this). So even
experts make mistakes.
However, in spite of it’s misuse and lack of interpretation, probability is the corner-
stone of all sciences and also of various subjects of humanities and management. So
it is imperative that you have a clear idea of probability and understand the basics of
probability well.