Introduction To Probability Theory

Introduction to Probability Theory
K. Suresh Kumar
Department of Mathematics
Indian Institute of Technology Bombay
September 9, 2017
2
LECTURES 14-15
Example 0.1 (Bernoulli distribution) Let (Ω, F, P ) is a probability space

and A ∈ F with p = P (A). Tossing a p-coin gives such a probability space
with an event A. Here note that X takes 2 values 0 and 1.
The distribution function of X is given by
F (x) = 0 if x < 0
= 1 − p if 0 ≤ x < 1
= 1 if x ≥ 1 .
The distribution of X is given by

X
µ(B) = pk , B ∈ BR ,
k∈B∩{0,1}
where p0 = 1 − p, p1 = p. The above function F and the probability measure

µ on (R, BR ) are called respectively the Bernoulli distribution function and
the Bernoulli distribution. Also X = IA is an example of Bernoulli (p)
random variable.
Example 0.2 ( Binomial distribution with parameters (n, p)). Let X1 , X2 , · · · , Xn

be n independent Bernoulli(p) random variables defined on a probability
space. In fact one can define independent Bernoulli’s given above through
the following. Toss a p-coin n-times independently and let Xk = 1 if kth
toss is H and = 0 if the kth toss is T . Then

n
P {X = k} = P {X = 1, for k i0 s and Xi = 0 otherwise}
k

n k
= p (1 − p)n−k .
k
Hence the distribution of X = X1 + · · · + Xn is given by


 0
if x < 0

 n
(1 − p)n


 if 0 ≤ x < 1



 0
 n

n n
F (x) = (1 − p) + p(1 − p)n−1 if 1 ≤ x < 2
 0 1
 k
n i

 X
p (1 − p)n−i if k ≤ x < k + 1, k = 2, . . . , n − 1




 i
 i=0


1 if x ≥ n .
3

X n k
µX (B) = p (1 − p)n−k ).
k
k∈B∩{0,1,··· ,n}
The above F and µX are called Binomial (n, p) distribution function and
Binomial (n, p) distribution respectively. A random variable with Binomial
distribution as its law (distribution) is called a Binomial random variable. In
the beginning we have seen an example of a Binomial (n, p) random variable.
Example 0.3 (Poisson distribution with parameter λ). On (R, BR ) define

probability measure
X λk e− λ
µ(B) = , B ∈ BR .
k!
k∈B∩{0,1,··· }
Then µ defines a probability measure on BR and is called Poisson distribution

with parameter λ. The Poisson distribution function is given by

 0
 if x < 0
λ k e− λ
F (x) =
X
,x ≥ 0

 k!
k=0,1,2,··· , k≤x
Question: Is F is indeed a distribution function? From the definition

of distribution function (I have given in the beginning of this chapter),
F is a distribution function if there exists a random variable X such that
F (x) = P {X ≤ x} for all x ∈ R.
I will give one such construction. Observe that X should take values
from {0, 1, 2, · · · } = {0} ∪ N. So take Ω = {0} ∪ N, F = P(Ω) and
X λk e− λ
P (A) = , A ⊆ Ω.
k!
k∈A
On this probability space, define X : Ω → R by X(ω) = ω. Then X is a

random variable and
X λk e− λ
P {X ≤ x} = , x ≥ 0 = F (x)
k!
k∈{0}∪N:k≤x
and P {X ≤ x} = 0 = F (x), x < 0. i.e., F is the distribution function of X.

Exercise: Give all the details.
4
Example 0.4 (Geometric distribution) On (R, BR ) define probability mea-

sure
µ({k}) = p(1 − p)k−1 , k = 1, 2, · · · ,
0 < p < 1. Then µ defines a probability measure on BR and is called
geometric distribution with parameter p.
Stusent may add the details as in the previous example.
Example 0.5 (Uniform distribution on [0, 1)) The distribution function is

given by 
 0 if x ≤ 0
F (x) = x if 0 < x ≤ 1
1 if x ≥ 1

is called the Uniform [0, 1) distribution function.
As discussed above, first one need to check that the above indeed is a
distribution function, i.e. we need to get(construct) a random variable X
such that P {X ≤ x} = F (x), x ∈ R.
To this end, first we describe a probability space. On (R, BR ), define the
probability measure µ such that
µ(B) = l(B ∩ [0, 1))
when the Borel set B is an interval, where l(B ∩ [0, 1)) denote the length of
the interval B ∩ [0, 1) if it is non empty. Now take (Ω, F, P ) as (R, BR , µ)
and define X : Ω → R as X(ω) = ω, the identity function. Then X is a
random variable (exercise) and
P {X ≤ x} = P ((−∞, x]) = l((−∞, x] ∩ [0, 1)) = F (x), x ∈ R
The probability measure µ is called the uniform distribution on [0, 1).
Example 0.6 ( Normal distribution with parameters µ, σ )

The distribution function F : R → R is given by
Z x
1 (y−µ)2
F (x) = µ(−∞, x] = √ e− 2σ2 dy .
2πσ −∞
is called normal distribution function with parameters µ, σ. Again to see that

F is indeed a distribution function, I will give another useful construction
of the ’normal’ random variable. Let U be a uniform [0, 1) random variable
5
defined on (Ω, F, P ). We can see that (exrcise) F is strictly increasing and

continuous with 0 < F (x) < 1 for all x ∈ R.
Define X = F −1 ◦ U . Then (exercise) X is a random variable on
(Ω, F, P ).
P {X ≤ x} = P {F −1 ◦ U ≤ x}
= P {U ≤ F (x)} = F (x), since 0 < F (x) < 1.
Example 0.7 (Exponential distribution with parameter λ > 0)

The distribution function is given by

0 if x ≤ 0
F (x) =
1 − e−λx if x > 0
is called exponential distribution function with parameter λ. Details of this

example is left as an exercise.
Remark 0.1 If F : R → R satisfies the following
(1)
lim F (x) = 0, lim F (x) = 1,
x→−∞ x→∞
(2) F is increasing and right continuous,
then we can show that there exists a random variable X on a probability

space (Ω, F, P ) such that
P {X ≤ x} = F (x), x ∈ R.
In fact, we have seen this in the above examples. We have seen three
methods to construct probability space and random variable on it satisfy-
ing P {X ≤ x} = F (x), x ∈ R. Two methods works only for special cases
which are easy (to construct) and the third method works for any F satisfy-
ing (1) and (2) but details are difficult (in fact ’finer details’ are beyond the
scope of this course).
Method I: This method is for the ’discrete’ F satisfying (1) and (2), i.e.
F ’increases’ only at jumps and hence all jumps add upto 1. Examples of
such F ’s are Bernoulli, Binomial, Poisson, Geometric etc. The precise def-
inition of ’discrete’ F is given in the subsection on ’classification of random
variables’.
6
Here one take Ω = D = {xi |i ∈ I}, the set of discontinuities of F (here

I is countable) and F = P(Ω) and P is defined by
P ({xi }) = F (xi ) − F (xi −), i ∈ I,
i.e. P ({xi }) is the jump size at xi . Now define X : Ω → R as X(ω) = ω.

Then X X
P {X ≤ x} = P {X = xi } = P {xi } = F (x).
i:xi ≤x i:xi ≤x
(Instruction: Student should carefully look at how each equality follows)
Method II: WhenF is strictly increasing, one can use the following.
Let U be a uniform (0, 1) random variable on a probability space (Ω, F, P ).
Then define X = F −1 ◦ U . Now as explained in the example of Normal
distribution that
P {X ≤ x} = P {U ≤ F (x)} = F (x).
Method III: This method is very general and works for any F satisfying
(1) and (2). Method relay on defining a probability measure P on (Ω, F) =
(R, BR ) such that
P ((−∞, x]) = F (x), x ∈ R.
(Here note that P is nothing but the distribution µ corresponding to F ).
Now define X : Ω → R → R by X(ω) = ω. Then
P {X ≤ x} = P ((−∞, x]) = F (x), x ∈ R.
See the example-Uniform distribution.
A Classification of random variables. Random variables can be classi-

fied using distribution functions according the ’continuity properties.
Definition 5.3: A random variable X with distribution function F : R → R

is said to be a discrete random variable if
X
(F (x) − F (x−) = 1 ,
x∈D
where D is the set of discontinuities of F .

Here observe that the ’discrete’ distribution F exhaust all the probability
masses through its jumps.
7
Lemma 0.1 If F is a discrete distribution then it is of the form

X
F (x) = pi H0 (x − xi ), x ∈ R,
i:xi ∈D
where H0 denote the Heaviside function1 and pi = P {X = xi } = F (xi ) −

F (xi −) and D is the set of discontinuities of F .
Proof: Let D = {xi |i ∈ I} where the index set I is countable and X be a

random variable with distribution X. Then it follows that P {X = x} = 0
for all x ∈
/ D. Therefore
F (x) = P {X ≤ x}
X
= P {X = xi }
i∈I:xi ≤x
X
= pi H0 (x − xi ).
i∈I
The distributions in Examples 0.1, 0.2, 0.3 corresponds to discrete ran-

dom variables.
Definition 5.4 A random variable X with distribution function F : R → R
which is continuous is said to be random variable with continuous distribu-
tion and is in short called by the name continuous random variable.
The distributions given in Examples 0.6, 0.5, 0.7 corresponds to contin-
uous random variable.
Definition 5.5 (Probability mass function)
Let X be a discrete random variable with distribution function F : R →
R.
Define f : R → R as follows:
f (x) = F (x) − F (x−)
Then f is called the probability mass function(pmf) of X.
For example, the pmf of the discrete random variable given in Example
4.0.26 is given by  1
 4 if x = 0, 2
1
f (x) = if x = 1
 2
0 otherwise .
1
Heaviside function H0 is defined by

0 if x<0
H0 (x) =
1 if x ≥ 0.
8
It is left as an exercise for the student to write down the pmf of random
variables in Examples 0.1, 0.2, 0.3.
The pmf of a continuous random variable is the zero function. Hence
the notion of pmf is useless for continuous random variables.
Definition 5.6(Probability density function)
A continuous random variable X with distribution function F : R → R
is said to have a probability density function(pdf) if there exists a function
f : R → R such that
Z x
F (x) = f (y)dy ∀ x ∈ R
−∞
If f : R → R exists, then it is called the pdf of X.

A continuous random variable with a pdf is simply called by absolutely
continuous random variable
It is easy to see that if F is differentiable every where and the derivative
denoted by F 0 is a continuous function, then the corresponding random
variable X has a pdf and is given by f = F 0 . This is not a necessary
condition.
Example 0.8 Define F : R → R as follows.


 0 if x < 0
if 0 ≤ x < 21

 x


1
F (x) = 2 if 12 ≤ x < 1
x − 12 if 1 ≤ x < 32




1 if x ≥ 32 .

Student can verify that F corresponds to distribution function of the random

variable given by the random experiment of picking a point ’at random’ from
[0, 12 ] ∪ [1, 23 ].
Then F is a distribution function corresponding to a continuous random
variable. But F is not differentiable at x = 12 , 1. The function

 0 if x<0
0 ≤ x < 21

 1 if


1
f (x) = 0 if 2 ≤x<1
1 if 1 ≤ x < 32




0 if x ≥ 32 .

is the pdf of F .
9
Distribution function of transformation of random variables: In this

subsection, we will see how one can write down the distribution function of
Y = ϕ ◦ X in terms of the distribution of X where ϕ : R → R is Borel
measurable. Note it is not possible to give an explicit formula but in some
cases one will be able to do that. Here my plan is to give a general recipe
and will illustrate it through some examples. I will give as an example, one
special class of transformations. Though it is possible to give an explicit
formula for many other cases, I will not do it instead take some examples
and show you how to use it.
General Recipe: The distribution of Y is given by
µY (B) = P {Y ∈ B}
= P {ϕ(X) ∈ B}
= P {X ∈ ϕ−1 (B)}
= µX (ϕ−1 (B), B ∈ BR .
Hence by taking B = (−∞, y], y ∈ R, we get following:
FY (y) = µX (ϕ−1 (−∞, y]), y ∈ R.
Hence to compute the distribution function Y in terms of the distribution

function of X, one need to identify the set ϕ−1 (−∞, y]). This I will illustrate
in the next example.
Example 0.9 (Reading exercise) Let ϕ : R → R be a continuous function

which is increasing. Then ϕ−1 (−∞, y]) = (−∞, sup ϕ−1 (y)]. This implies
that
FY (y) = FX (sup ϕ−1 (y)), y ∈ R.
In particular, if ϕ is strictly increasing, then FY (y) = FX (ϕ−1 (y)).
Now we will see the proof of ϕ−1 (−∞, y]) = (−∞, sup ϕ−1 (y)].
x ∈ ϕ−1 (−∞, y]) ⇒ ϕ(x) ∈ (−∞, y]

⇒ ϕ(x) ≤ y
⇒ x ≤ z for all z ∈ ϕ−1 (y) or ϕ(x) = y
⇒ x ∈ (−∞, sup ϕ−1 (y)].
The statement ϕ(x) ≤ y ⇒ x ≤ z for all z ∈ ϕ−1 (y) or ϕ(x) = y follows

from the argument. Suppose there exists some z ∈ ϕ−1 (y) such that x > z,
10
then ϕ(x) ≥ ϕ(z) = y. Hence ϕ(x) = y.
Now we prove the reverse inclusion. Suppose x ≤ sup ϕ−1 (y). Then
either (I) : x ≤ z for some z ∈ ϕ−1 (y) or (II) : x > z for all z ∈ ϕ−1 (y)
and there exists a sequence zn ∈ ϕ−1 (y) with zn → x.
Now
(I) ⇒ ϕ(x) ≤ y
⇒ x ∈ ϕ−1 ((−∞, y])
⇒ x ∈ ϕ−1 ((−∞, y]).
(II) ⇒ ϕ(x) = lim ϕ(zn ) = y (using continuity of) ϕ

n→∞
⇒ ϕ(x) = y
⇒ x ∈ ϕ−1 (y) ⊆ ϕ−1 ((−∞, y]).
This completes the proof of the reverse inclusion. Hence the proof is com-
plete.
Example 0.10 ϕ(x) = x3 . (Prototype for ϕ which is strictly increasing and
1 1 1
continuous) Hence FY (y) = FX (y 3 ). Here note y 3 = −(|y|) 3 for y < 0.
Example 0.11 Let ϕ(x) = x2 +1. (Prototype for ϕ which is increasing and
continuous and with some ’turning’ points)Then

 ∅ if y < 1
−1
ϕ (−∞, y] = {0} if y = 1
√ √
[− y − 1, y − 1] if y > 1.

Hence
p p p p
FY (y) = µX ([− y − 1, y − 1]) = FX ( y − 1) − FX ( y − 1−).
Example 0.12 ϕ be the Heaviside function, i.e. ϕ(x) = 0 if x < 0 and = 1
if x ≥ 0. (Prototype for ϕ which is piece-wise continuous)Then

 ∅ if y < 0
ϕ−1 (−∞, y]) = (−∞, 0) if 0 ≤ y < 1
if y ≥ 1.

R
Hence 
 0 if y < 0
FY (y) = FX (0−) if 0 ≤ y < 1
1 if y ≥ 1.


Introduction To Probability Theory

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Introduction To Probability Theory

Enviado por

Direitos autorais:

Formatos disponíveis

Introduction to Probability Theory

Example 0.1 (Bernoulli distribution) Let (Ω, F, P ) is a probability space

The distribution of X is given by

where p0 = 1 − p, p1 = p. The above function F and the probability measure

Example 0.2 ( Binomial distribution with parameters (n, p)). Let X1 , X2 , · · · , Xn

Hence the distribution of X = X1 + · · · + Xn is given by

Example 0.3 (Poisson distribution with parameter λ). On (R, BR ) define

Then µ defines a probability measure on BR and is called Poisson distribution

Question: Is F is indeed a distribution function? From the definition

On this probability space, define X : Ω → R by X(ω) = ω. Then X is a

and P {X ≤ x} = 0 = F (x), x < 0. i.e., F is the distribution function of X.

Example 0.4 (Geometric distribution) On (R, BR ) define probability mea-

Example 0.5 (Uniform distribution on [0, 1)) The distribution function is

is called the Uniform [0, 1) distribution function.

µ(B) = l(B ∩ [0, 1))

P {X ≤ x} = P ((−∞, x]) = l((−∞, x] ∩ [0, 1)) = F (x), x ∈ R

The probability measure µ is called the uniform distribution on [0, 1).

Example 0.6 ( Normal distribution with parameters µ, σ )

is called normal distribution function with parameters µ, σ. Again to see that

defined on (Ω, F, P ). We can see that (exrcise) F is strictly increasing and

Example 0.7 (Exponential distribution with parameter λ > 0)

is called exponential distribution function with parameter λ. Details of this

Remark 0.1 If F : R → R satisfies the following

(2) F is increasing and right continuous,

then we can show that there exists a random variable X on a probability

Here one take Ω = D = {xi |i ∈ I}, the set of discontinuities of F (here

P ({xi }) = F (xi ) − F (xi −), i ∈ I,

i.e. P ({xi }) is the jump size at xi . Now define X : Ω → R as X(ω) = ω.

(Instruction: Student should carefully look at how each equality follows)

P {X ≤ x} = P ((−∞, x]) = F (x), x ∈ R.

See the example-Uniform distribution.

A Classification of random variables. Random variables can be classi-

Definition 5.3: A random variable X with distribution function F : R → R

where D is the set of discontinuities of F .

Lemma 0.1 If F is a discrete distribution then it is of the form

where H0 denote the Heaviside function1 and pi = P {X = xi } = F (xi ) −

Proof: Let D = {xi |i ∈ I} where the index set I is countable and X be a

The distributions in Examples 0.1, 0.2, 0.3 corresponds to discrete ran-

If f : R → R exists, then it is called the pdf of X.

Example 0.8 Define F : R → R as follows.

Student can verify that F corresponds to distribution function of the random

Distribution function of transformation of random variables: In this

Hence by taking B = (−∞, y], y ∈ R, we get following:

FY (y) = µX (ϕ−1 (−∞, y]), y ∈ R.

Hence to compute the distribution function Y in terms of the distribution

Example 0.9 (Reading exercise) Let ϕ : R → R be a continuous function

x ∈ ϕ−1 (−∞, y]) ⇒ ϕ(x) ∈ (−∞, y]

The statement ϕ(x) ≤ y ⇒ x ≤ z for all z ∈ ϕ−1 (y) or ϕ(x) = y follows

then ϕ(x) ≥ ϕ(z) = y. Hence ϕ(x) = y.

(II) ⇒ ϕ(x) = lim ϕ(zn ) = y (using continuity of) ϕ

Você também pode gostar