Introduction To Probability Theory

Introduction to Probability Theory
K. Suresh Kumar
Department of Mathematics
Indian Institute of Technology Bombay
September 23, 2017

2
LECTURES 16-17
Chapter 6 : Random vectors, joint distributions
In many problems, one often encounter multiple random objects. For

example, if one is interested in the future price of two different stocks in a
stock market. Since the price of one stock can affect the price of the second,
it is not advisable to analyze them separately. To model such phenomenon,
we need to introduce many random variables in a single platform (i.e., a
probability space).
First we will recall, some elementary facts about n-dimensional Eu-
clidean space. Let
n times
z }| {
n
R = R × ··· × R
with the usual metric d(x, y) = kx − yk, where
v
u n
uX
kxk = t |xi |2 , x = (x1 , . . . .xn ) ∈ Rn .
i=1
A subset O of Rn is said to be open if for each x ∈ O, there exists an > 0

such that
B(x, ) ⊆ O,
where
B(x, ) = {y ∈ Rn | kx − yk < } .
A useful fact: Any open set can be written as a countable union of open
sets of the form Πni=1 (ai , bi ), ai < bi , they are called open rectangles.
Definition 6.1. The σ-field generated by all open sets in Rn is called the
Borel σ-field of subsets of Rn and is denoted by BRn .
Some Borel sets : Rectangles, Triangles, lines, points all are Borel sets.
Theorem 0.1 Let
In = {(−∞, x1 ] × · · · × (−∞, xn ] | (x1 , . . . , xn ) ∈ Rn } .
Then
σ(In ) = BRn .
3
Proof. (Reading exercise) We prove for n = 2, for n ≥ 3, it is similar. Note

that
I2 ⊆ BR2 .
Hence from the definition of σ(I2 ), we have
σ(I2 ) ⊆ BR2 .
Note that for (x1 , x2 ) ∈ R2 ,
h 1 1 i
(−∞, x1 ) × (−∞, x2 ) = ∪∞
m=1 (−∞, x1 − ] × (−∞, x2 − ] ∈ σ(I2 ) .
m m
For each x1 , x2 , y1 , y2 ∈ R such that x1 < y1 , x2 < y2 we have
(x1 , y1 ) × (x2 , y2 ) = (−∞,
h y1 ) × (−∞, y2 )
\ (−∞, x1 ] × (−∞, x2 ] ∪ (−∞, x1 ] × (−∞, y2 ]
i
∪(−∞, y1 ] × (−∞, x2 ] .
Hence all open rectangles are in σ(I2 ). Since any open set in R2 can be
rewritten as a countable union of open rectangles, all open sets are in σ(I2 ).
Therefore from the definition of BR2 , we get
BR2 ⊆ σ(I2 ) .
This completes the proof. (It is advised that student try to write down the
proof for n = 3)
Definition 6.2. Let Ω be a non empty set (sample space) and F be a

σ-field of subsets of Ω. A map X : Ω → Rn , is called a random vector if
X −1 (B) ∈ F for all B ∈ BRn .
The above definition is equivalent to the following: X is a random vector
with respect to F if
{X1 ≤ x1 , X2 ≤ x2 } ∈ F for all x1 , x2 ∈ R.
Now onwards we set n = 2.
Exercise 6.1 Let B is a Borel set in R, then show that B × R is a Borel set
in R2 . Hint: Use the following technique. Collect all Borel set B satisfying
”B × R is Borel set in R2 ’. Then show that it is a σ-field. Also show that
this collection contains all open sets in R. Now using some simple reasoning
to show that the collection is indeed all Borel sets in R2 .
4
Theorem 0.2 X : Ω → R2 is a random vector iff Xi , i = 1, 2, are random

variables where Xi denote the ith component of X.
Proof: Let X be a random vector.

For B ∈ BR
X1−1 (B) = X −1 (B × R) ∈ F ,
since B ∈ BR we have B × R ∈ BR2 . (see Exercise 6.1)
Therefore X1 is a random variable. Similarly, we can show that X2 is a
random variable.
Suppose X1 , X2 are random variables.
For x1 , x2 ∈ R, set Bi = (−∞, xi ].
X −1 (B1 × B2 ) = X1−1 (B1 ) ∩ X2−1 (B2 ) ∈ F . (0.1)
Set
B = {B ∈ BR2 |X −1 (B) ∈ F} .
By (0.1)
I2 ⊆ B (0.2)
For B ∈ B, we have
X −1 (B) ∈ F .
Hence
X −1 (B c ) = [X −1 (B)]c ∈ F .
Thus B c ∈ B. Similarly
B1 , B2 · · · ∈ B ⇒ X −1 (Bn ) ∈ F, ∀n
⇒ X −1 (∪∞
n=1 B n ) = ∪∞
n=1 X −1 (B ) ∈ F
n
⇒ ∪∞
n=1 Bn ∈ B .
Hence
B is a σ − field.
Thus from (0.2), we have
σ(I2 ) ⊆ B .
Therefore from Theorem 0.1, we have B = BR2 . Hence X is a random
vector. This completes the proof.
5
Theorem 0.3 Let X = (X1 , X2 ) be a random vector defined on a probabil-

ity space (Ω, F, P ). On (R2 , BR2 ) define µ as follows
µ(B) = P {X ∈ B} .
Then µ is a probability measure on (R2 , BR2 ).
Proof. Since {X ∈ R2 } = Ω, we have
µ(R2 ) = 1 .
Let B1 , B2 . . . be pair wise disjoint elements from BR2 . Then X −1 (B1 ), X −1 (B2 ), . . .
are pair wise disjoint and are in F. Hence
∞
X ∞
X
µ(∪∞
n=1 (Bn ) = P ∪∞
n=1 X −1
(Bn ) = P {X ∈ Bn } = µ(Bn ) .
n=1 n=1
This completes the proof.
Definition 6.3. The probability measure µ is called the distribution (or

Law) of the random vector X and is denoted by µX .
Definition 6.4. (joint distribution function)

Let X = (X1 , X2 ) be a random vector. Then the function F : R2 → R
given by
F (x1 , x2 ) = P {X1 ≤ x1 , X2 ≤ x2 }
is called the distribution function of X (in otherwords, joint distribution
function of the random variables X1 and X2 ).
Theorem 0.4 Let F be the joint distribution function of a random vector

X. Then F satisfies the following.
(i) (a)
lim F (x1 , x2 ) = 0, i = 1, 2
xi →−∞
(b)
lim F (x1 , x2 ) = 1
x1 →∞, x2 →∞
(ii) F is right continuous in each argument, i.e.
lim F (y, x2 ) = F (x, x2 ) for all x ∈ R, and for all x2 ∈ R,

y↓x
6
lim F (x1 , y) = F (x1 , x) for all x ∈ R, and for all x1 ∈ R.

y↓x
(iii) F is nondecreasing in each arguments, i.e.
x ≤ y ⇒ F (x, x2 ) ≤ F (y, x2 ) for all x2 and

x≤y ⇒ F (x1 , x) ≤ F (x1 , y) for all x1 .
The proof of the above theorem is an easy exercise to the student.
Given a random vector X = (X1 , X2 ), the distribution function of X1

denoted by FX1 is called the marginal distribution of X1 . Similarly the
marginal distribution function FX2 of X2 is defined. Given the joint distri-
bution function F of X, one can recover the corresponding marginal distri-
butions as follows.
FX1 (x1 ) = P {X1 ≤ x1 } = P {X1 ≤ x1 , X2 ∈ R} = lim F (x1 , x2 ) .

x2 →∞
Similarly
FX2 (x2 ) = lim F (x1 , x2 ).
x1 →∞
Given the marginal distribution functions of X1 and X2 , in general it is

impossible to construct the joint distribution function. Note that marginal
distribution functions doesn’t contain information about the dependence of
X1 over X2 and vice versa. One can characterize the independence of X1
and X2 in terms of its joint and marginal distribution functions as in the
following theorem. The proof is beyond the scope of this course.
Theorem 0.5 Let X = (X1 , X2 ) be a random vector with distribution func-

tion F . Then X1 and X2 are independent iff
F (x1 , x2 ) = FX1 (x1 ) FX2 (x2 ), x1 , x2 ∈ R .
Definition 6.5. (pmf of discrete random vector) Let X = (X1 , X2 ) be

a discrete random vector, i.e, X1 , X2 are discrete random variables. Define
f : R2 → R by
f (x, y) = P {X1 = x, X2 = y}, x, y ∈ R .
Then f is called pmf of X. As mentioned earlier, we also use the name joint
pmf of random variables X1 and X2 .
7
Example 0.1 Consider the experiment of throwing a die twice indepen-

dently. Let X1 denote the number of 1’s and X2 denote the number of 3’s.
We will write down the joint pmf of X1 and X2 .
X1 /X2 0 1 2
4 2 1
0 9 9 36
2 1
1 9 18 0
1
2 36 0 0
I will illustrate how the pmf is calculated. Di denote the face shown by
the die in the ith throw.
f (0, 0) = P {X1 = 0, X2 = 0}
= P {D1 6= 1, 3 & D2 6= 1, 3}
4 4 4
= × = .
6 6 9
The above is an example of a lattice distribution, i.e. ’probability masses
’ only at the ’lattice’ points (i, j); i, j ∈ Z.
Definition 6.6. ( pdf of continuous random vector) Let X = (X1 , X2 ) be

a continuous random variable (i.e., X1 , X2 are continuous random variables)
with joint distribution function F . If there exists a function f : R2 → R
such that
Z x2 Z x1
F (x1 , x2 ) = f (x, y)dxdy, ∀ x1 ∈ R, x2 ∈ R
−∞ −∞
then f is called the pdf of X, in otherwords, the joint pdf of the random
variables X1 and X2 .
Theorem 0.6 Let (X, Y ) be a continuous random vector with pdf f . Then
ZZ
P {(X, Y ) ∈ A} = f (x, y)dxdy, A ∈ BR2 .
A
Proof. (Proof is not part of the syllabus and but the proof is given as a
reading exercise which is optional )
Note that L.H.S of the equality corresponds to the law of (X, Y ).
Let F0 denote the set of all finite union of rectangles in R2 . Then F0 is
a field (exercise for the student).
8
Set ZZ
µ1 (B) = f (x, y)dxdy, B ∈ BR2 and
B
µ2 (B) = P {(X, Y ) ∈ B}, B ∈ BR2
Then µ1 , µ2 are probability measures on BR2 and µ1 = µ2 on F0 . Hence,
using extension theorem, we have
µ1 = µ2 on BR2 .
i.e., ZZ
P {(X, Y ) ∈ A} = f (x, y)dxdy, A ∈ BR2 .
A
Remark 0.1 The integral in Theorem 0.6 is in general not understood in

the Riemann integral sense but for all our computations, the integral will be
in the Riemann sense. Note mostly will be considering Riemann integrable
functions in domains which are either rectangles or of the form
A = {(x, y)|a ≤ x ≤ b, ϕ1 (x) ≤ y ≤ ϕ2 (x)},
for some continuous function ϕ1 , ϕ2 : [a, b] → R, ϕ1 ≤ ϕ2 , or of the form
A = {(x, y)|c ≤ y ≤ d, ψ1 (y) ≤ x ≤ ψ2 (y)},
for some continuous function ψ1 , ψ2 : [c, d] → R, ψ1 ≤ ψ2 . We call these

domains as elementary domains.
Now we will give some important class of multi valued distribution func-
tions.
Example 0.2 (Uniform distribution on disc of radius R) Consider the func-

tion 1
πR2
if x2 + y 2 ≤ R2
f (x, y) =
0 if otherwise.
Then f is called joint density of uniform distribution on {(x, y)|x2 + y 2 ≤
R2 }. In fact student acn try to construct a pair of random variables (X, Y )
on some probability space (Ω, F, P ) such that its joint distribution function
is given by Z y
Z x
P {X ≤ x, Y ≤ y} = f (x, y)dxdy.
−∞ −∞
Any Guess! Hint: look uniform distribution function on (0, 1).
9
Example 0.3 Let X, Y be two random variables with joint pdf given by
√
3 −(x2 −xy+y2 )2
f (x, y) = e , x, y ∈ R .
4π
If fX (.), fY (.) denote the marginal pdfs of X and Y respectively, then
Z ∞
fX (x) = f (x, y)dy
√−∞Z ∞
3 2 2
= e−(x −xy+y )|2 dy
4π
√ Z−∞
3 ∞ −(y− x )2 −3x2
= e 2 e 8 dy
√4π −∞
3 − 3x2 ∞ −
Z
x
= e e (y − )2 dy
√4π 8 −∞ 2
3 1 − 3x2
= √ e 8 .
2 2π
Therefore
4
X ∼ N (0, ).
3
Here X ∼ N (ν, σ 2 ) means X is normally distributed with mean µ and vari-
ance σ 2 . Similarly,
Z ∞
fy (y) = f (x, y)dy
√−∞
3 1 − 3y2
= √ e 8 .
2 2π
Therefore
4
Y ∼ N (0, ).
3
Also note that X and Y are dependent since,
f (x, y) 6= fX (x)fY (y) ,
see exercise.

Introduction To Probability Theory

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Introduction To Probability Theory

Enviado por

Direitos autorais:

Formatos disponíveis

Introduction to Probability Theory

September 23, 2017

Chapter 6 : Random vectors, joint distributions

In many problems, one often encounter multiple random objects. For

A subset O of Rn is said to be open if for each x ∈ O, there exists an  > 0

Theorem 0.1 Let

In = {(−∞, x1 ] × · · · × (−∞, xn ] | (x1 , . . . , xn ) ∈ Rn } .

Proof. (Reading exercise) We prove for n = 2, for n ≥ 3, it is similar. Note

Definition 6.2. Let Ω be a non empty set (sample space) and F be a

Theorem 0.2 X : Ω → R2 is a random vector iff Xi , i = 1, 2, are random

Proof: Let X be a random vector.

X −1 (B1 × B2 ) = X1−1 (B1 ) ∩ X2−1 (B2 ) ∈ F . (0.1)

Theorem 0.3 Let X = (X1 , X2 ) be a random vector defined on a probabil-

Then µ is a probability measure on (R2 , BR2 ).

Proof. Since {X ∈ R2 } = Ω, we have

This completes the proof.

Definition 6.3. The probability measure µ is called the distribution (or

Definition 6.4. (joint distribution function)

Theorem 0.4 Let F be the joint distribution function of a random vector

(ii) F is right continuous in each argument, i.e.

lim F (y, x2 ) = F (x, x2 ) for all x ∈ R, and for all x2 ∈ R,

lim F (x1 , y) = F (x1 , x) for all x ∈ R, and for all x1 ∈ R.

(iii) F is nondecreasing in each arguments, i.e.

x ≤ y ⇒ F (x, x2 ) ≤ F (y, x2 ) for all x2 and

The proof of the above theorem is an easy exercise to the student.

Given a random vector X = (X1 , X2 ), the distribution function of X1

FX1 (x1 ) = P {X1 ≤ x1 } = P {X1 ≤ x1 , X2 ∈ R} = lim F (x1 , x2 ) .

Given the marginal distribution functions of X1 and X2 , in general it is

Theorem 0.5 Let X = (X1 , X2 ) be a random vector with distribution func-

F (x1 , x2 ) = FX1 (x1 ) FX2 (x2 ), x1 , x2 ∈ R .

Definition 6.5. (pmf of discrete random vector) Let X = (X1 , X2 ) be

f (x, y) = P {X1 = x, X2 = y}, x, y ∈ R .

Example 0.1 Consider the experiment of throwing a die twice indepen-

Definition 6.6. ( pdf of continuous random vector) Let X = (X1 , X2 ) be

Remark 0.1 The integral in Theorem 0.6 is in general not understood in

A = {(x, y)|a ≤ x ≤ b, ϕ1 (x) ≤ y ≤ ϕ2 (x)},

for some continuous function ϕ1 , ϕ2 : [a, b] → R, ϕ1 ≤ ϕ2 , or of the form

A = {(x, y)|c ≤ y ≤ d, ψ1 (y) ≤ x ≤ ψ2 (y)},

for some continuous function ψ1 , ψ2 : [c, d] → R, ψ1 ≤ ψ2 . We call these

Example 0.2 (Uniform distribution on disc of radius R) Consider the func-

f (x, y) 6= fX (x)fY (y) ,

Você também pode gostar

A subset O of Rn is said to be open if for each x ∈ O, there exists an > 0