Você está na página 1de 10

Introduction to Probability Theory

K. Suresh Kumar
Department of Mathematics
Indian Institute of Technology Bombay

September 9, 2017
2

LECTURES 14-15

Example 0.1 (Bernoulli distribution) Let (Ω, F, P ) is a probability space


and A ∈ F with p = P (A). Tossing a p-coin gives such a probability space
with an event A. Here note that X takes 2 values 0 and 1.
The distribution function of X is given by

F (x) = 0 if x < 0
= 1 − p if 0 ≤ x < 1
= 1 if x ≥ 1 .

The distribution of X is given by


X
µ(B) = pk , B ∈ BR ,
k∈B∩{0,1}

where p0 = 1 − p, p1 = p. The above function F and the probability measure


µ on (R, BR ) are called respectively the Bernoulli distribution function and
the Bernoulli distribution. Also X = IA is an example of Bernoulli (p)
random variable.

Example 0.2 ( Binomial distribution with parameters (n, p)). Let X1 , X2 , · · · , Xn


be n independent Bernoulli(p) random variables defined on a probability
space. In fact one can define independent Bernoulli’s given above through
the following. Toss a p-coin n-times independently and let Xk = 1 if kth
toss is H and = 0 if the kth toss is T . Then
 
n
P {X = k} = P {X = 1, for k i0 s and Xi = 0 otherwise}
k
 
n k
= p (1 − p)n−k .
k

Hence the distribution of X = X1 + · · · + Xn is given by



 0 
 if x < 0

 n
(1 − p)n


 if 0 ≤ x < 1



  0  
 n

n n
F (x) = (1 − p) + p(1 − p)n−1 if 1 ≤ x < 2
 0 1
 k  
n i

 X
p (1 − p)n−i if k ≤ x < k + 1, k = 2, . . . , n − 1




 i
 i=0


1 if x ≥ n .
3
 
X n k
µX (B) = p (1 − p)n−k ).
k
k∈B∩{0,1,··· ,n}

The above F and µX are called Binomial (n, p) distribution function and
Binomial (n, p) distribution respectively. A random variable with Binomial
distribution as its law (distribution) is called a Binomial random variable. In
the beginning we have seen an example of a Binomial (n, p) random variable.

Example 0.3 (Poisson distribution with parameter λ). On (R, BR ) define


probability measure
X λk e− λ
µ(B) = , B ∈ BR .
k!
k∈B∩{0,1,··· }

Then µ defines a probability measure on BR and is called Poisson distribution


with parameter λ. The Poisson distribution function is given by

 0
 if x < 0
λ k e− λ
F (x) =
X
,x ≥ 0

 k!
k=0,1,2,··· , k≤x

Question: Is F is indeed a distribution function? From the definition


of distribution function (I have given in the beginning of this chapter),
F is a distribution function if there exists a random variable X such that
F (x) = P {X ≤ x} for all x ∈ R.

I will give one such construction. Observe that X should take values
from {0, 1, 2, · · · } = {0} ∪ N. So take Ω = {0} ∪ N, F = P(Ω) and

X λk e− λ
P (A) = , A ⊆ Ω.
k!
k∈A

On this probability space, define X : Ω → R by X(ω) = ω. Then X is a


random variable and
X λk e− λ
P {X ≤ x} = , x ≥ 0 = F (x)
k!
k∈{0}∪N:k≤x

and P {X ≤ x} = 0 = F (x), x < 0. i.e., F is the distribution function of X.


Exercise: Give all the details.
4

Example 0.4 (Geometric distribution) On (R, BR ) define probability mea-


sure
µ({k}) = p(1 − p)k−1 , k = 1, 2, · · · ,
0 < p < 1. Then µ defines a probability measure on BR and is called
geometric distribution with parameter p.
Stusent may add the details as in the previous example.

Example 0.5 (Uniform distribution on [0, 1)) The distribution function is


given by 
 0 if x ≤ 0
F (x) = x if 0 < x ≤ 1
1 if x ≥ 1

is called the Uniform [0, 1) distribution function.

As discussed above, first one need to check that the above indeed is a
distribution function, i.e. we need to get(construct) a random variable X
such that P {X ≤ x} = F (x), x ∈ R.
To this end, first we describe a probability space. On (R, BR ), define the
probability measure µ such that

µ(B) = l(B ∩ [0, 1))

when the Borel set B is an interval, where l(B ∩ [0, 1)) denote the length of
the interval B ∩ [0, 1) if it is non empty. Now take (Ω, F, P ) as (R, BR , µ)
and define X : Ω → R as X(ω) = ω, the identity function. Then X is a
random variable (exercise) and

P {X ≤ x} = P ((−∞, x]) = l((−∞, x] ∩ [0, 1)) = F (x), x ∈ R

The probability measure µ is called the uniform distribution on [0, 1).

Example 0.6 ( Normal distribution with parameters µ, σ )


The distribution function F : R → R is given by
Z x
1 (y−µ)2
F (x) = µ(−∞, x] = √ e− 2σ2 dy .
2πσ −∞

is called normal distribution function with parameters µ, σ. Again to see that


F is indeed a distribution function, I will give another useful construction
of the ’normal’ random variable. Let U be a uniform [0, 1) random variable
5

defined on (Ω, F, P ). We can see that (exrcise) F is strictly increasing and


continuous with 0 < F (x) < 1 for all x ∈ R.
Define X = F −1 ◦ U . Then (exercise) X is a random variable on
(Ω, F, P ).

P {X ≤ x} = P {F −1 ◦ U ≤ x}
= P {U ≤ F (x)} = F (x), since 0 < F (x) < 1.

Example 0.7 (Exponential distribution with parameter λ > 0)


The distribution function is given by

0 if x ≤ 0
F (x) =
1 − e−λx if x > 0

is called exponential distribution function with parameter λ. Details of this


example is left as an exercise.

Remark 0.1 If F : R → R satisfies the following

(1)
lim F (x) = 0, lim F (x) = 1,
x→−∞ x→∞

(2) F is increasing and right continuous,

then we can show that there exists a random variable X on a probability


space (Ω, F, P ) such that

P {X ≤ x} = F (x), x ∈ R.

In fact, we have seen this in the above examples. We have seen three
methods to construct probability space and random variable on it satisfy-
ing P {X ≤ x} = F (x), x ∈ R. Two methods works only for special cases
which are easy (to construct) and the third method works for any F satisfy-
ing (1) and (2) but details are difficult (in fact ’finer details’ are beyond the
scope of this course).

Method I: This method is for the ’discrete’ F satisfying (1) and (2), i.e.
F ’increases’ only at jumps and hence all jumps add upto 1. Examples of
such F ’s are Bernoulli, Binomial, Poisson, Geometric etc. The precise def-
inition of ’discrete’ F is given in the subsection on ’classification of random
variables’.
6

Here one take Ω = D = {xi |i ∈ I}, the set of discontinuities of F (here


I is countable) and F = P(Ω) and P is defined by

P ({xi }) = F (xi ) − F (xi −), i ∈ I,

i.e. P ({xi }) is the jump size at xi . Now define X : Ω → R as X(ω) = ω.


Then X X
P {X ≤ x} = P {X = xi } = P {xi } = F (x).
i:xi ≤x i:xi ≤x

(Instruction: Student should carefully look at how each equality follows)

Method II: WhenF is strictly increasing, one can use the following.
Let U be a uniform (0, 1) random variable on a probability space (Ω, F, P ).
Then define X = F −1 ◦ U . Now as explained in the example of Normal
distribution that

P {X ≤ x} = P {U ≤ F (x)} = F (x).

Method III: This method is very general and works for any F satisfying
(1) and (2). Method relay on defining a probability measure P on (Ω, F) =
(R, BR ) such that
P ((−∞, x]) = F (x), x ∈ R.
(Here note that P is nothing but the distribution µ corresponding to F ).
Now define X : Ω → R → R by X(ω) = ω. Then

P {X ≤ x} = P ((−∞, x]) = F (x), x ∈ R.

See the example-Uniform distribution.

A Classification of random variables. Random variables can be classi-


fied using distribution functions according the ’continuity properties.

Definition 5.3: A random variable X with distribution function F : R → R


is said to be a discrete random variable if
X
(F (x) − F (x−) = 1 ,
x∈D

where D is the set of discontinuities of F .


Here observe that the ’discrete’ distribution F exhaust all the probability
masses through its jumps.
7

Lemma 0.1 If F is a discrete distribution then it is of the form


X
F (x) = pi H0 (x − xi ), x ∈ R,
i:xi ∈D

where H0 denote the Heaviside function1 and pi = P {X = xi } = F (xi ) −


F (xi −) and D is the set of discontinuities of F .

Proof: Let D = {xi |i ∈ I} where the index set I is countable and X be a


random variable with distribution X. Then it follows that P {X = x} = 0
for all x ∈
/ D. Therefore
F (x) = P {X ≤ x}
X
= P {X = xi }
i∈I:xi ≤x
X
= pi H0 (x − xi ).
i∈I

The distributions in Examples 0.1, 0.2, 0.3 corresponds to discrete ran-


dom variables.
Definition 5.4 A random variable X with distribution function F : R → R
which is continuous is said to be random variable with continuous distribu-
tion and is in short called by the name continuous random variable.
The distributions given in Examples 0.6, 0.5, 0.7 corresponds to contin-
uous random variable.
Definition 5.5 (Probability mass function)
Let X be a discrete random variable with distribution function F : R →
R.
Define f : R → R as follows:
f (x) = F (x) − F (x−)
Then f is called the probability mass function(pmf) of X.
For example, the pmf of the discrete random variable given in Example
4.0.26 is given by  1
 4 if x = 0, 2
1
f (x) = if x = 1
 2
0 otherwise .
1
Heaviside function H0 is defined by

0 if x<0
H0 (x) =
1 if x ≥ 0.
8

It is left as an exercise for the student to write down the pmf of random
variables in Examples 0.1, 0.2, 0.3.
The pmf of a continuous random variable is the zero function. Hence
the notion of pmf is useless for continuous random variables.
Definition 5.6(Probability density function)
A continuous random variable X with distribution function F : R → R
is said to have a probability density function(pdf) if there exists a function
f : R → R such that
Z x
F (x) = f (y)dy ∀ x ∈ R
−∞

If f : R → R exists, then it is called the pdf of X.


A continuous random variable with a pdf is simply called by absolutely
continuous random variable
It is easy to see that if F is differentiable every where and the derivative
denoted by F 0 is a continuous function, then the corresponding random
variable X has a pdf and is given by f = F 0 . This is not a necessary
condition.

Example 0.8 Define F : R → R as follows.



 0 if x < 0
if 0 ≤ x < 21

 x


1
F (x) = 2 if 12 ≤ x < 1
x − 12 if 1 ≤ x < 32




1 if x ≥ 32 .

Student can verify that F corresponds to distribution function of the random


variable given by the random experiment of picking a point ’at random’ from
[0, 12 ] ∪ [1, 23 ].
Then F is a distribution function corresponding to a continuous random
variable. But F is not differentiable at x = 12 , 1. The function

 0 if x<0
0 ≤ x < 21

 1 if


1
f (x) = 0 if 2 ≤x<1
1 if 1 ≤ x < 32




0 if x ≥ 32 .

is the pdf of F .
9

Distribution function of transformation of random variables: In this


subsection, we will see how one can write down the distribution function of
Y = ϕ ◦ X in terms of the distribution of X where ϕ : R → R is Borel
measurable. Note it is not possible to give an explicit formula but in some
cases one will be able to do that. Here my plan is to give a general recipe
and will illustrate it through some examples. I will give as an example, one
special class of transformations. Though it is possible to give an explicit
formula for many other cases, I will not do it instead take some examples
and show you how to use it.
General Recipe: The distribution of Y is given by

µY (B) = P {Y ∈ B}
= P {ϕ(X) ∈ B}
= P {X ∈ ϕ−1 (B)}
= µX (ϕ−1 (B), B ∈ BR .

Hence by taking B = (−∞, y], y ∈ R, we get following:

FY (y) = µX (ϕ−1 (−∞, y]), y ∈ R.

Hence to compute the distribution function Y in terms of the distribution


function of X, one need to identify the set ϕ−1 (−∞, y]). This I will illustrate
in the next example.

Example 0.9 (Reading exercise) Let ϕ : R → R be a continuous function


which is increasing. Then ϕ−1 (−∞, y]) = (−∞, sup ϕ−1 (y)]. This implies
that
FY (y) = FX (sup ϕ−1 (y)), y ∈ R.
In particular, if ϕ is strictly increasing, then FY (y) = FX (ϕ−1 (y)).

Now we will see the proof of ϕ−1 (−∞, y]) = (−∞, sup ϕ−1 (y)].

x ∈ ϕ−1 (−∞, y]) ⇒ ϕ(x) ∈ (−∞, y]


⇒ ϕ(x) ≤ y
⇒ x ≤ z for all z ∈ ϕ−1 (y) or ϕ(x) = y
⇒ x ∈ (−∞, sup ϕ−1 (y)].

The statement ϕ(x) ≤ y ⇒ x ≤ z for all z ∈ ϕ−1 (y) or ϕ(x) = y follows


from the argument. Suppose there exists some z ∈ ϕ−1 (y) such that x > z,
10

then ϕ(x) ≥ ϕ(z) = y. Hence ϕ(x) = y.

Now we prove the reverse inclusion. Suppose x ≤ sup ϕ−1 (y). Then
either (I) : x ≤ z for some z ∈ ϕ−1 (y) or (II) : x > z for all z ∈ ϕ−1 (y)
and there exists a sequence zn ∈ ϕ−1 (y) with zn → x.

Now
(I) ⇒ ϕ(x) ≤ y
⇒ x ∈ ϕ−1 ((−∞, y])
⇒ x ∈ ϕ−1 ((−∞, y]).

(II) ⇒ ϕ(x) = lim ϕ(zn ) = y (using continuity of) ϕ


n→∞
⇒ ϕ(x) = y
⇒ x ∈ ϕ−1 (y) ⊆ ϕ−1 ((−∞, y]).
This completes the proof of the reverse inclusion. Hence the proof is com-
plete.
Example 0.10 ϕ(x) = x3 . (Prototype for ϕ which is strictly increasing and
1 1 1
continuous) Hence FY (y) = FX (y 3 ). Here note y 3 = −(|y|) 3 for y < 0.
Example 0.11 Let ϕ(x) = x2 +1. (Prototype for ϕ which is increasing and
continuous and with some ’turning’ points)Then

 ∅ if y < 1
−1
ϕ (−∞, y] = {0} if y = 1
√ √
[− y − 1, y − 1] if y > 1.

Hence
p p p p
FY (y) = µX ([− y − 1, y − 1]) = FX ( y − 1) − FX ( y − 1−).
Example 0.12 ϕ be the Heaviside function, i.e. ϕ(x) = 0 if x < 0 and = 1
if x ≥ 0. (Prototype for ϕ which is piece-wise continuous)Then

 ∅ if y < 0
ϕ−1 (−∞, y]) = (−∞, 0) if 0 ≤ y < 1
if y ≥ 1.

R
Hence 
 0 if y < 0
FY (y) = FX (0−) if 0 ≤ y < 1
1 if y ≥ 1.

Você também pode gostar