Você está na página 1de 59

Probability Theory

Taejin Kim
CUHK Business School
57 September, 2013
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 1 / 59
Goal
Mathematical foundation for further study in quantitative nance
1
Probability
2
Stochastic Process
3
Econometrics
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 2 / 59
Contacts
Instructor
Taejin Kim
taejinkim@baf.cuhk.edu.hk
Tutor
Kelvin Lam
kelvinlam@baf.cuhk.edu.hk
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 3 / 59
Grading
Quiz (10%)
Assignments (20%)
Midterm (30%): 26 October
Final (40%): 30 November
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 4 / 59
References
No required textbook
References
Probability and Statistics, DeGroot and Schervish
Stochastic Calculus for Finance I: The Binomial Asset Pricing Model,
Shreve
Stochastic Calculus for Finance II: Continuous-Time Models, Shreve
A Guide to Econometrics, Kennedy
Econometric Theory and Methods, Davidson and MacKinnon
Introductory Econometrics, Wooldridge
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 5 / 59
Prerequisite
Calculus
Linear Algebra
Elementary Probability Theory
Basic Statistics
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 6 / 59
Review of Elementary Probability Theory
Finite number of outcomes
Sample space = {
1
, . . . ,
N
}
Examples
1
A single coin toss
= {H, T}
2
n tosses of a coin
= { : = (a
1
, . . . , a
n
), a
i
= H or T}
Event: a subset A
Either A or / A
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 7 / 59
Review of Elementary Probability Theory
Algebra A: a collection of subsets of for which
1
A
2
A, B A A B A
3
A A A
c
A
Show A, B A A B A
Examples
1
{, }
2
{A, A
c
, , }
3
A = {A : A }, the collection consisting of all the subsets of
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 8 / 59
Review of Elementary Probability Theory
Probability: a weight assinged to each outcome
i
, denoted by P(
i
),
satisfying:
1
0 P(
i
) 1
2
P(
1
) + +P(
N
) = 1
Probability of event A A
P(A) =

{i :
i
A}
P(
i
).
Properties
1
P() = 0, P() = 1
2
If A B = , P(A B) = P(A) +P(B)
3
P(A
c
) = 1 P(A)
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 9 / 59
Review of Elementary Probability Theory
Probability space:
(, A, P).
Example (Binomial distribution). Let a coin be tossed n times and
record the results as an ordered set (a
1
, . . . , a
n
), where a
i
= 1 for a
head and a
i
= 0 for a tail. The sample space is
= { : = (a
1
, . . . , a
n
), a
i
= 0, 1}.
To each sample point = (a
1
, . . . , a
n
), we assign the probability
P() = p

a
i
q
n

a
i
,
where the nonnegative numbers p and q satisfy p + q = 1. (Show
P() = 1.)
The space , together with the collection A of all its subsets and the
probabilities P(A) =

A
P(), denes a probability space.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 10 / 59
Review of Elementary Probability Theory
Example (continued). Consider the events
A
k
= { : = (a
1
, . . . , a
n
), a
1
+ + a
n
= k}, k = 0, 1, . . . , n.
The set of probabilities (P(A
0
), . . . , P(A
n
)) is called the binomial
distribution.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 11 / 59
Review of Elementary Probability Theory
Conditional Probability of event B given event A with P(A) > 0 is
P(B|A)
P(A B)
P(A)
.
Properties
1
(Law of total probability) Consider a decomposition D = {A
1
, . . . , A
n
}
with P(A
i
) > 0 for all i ; that is, A
i
A
j
= for all i = j and
A
1
A
n
= . Then,
P(B) =
n

i =1
P(B|A
i
)P(A
i
).
2
(Bayes law)
P(A
i
|B) =
P(A
i
)P(B|A
i
)
P(B)
=
P(A
i
)P(B|A
i
)

n
j =1
P(A
j
)P(B|A
j
)
.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 12 / 59
Review of Elementary Probability Theory
Example. Suppose that two dice were rolled. What is the conditional
probability that the sum of the two numbers was less than 8 given
that it was odd?
Let A be the event that the sum is less than 8 and let B be the event
that the sum is odd.
P(A B) =
1
3
P(B) =
1
2
Hence,
P(A|B) =
P(A B)
P(B)
=
2
3
.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 13 / 59
Review of Elementary Probability Theory
Example. The following table exhibits conditional probabilities
P(Y
2
|X
1
).
U
2
M
2
L
2
U
1
.45 .48 .07
M
1
.05 .70 .25
L
1
.01 .50 .49
For example, P(U
2
|U
1
) = .45. Suppose P(U
1
) = 10%, P(M
1
) = 40%,
and P(L
1
) = 50%. What is P(U
1
|U
2
)?
Applying the law of total probability, we have
P(U
2
) = P(U
2
|U
1
)P(U
1
) +P(U
2
|M
1
)P(M
1
) +P(U
2
|L
1
)P(L
1
)
= .45 .10 +.05 .40 +.01 .50 = .07
Thus,
P(U
1
|U
2
) =
P(U
2
|U
1
)P(U
1
)
P(U
2
)
=
.45 .10
.07
= .64
Glass and Hall (1954) reported this table for occupational mobility in England and Wales.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 14 / 59
Review of Elementary Probability Theory
Independence
Events A and B are independent if
P(A B) = P(A) P(B).
Two algebras A
1
and A
2
are independent if all pairs of sets A
1
and A
2
,
belonging respectively to A
1
and A
2
, are independent.
The sets A
1
, . . . , A
n
are independent if for k = 1, . . . , n and
1 i
1
< i
2
< < i
k
n
P(A
i
1
A
i
k
) = P(A
i
1
) . . . P(A
i
k
).
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 15 / 59
Innitely Many Outcomes
Can we use the framework described so far for a case of innitely
many outcomes?
Consider an experiment of tossing a coin innitely many times. The
sample space

is the set of innite sequences of Hs and Ts. The


problem is that the probability of any particular outcome is zero.
Thus, we cannot dene the probability of an event A by summing up
the (uncountably) innite probabilities of the elements in A.
The solution is to assign probability to each event directly. The
mathematical tool for this job is measure theory. A mathematical
measure on a set is a systematic way to assign a number to each
suitable subset of that set (Wikipedia). A probability measure is a
special measure, assigning probability to each event in a probability
space.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 16 / 59
-algebra
In order to dene a (probability) measure in a useful way, we need to
extend the concept of algebra.
Denition. Let be a nonempty set, and let F be a collection of
subsets of . We say that F is a -algebra provided that:
1
the empty set belongs to F,
2
whenever a set A belongs to F, its complement A
c
also belongs to F,
3
whenever a sequence of sets A
1
, A
2
, . . . belongs to F, their union

n=1
A
n
also belongs to F.
A -algebra is closed under countable intersections.
Algebra: Finite union -algebra: Countable union
The pair (, F) is called a measurable space.
An innite set is countably innite if it can be put into one-to-one
correspondence with the positive integers. Otherwise, it is called
uncountably innite. In the same spirit,

n=1
A
n
is called a countable
union because it is the union of a countable collection {A
1
, A
2
, . . . }.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 17 / 59
-algebra
Example. Let = [0, 1]. The -algebra obtained by closed intervals
in using countable unions and intersections is called the Borel
-algebra of subsets of [0, 1], denoted by B[0, 1]. We can dene the
Borel -algebra of R, denoted by B(R), in the same manner.
(R, B(R)) is extremely important for measure theory. Also, we can
show that B(R) is the smallest algebra containing all closed (or open)
intervals.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 18 / 59
-algebra
(Bjork) The Borel algebra is an extremely complicated object and the
reader should be aware of the following facts.
There is no constructive denition of the Borel algebra. In other
words, it is not possible to give anything like a concrete description of
what the typical Borel set looks like.
The Borel algebra is strictly included in the power algebra. Thus there
exist subsets of R which are not Borel sets.
However, all subsets of R which ever turn up in practice are Borel
sets. Reformulating this, one can say that it is enormously hard to
construct a set which is not a Borel set. The pedestrian can therefore,
and without danger, informally regrad a Borel set as an arbitrary
subset of R.
We shall see that assigning probability to an event amounts to
dening a measure for a subset of the sample space. A -algebra
ensures that this measure can be properly dened, even when the
probability space is uncountably innite.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 19 / 59
Measure
As I mentioned, a measure is a function assigning a number to
subsets. We need more structure on this function.
Denition. A measure on (, F) is a function : F [0, ], such
that
(i) () = 0, and
(ii) (countable additivity) for any sequence of disjoint sets A
1
, A
2
, . . .
in F,

_

_
n=1
A
n
_
=

n=1
(A
n
).
The simplest but very important example is the length of subsets of
R, starting from ([a, b]) = b a. This is Lebesgue measure.
What is the Lebesgue measure of a single point, {1,
1
2
,
1
3
,
1
4
, . . . }, and
Q [0, 1]?
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 20 / 59
Probability Measure
Denition. Let be a nonempty set, and let F be a -algebra of
subsets of . A probability measure P is a function that, to every set
A F, assigns a number in [0, 1], called the probability of A and
written P(A). We require
(i) P() = 1, and
(ii) whenever A
1
, A
2
, . . . is a sequence of disjoint sets in F, then
P
_

_
n=1
A
n
_
=

n=1
P(A
n
).
The triple (, F, P) is called a probability space.
Denition. Let (, F, P) be a probability space. If a set A F
satises P(A) = 1, we say that the event A occurs almost surely.
Example. When tossing a coin innitely many times we get at least one
head almost surely.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 21 / 59
Probability Measure
Probability measures have the following properties.
1
P() = 0.
2
P(A
c
) = 1 P(A).
3
If A
1
, A
2
, . . . , A
N
are nitely many disjoint sets in F, then
P
_
N
_
n=1
A
n
_
=
N

n=1
P(A
n
).
4
If B A then P(B) P(A).
5
P(A
1
A
2
. . . ) P(A
1
) +P(A
2
) +. . .
Lebesgue measure on [0, 1] corresponds to uniform measure on [0, 1]:
P[a, b] = b a. So the probability of choosing a rational number at
random on [0, 1] is zero.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 22 / 59
Random Variables
Denition. Let (, F, P) be a probability space. A random variable is
a real-valued function X dened on with the property that for every
Borel subset B of R, the subset of given by
{X B} = { : X() B}
is in the -algebra F.
Example. The indicator I
A
() of an arbitrary (measurable) set A F.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 23 / 59
Distribution
Denition. Let X be a random variable (, F, P). The distribution
measure of X is the probability measure
X
that assigns to each
Borel subset B of R the mass
X
(B) = P{X B}.
Do you see why we need the measurability condition on the previous
slide?
Two dierent random variables can have the same distribution. A
single random variable can have two dierent distributions.
Denition. The function
F(x) = P{X x}, x R
is called the cumulative distribution function of X.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 24 / 59
Distribution
Example 1. The random variable X takes, with probability one, one of
the values in the sequnce x
1
, x
2
, . . . . Dene (probability mass
function) p
i
= P{X = x
i
}, where p
i
0 and

i
p
i
= 1. The
distribution measure is

X
(B) =

{i :x
i
B}
p
i
, B B(R).
Example 2. If, for x R, there exists a nonnegative function f (x)
(density function), such that
P{a X b} =
_
b
a
f (x)dx, < a b < ,
then

X
[a, b] = P{a X b}
is a distribution measure and F(b) =
_
b

f (x)dx.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 25 / 59
Joint Distribution
Denition. Let X and Y be random variables. The pair of random
variables (X, Y) takes values in the plane R
2
, and the joint
distribution measure of (X, Y) is given by

X,Y
(C) = P{(X, Y) C} for all Borel sets C R
2
.
This is a probability measure. The joint cumulative distribution
function of (X, Y) is
F
X,Y
(a, b) =
X,Y
((, a] (, b]) = P{X a, Y b},
for all a R, b R. A nonnegative, Borel-measurable function
f
X,Y
(x, y) is a joint density for the pair of random variables (X, Y) if

X,Y
(C) =
_

I
C
(x, y)f
X,Y
(x, y)dydx,
for all Borel sets C R
2
.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 26 / 59
Joint Distribution
Equivalently,
F
X,Y
(a, b) =
_
a

_
b

f
X,Y
(x, y)dydx,
for all a R, b R.
The (marginal) distribution measures of X and Y are

X
(A) = P{X A} =
X,Y
(A R)

Y
(B) = P{Y B} =
X,Y
(R B)
for all Borel subsets A, B R. The (marginal) cumulative
distribution functions are
F
X
(a) =
X
(, a] = P{X a}
F
Y
(b) =
Y
(, b] = P{Y b}
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 27 / 59
Joint Distribution
If the joint density f
X,Y
exists, then the marginal densities exist and
are given by
f
X
(x) =
_

f
X,Y
(x, y)dy and f
Y
(y) =
_

f
X,Y
(x, y)dx.
The marginal densities are connected to marginal distribution
measures and cumulative distribution functions as before.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 28 / 59
Joint Distribution
Example. A fair coin is tossed three times; let X denote the number
of head on the rst toss and Y the total number of heads. The joint
probability mass function of X and Y is given by:
y
x 0 1 2 3
0
1
8
2
8
1
8
0
1 0
1
8
2
8
1
8
Compute P(Y = 0), P(Y = 1), P(X = 0|Y = 1), P(X = 1|Y = 1).
P(Y = 0) = P(Y = 0, X = 0) +P(Y = 0, X = 1) =
1
8
P(Y = 1) = P(Y = 1, X = 0) +P(Y = 1, X = 1) =
3
8
P(X = 0|Y = 1) =
2
8
3
8
=
2
3
P(X = 1|Y = 1) = 1 P(X = 0|Y = 1) =
1
3
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 29 / 59
Expectation
Denition. Let X be a random variable on (, F, P). The expectation
(or expected value) of X is dened to be
EX =
_

X()dP().
This is the Lebesgue integral. What we usually do is the Riemann
integral.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 30 / 59
Lebesgue Integral
Let X be a nonnegative random variable on (, F, P), and
= {y
0
, y
1
, y
2
, . . . }, where 0 = y
0
< y
1
< y
2
< . . . . For each
subinterval [y
k
, y
k+1
], we set
A
k
= { : y
k
X() < y
k+1
}.
We dene the lower Lebesgue sum to be
LS

(X) =

k=1
y
k
P(A
k
).
This lower sum converges as ||||, the maximal distance between the
y
k
partition points, approaches zero, and we dene this limit to be
the Lebesgue intergral
_

XdP.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 31 / 59
Lebesgue Integral
y
y = X()
y
k+1
y
k
A
k
= { : y
k
X() < y
k+1
}
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 32 / 59
Lebesgue Integral
For a general X, we express X in terms of two nonnegative random
variables:
X = X
+
X

,
where
X
+
() = max{X(), 0} and X

() = max{X(), 0}.
Then, we dene
_

X()dP() =
_

X
+
()dP()
_

()dP().
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 33 / 59
Lebesgue Integral
The Riemann integral partitions the domain, while the Lebesgue
integral partitions the range. With an abstract sample space , there
is no natural way to partition .
The Lebesgue integral can be dened for any arbitrary measure, not
just for a probability measure.
The Riemann and Lebesgue integrals agree whenever the Riemann
integral is dened. Here the Lebesgue integral is computed with
respect to the Lebesgue measure.
The expectation EX is well dened when X is integrable (EX
+
<
and EX

< ) or X 0 a.s.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 34 / 59
Properties of Expectation
Let X and Y be a random variable on a probability space (, F, P).
1
If X takes only nitely many values x
0
, x
1
, . . . , x
n
, then
EX =
n

k=0
x
k
P{X = x
k
}.
2
(Integrability) The random variable X is integrable if and only if
E|X| < .
3
(Comparison) If X Y almost surely and EX and EY are dened,
then
EX EY.
In particurlar, if X = Y a.s., EX = EY.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 35 / 59
Properties of Expectation
4
(Linearity) If and are real constants and X and Y are integrable,
or if and are nonnegative constants and X and Y are
nonnegative, then
E(X +Y) = EX +EY.
5
(Jensens inequality) If is a convex, real-valued function dened on
R, and if E|X| < , then
(EX) E(X).
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 36 / 59
Properties of Expectation
Example. Consider the innite independent coin-toss space

. A
generic element of

will be denoted =
1

2
. . . , where
n
indicates the result of the nth coin toss. The probability measure P
corresponds to probability
1
2
for head on each toss. Let
Y
k
() =
_
1 if
k
= H,
0 if
k
= T.
Then,
EY
n
= 1 P{Y
n
= 1} + 0 P{Y
n
= 0} =
1
2
.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 37 / 59
Properties of Expectation
Example. Let = [0, 1], and let P be the Lebesgue measure on [0, 1].
Consider a random variable
X() =
_
1 if is irrational,
0 if is rational.
Then,
EX = 1 P{ : is irrational} + 0 P{ : is rational}
= 1,
because there are only countably many rational numbers in and P
satises the countable additivity property.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 38 / 59
Sequence of Random Variables
Recall that a sequence of real numbers {x
n
} is said to converge to x
if, for every > 0, there is an integer N suth that n N implies
|x
n
x| < .
Now we consider a sequence of random variables and its limit. Let
X, X
1
, X
2
, . . . be random variables dened on a probability space
(, F, P).
Denition. The sequnce X
1
, X
2
, . . . of random variables converges in
probability to X (X
n
P
X), if for every > 0
P{|X
n
X| > } 0, n .
There is a stronger version than this, in which the sequence is said to
converge with probability one (or almost surely).
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 39 / 59
Sequence of Random Variables
Denition. The sequence X
1
, X
2
, . . . of random variables converges in
distribution (or in law) to X (X
n
d
X) if
F
X
n
(x) F
X
(x)
at each point x of continuity of F
X
(x), where F
X
(x) = P{X < x}.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 40 / 59
Sequence of Random Variables
The relationships among the four modes of convergence:
1
X
n
P
X =X
n
d
X
2
If c R, then X
n
d
c =X
n
P
c
Slutsky Theorems
1
If X
n
d
X and f : R R is continuous, then f (X
n
)
d
f (X)
2
If X
n
d
X and (X
n
Y
n
)
P
0 , then Y
n
d
X
3
If X
n
d
X and Y
n
d
c, then X
n
+ Y
n
d
X + c, X
n
Y
n
d
cX,
X
n
/Y
n
d
X/c, if c = 0
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 41 / 59
Weak Laws of Large Numbers
The mean of a sample from distributions converges in probability to
the mean of the distribution. Let X
n
=
X
1
++X
n
n
.
(i.i.d.) Let X
1
, X
2
, . . . be a sequence of independent, identically
distributed (i.i.d.) random variables with E|X
1
| < . Then
X
n
P
EX
1
.
(independent) Let X
1
, X
2
, . . . be a sequence of independent random
variables such that EX
i
< and Var(X
i
) =
2
i
< . Then,
(X
n
EX
n
)
P
0
provided that
lim
n
1
n
2
n

j =1

2
j
= 0.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 42 / 59
Central Limit Theorems
(i.i.d.) Let X
1
, X
2
, . . . be a sequence of independent, identically
distributed (i.i.d.) random variables with each mean m and variance

2
. Then,

n(X
n
m)

d
N(0, 1).
(independent) Let X
1
, X
2
, . . . be a sequence of independent random
variables with nite second moments. Let m
j
= EX
j
and

2
j
= Var(X
j
) > 0. Assume that, for some > 0,
lim
n

n
j =1
E|X
j
m
j
|
2+
_

n
j =1

2
j
_
1+

2
= 0. (Lyapunov condition)
Then

n(X
n
m
n
)
_

n
j =1

2
j
/n
d
N(0, 1).
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 43 / 59
Central Limit Theorems
Example. Recall the innite independent coin-toss space

. With
the probability measure chosen to correspond to probability p of head
on each toss, we dene
Y
k
() =
_
1 if
k
= H
0 if
k
= T,
and
H
n
=
n

k=1
Y
k
,
so that H
n
is the number of heads obtained in the rst n tosses. The
strong law of large numbers yields
lim
n
H
n
n
= p almost surely.
The central limit theorm tells us
H
n
np
_
np(1 p)
d
N(0, 1).
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 44 / 59
Central Limit Theorems
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 45 / 59
Computation of Expectations
Integration over R is more convenient than integration over the
abstract space in practice when computing expectations.
Since the distribution measure of X is the probability measure
X
dened on R by

X
(B) = P{X B} for every Borel subset B of R,
we have the transition
_

X()dP()
_
R
xd
X
(x),
as stated in the next theorem.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 46 / 59
Computation of Expectations
Theorem. Let X be a random variable on (, F, R) and let g be a
Borel-measurable function on R. Then
E|g(X)| =
_
R
|g(x)|d
X
(x),
and if this quantity is nite, then
Eg(X) =
_
R
g(x)d
X
(x).
A function f is said to be Borel measurable, if for every Borel subset B
of R, the set {x : f (x) B} is also a Borel subset of R. Every
continuous and piecewise continuous function is Borel measurable.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 47 / 59
Computation of Expectations
Suppose X has a density f (a nonnegative, Borel-measurable function
on R such that
X
(B) =
_
B
f (x)dx). In particular, assume that f is
bounded and continuous almost everywhere as in most cases, we have
E|g(X)| =
_

|g(x)|f (x)dx,
and if this quantity is nite, then
Eg(X) =
_

g(x)f (x)dx.
Probably you are familiar with this representation.
Note that they are Riemann integrals.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 48 / 59
Change of Measure
Suppose there are only two possible states (outcomes) of the world at
t = 1: u and d with (physical) probabilities 0.6 and 0.4, respectively.
The riskless interest rate is r = 25%. What is the price of riskless
bond that pays 5 at t = 1?
Suppose there is a risky asset that pays 7 in state u and 2 in d,
denoted by S
1
. Its expected cash ow is 7(0.6)+2(0.4) = 5. What is
the price of this asset? Generally, P <
ES
1
1+r
, because investors ask for
a risk premium.
Suppose the price of the risky asset is 2 <
ES
1
1+r
= 4. However, if we
change the probability measure appropriately, we can price the asset
in the risk neutral manner. In particular, if we assign a probability 0.1
to u and 0.9 to d, then the price of risky asset under risk neutral
valuation
P =

ES
1
1 + r
=
7(0.1) + 2(0.9)
1.25
= 2,
where a tilde denotes operations under the new probability measure.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 49 / 59
Change of Measure
The new probability measure (called a risk neutral measure or a
martingale measure) cannot be empirically estimated. However, this
ctitious measure turns out to be very useful in pricing derivatives. As
you will see, the risk neutral valuation un the risk neutral measure
always delivers the fair price of any possible asset given the states of
the world.
Today we build a mathematical foundation for this technique.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 50 / 59
Change of Measure
Theorem. Let (, F, P) be a probability space and let Z be an almost
surely nonnegative random variable with EZ = 1. For A F, dene

P(A) =
_
A
Z()dP().
Then

P is a probability measure. Furthermore,

EX = E[XZ],
provided that they exist. If Z is almost surely strictly positive, we also
have

EY = E
_
Y
Z
_
,
for every nonnegative random variable Y.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 51 / 59
Change of Measure
This theorem shows a way to change a probability measure and the
relationship of expectations under the two measures.
Denition. Two probability measures P and

P on (, F) are said to
be equivalent if they agree which sets in F have probability zero.
The probability measures P and

P in the preceding theorem are
equivalent. They agree on what is possible and what is impossible.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 52 / 59
Change of Measure
Example. = [0, 1], P[a, b] = b a,

P[a, b] =
_
b
a
2d = b
2
a
2
. We can link P and

P

P[a, b] =
_
b
a
2dP().
Equivalently,

P(B) =
_
B
2dP(),
for every Borel set B R. This implies that Z() = 2.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 53 / 59
Change of Measure
Denition. Let (, F, P) be a probability space, let

P be another
probability measure on (, F) that is equivalent to P, and let Z be an
almost surely positive random variable that relates P and

P via

P(A) =
_
A
Z()dP().
Then Z is called the Radon-Nikodym derivative of

P with respect to
P, and we write
Z =
d

P
dP
.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 54 / 59
Change of Measure
In a nite sample space,
Z() =

P()
P()
.
Example. Let = {
1
,
2
}, X(
1
) = x
1
= 2 and X(
2
) = x
2
= 3.
Also, P{X = x
1
} = p
1
= 2/5. Then, EX = 1.
We want to change the mean of X to 0 under a new probability
measure Q, where Q{X = x
1
} = q
1
. We require that
E
Q
X = q
1
(2) + (1 q
1
)(3) = 0,
leading to q
1
= 3/5. Thus,
z
1
= q
1
/p
1
= 3/2, z
2
= (1 q
1
)/(1 p
1
) = 2/3.
Verify EZ = 1.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 55 / 59
Change of Measure
Example. Let X be a standard normal random variable dened on
(, F, P), and let Y = X + with > 0. We want to dene a new
probability measure

P on under which Y is standard normal. First
dene the random variable
Z() = exp
_
X()
1
2

2
_
for all .
This random variable can serve as a Radon-Nikodym derivative
because
1
Z() > 0 for all , and
2
EZ = 1. (Show)
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 56 / 59
Change of Measure
Example (contd). A new probability measure

P is created by

P(A) =
_
A
Z()dP().
Then,

P{Y b} =
_
{:Y()b}
Z()dP()
=
_

I
{Y()b}
Z()dP()
=
_

I
{X()b}
exp
_
X()
1
2

2
_
dP()
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 57 / 59
Change of Measure
Example (contd). Since P has a density (x) =
1

2
e

x
2
2
,
_

I
{X()b}
exp
_
X()
1
2

2
_
=
_

I
{X()b}
e
x
1
2

2
(x)dx
=
1

2
_
b

1
2
(x+)
2
dx
=
1

2
_
b

1
2
y
2
dy,
implying Y is a standard normal random variable under

P.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 58 / 59
Change of Measure
Theorem (Radon-Nikodym). Let P and

P be equivalent probability
measures dened on (, F). Then there exists an almost surely
positive random variable Z such that EZ = 1 and

P(A) =
_
A
Z()dP(), (1)
for every A F.
This remarkable result is the converse of what we did. It implies that
(1) is the only way to obtain an equivalent measure.
Taejin Kim (CUHK Business School) Probability Theory 57 September, 2013 59 / 59

Você também pode gostar