Ad PR 2006

Advanced Probability
University of Cambridge, Part III of the Mathematical Tripos Michaelmas Term 2006
Grgory Miermont1 e
& Laboratoire de Mathmatique, Equipe Probabilits, Statistique et Modlisation, e e e Bt. 425, Universit Paris-Sud, 91405 Orsay, France e
1 CNRS
Contents
1 Conditional expectation 1.1 The discrete case . . . . . . . . . . . . . . . 1.2 Conditioning with respect to a -algebra . . 1.2.1 The L2 case . . . . . . . . . . . . . . 1.2.2 General case . . . . . . . . . . . . . . 1.2.3 Non-negative case . . . . . . . . . . . 1.3 Specic properties of conditional expectation 1.4 Computing a conditional expectation . . . . 1.4.1 Conditional density functions . . . . 1.4.2 The Gaussian case . . . . . . . . . . 2 Discrete-time martingales 2.1 Basic notions . . . . . . . . . . . . . . 2.1.1 Stochastic processes, ltrations 2.1.2 Martingales . . . . . . . . . . . 2.1.3 Doobs stopping times . . . . . 2.2 Optional stopping . . . . . . . . . . . . 2.3 The convergence theorem . . . . . . . . 2.4 Lp convergence, p > 1 . . . . . . . . . . 2.4.1 A maximal inequality . . . . . . 2.5 L1 convergence . . . . . . . . . . . . . 2.6 Optional stopping in the UI case . . . 2.7 Backwards martingales . . . . . . . . . 5 5 6 7 7 8 10 11 11 11 13 13 13 14 14 15 17 18 18 20 20 21 23 23 23 24 26 28
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
3 Examples of applications of discrete-time martingales 3.1 Kolmogorovs 0 1 law, law of large numbers . . . . . . 3.2 Branching processes . . . . . . . . . . . . . . . . . . . . . 3.3 The Radon-Nikodym theorem . . . . . . . . . . . . . . . 3.4 Product martingales . . . . . . . . . . . . . . . . . . . . 3.4.1 Example: consistency of the likelihood ratio test . 3
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
4 Continuous-parameter processes 4.1 Theoretical problems . . . . . . . . . . . . . . . . . . . 4.2 Finite marginal distributions, versions . . . . . . . . . . 4.3 The martingale regularization theorem . . . . . . . . . 4.4 Convergence theorems for continuous-time martingales 4.5 Kolmogorovs continuity criterion . . . . . . . . . . . . 5 Weak convergence 5.1 Denition and characterizations 5.2 Convergence in distribution . . 5.3 Tightness . . . . . . . . . . . . 5.4 Lvys convergence theorem . . e
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
29 29 31 32 34 35 37 37 39 40 41 43 43 47 48 51 53 55 59 63 63 64 66 66 67
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
6 Brownian motion 6.1 Wieners theorem . . . . . . . . . . . 6.2 First properties . . . . . . . . . . . . 6.3 The strong Markov property . . . . . 6.4 Martingales and Brownian motion . . 6.5 Recurrence and transience properties 6.6 The Dirichlet problem . . . . . . . . 6.7 Donskers invariance principle . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
7 Poisson random measures and processes 7.1 Poisson random measures . . . . . . . . . . . 7.2 Integrals with respect to a Poisson measure . . 7.3 Poisson point processes . . . . . . . . . . . . . 7.3.1 Example: the Poisson process . . . . . 7.3.2 Example: compound Poisson processes
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
8 ID laws and Lvy processes e 69 8.1 The Lvy-Khintchine formula . . . . . . . . . . . . . . . . . . . . . . . . . 69 e 8.2 Lvy processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 e 9 Exercises 9.1 Conditional expectation . . . 9.2 Discrete-time martingales . . 9.3 Continuous-time processes . . 9.4 Weak convergence . . . . . . . 9.5 Brownian motion . . . . . . . 9.6 Poisson measures, ID laws and 75 77 79 84 85 86 89
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lvy processes e
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Chapter 1 Conditional expectation

1.1 The discrete case
Let (, F, P ) be a probability space. If A, B F are two events such that P (B) > 0, we dene the conditional probability of A given B by the formula P (A|B) = P (A B) . P (B)
We interpret this quantity as the probability of the event A given the fact that B is realized. The fact that P (B|A)P (A) P (A|B) = P (B) is called Bayes rule. More generally, if X L1 (, F, P ) is an integrable random variable, we dene E[X 1B ] E[X|B] = , P (B) the conditional expectation of X given B. Example. Toss a fair die (probability space = {1, 2, 3, 4, 5, 6} and P ({}) = 1/6, for ) and let A = {the result is even}, B = {the result is less than or equal to 2}. Then P (A|B) = 1/2, P (B|A) = 1/3. If X = is the result, then E[X|A] = 4, E[X|B] = 3/2. Let (Bi , i I) be a countable collection of disjoint events, such that = G = {Bi , i I}. If X L1 (, F, P ), we dene a random variable X =
iI iI
Bi , and
E[X|Bi ]1Bi ,
with the convention that E[X|Bi ] = 0 if P (Bi ) = 0. The random variable X is integrable, since E[|X |] =
iI
P (Bi )|E[X|Bi ]| =
iI
P (Bi )
|E[X 1Bi ]| E[|X|]. P (Bi )
Moreover, it is straightforward to check: 5
6 1. X is G-measurable, and
CHAPTER 1. CONDITIONAL EXPECTATION
2. for every B G, E[1B X ] = E[1B X]. Example. If X L1 (, F, P ) and Y is a random variable with values in a countable set E, the above construction gives, by letting By = {Y = y}, y E, which partitions (Y ) into measurable events, a random variable E[X|Y ] =
yE
E[X|Y = y]1{Y =y} .
Notice that the value taken by E[X|Y = y] when P (Y = y) = 0, which we have xed to 0, is actually irrelevant to dene E[X|Y ], since a random variable is always dened up to a set of zero measure. It is important to keep in mind that conditional expectations are always a priori only dened up to a zero-measure set.
1.2
Conditioning with respect to a -algebra
We are now going to dene the conditional expectation given a sub--algebra of our probability space, by using the properties 1. and 2. of the previous paragraph. The denition is due to Kolmogorov. Theorem 1.2.1 Let G F be a sub--algebra, and X L1 (, F, P ). Then there exists a random variable X with E[|X |] < such that the following two characteristic property are veried: 1. X is G-measurable 2. for every B G, E[1B X ] = E[1B X]. Moreover, if X is another such random variable, then X = X a.s. We denote by E[X|G] L1 (, G, P ) the class of random variable X . It is called the conditional expectation of X given G. Otherwise said, E[X|G] is the unique element of L1 (, G, P ) such that E[1B X] = E[1B E[X|G]] for every B G. Equivalently, an approximation argument allows to replace 2. in the statement by: 2. For every bounded G-measurable random variable Z, E[ZX ] = E[ZX]. Proof of the uniqueness. Suppose X and X satisfy the two conditions of the statement. Then B = {X > X } G, and therefore 0 = E[1B (X X)] = E[1B (X X )], which shows X X a.s., the reverse inequality is obtained by symmetry. The existence will need two intermediate steps.
1.2. CONDITIONING WITH RESPECT TO A -ALGEBRA
1.2.1
The L2 case
We rst consider L2 variables. Suppose that X L2 (, F, P ) and let G F be a sub--algebra. Notice that L2 (, G, P ) is a closed vector subspace of the Hilbert space L2 (, F, P ). Therefore, there exists a unique random variable X L2 (, G, P ) such that E[Z(X X )] = 0 for every Z L2 (, G, P ), namely, X is the orthogonal projection of X onto L2 (, G, P ). This shows the previous theorem in the case X L2 , and in fact E[|G] : L2 L2 is the orthonormal projector onto L2 (, G, P ), and hence is linear. It follows from the uniqueness statement that the conditional expectation has the following nice interpretation in the L2 case: E[X|G] is the G-measurable random variable that best approximates X. It is useful to keep this intuitive idea even in the general L1 case, although the word approximates becomes more fuzzy. Notice that X := E[X|G] 0 a.s. whenever X 0 since (notice {X < 0} G) E[X 1{X <0} ] = E[X 1{X <0} ], and the left-hand side is non-negative while the right hand-side is non-positive, entailing P (X < 0) = 0. Moreover, it holds that E[E[X|G]] = E[X], because it is the scalar product of X against the constant function 1 L2 (, G, P ).
1.2.2
General case
Now let X 0 be any non-negative random variable (not necessarily integrable). Then X n is in L2 for every n N, and X n increases to X pointwise. Therefore, the sequence E[X n|G] is an (a.s.) increasing sequence, because X n X (n 1) 0) and by linearity of E[|G] on L2 . It therefore increases a.s. to a limit which we denote by E[X|G]. Notice that E[E[X n|G]] = E[X n] so that by the monotone convergence theorem, E[E[X|G]] = E[X]. In particular, if X is integrable, then so is E[X|G]. Proof of existence in Theorem 1.2.1. Existence. Let X L1 , and write X = X + X (where X + = X 0, and X = (X) 0). Then X + , X are non-negative integrable random variables, so E[X + |G] and E[X |G] are nite a.s. and we may dene E[X|G] = E[X + |G] E[X |G]. Now, let B G. Then E[(X + n)1B ] = E[E[X + n|G]1B ] by denition. The monotone convergence theorem allows to pass to the limit (all integrated random variables are nonnegative), and we obtain E[X + 1B ] = E[E[X + |G]1B ]. The same is easily true for X , and by subtracting we see that E[X|B] indeed satises the characteristic properties 1., 2. The following properties are immediate consequences of the previous theorem and its proof. Proposition 1.2.1 Let G F be a -algebra and X, Y L1 (, F, P ). Then 1. E[E[X|G]] = E[X] 2. If X is G-measurable, then E[X|G] = X.
CHAPTER 1. CONDITIONAL EXPECTATION 3. If X is independent of G, then E[X|G] = E[X]. 4. If a, b R then E[aX + bY |G] = aE[X|G] + bE[Y |G] (linearity). 5. If X 0 then E[X|G] 0 (positiveness). 6. |E[X|G]| E[|X| |G], so that E[|E[X|G]|] E[|X|].
Important remark. Notice that all statements concerning conditional expectation are about L1 variables, which are only dened up to a subset of of zero probability, and hence are a.s. statements. This is of crucial importance and reminds the fact that we encountered before, that E[X|Y = y] can be assigned an arbitrary value whenever P (Y = y) = 0.
1.2.3
Non-negative case
In the course of proving the last theorem, we actually built an object E[X|G] as the a.s. increasing limit of E[X n|G] for any non-negative random variable X, not necessarily integrable. This random variable enjoys similar properties as the L1 case, and we state them similarly as in Theorem 1.2.1. Theorem 1.2.2 Let G F be a sub--algebra, and X 0 a non-negative random variable. Then there exists a random variable X 0 such that 1. X is G-measurable, and 2. for every non-negative G-measurable random variable Z, E[ZX ] = E[ZX]. Moreover, if X is another such r.v., X = X a.s. We denote by E[X|G] the class of X up to a.s. equality. Proof. Any r.v. in the class of E[X|G] = lim supn E[X n|G] trivially satises 1. It also satises 2. since if Z is a positive G-measurable random variable, we have, by passing to the (increasing) limit in E[(X n)(Z n)] = E[E[X n|G]Z n], that E[XZ] = E[E[X|G]Z]. Uniqueness. If X , X are non-negative and satisfy the properties 1. & 2., for any a < b Q+ , by letting B = {X a X ) = 0. The reader is invited to formulate and prove analogs of the properties of Proposition 1.2.1 for positive variables, and in particular, that if 0 X Y then 0 E[X|G] E[Y |G] a.s. The conditional expectation enjoys the following properties, which match those of the classical expectation.
1.2. CONDITIONING WITH RESPECT TO A -ALGEBRA Proposition 1.2.2 Let G F be a -algebra.
1. If (Xn , n 0) is an increasing sequence of non-negative random variables with limit X, then (conditional monotone convergence theorem) E[Xn |G]
n
E[X|G]
a.s.
2. If (Xn , n 0) is a sequence of non-negative random variables, then (conditional Fatou theorem) E[lim inf Xn |G] lim inf E[Xn |G] a.s.
n n
3. If (Xn , n 0) is a sequence of random variables a.s. converging to X, and if there exists Y L1 (, F, P ) such that supn |Xn | Y a.s., then (conditional dominated convergence theorem)
n
lim E[Xn |G] = E[X|G] ,
a.s. and in L1 .
4. If : R (, ] is a convex function and X L1 (, F, P ), and either is non-negative or (X) L1 (, F, P ), then (conditional Jensen inequality) E[(X)|G] (E[X|G]) 5. If 1 p < and X Lp (, F, P ), E[X|G]
p
a.s.
X p.
In particular, the linear operator X E[X|G] from Lp (, F, P ) to Lp (, G, P ) is continuous. Proof. 1. Let X be the increasing limit of E[Xn |G]. Let Z be a positive G-measurable random variable, then E[ZE[Xn |G]] = E[ZXn ], which by taking an increasing limit gives E[ZX ] = E[ZX], so X = E[X|G]. 2. We have E[inf kn Xk |G] inf kn E[Xk |G] for every n by monotonicity of the conditional expectation, and the result is obtained by passing to the limit and using 1. 3. Applying 2. to the nonnegative random variables Z Xn , Z + Xn , we get that E[Z X|G] E[Z|G] lim sup E[Xn |G] and that E[Z + X|G] E[Z|G] + lim inf E[Xn |G], giving the a.s. result. The L1 result is a consequence of the dominated convergence theorem, since |E[Xn |G]| E[|Xn | |G] |Z| a.s. 4. A convex function is the superior envelope of its ane minorants, i.e. (x) = sup
a,bR:y,ay+b(y)
ax + b =
sup
a,bQ:y,ay+b(y)
ax + b.
The result is then a consequence of linearity of the conditional expectation and the fact that Q is countable (this last fact is necessary because of the fact that conditional expectation is dened only a.s.). 5. One deduces from 4. and the previous proposition that E[X|G] p = E[|E[X|G]|p ] p E[E[|X|p |G]] = E[|X|p ] = X p , if 1 p < and X Lp (, F, P ). Thus p E[X|G]
p
X p.
10
1.3
Specic properties of conditional expectation
The information contained in G can be factorized out the conditional expectation: Proposition 1.3.1 Let G F be a -algebra, and let X, Y be real random variables such that either X, Y are non-negative or X, XY L1 (, F, P ). Then, is Y is G-measurable, we have E[Y X|G] = Y E[X|G]. Proof. Let Z be a non-negative G-measurable random variable, then, if X, Y are nonnegative, E[ZY X] = E[ZY E[X|G]] since ZY is non-negative, and the result follows by uniqueness. If X, XY are integrable, the same result follows by letting X = X + X , Y = Y + Y . One has the Tower property (restricting the information) Proposition 1.3.2 Let G1 G2 F be -algebras. Then for every random variable X which is positive or integrable, E[E[X|G2 ]|G1 ] = E[X|G1 ]. Proof. For a positive bounded G1 -measurable Z, Z is G2 -measurable as well, so that E[ZE[E[X|G2 ]|G1 ]] = E[ZE[X|G2 ]] = E[ZX] = E[ZE[X|G1 ]], hence the result. Proposition 1.3.3 Let G1 , G2 be two sub--algebras of F, and let X be a positive or integrable random variable. Then, if G2 is independent of (X, G1 ), E[X|G1 G2 ] = E[X|G1 ]. Proof. Let A G1 , B G2 , then E[1AB E[X|G1 G2 ]] = E[1A 1B X] = E[1B E[X 1A |G2 ]] = P (B)E[X 1A ] = P (B)E[1A E[X|G1 ]] = E[1AB E[X|G1 ]], where we have used the independence property at the third and last steps. The proof is then done by the monotone class theorem. Proposition 1.3.4 Let X, Y be random variables and G be a sub--algebra of F such that Y is G-measurable and X is independent of G. Then for any non-negative measurable function f , E[f (X, Y )|G] = where P (X dx) is the law of X. Proof. For any non-negative G-measurable random variable Z, we have that X is independent of (Y, Z), so that the law P ((X, Y, Z) dxdydz) is equal to the product P (X dx)P ((Y, Z) dydz) of the law of X by the law of (Y, Z). Hence, E[Zf (X, Y )] = = zf (x, y)P (X dx)P ((Y, Z) dydz) P (X dx)E[Zf (x, Y )] = E Z P (X dx)f (x, Y ) , P (X dx)f (x, Y ),
where we used Fubinis theorem in two places. This shows the result.
1.4. COMPUTING A CONDITIONAL EXPECTATION
11
1.4
Computing a conditional expectation
We give two concrete and important examples of computation of conditional expectations.
1.4.1
Conditional density functions
Suppose X, Y have values in Rm and Rn respectively, and that the law of (X, Y ) has a density: P ((X, Y ) dxdy) = fX,Y (x, y)dxdy. Let fY (y) = xRm fX,Y (x, y)dx, y Rn be the density of Y . Then for every non-negative measurable h : Rm R, g : Rn R, we have E[h(X)g(Y )] =
Rm Rn
h(x)g(y)fX,Y (x, y)dxdy g(y)fY (y)dy

Rn Rm
h(x)
fX,Y (x, y) 1{fY (y)>0} dx fY (y)
= E[(Y )g(Y )], so E[h(X)|Y ] = (Y ), where (y) = 1 fY (y) h(x)fX,Y (x, y)dx
Rm
if fY (y) > 0,
and 0 else. We interpret this result by saying that E[h(X)|Y ] =

Rm
h(x)(Y, dx),
where (y, dx) = fY (y)1 fX,Y (x, y)1{fY (y)>0} dx = fX|Y (x|y)dx. The measure (y, dx) is called conditional distribution given Y = y, and fX|Y (x|y) is the conditional density function of X given Y = y. Notice this function of x, y is dened only up to a zeromeasure set.
1.4.2
The Gaussian case
Let (X, Y ) be a Gaussian vector in R2 . Take X = aY +b with a, b such that Cov (X, Y ) = aVar Y and aE[Y ] + b = E[X]. In this case, Cov (Y, X X ) = 0, hence X X is independent of (Y ) by properties of Gaussian vectors. Moreover, X X is centered so for every B (Y ), one has E[1B X] = E[1B X ], hence X = E[X|Y ].
12
Chapter 2 Discrete-time martingales

Before we entirely focus on discrete-time martingales, we start with a general discussion on stochastic processes, which includes both discrete and continuous-time processes.
2.1
2.1.1
Basic notions
Stochastic processes, ltrations
Let (, F, P ) be a probability space. For a measurable states space (E, E) and a subset I R of times, or epochs, an E-valued stochastic process indexed by I is a collection (Xt , t I) of random variables. Most of the processes we will consider take values in R, Rd , or C, being endowed with their Borel -algebras. A ltration is a collection (Ft , t I) of sub--algebras of F which is increasing (s t = Fs Ft ). Once a ltration is given, we call (, F, (Ft )tI , P ) a ltered probability space. A process (Xt , t I) is adapted to the ltration (Ft , t I) if Xt is Ft -measurable for every t. The intuitive idea is that Ft is the quantity of information available up to time t (present). To give an informal example, if we are interested in the evolution of the stock market, we can take Ft as the past history of the stocks prices (or only some of them) up to time t. We will let F = tI Ft F be the information at the end of times. Example. For every process (Xt , t I), one associates the natural ltration FtX = ({Xs , s t}) , t I.
Every process is adapted to its natural ltration, and F X is the smallest ltration X is adapted to: FtX contains all the measurable events depending on (Xs , s t). Last, a real-valued process (Xt , t I) is said to be integrable if E[|Xt |] < for all t I. 13
14
CHAPTER 2. DISCRETE-TIME MARTINGALES
2.1.2
Martingales
Denition 2.1.1 Let (, F, (Ft )tI , P ) be a ltered probability space. An R-valued adapted integrable process (Xt , t I) is: a martingale if for every s t, E[Xt |Fs ] = Xs . a supermartingale if for every s t, E[Xt |Fs ] Xs . a submartingale if for every s t, E[Xt |Fs ] Xs . Notice that a (super, sub)martingale remains a martingale with respect to its natural ltration, by the tower property of conditional expectation.
2.1.3
Doobs stopping times
Denition 2.1.2 Let (, F, (Ft )tI , P ) be a ltered probability space. A stopping time (with respect to this space) is a random variable T : I {} such that {T t} Ft for every t I. For example, constant times are (trivial) stopping times. If I = Z+ , the random variable n1A + 1Ac is a stopping time if A Fn (with the convention 0 = 0). The intuitive idea behind this denition is that T is a time when a decision can be taken (given the information we have). For example, for a meteorologist having the weather information up to the present time, the rst day of 2006 when the temperature is above 23 C is a stopping time, but not the last day of 2006 when the temperature is above 23 C. Example. If I Z+ , the denition can be replaced by {T = n} Fn for all n I. When I is a subset of the integers, we will denote the time by letters n, m, k rather than t, s, r (so n 0 means n Z+ ). Particularly important instances of stopping times in this case are the rst entrance times. Let (Xn , n 0) be an adapted process and let A E. The rst entrance time in A is TA = inf{n Z+ : Xn A} Z+ It is a stopping time since {TA n} =
0mn 1 Xm (A).
{}.
To the contrary, the last exit time before some xed N , LA = sup{n {0, 1, . . . , N } : Xn A} Z+ is in general not a stopping time. As an immediate consequence of the denition, one gets: Proposition 2.1.1 Let S, T, (Tn , n N) be stopping times (with respect to some ltered probability space). Then S T, S T, inf n Tn , supn Tn , lim inf n Tn , lim supn Tn are stopping times. {},
2.2. OPTIONAL STOPPING
15
Denition 2.1.3 Let T be a stopping time with respect to some ltered probability space (, F, (Ft )tI , P ). We dene FT , the -algebra of events before time T FT = {A F : A {T t} Ft }. The reader is invited to check that it denes indeed a -algebra, which is interpreted as the events that are measurable with respect to the information available at time T : 4 days before the rst day (T ) in 2005 when the temperature is above 23 C, the temperature was below 10 C is in FT . If S, T are stopping times, one checks that S T = FS FT . (2.1)
Now suppose that I is countable. If (Xt , t I) is adapted, and T a stopping time, we let XT 1{T <} = XT () () if T () < , and 0 else. It is a random variable as the composition of (, t) Xt () and (, T ()), which are measurable (why?). We also let X T = (XT t , t I), and we call it the process X stopped at T . Proposition 2.1.2 Under these hypotheses, 1. XT 1{T <} is FT -measurable, 2. the process X T is adapted, 3. if moreover I = Z+ and X is integrable, then X T is integrable. Proof. 1. Let A E. Then {XT A} {T t} = sI,st {Xs A} {T = s}. Then notice {T = s} = {T s} \ u<s {T u} Fs . 2. For every t I, XT t is FT t -measurable, hence Ft measurable since T t t, by (2.1). T 3. If I = Z+ and X is integrable, E[|Xn |] = m<n E[|Xm |1{T =m} ] + E[|Xn |1{T n} ] n sup0mn E[|Xm |]. From now on until the end of the section (except in the paragraph on backwards martingales), we will suppose that E = R and I = Z+ (discrete-time processes).
2.2
Discrete-time martingales: optional stopping
We consider a ltered probability space (, F, (Fn ), P ). All the above terminology (stopping times, adapted processes and so on) will be with respect to this space. We rst introduce the so-called martingale transform, which is sometimes called the discrete stochastic integral with respect to a (super, sub)martingale X. We say that a process (Cn , n 1) is previsible if Cn is Fn1 -measurable for every n 1. A previsible process can be interpreted as a strategy: one bets at time n only with all the accumulated knowledge up to time n 1.
16
If (Xn , n 0) is adapted and (Cn , n 1) is previsible, we dene an adapted process C X by

n
(C X)n =
k=1
Ck (Xk Xk1 ).
We can interpret this new process as follows: if Xn is a certain amount of money at time n and if Cn is the bet of a player at time n then (C X)n is the total winning of the player at time n. Proposition 2.2.1 In this setting, if X is a martingale, and C is bounded, then C X is a martingale. If X is a supermartingale (resp. submartingale) and Cn 0 for every n 1, then C X is a supermartingale (resp. submartingale). Proof. Suppose X is a martingale. Since C is bounded, the process C X is trivially integrable. Since Cn+1 is Fn -measurable, E[(C X)n+1 (C X)n |Fn ] = E[Cn+1 (Xn+1 Xn )|Fn ] = Cn+1 E[Xn+1 Xn |Fn ] = 0. The (super-, sub-)martingale cases are similar. Theorem 2.2.1 (Optional stopping) Let (Xn , n 0) be a martingale (resp. super-, sub-martingale). (i) If T is a stopping time, then Then X T is also a martingale (resp. super-, submartingale). (ii) If S T are bounded stopping times, then E[XT |FS ] = XS (resp. E[XT |FS ] XS , E[XT |FS ] XS ). (iii) If S T are bounded stopping times, then E[XT ] = E[XS ] (resp. E[XT ] E[XS ], E[XT ] E[XS ]). Proof. (i) Let Cn = 1{nT } , then C is a previsible non-negative bounded process, and it is immediate that C X = X T . The rst result follows from Proposition 2.2.1. (ii) If now S, T are bounded stopping times with S T , and A FS , we dene Cn = 1A 1{S<nT } . Then C is a nonnegative bounded previsible process, since A {S < n} = A {S n 1} Fn1 and {n T } = {n 1 < T } Fn1 . Morevoer, XS , XT are integrable since S, T are bounded, and (C X)K = 1A (XT XS ) as soon as K T a.s. Since C X is a martingale, E[(C X)K ] = E[(C X)0 ] = 0. Taking expectations entails that E[XT |FS ] = XS . (iii) Follows by taking expectations in (ii). Notice that the last two statement is not true in general. For example, if (Yn , n 0) are independent random variables which take values 1 with probability 1/2, then Xn = 1in Yi is a martingale. If T = inf{n 0 : Xn = 1} then it is classical that T < a.s., but of course E[XT ] = 1 > 0 = E[X0 ]. However, for non-negative supermartingales, Fatous lemma entails: Proposition 2.2.2 Suppose X is a non-negative supermartingale. Then for any stopping time which is a.s. nite, we have E[XT ] E[X0 ].
2.3. THE CONVERGENCE THEOREM
17
Beware that this sign should not in general be turned into a = sign, even if X is a martingale! The very same proposition is actually true without the assumption that P (T < ) = 1, by the martingale convergence theorem 2.3.1 below.
2.3
Discrete-time martingales: the convergence theorem
The martingale convergence theorem is the most important result in this chapter. Theorem 2.3.1 (Martingale convergence theorem) If X is a supermartingale which is bounded in L1 (, F, P ), i.e. such that supn E[|Xn |] < , then Xn converges a.s. towards an a.s. nite limit X . An easy and important corollary for that is Corollary 2.3.1 A non-negative supermartingale converges a.s. towards an a.s. nite limit. Indeed, for a non-negative supermartingale, E[|Xn |] = E[Xn ] E[X0 ] < . The proof of Theorem 2.3.1 relies on an estimation of the number of upcrossings of a submartingale between to levels a < b. If (xn , n 0) is a real sequence, and a b},
with the usual convention inf = . The number Nn ([a, b], x) = sup{k > 0 : Tk (x) n} is the number of upcrossings of x between a and b before time n, which increases as n to the total number of upcrossings N ([a, b], x) = sup{k > 0 : Tk (x) < }. The key is the simple following analytic lemma: Lemma 2.3.1 A real sequence x converges (in R) if and only if N ([a, b], x) < for every rationals a < b. Proof. If there exist a < b rationals such that N ([a, b], x) = , then lim inf n xn a < b lim supn xn so that x does not converge. If x does not converge, then lim inf n xn < lim supn xn , so by taking two rationals a < b in between, we get the converse statement.
Theorem 2.3.2 (Doobs upcrossing lemma) Let X be a supermartingale, and a < b two reals. Then for every n 0, (b a)E[Nn ([a, b], X)] E[(Xn a) ].
18
Proof. It is immediate by induction that Sk = Sk (X), Tk = Tk (X) dened as above are stopping times. Dene a previsible process C, taking only 0 or 1 values, by Cn =
k1
1{Sk <nTk } .
It is indeed previsible since {Sk < n Tk } = {Sk n 1} {Tk n 1}c Fn1 . Now, letting Nn = N ([a, b], X), we have
Nn
(C X)n =
i=1
(XTi XSi ) + (Xn XSNn +1 )1{SNn +1 n}
(b a)Nn + (Xn a)1{Xn a} (b a)Nn (Xn a) . Since C is a non-negative bounded previsible process, C X is a supermartingale so nally (b a)E[Nn ] E[(Xn a) ] E[(C X)n ] 0, hence the result. Since (x + y) |x| + |y|, we get from Theorem 2.3.2 Proof of Theorem 2.3.1. that E[Nn ] (b a)1 E[|Xn | + a], and since Nn increases to N = N ([a, b], X) we get by monotone convergence E[N ] (b a)1 (supn E[|Xn |] + a). In particular, we get N ([a, b], X) < a.s. for every a < b Q, so P {N ([a, b], X) < }
a<bQ
= 1.
Hence the a.s. convergence to some X , possibly innite. Now Fatous lemma gives E[|X |] lim inf n E[|Xn |] < by hypothesis, hence |X | < a.s.
Exercise. In fact, from Theorem 2.3.2 it is clearly enough that supn E[Xn ] < is sucient, prove that this actually implies boundedness is L1 (provided X is a supermartingale, of course).
2.4
2.4.1
Doobs inequalities and Lp convergence, p > 1

A maximal inequality
Proposition 2.4.1 Let X be a sub-martingale. Then letting Xn = sup0kn Xk , for every c > 0, and n 0,
+ cP (Xn c) E[Xn 1{Xn c} ] E[Xn ].
Proof. Letting T = inf{k 0 : Xk c}, we obtain by optional stopping that

T E[Xn ] E[Xn ] = E[Xn 1{T >n} ] + E[XT 1{T n} ] E[Xn 1{T >n} ] + cP (T n).
Since {T n} = {Xn c}, the conclusion follows.
2.4. LP CONVERGENCE, P > 1
19
Theorem 2.4.1 (Doobs Lp inequality) Let p > 1, and X be a martingale, then let ting Xn = sup0kn |Xk |, we have
Xn p
p Xn p . p1
Proof. Since x |x| in convex, the process (|Xn |, n 0) is a non-negative submartingale. Applying Proposition 2.4.1 and Hlders inequality shows that o
E[(Xn )p ] = 0 dx pxp1 P (Xn x)
dx pxp2 E[|Xn |1{Xn x} ] Xn
= pE |Xn |
0
dx xp2
p Xn p1 p ,
p p = E[|Xn |(Xn )p1 ] Xn p1 p1 which yields the result.
Theorem 2.4.2 Let X be a martingale and p > 1, then the following statements are equivalent: 1. X is bounded in Lp (, F, P ): supn0 Xn
p
<
2. X converges a.s. and in Lp to a random variable X 3. There exists some Z Lp (, F, P ) such that Xn = E[Z|Fn ]. Proof. 1. = 2. Suppose X is bounded in Lp , then in particular, it is bounded in L1 so it converges a.s. to some nite X by Theorem 2.3.1. Moreover, X Lp by an easy application of Fatous theorem. Next, Doobs inequality Xn p C Xn p < C < entails X p < by monotone convergence, where X = supn0 |Xn | is the monotone limit of Xn . Since X supnN |Xn |, |Xn X | 2X Lp and dominated convergence entails that Xn converges to X in Lp . 2. = 3. Since conditional expectation is continuous as a linear operator on Lp spaces (Proposition 1.2.2), if Xn X in Lp we have for n m, Xn = E[Xm |Fn ] m E[X |Fn ]. 3. = 1. This is immediate by the conditional Jensen inequality. A martingale which has the form in 3. is said to be closed (in Lp ). Notice that in this case, X = E[Z|F ], where F = n0 Fn . Indeed, n0 Fn is a -system that spans F , and moreover if B FN say is an element of this -system, E[1B Z] = E[1B E[Z|F ]] = E[1B XN ] E[1B X ] as N . Since X = lim supn Xn is F measurable, this gives the result. Therefore, for p > 1, the map Z Lp (, F , P ) (E[Z|Fn ], n 0) is a bijection between Lp (, F , P ) and the set of martingales that are bounded in Lp .
20
2.5
Uniform integrability and convergence in L1
The case of L1 convergence is a little dierent from Lp for p > 1, as one needs to suppose uniform integrability rather than a mere boundedness in L1 . Notice that uniform integrability follows from boundedness in Lp . Theorem 2.5.1 Let X be a martingale. The following statements are equivalent: 1. (Xn , n 0) is uniformly integrable 2. Xn converges a.s. and in L1 (, F, P ) to a limit X 3. There exists Z L1 (, F, P ) so that Xn = E[Z|Fn ], n 0. Proof. 1. = 2. Suppose X is uniformly integrable, then it is bounded in L1 so by Theorem 2.3.1 it converges a.s. By properties of uniform integrability, it then converges in L1 . 2. = 3. This follows the same proof as above: X = Z is a suitable choice. 3. = 1. This is a straightforward consequence of the fact that {E[X|G] : G is a sub--algebra of F} is U.I., see the example sheet 1. As above, we then have E[Z|F ] = X , and this theorem says that there is a one-toone correspondence between U.I. martingales and L1 (, F , P ). Exercise. Show that if X is a U.I. supermartingale (resp. submartingale), then Xn converges a.s. and in L1 to a limit X , so that E[X |Fn ] Xn (resp. ) for every n.
2.6
Optional stopping in the case of U.I. martingales
We give an improved version of the optional stopping theorem, in which the boundedness condition on the stopping time is lifted, and replaced by a uniform integrability condition on the martingale. Since U.I. martingales have a well dened limit X , we unambiguously let XT = XT 1{T <} + X 1{T =} for any stopping time T . Theorem 2.6.1 Let X be a U.I. martingale, and S, T be two stopping times with S T . Then E[XT |FS ] = XS . Proof. We check that XT L1 , indeed, since |Xn | E[|X | |Fn ],
E[|XT |] =
n=0
E[|Xn |1{T =n} ] + E[|X |1{T =} ]

nN
E[|X |1{T =n} ] = E[|X |].
Next, if B FT , E[1B X ] =
nZ+ {}
E[1B 1{T =n} X ] =

nZ+ {}
E[1B 1{T =n} Xn ] = E[1B XT ],
so that XT = E[X |FT ]. Finally, E[XT |FS ] = E[E[X |FT ]|FS ] = XS , by the tower property.
2.7. BACKWARDS MARTINGALES
21
2.7
Backwards martingales
Backwards martingales are martingales whose time-set is Z . More precisely, given a ltration . . . G2 G1 G0 , a process (Xn , n 0) is a backward martingale if E[Xn+1 |Gn ] = Xn , as in the usual denition. They are somehow nicer than forward martingales, as they are automatically U.I. since X0 L1 , and E[X0 |Gn ] = Xn for every n 0. Adapting Doobs upcrossing theorem is a simple exercise: if Nm ([a, b], X) is the number of upcrossings of a backwards martingale from a to b between times m and 0, one has, considering the (forward) supermartingale (Xm+k , 0 k m), that (b a)E[Nm ([a, b], X)] E[(X0 a) ]. As m , Nm ([a, b], X) increases to the total number of upcrossings of X from a to b, and this allows to conclude that Xn converges a.s. as n to a G -measurable random variable X , where G = n0 Gn . We proved: Theorem 2.7.1 Let X be a backwards martingale. Then Xn converges a.s. and in L1 as n to the random variable X = E[X0 |G ]. Moreover, if X0 Lp for some p > 1, then X is bounded in Lp and converges in Lp as n .
22
Chapter 3 Examples of applications of discrete-time martingales

3.1 Kolmogorovs 0 1 law, law of large numbers
Let (Yn , n 1) be a sequence of independent random variables. Theorem 3.1.1 (Kolmogorovs 0 1 law) The tail -algebra G = Gn = {Xm , m n}, is trivial: every A G has probability 0 or 1.
n0
Gn , where
Proof. Let Fn = {Y1 , . . . , Yn }, n 1. Let A G . Then E[1A |Fn ] = P (A) since Fn is independent of Gn+1 , hence of G . Therefore, the martingale convergence theorem gives E[1A |F ] = 1A = P (A) a.s., since G F . Hence, P (A) {0, 1}. Suppose now that the Yi are real-valued i.i.d. random variables in L1 . Let Sn = n k=1 Yk , n 0 be the associated random walk. Theorem 3.1.2 (LLN) A.s. as n , Sn E[Y1 ]. n n Proof. Let Hn = {Sn , Sn+1 , . . .} = {Sn , Yn+1 , Yn+2 , . . .}. We have E[Sn |Hn+1 ] = Sn+1 E[Xn+1 |Sn+1 ]. Now, by symmetry we have E[Xn+1 |Sn+1 ] = E[Xk |Sn+1 ] for every 1 k n + 1, so that it equals (n + 1)1 E[Sn+1 |Sn+1 ] = Sn+1 /(n + 1). Finally, E[Sn /n|Hn+1 ] = Sn+1 /(n + 1), so that (Sn /(n), n 1) is a backwards martingale with respect to its natural ltration. Therefore, Sn /n converges a.s. and in L1 to a limit which is a.s. constant by Kolmogorovs 0 1 law, so it must be equal to its mean value: E[S1 |H ] = E[S1 ] = E[Y1 ].
3.2
Branching processes
Let be a probability distribution on Z+ , and consider a Markov process (Zn , n 0) in Z+ whose step-transitions are determined by the following rule. Given Zn = z, take z 23
24CHAPTER 3. EXAMPLES OF APPLICATIONS OF DISCRETE-TIME MARTINGALES independent random variables Y1 , . . . , Yz with law , and let Zn+1 have the same distribution as Y1 + . . . + Yz . In particular, 0 is an absorbing state for this process. This can be interpreted as follows: Zn is a number of individuals present in a population, and at each time, each individual dies after giving birth to a -distributed number of sons, independently of the others. Notice that E[Zn+1 |Fn ] = E[Zn+1 |Zn ] = mZn , where (Fn , n 0) is the natural ltration, and m is the mean of , m = z z({z}). Therefore, supposing m (0, ), Proposition 3.2.1 The process (mn Zn , n 0) is a non-negative martingale. Notice that the fact that the martingale converges a.s. to a nite value immediately implies that when m < 1, there exists some n so that Zn = 0, i.e. the population becomes extinct in nite time. It is also guessed that when m > 1, Zn should be of order mn so that the population should grow explosively, at least with a positive probability. It is a standard to show that z Exercise Let (s) = zZ+ ({z})s be the generating function of , we suppose ({1}) < 1. Show that if Z0 = 1, then the generating function of Zn is the n-fold composition of with itself. Show that the probability of eventual extinction of the population satises (q) = q, and that q > 0 m > 1. As a hint, is a convex function such that (1) = m. Notice that, still supposing Z0 = 1, the martingale (Mn = Zn /mn , n 0) cannot be U.I. when m 1, since it converges to 0 a.s., so E[M ] < E[M0 ]. This leaves open the question whether P (M > 0) > 0 in the case m > 1. We are going to address the problem in a particular case: Proposition 3.2.2 Suppose m > 1, Z0 = 1 and 2 = Var () < . Then the martingale M is bounded in L2 , and hence converges a.s. and in L2 to a variable M so that E[M ] = 1, in particular, P (M > 0) > 0.
2 2 2 2 Proof. We compute E[Zn+1 |Fn ] = Zn m2 + Zn 2 . This shows that E[Mn+1 ] = E[Mn ] + 2 mn , and therefore, since mn , n 0 is summable, M is bounded in L2 (this statement is actually equivalent to m > 1).
Exercise Show that under these hypotheses, {M > 0} and {limn Zn = } are equal, up to an event of vanishing probability.
3.3
A martingale approach to the Radon-Nikodym theorem
We begin with the following general remark. Let (, F, (Fn ), P ) be a ltered probability space with F = F, and let Q be a nite non-negative measure on (, F). Let Pn and Qn denote the restrictions of P and Q to the measurable space (, Fn ). Suppose that for every n, Qn has a density Mn with respect to Pn , namely Qn (d) = Mn ()Pn (d), where Mn is an Fn -measurable non-negative function. We also sometimes let Mn = dQn /dPn .
3.3. THE RADON-NIKODYM THEOREM
25
Then it is immediate that (Mn , n 0) is a martingale with respect to the ltered space (, F, (Fn ), P ). Indeed, E[Mn ] = Q() < , and for A Fn , E P [Mn+1 1A ] = E Pn+1 [Mn+1 1A ] = Qn+1 (A) = Qn (A) = E Pn [Mn 1A ] = E P [Mn 1A ], where E P , E Pn denote expectations with respect to the probability measures P, Pn . A natural problem is to wonder whether the identity Qn = Mn Pn passes to the limit Q = M P as n , where M is the a.s. limit of the non-negative martingale M . Proposition 3.3.1 Under these hypotheses, there exists a non-negative random variable X := dQ/dP such that Q = X P if and only if (Mn , n 0) is U.I. Proof. If M is U.I., then we can pass to the limit in E[Mm 1A ] = Q(A) for A Fn and m , to obtain E[M 1A ] = Q(A) for every A n0 Fn . Since this last set is a -system that generates F = F, we obtain M P = Q by the theorem on uniqueness of measures. Conversely, if Q = X P , then for A Fn , we have Q(A) = E[Mn 1A] = E[X 1A] so that Mn = E[X|Fn ], which shows that M is U.I. The Radon-Nikodym theorem (in a particular case) states as follows. Theorem 3.3.1 (Radon-Nikodym) Let (, F) be a measurable space such that F is separable, i.e. generated by a countable set of events Fk , k 1. Let P be a probability measure on (, F) and Q be a nite non-negative measure on (, F). Then the following statements are equivalent. (i) Q is absolutely continuous with respect to P , namely A F, P (A) = 0 = Q(A) = 0. (ii) > 0, > 0, A F, P (A) = Q(A) . (iii) There exists a non-negative random variable X such that Q = X P . The separability condition on F can actually be lifted, see Williams book for the proof in the general case. Proof. That (iii) implies (i) is straightforward. If (ii) is not satised then we can nd a sequence Bn of events and an > 0 such that P (Bn ) < 2n but Q(Bn ) . But by the Borel-Cantelli lemma, P (lim sup Bn ) = 0, while Q(lim sup Bn ), as the decreasing limit of Q( kn Bk ) as n , must be lim supn Q(Bn ) . Hence, (i) does not hold for the set A = lim sup Bn . So (i) implies (ii). Let us now assume (ii). Let Fn be a ltration such that Fn is the -algebra spanned by events F1 , . . . , Fn , Notice that any event of Fn is a disjoint union of non-empty atoms of the form Gi ,
i1
26CHAPTER 3. EXAMPLES OF APPLICATIONS OF DISCRETE-TIME MARTINGALES where either Gi = Fi or its complementary set. We let An be the set of atoms of Fn . Let Mn () =
AAn
Q(A) 1A (), P (A)
with the convention that 0/0 = 0. Then it is easy to check that Mn is a density for Qn with respect to Pn , where Pn , Qn denote restrictions to Fn as above. Indeed, if A An , Qn (A) = Q(A) P (A) = E Pn [Mn 1A ]. P (A)
Therefore, (Mn , n 0) is a non-negative (Fn , n 0)-martingale, and Mn () converges a.s. towards a limit M (). Moreover, the last proposition tells us that it suces to shoz that (Mn ) is U.I. to conclude the proof. But note that we have E[Mn 1{Mn a} ] = Q(Mn a). So for > 0 xed, P (Mn a) E[Mn ]/a = Q()/a < for all n, as soon as a is large enough, with xed by the claim, and this entails Q(Mn a) for every n. Hence the result. Example. Let = [0, 1) be endowed with its Borel -eld, which is spanned by {Ik,j = [j2k , (j + 1)2k ), k 0, 0 j 2k 1}. The intervals Ik,j , 0 j 2k 1 are called the dyadic intervals of depth k, they span a -algebra which we call Fk . We let (d) be the Lebesgue measure on [0, 1). Let be a nite non-negative measure on [0, 1), and
2n 1
Mn () = 2
n j=0
1In,j ()(In,j ),
then we obtain by the previous theorem that if is absolutely continuous with respect to , then = f for some non-negative measurable f . We then see that a.s., if Ik (x) = [2k [2k x], 2k ([2k x] + 1)) denotes the dyadic interval of level k containing x, 2k
Ik (x)
f (x)(dx) f (x).
k
This is a particular case of Lebesgue dierentiation theorem.
3.4
Product martingales and likelihood ratio tests
Theorem 3.4.1 (Kakutanis theorem) Let (Yn , n 1) a sequence of independent non-negative random variables, with mean 1. Let Fn = (Y1 , . . . , Yn ). Then Xn = Yk 1kn , n 0 is a (Fn , n 0)-martingale, which converges to some X 0. Letting an = E[ Yn ], the following statements are equivalent: 1. X is U.I. 2. E[X ] = 1 3. P (X > 0) > 0
3.4. PRODUCT MARTINGALES 4. an > 0.
27
Proof. The fact that M is a (non-negative) martingale follows from the fact that E[Xn+1 |Fn ] = Xn E[Yn+1 |Fn ] = Xn E[Yn+1 ]. For the same reason, the process n Yn Mn = , n0 an k=1
2 is a non-negative martingale with mean E[Mn ] = 1, and E[Mn ] = n a2 . Thus, M k=1 k 2 is bounded in L if and only if n an > 0 (notice that an (0, 1] e.g. by the Schwarz inequality E[1 Yn ] E[Yn ]). Now, with the standard notation Xn = sup0kn Xk , using Doobs L2 inequality, 2 E[Xn ] E[(Mn )2 ] 4E[Mn ], which shows that if M is bounded in L2 , then X is integrable, hence X is U.I. since it is dominated by X . We thus have obtained 4. = 1. = 2. = 3., where the second implication comes from the optional stopping theorem for U.I. martingales, and the implication 2. = 3. is trivial. On the other hand, if n an = 0, since Mn converges a.s. to some M 0, then Xn = Mn 1kn ak converges to 0, so that 3. does not hold. So 3. = 4., hence the result.
Note that, if Yn > 0 a.s. for every n, the event {X = 0} is a tail event, so that 2. above is equivalent to P (X > 0) = 1 by Kolmogorovs 0 1 law. As an example of application of this theorem, consider a -nite measured space (E, E, ) and let = E N , F = E N be the product measurable space. We let Xn () = n , n 1, and Fn = ({X1 , . . . , Xn }). One says that X is the canonical (E-valued) process. Now suppose given two families of probability measures (n , n 1) and (n , n 1) that admit densities dn = fn d, dn = gn d with respect to . We suppose that fn (x)gn (x) > 0 for every n, x. Let P = n1 n , resp. Q = n1 n denote the measures on (, F) under which (Xn , n 1) is a sequence of independent random variables with respective laws n (resp. n ). In particular, if A = n Ai E N is a measurable rectangle i=1 in Fn , n n gi (xi ) Q(A) = fi (xi )dxi = E P (Mn 1A ), fi (xi ) i=1 E n i=1 where E P denotes expectation with respect to P , and
n
Mn =
i=1
gi (Xi ) . fi (Xi )
Since measurable rectangles of Fn form a -system that span Fn , the probability Q|Fn is absolutely continuous with respect to P |Fn , with density Mn , so that (Mn , n 1) is a nonnegative martingale with respect to the ltered space (, F, (Fn , n 0), P ). Kakutanis theorem then shows that M converges a.s. and in L1 to its limit M if and only if
2
fn (x)gn (x)(dx) > 0

n1 E n1 E
fn (x)
gn (x)
(dx).
28CHAPTER 3. EXAMPLES OF APPLICATIONS OF DISCRETE-TIME MARTINGALES In this case, one has Q(A) = E P [M 1A ] for every measurable rectangle of F, and Q is absolutely continuous with respect to P with density M . In the opposite case, M = 0, so Proposition 3.3.1 shows that Q and P are carried by two disjoint measurable sets.
3.4.1
Example: consistency of the likelihood ratio test
In the case where n = , n = for every n, we see that M = 0 a.s. if (and only if) = . This is called the consistency of the likelihood ratio test in statistics. Let us recall the background for the application of this test. Suppose given an i.i.d. sample X1 , X2 , . . . , Xn , with an unknown common distribution. Suppose one wants to test the hypothesis (H0 ) that this distribution is P against the hypothesis (H1 ) that it is Q, where P and Q have everywhere positive densities f, g with respect to some common nite measure (for example, a normal distribution and a Cauchy distribution). Letting Mn = 1in g(Xi )/f (Xi ), we use the test 1{Mn 1} for acceptance of H0 against H1 . Then supposing H0 , then M = 0 a.s. so the probability of rejection P (Mn > 1) converges to 0. Similarly, supposing H1 , then M = + a.s. so the probability of rejection goes to 1.
Chapter 4 Continuous-parameter stochastic processes

In this section, we will consider the case when processes are indexed by a real interval I R, with non-empty interior, in many cases I will be R+ . This makes the whole study more involved, as we now stress here. In all what follows, the states space E is assumed to be a metric space, usually E = R or E = Rd endowed with the Euclidean norm.
4.1
Theoretical problems when dealing with continuous time processes
Although the denitions for ltrations, adapted processes, stopping times, martingales, super-, sub-martingales are not changed when compared to the discrete case (see the beginning of Section 2), the use of continuous time induces important measurability problems. Indeed, there is no reason why an adapted process (, t) Xt () should be a measurable map dened on I, or even the sample path t Xt () for any xed . In particular, stopped processes like XT 1{T <} for a stopping time T have no reason to be random variables. Even worse, there are in general very few stopping times for example rst entrance times inf{t 0 : Xt A} for measurable (or even open or closed) subsets of the states space E need not be stopping times. This is the reason why we add a priori requirements on the regularity of random processes under consideration. A quite natural requirement is that they are continuous processes, i.e. that t Xt () is continuous for a.e. , because a continuous function is determined by its values on a countable dense subset of I. More generally, we will consider processes that are right-continuous and admit left limits everywhere, a.s. such processes are called c`dl`g, and are also determined by the values they take on a countable dense a a subset of I (the notation c`dl`g stands for the French continu ` droite, limit ` gauche). a a a ea We let C(I, E), D(I, E) denote the spaces of continuous and c`dl`g functions from I a a to E, we consider these sets as measurable spaces by endowing them with the product -algebra that makes the projections t : X Xt measurable for every t I. Usually, we will consider processes with values in R, or sometimes Rd for some d 1 in the chapter 29
30
CHAPTER 4. CONTINUOUS-PARAMETER PROCESSES
on Brownian motion. The following proposition holds, of which (ii) is an analog of 1., 2. in Proposition 2.1.2. Proposition 4.1.1 Let (, F, (Ft , t I), P ) be a ltered probability space, and let (Xt , t I) be an adapted process with values in E. (i) Suppose X is continuous (i.e. (Xt (), t I) C(I, E) for every ). If A is a closed set, and inf I > , then the random time TA = inf{t I : Xt A} is a stopping time. (ii) Let T be a stopping time, and suppose X is c`dl`g. Then XT 1{T <} : a a XT () ()1{T ()<} is an FT -measurable random variable. Moreover, the stopped process X T = (XT t , t 0) is adapted. Proof. For (i), notice that if A is closed and X is continuous, then for every t I, {TA t} = inf d(Xs , A) = 0 ,
sIQ,st
where d(x, A) = inf yA d(x, y) is the distance from x to the set A. Indeed, if Xs A for some s t, then for qn converging to s in Q I (, t], Xqn converges to Xs , so that d(Xqn , A) converges to 0. Conversely, if there exists qn Q I (, t] such that d(Xqn , A) converges to 0, then since inf I > we can extract along a subsequence and assume qn converges to some s I (, t], and this s has to satisfy d(Xs , A) = 0 by continuity of X. Since A is closed, this implies Xs A, so that {TA t}. For (ii), rst note that a random variable Z is FT -measurable if Z 1{T t} Ft for every t I, by approximating Z by a nite sum of the form i 1Ai , for Ai FT . Notice also that if T is a stopping time, then, if x denotes smallest n Z+ with n x, Tn = 2n 2n T is also a stopping time with Tn T , that decreases to T as n (Tn = if T = ). Indeed, {Tn t} = {T 2n 2n t } Ft (notice x y if and only if x y , where y is the largest n Z+ with n x). Moreover, Tn takes values in the set Dn = {k2n , k Z+ } {} of dyadic numbers with level n (or ). Therefore, XT 1{T <} 1{T t} = Xt 1{T =t} + XT 1{T <t} , which by the c`dl`g property is a a equal to Xt 1{T =t} + lim XTn t 1{T <t} .
n
The variables Xt 1{T =t} and XTn t 1{T <t} are Ft -measurable, because XTn t =
dDn ,dt
Xd 1{Tn =d} + Xt 1{t<Tn } .
hence the result. For the statement on the stopped process, notice that for every t, XT t is FT t , hence Ft -measurable. It turns out that (i) does not hold in general for c`dl`g processes, although it is a very a a subtle problem to nd counterexamples. See Rogers and Williams book, Chapters II.74
4.2. FINITE MARGINAL DISTRIBUTIONS, VERSIONS
31
and II.75. In particular, Lemma 75.1 therein shows that TA is a stopping time if A is compact and X is an adapted c`dl`g process, whenever the ltration (Ft , t I) satises a a the so-called usual conditions see Section 4.3 for the denition of these conditions. You may check as an exercise that the times TA for open sets A associated with c`dl`g a a processes, are stopping times with respect to the ltration (Ft+ , t I), where Ft+ =
s>t
Fs .
Somehow, the ltration (Ft+ ) foresees what will happen just after t.
4.2
Finite marginal distributions, versions
We now discuss the notion of law of a process. If (Xt , t I) is a stochastic process, we can consider it as a random variable with values in the set E I of maps f : I E, where this last space is endowed with the product -algebra (the smallest -algebra that makes the projections f E I f (t) measurable for every t I). It is then natural to consider the image measure of the probability P by the process X as the law of X. However, this measure is uneasy to manipulate, and the quantities that are of true interest are the following simpler objects. Denition 4.2.1 Let (Xt , t I) be a process. For every nite J I, the nite marginal distribution of X indexed by J is the law J of the E J -valued random variable (Xt , t J).
It is a nice fact that the nite marginal distributions {J : J I, #J < } uniquely characterize the law of the process (Xt , t I) as dened above. Indeed, by denition, if X and Y are c`dl`g processes having the same nite marginal laws, then their distribution a a agree on the -system of nite rectangles of the form iJ As tI\J E for nite J I, which generate the product -algebra, hence the distributions under consideration are equal. Notice that this uniqueness result does not imply the existence of a process with given marginal distributions. The problem with (nite marginal) laws of processes is that they are powerless in dealing with properties of processes that involve more than countably many times, such as continuity or c`dl`g properties of the process. For example, if X is a continuous a a process, there are (many!) non-continuous processes that have the same nite-marginal distributions as X: the nite marginal distributions just do not see the sample path properties of the process. This motivates the following denition. Denition 4.2.2 If X and X are two processes dened on some common probability space (, F, P ), we say that X is a version of X if for every t, Xt () = Xt () a.s. In particular, two versions X and X of the same process share the same nitedimensional distribution, however, this does not say that there exists an so that Xt () = Xt () for every t. This becomes true if both X and X are a priori known to be c`dl`g, for instance. a a
32
Example. To explain these very abstract notions, suppose we want to nd a process (Xt , 0 t 1) whose nite marginal laws are Dirac masses at 0, namely J ({(0, 0, . . . , 0)}) = P (Xs = 0, s J) = 1
#J
times
for every nite J [0, 1]. Of course, the process Xt = 0, 0 t 1 satises this. However, the process Xt = 1{U } (t), 0 t 1, where U is a uniform random variable on [0, 1], is a version of X, and therefore has the same law as X. But of course, it is not continuous, and P (Xt = 0 t [0, 1]) = 0. We thus want to consider it as a bad version of the zero process. This example motivates the following way of dealing with processes: when considering a process whose nite marginal distributions are known, we rst try to nd the most regular version of the process as we can before studying it. We will discuss two regularization theorems in this course, the martingale regularization theorem and Kolmogorovs continuity criterion, which are instances of situations when there exists a regular (continuous or c`dl`g) version of the stochastic process under a a consideration.
4.3
The martingale regularization theorem
We consider here a martingale (Xt , t 0) on some ltered probability space (, F, (Ft , t 0), P ). We let N be the set of events in F with probability 0, Ft+ =
s>t
Fs ,
t 0,
and Ft = Ft+ N . Theorem 4.3.1 Let (Xt , t 0) be a martingale. Then there exists a c`dl`g process X a a which is a martingale with respect to the ltered probability space (, F, (Ft , t 0), P ), so that for every t 0, Xt = E[Xt |Ft ] a.s. If Ft = Ft for every t 0, X is therefore a c`dl`g version of X. a a We say that (Ft , t 0) satises the usual conditions if Ft = Ft for every t, that is, N F0 and Ft+ = Ft (a ltration satisfying this last condition for t 0 is called rightcontinuous, notice that (Ft+ , t 0) is right-continuous for every ltration (Ft , t 0)). As a corollary of Theorem 4.3.1, in the case when the ltration satises the usual conditions, a martingale admits a c`dl`g version so there is little to lose to consider that martingales a a are c`dl`g. a a Lemma 4.3.1 A function f : Q+ R admits a left and a right (nite) limit at every t R+ if and only if for every rationals a b, 1 i n
a upcrossings of f from a to b is nite.
4.3. THE MARTINGALE REGULARIZATION THEOREM
33
Proof of Theorem 4.3.1. We rst show that X is bounded on bounded subsets of Q+ . Indeed, if I is such a subset and J = {a1 , . . . , ak } is a nite subset of I with a1 < . . . < an , then Ml = Xal , 1 l k is a martingale. Doobs maximal inequality applied to the submartingale |M | then shows that
cP (Mk > c) = cP ( max |Xal | > c) E[|Xak |] E[|XK |] 1lk
for any K > sup I. Therefore, taking a monotone limit over nite J I with union I, we have cP (sup |Xt | > c) E[|XK |].
tI
This shows that P (suptI |Xt | < ) = 1 by letting c . Let I still be a bounded subset of R+ , and a < b Q+ . By denition, we have N ([a, b], I, X) = supJI,nite N ([a, b], J, X). So let J I be a nite subset of the form {a1 , a2 , . . . , ak } as above, and again let Ml = Xal , 1 l k. Doobs upcrossing lemma for this martingale gives (b a)E[N ([a, b], J, X)] E[(Xak a) ] E[(XK a) ], for any K sup I, because ((Xt a) , t 0) is a submartingale due to the convexity of x (x a) . Taking the supremum over J shows that N ([a, b], I, X) is a.s. bounded, because E[|XK |] < . This shows by letting K along integers, that N ([a, b], I, X) is nite for every bounded subset I of Q+ , and every a t
lim
Xs () ,
and Xt () = 0 for every t for 0 . The process X thus obtained then is indeed / adapted to the ltration (Ft , t 0). It remains to show that X is an (Ft )-martingale, satises E[Xt |Ft ] = Xt , and is c`dl`g. a a First, check that if X remains an (Ft N , t I)-martingale, because E[X|G N ] = E[X|G] in L1 (, G N , P ) for any integrable X and sub--algebra G F. Thus, we may suppose that N Ft for every t. Let s < t R+ , and sn , n 0 be a (strictly) decreasing sequence of rationals that converges to s, with s0 < t. Then Xs = lim Xsn = lim E[Xt |Fsn ] by denition for 0 . Now, the process (Mn = Xsn , n 0) is a backwards martingale with respect to the ltration (Gn = Fsn , n 0). The backwards martingale convergence theorem thus shows that Xs = E[Xt |Fs+ ], and therefore Xt = E[Xt |Ft ]. Moreover, taking a rational sequence (tn ) decreasing to t and using again the backwards martingale convergence theorem, (Xtn ) converges to Xt in L1 , so that Xs = E[Xt |Fs ] for every s t. The only thing that remains to prove is the c`dl`g property. If t R+ and if Xs () a a does not converge to Xt () as s t, then |Xt Xs | > for some > 0 and for innitely many s > t, so that if 0 , |Xt Xu | > /2 for an innite number of rationals u > t, contradicting 0 . The argument for showing that X has left limits is similar. From now on, when considering martingales in continuous time, we will always take their c`dl`g version, provided the underlying ltration satises the usual hypotheses. a a
34
4.4
Doobs inequalities and convergence theorems for martingales in continuous time
Considering c`dl`g martingales makes it straightforward to generalize to the continuous a a case the inequalities of section 2.4, by density arguments. We leave to the reader to show the following theorems which are analog to the discrete-time case. Proposition 4.4.1 (A.s. convergence) Let (Xt , t 0) be a c`dl`g martingale which a a is bounded in L1 . Then Xt converges as t a.s. to an (a.s.) nite limit X . To prove this, notice that convergence of Xt as t to a (possibly innite) limit is equivalent to the fact that the number of upcrossings of X from below a to above b over the time interval R+ is nite for every a < b rationals. However, by the c`dl`g a a property, it suces to restrict our attention to the countable time set Q+ rather than R+ . Indeed, for each upcrossing of X from a to b between times s < t say, we can nd rationals s > s, t > t as close to s, t as wanted so that X accomplishes an upcrossing from a to b between times s , t , and this implies that N (X, R+ , [a, b]) = N (X, Q+ , [a, b]) (possibly innite). Then, use similar arguments as those used in the rst part of the proof of Theorem 4.3.1. Proposition 4.4.2 (Doobs inequalities) If (Xt , t 0) is a c`dl`g martingale and a a Xt = sup0st |Xs |, then for every c > 0, t 0, cP (Xt c) E[|Xt |]. Moreover, if p > 1 then Xt
p
p Xt p . p1
To prove this, notice that Xt = sups{t}([0,t]Q) |Xs | by the c`dl`g property. a a Proposition 4.4.3 (Lp convergence) (i) If X is a c`dl`g martingale and p > 1 then a a supt0 Xt p < if and only if X converges a.s. and in Lp to its limit X , and this if and only if X is closed in Lp , i.e. there exists Z Lp so that E[Z|Ft ] = Xt for every t, a.s. (one can then take Z = X ). (ii) If X is a c`dl`g martingale then X is U.I. if and only if X converges a.s. and in a a 1 L to its limit X , and this if and only if X is closed (in L1 ). Proposition 4.4.4 (Optional stopping) Let X be a c`dl`g U.I. martingale. Then for a a every stopping times S T , one has E[XT |FS ] = XS a.s. Proof. Let Tn be the stopping time 2n 2n T as dened in the proof of Proposition 4.1.1. The right-continuity of paths of X shows that XTn converges to XT a.s. Moreover, Tn takes values in the countable set Dn of dyadic rationals of level n (and ), so that E[X |FTn ] =
dDn
E[1{Tn =d} X |FTn ] =

dDn
1{Tn =d} E[X |Fd ]
4.5. KOLMOGOROVS CONTINUITY CRITERION
35
(you should check this carefully). Now, since Xt converges to X in L1 , Xd = E[Xt |Fd ] = E[X |Fd ] a.s., and E[X |FTn ] = XTn . Passing to the limit as n and using the backwards martingale convergence theorem, we obtain E[X |FT ] = XT where FT = n1 FTn , and therefore E[X |FT ] = XT by the tower property, since XT is FT -measurable. The theorem then follows as in Theorem 2.6.1.
4.5
Kolmogorovs continuity criterion
Theorem 4.5.1 (Kolmogorovs continuity criterion) Let (Xt , 0 t 1) be a stochastic process with real values. Suppose there exist p > 0, c > 0, > 0 so that for every s, t 0, E[|Xt Xs |p ] c|t s|1+ . Then, there exists a modication X of X which is a.s. continuous (and even -Hlder o continuous for any (0, /p)). Proof. Let Dn = {k 2n , 0 k 2n } denote the dyadic numbers of [0, 1] with level n, so Dn increases as n increases. Then letting (0, /p), Markovs inequality gives for 0 k < 2n , P (|Xk2n X(k+1)2n | > 2n ) 2np E[|Xk2n X(k+1)2n |p ] 2np 2nn 2n 2(p)n . Summing over Dn we obtain P sup |Xk2n X(k+1)2n | > 2n 2n(p) ,
0k<2n
which is summable. Therefore, the Borel-Cantelli lemma shows that for a.a. , there exists N so that if n N , the supremum under consideration is 2n . Otherwise said, a.s., |Xk2n X(k+1)2n | M () < . sup sup 2n n0 k{0,...,2n 1} We claim that this implies that for every s, t D = n0 Dn , |Xs Xt | M ()|t s| , for some M () < a.s. Indeed, if s, t D, s < t, and if r is the least integer such that t s > 2r1 we can write [s, t) as a disjoint unions of intervals of the form [r, r + 2n ) with r Dn and n > r, in such a way that for every n > r, at most two of these intervals have length 2n . This entails that |Xs Xt | 2
nr+1
M ()2n 2(1 2 )1 M ()2(r+1) M ()|t s|
where M () < a.s. Therefore, the process (Xt , t D) is a.s. uniformly continuous (and even -Hlder continuous). Since D is an everywhere dense set in [0, 1], the latter o process a.s. admits a unique continuous extension X on [0, 1], which is also -Hlder o continuous (it is consistently dened by Xt = limn Xtn , where (tn , n 0) is any D-valued sequence converging to t). On the exceptional set where (Xd , d D) is not uniformly
36
continuous, we let Xt = 0, 0 t 1, so X is continuous. It remains to show that X is a version of X. To this end, we estimate by Fatous lemma E[|Xt Xt |p ] lim inf E[|Xt Xtn |p ],
n
where (tn , n 0) is any D-valued sequence converging to t. But since E[|Xt Xtn |p ] c|t tn |1+ , this converges to 0 as n . Therefore, Xt = Xt a.s. for every t. The nice thing about this criterion is that is depends only on a control on the twodimensional marginal distributions of the stochastic process. In fact, the very same proof can give the following alternative Corollary 4.5.1 Let (Xd , d D) be a stochastic process indexed by the set D of dyadic numbers in [0, 1]. Assume that there exist c, p, > 0 so that for every s, t D, E[|Xs Xt |p ] c|s t|1+ , thenalmost-surely, the process (Xd , d [0, 1]) has an extension (Xt , t [0, 1]) that is continuous, and even Hlder-continuous of any index (0, /p). o
Chapter 5 Weak convergence

5.1 Denition and characterizations
Let (M, d) be a metric space, endowed with its Borel -algebra. All measures in this chapter will be measures on such a measurable space. Let (n , n 0) be a sequence of probability measures on M . We say that n converges weakly to the non-negative measure if for every continuous bounded function f : M R, one has n (f ) (f ). Notice that in this case, is automatically a probability measure since (1) = 1, and the denition actually still makes sense if we suppose that n (resp. ) are just nite non-negative measures on M . Examples. Let (xn , n 0) be a M -valued sequence that converges to x. Then xn converges weakly to x , where a is the Dirac mass at a. This is just saying that f (xn ) f (x) for continuous functions. Let M = [0, 1] and n = n1 0kn1 k/n . Then n (f ) is the Riemann sum 1 n1 0kn1 f (k/n), which converges to 0 f (x)dx if f is continuous, which shows that n converges weakly to Lebesgues measure on [0, 1]. In this two cases, notice that it is not true that n (A) converges to (A) for every Borel set A convergence. This pointwise convergence is stronger, but much more rigid than weak convergence. For example, xn does not converge in that sense to x unless xn = x eventually. See e.g. Chapter III in Stroocks book for a discussion on the various existing notions of convergence for measures. Theorem 5.1.1 Let (n , n 0) be a sequence of probability distributions. The following assertions are equivalent: 1. n converges weakly to 2. For every open subset G of M , lim inf n n (G) (G) (open sets can lose mass) 3. For every closed subset F of M , lim supn n (F ) (F ) (closed sets can gain mass) 4. For every Borel subset A in M with (A) = 0, limn n (A) = (A). (mass is lost or gained through the boundary) 37
38
CHAPTER 5. WEAK CONVERGENCE
Proof. 1. = 2. Let G be an open subset with nonempty complement Gc . The distance function d(x, Gc ) is continuous and positive if and only if x G. Let fM = 1(M d(x, Gc )). Then fM increases to 1G as M . Now, n (fM ) n (G) converges to (fM ), so that lim inf n n (G) (fM ) for every M , and by monotone convergence letting M , one gets the result. 2. 3. is obvious by taking complementary sets. 2.,3. = 4. Let A and A respectively denote the interior and the closure of A. Since (A) = (A \ A ) = 0, we obtain (A ) = (A) = (A). lim sup n (A) (A) lim inf n (A ),
n n
and since A A, this gives the result. 4. = 1. Let f : M R+ be a continuous bounded non-negative function, then using Fubinis theorem,
f (x)n (dx) =
M M
n (dx)
0
1{tf (x)} dt =
0
n ({f t})dt,
where K is any upper bound for f . Now {f t} := {x : f (x) t} is a closed subset of M , whose boundary is included in {f = t}, because {f > t} is open and included in {f t}, and their dierence is {f = t}. However, there can be at most a countable set of numbers t such that ({f = t}) > 0, because {t : ({f = t}) > 0} =
n1
{t : ({f = t}) n1 },
and the n-th set on the right-hand side has at most n elements. Therefore, for Lebesguealmost all t, ({f t}) = 0 and therefore, 4. and dominated convergence over the nite interval [0, K], where the integrated quantities are bounded by 1, show that n (f ) K converges to 0 ({f t})dt = (f ). The case of functions taking values of both signs is immediate. As a consequence, one obtains the following important criterion for weak convergence of measures on R. Recall that the distribution function of a non-negative nite measure on R is the c`dl`g function dened by F (x) = ((, x]), x R. a a Proposition 5.1.1 Let n , n 0, be probability measures on R. Then the following statements are equivalent: 1. n converges weakly to 2. for every x R such that F is continuous at x, Fn (x) converges to F (x) as n . Proof. The continuity of F at x exactly says that (Ax ) = 0 where Ax = (, x], so 1. = 2. is immediate by Theorem 5.1.1.
5.2. CONVERGENCE IN DISTRIBUTION
39
Conversely, let G be an open subset of R, which we write as a countable union k (ak , bk ) of disjoint open intervals. Then n (G) =
k
n ((ak , bk )) ,
(5.1)
while for every k and ak < a < b < bk . , n ((ak , bk )) = Fn (bk ) Fn (ak ) Fn (b ) Fn (a ). If we take a , b to be continuity points of F , we then obtain lim inf n n ((ak , bk )) F (b ) F (a ). Letting a ak , b bk along continuity points of F (such points always form a dense set in R) gives lim inf n n ((ak , bk )) ((ak , bk )). On the other hand, applying Fatous lemma to (5.1) yields lim inf n n (G) k lim inf n n ((ak , bk )), whence lim inf n n (G) (G).
5.2
Convergence in distribution for random variables
If (Xn , n 0) is a sequence of random variables with values in a metric space (M, d), and dened on possibly dierent probability spaces (n , Fn , Pn ), we say that Xn converges in distribution to a random variable X on (, F, P ) if the law of Xn converges weakly to that of X. Otherwise said, Xn converges in distribution to X if for every continuous bounded function f , E[f (Xn )] converges to E[f (X)]. The two following examples are the probabilistic counterpart of the examples discussed in the beginning of the previous section. Examples. If (xn ) is a sequence in M that converges to x, then xn converges as n to x in distribution, if the xn , n 0 and x are considered as random variables! If U is a uniform random variable on [0, 1) and Un = n1 nU , we see that Un has law n and converges in distribution to U . In the two cases we just discussed, the variables under consideration even converge a.s., which directly entails convergence in distribution, see the example sheets. The notion of convergence in distribution is related to the other notions of convergence for random variables as follows. See the Example sheet 3 for the proof. Proposition 5.2.1 1. If (Xn , n 1) is a sequence of random variables that converges in probability to some random variable X, then Xn converges in distribution to X. 2. If (Xn , n 1) is a sequence of random variables that converges in distribution to some constant random variable c, then Xn converges to c in probability. Using Proposition 5.1.1, we can discuss the following
40
Example: the central limit theorem. The central limit theorem says that if (Xn , n 1) is a sequence of iid random variables in L2 with m = E[X1 ] and 2 = Var (X1 ), then for every a < b in R, P a Sn mn b n 1 n 2
b a
ex
2 /2
dx,
where Sn = X1 + . . . + Xn . This is exactly saying that (Sn mn)/( n) converges in distribution as n to a Gaussian N (0, 1) random variable.
5.3
Tightness
Denition 5.3.1 Let {i , i I} be a family of probability measures on M . This family is said to be tight if for every > 0, there exists a compact subset K M such that sup i (M \ K) ,
iI
i.e. most of the mass of i is contained in K, uniformly in i I. Proposition 5.3.1 (Prokhorovs theorem) Suppose that the sequence of probability measures (n , n 0) on M is tight. Then there exists a subsequence (nk , k 0) along which n converges weakly to some limiting . The proof is considerably eased when M = R, which we will suppose. For the general case, see Billingsleys book Convergence of Probability Measures. Notice that in particular, if (n , n 0) is a sequence of probability measures on a compact space, then there exists a subsequence nk weakly converging to some . Proof. Let Fn be the distribution function of n . Then it is easy by a diagonal extraction argument to nd an extraction (nk , k 0) and a non-decreasing function F : Q [0, 1] such that Fnk (r) F (r) as k for every rational r. The function F is extended on R as a c`dl`g non-decreasing function by the formula F (x) = limrx,rQ F (r). It is then a a elementary by a monotonicity argument to show that Fnk (x) F (x) for every x which is a continuity point of F . To conclude, we must check that F is the distribution function of some measure . But the tightness shows that for every > 0, there exists A > 0 such that Fn (A) 1 and Fn (A) for every n. By further choosing A so that F is continuous at A and A, we see that F (A) 1 and F (A) , whence F has limits 0 and 1 at and +. By a standard corollary of Caratheodorys theorem, there exists a probability measure having F as its distribution function. Remark. The fact that Fn converges up to extraction to a function F , which need not be a probability distribution function unless the tightness hypothesis is veried, is a particular case of Hellys theorem, and says that up to extraction, a family of probability laws n converges vaguely to a possibly defective measure (i.e. of mass 1), i.e. n (f ) (f )
5.4. LEVYS CONVERGENCE THEOREM
41
for every f with compact support. The problem that could appear is that some of the mass of the n s could go to innity, for example n converges vaguely to the zero measure as n , and does not converge weakly. This phenomenon of mass going away is exactly what Prokhorovs theorem prevents from happening. In many situations, showing that a sequence of random variables Xn converges in distribution to a limiting X with law is done in two steps. One rst shows that the sequence (n , n 0) of laws of the Xn form a tight sequence. Then, one shows that the limit of n along any subsequence cannot be other than . This will be illustrated in the next section.
5.4
Lvys convergence theorem e
In this section, we let d be a positive integer and consider only random variables with values in the states space Rd . Recall that the characteristic function of an Rd -valued random variable X is the function X : Rd C dened by X () = E[exp(i , X )]. It is a continuous function on Rd , such that X (0) = 1. Moreover, it induces an injective mapping from (distributions of) random variables to complex-valued functions dened on Rd , in the sense that two random variables with distinct distributions have distinct characteristic functions. The following theorem is extremely useful in practice. Theorem 5.4.1 (Lvys convergence theorem) Let (Xn , n 0) be a sequence of rane dom variables. (i) If Xn converges in distribution to a random variable X, then Xn () converges to X () for every Rd . (ii) If Xn () converges to () for every Rd , where is a function which is continuous at 0, then is a characteristic function, i.e. there exists a random variable X such that = X , and moreover, Xn converges in distribution to X. Corollary 5.4.1 If (Xn , n 0), X are random variables in Rd , then Xn converges in distribution to X as n if and only if Xn converges to X pointwise. The proof of (i) in Lvys theorem is immediate since the function x exp(i x) is e continuous and bounded from Rd to C. For the proof of (ii), we will need to show that the hypotheses imply the tightness of the sequence of laws of Xn , n 0. To this end, the following bound is very useful. Lemma 5.4.1 Let X be a random variable with values in Rd . Then for any norm on Rd there exists a constant C > 0 (depending on d and on the choice of the norm) such that P ( X K) CK d
[K 1 ,K 1 ]d
(1 X (u))du.
42
Proof. Let be the distribution of X. Using Fubinis theorem and a simple recursion, it is easy to check that 1 d
d
(1 X (u))du = 2
[,]d
d Rd
(dx) 1
i=1
sin(xi ) xi
Now, the continuous function sinc : t R t1 sin t is such that there exists 0 < c < 1 such that |sinc t| c for every t 1, so that f : u Rd d sin ui /ui is such that i=1 |f (u)| c as soon as u 1. Therefore, 1 f is a non-negative continuous function which is 1 c when u 1. Letting C = 2d (1 c)1 entails that C(1 f (u)) 1{ u 1} . Putting things together, one gets the result for the norm , and the general result follows from the equivalence of norms in nite-dimensional vector spaces. Proof of Lvys theorem. e Suppose Xn converges pointwise to a limit that is continuous at 0. Then, |1 Xn | being bounded above by 2, xing > 0, the dominated convergence theorem shows that for any K > 0 lim K d
n [K 1 ,K 1 ]d
(1 Xn (u))du = K d
[K 1 ,K 1 ]d
(1 (u))du.
By taking K large enough, we can make this limiting value < /(2Cd ), because is continuous at 0, and it follows by the lemma that for every n large enough, P (|Xn | K) . Up to increasing K, this then holds for every n, showing tightness of the family of laws of the Xn . Therefore, up to extracting a subsequence, we see from Prokhorovs theorem that Xn converges in distribution to a limiting X, so that Xn converges pointwise to X along this subsequence (by part (i)). This is possible only if X = , showing that is a characteristic function. Moreover, this shows that the law of X is the only possible probability measure which is the weak limit of the laws of the Xn along some subsequence, so Xn must converge to X in distribution. More precisely, if Xn did not converge in distribution to X, we could nd a continuous bounded f , some > 0 and a subsequence Xnk such that for all k, |E[f (Xnk )] E[f (X)]| > (5.2)
But since the laws of (Xnk , k 0) are tight, we could nd a further subsequence along which Xnk converges in distribution to some X , which by (i) would satisfy X = = X and thus have same distribution as X, contradicting (5.2).
Chapter 6 Introduction to Brownian motion

6.1 History up to Wieners theorem
This chapter is devoted to the construction and some properties of one of probability theorys most fundamental objects. Brownian motion earned its name after R. Brown, who observed around 1827 that tiny particles of pollen in water have an extremely erratic motion. It was observed by Physicists that this was due to a important number of random shocks undertaken by the particles from the (much smaller) water molecules in motion in the liquid. A. Einstein established in 1905 the rst mathematical basis for Brownian motion, by showing that it must be an isotropic Gaussian process. The rst rigorous mathematical construction of Brownian motion is due to N. Wiener in 1923, using Fourier theory. In order to motivate the introduction of this object, we rst begin by a microscopical depiction of Brownian motion. Suppose (Xn , n 0) is a sequence of Rd valued random variables with mean 0 and covariance matrix 2 Id , which is the identity matrix in d 1 d dimensions, for some 2 > 0. Namely, if X1 = (X1 , . . . , X1 ),
i E[X1 ] = 0 , i j E[X1 X1 ] = 2 ij ,
1 i, j d.
We interpret Xn as the spatial displacement resulting from the shocks due to water molecules during the n-th time interval, and the fact that the covariance matrix is scalar stands for an isotropy assumption (no direction of space is priviledged). From this, we let Sn = X1 + . . . + Xn and we embed this discrete-time process into continuous time by letting Bt
(n)
= n1/2 S[nt] ,
t 0.
Let | | be the Euclidean norm on Rd and for t > 0 and X, y Rd , dene pt (x) = |x|2 1 exp (2t)d/2 2t ,
which is the density of the Gaussian distribution N (0, tId ) with mean 0 and covariance matrix tId . By convention, the Gaussian law N (m, 0) is the Dirac mass at m. 43
44
CHAPTER 6. BROWNIAN MOTION
Proposition 6.1.1 Let 0 t1 < t2 < . . . < tk . Then the nite marginal distributions of B (n) with respect to times t1 , . . . , tk converge weakly as n . More precisely, if F is a bounded continuous function, and letting x0 = 0, t0 = 0, E F (Bt1 , . . . , Btk )
(n) (n) (n)
F (x1 , . . . , xk )
(Rd )k 1ik
p2 (ti ti1 ) (xi xi1 )dxi .
Otherwise said, (Bt1 , . . . , Btk ) converges in distribution to (G1 , G2 , . . . , Gk ), which is a random vector whose law is characterized by the fact that (G1 , G2 G1 , . . . , Gk Gk1 ) are independent centered Gaussian random variables with respective covariance matrices 2 (ti ti1 )Id . Proof. With the notations of the theorem, we rst check that (Bt1 , Bt2 Bt1 , . . . , Btk (n) Btk1 ) is a sequence of independent random variables. Indeed, one has for 1 i k,
(n) Bti (n) (n) (n) (n)
(n)
(n) Bti1
1 = n
[nti ]
Xj ,
j=[nti1 ]+1
and the independence follows by the fact that (Xj , j 0) is an i.i.d. family. Even better, we have the identity in distribution for the i-th increment
(n) Bti
(n) d Bti1 =
[nti ] [nti1 ] n
1 [nti ] [nti1 ]
[nti ][nti1 ]
Xj ,
j=1
and the central limit theorem shows that this converges in distribution to a Gaussian law N (0, 2 (ti ti1 )Id ). Summing up our study, and introducing characteristic functions, we have shown that for every = (j , 1 j k),
k k (n) j (Btj j=1
E exp i
(n) Btj1
=
j=1 k n
E exp ij (Btj Btj1 E[exp (ij (Gj Gj1 )]

j=1 k
(n)
(n)
E exp i
j=1
i (Gj Gj1 )
where G1 , . . . , Gk is distributed as in the statement of the proposition. By Lvys convere (n) gence theorem we deduce that increments of B between times ti converge to increments of the sequence Gi , which is easily equivalent to the statement. This gives the clue that B (n) should converge to a process B whose increments are independent and Gaussian with covariances dictated by the above formula. This will be set in a rigorous way at the end of this section, with Donskers invariance theorem.
6.1. WIENERS THEOREM
45
Denition 6.1.1 A Rd -valued stochastic process (Bt , t 0) is called a standard Brownian motion if it is a continuous process, that satises the following conditions: (i) B0 = 0 a.s., (ii) for every 0 = t0 t1 t2 . . . tk , the increments (Bt1 Bt0 , Bt2 Bt1 , . . . , Btk Btk1 ) are independent, and (iii) for every t, s 0, the law of Bt+s Bt is Gaussian with mean 0 and covariance sId . The term standard refers to the fact that B1 is normalized to have variance Id , and the choice B0 = 0. The characteristic properties (i), (ii), (iii) exactly amount to say that the nitedimensional marginals of a Brownian motion are given by the formula of Proposition 6.1.1. Therefore the law of the Brownian motion is uniquely determined. We now show Wieners theorem that Brownian motion exists! Theorem 6.1.1 (Wiener) There exists a Brownian motion on some probability space. Proof. We will rst prove the theorem in dimension d = 1 and construct a process (Bt , 0 t 1) satisfying the properties of a Brownian motion. Let D0 = {0, 1}, Dn = {k2n , 0 k 2n } for n 1, and D = n0 Dn be the set of dyadic rational numbers in [0, 1]. On some probability space (, F, P ), let (Zd , d D) be a collection of independent random variables all having a Gaussian distribution N (0, 1) with mean 0 and variance 1. We are rst going to construct the process (Bd , d D) so that Bd is a linear combination of the Zd s for every d. It is a well-known and important fact that if random variables X1 , X2 , . . . are linear combinations of independent centered Gaussian random variables, then X1 , X2 , . . . are independent if and only if they are pairwise uncorrelated, namely Cov (Xi , Xj ) = E[Xi Xj ] = 0 for every i = j. We set B0 = 0 and Bd = Z1 . Inductively, given (Bd , d Dn1 ), we build (Bd , d Dn ) in such a way that (Bd , d Dn ) satises (i), (ii), (iii) in the denition of the Brownian motion (where the instants under consideration are taken in Dn ). the random variables (Zd , d D \ Dn ) are independent of (Bd , d Dn ). To this end, take d Dn \ Dn1 , and let d = d 2n and d+ = d + 2n so that d , d+ are consecutive dyadic numbers in Dn1 . Then write Bd = Bd + Bd+ Zd + (n+1)/2 . 2 2
Then Bd Bd = (Bd+ Bd )/2+Zd /2(n+1)/2 and Bd+ Bd = (Bd+ Bd )/2Zd /2(n+1)/2 . Now notice that Nd := (Bd+ Bd )/2 and Nd := Zd /2(n+1)/2 are by the induction hypothesis two independent centered Gaussian random variables with variance 2n1 . From this, one deduces Cov (Nd + Nd , Nd Nd ) = Var (Nd ) Var (Nd ) = 0, so that
46
the increments Bd Bd and Bd+ Bd are independent with variance 2n , as should be. Moreover, these increments are independent of the increments Bd +2n1 Bd for d Dn1 , d = d and of Zd , d Dn \ Dn1 , d = d so they are independent of the increments Bd +2n Bd for d Dn , d {d , d}. This allows the induction argument / to proceed one step further. Thus, we have a process (Bd , d D) satisfying the properties of Brownian motion. Note that Bt B s has same Let s t D, and notice that for every p > 0, since Bt Bs has same law as t sN , where N is a standard Gaussian random variable, E[|Bt Bs |p ] = |t s|p/2 E[|N |p ]. Since a Gaussian random variable admits moments of all orders, it follows from Corollary 4.5.1 that (Bd , d D) a.s. admits a continuous continuation (Bt , 0 t 1). Up to modifying B on the exceptional event where such an extension does not exist, replacing it by the 0 function for instance, we see that B can be supposed to be continuous for every . We now check that (Bt , t [0, 1]) thus constructed has the properties of Brownian motion. Let 0 = t0 < t1 < . . . < tk , and let 0 = tn < tn < . . . < tn be dyadic numbers 0 1 k such that tn converges to ti as n . Then by continuity, (Btn , . . . , Btn ) converges a.s. i 1 k to (Bt1 , . . . , Btk ) as n , while on the other hand, (Btn Btn ,1jk ) are independent j j1 Gaussian random variables with variances (tn tn , 1 j k), so it is not dicult j1 j using Lvys theorem to see that this converges in distribution to independent Gaussian random variables with respective variances tj tj1 , which thus is the distribution of (Btj Btj1 , 1 j k), as wanted. It is now easy to construct a Brownian motion indexed by R+ : simply take independent i standard Brownian motions (Bt , 0 t 1), i 0 as we just constructed, and let
t 1
Bt =
i=0
i B1 + Bt
t 0.
It is easy to check that this has the wanted properties. Finally, it is straightforward to build a Brownian motion in Rd , by taking d independent 1 d copies B 1 , . . . , B d of B and checking that ((Bt , . . . , Bt ), t 0) is a Brownian motion in Rd . Let W = C(R+ , Rd ) be the Wiener space of continuous functions, endowed with the product -algebra W (or the Borel -algebra associated with the compact-open topology). Let Xt (w) = w(t), t 0 denote the canonical process (w W ). Proposition 6.1.2 (Wieners measure) There exists a unique measure W0 (dw) on (W , W), such that (Xt , t 0) is a standard Brownian motion on (W , W, W0 (dw)).
Proof. Let (Bt , t 0) be a standard Brownian motion dened on some probability space (, F, P ) The distribution of B, i.e. the image measure of P by the random variable
6.2. FIRST PROPERTIES
47
B : W , is a measure W0 (dw) satisfying the conditions of the statement. Uniqueness is obvious because such a measure is determined by the nite-dimensional marginals of Brownian motion. For x Rd we also let Wx (dw) to be the image measure of W by (wt , t 0) (x + wt , t 0). A (continuous) process with law Wx (dw) is called a Brownian motion started at x. We let (FtB , t 0) be the natural ltration of (Bt , t 0). Notice that Kolmogorovs continuity lemma shows that a standard Brownian motion is also a.s. locally Hlder continuous with any exponent < 1/2, since it is for every o < 1/2 1/p for some integer p.
6.2
First properties
The rst few following basic (and fundamental) invariance properties of Brownian motion are left as an exercise. Proposition 6.2.1 Let B be a standard Brownian motion in Rd . 1. If U O(n) is an orthogonal matrix, then U B = (U Bt , t 0) is again a Brownian motion. In particular, B is a Brownian motion. 2. If > 0 then (1/2 Bt , t 0) is a standard Brownian motion (scaling property) 3. For every t 0, the shifted process (Bt+s Bt , s 0) is a Brownian motion independent of FtB (simple Markov property). We now turn to less trivial path properties of Brownian motion. We begin with Theorem 6.2.1 (Blumenthals 0 1 law) Let B be a standard Brownian motion. The B B -algebra F0+ = >0 F is trivial, i.e. constituted of the events of probability 0 or 1.
B Proof. Let 0 < t1 < t2 < . . . < tk and A F0+ . Then if F is continuous bounded function (Rd )k R, we have by continuity of B and the dominated convergence theorem,
E[1A F (Bt1 , . . . , Btk )] = lim E[1A F (Bt1 , . . . , Btk )],

() () 0 B where B = (Bt+B , t 0). On the other hand, since A is F -measurable for any > 0, the simple Markov property shows that this is equal to
P (A) lim E[F (Bt1 , . . . , Btk )],

0
which is P (A)E[F (Bt1 , . . . , Btk )], using again dominated convergence and continuity of B B B and F . This entails that F0+ is independent of (Bs , s 0) = F . However, F contains B F0+ , so that the latter -algebra is independent of itself, and P (A) = P (A A) = P (A)2 , entailing the result.
48
Proposition 6.2.2 (i) For d = 1 and t 0, let St = sup0st Bs and It = inf 0st Bs (these are random variables because B is continuous). Then almost-surely, for every > 0, one has S > 0 and I < 0. In particular, there exists a zero of B in any interval of the form (0, ), > 0. (ii) A.s., sup Bt = inf Bt = +.
t0 t0
(iii) Let C be an open cone in Rd with non-empty interior and origin at 0, i.e. a set of the form {tu : t > 0, u A}, where A is an non-empty open subset of the unit sphere of Rd . If HC = inf{t > 0 : Bt C} is the rst hitting time of C, then HC = 0 a.s. Proof. (i) The probability that Bt > 0 is 1/2 for every t, so P (St > 0) 1/2, and therefore if tn , n 0 is any sequence decreasing to 0, P (lim supn {Btn > 0}) lim supn P (Btn > 0) = 1/2. Since the event lim supn {Btn > 0} is in F0+ , Blumenthals law shows that its probability must be 1. The same is true for the inmum by considering the Brownian motion B. (ii) Let S = supt0 Bt . By scaling invariance, for every > 0, S = supt0 Bt has same law as supt0 B2 t = S . This is possible only if either S {0, } a.s., however, it cannot be 0 by (i). (iii) The cone C is invariant by multiplication by a positive scalar, so that P (Bt C) is the same as P (B1 C) for every t by the scaling invariance of Brownian motion. Now, if C has nonempty interior, it is straightforward to check that P (B1 C) > 0, and one concludes similarly as above. Details are left to the reader.
6.3
The strong Markov property
We now want to prove an important analog of the simple Markov property, where deterministic times are replaced by stopping times. To begin with, we extend a little the denition of Brownian motion, by allowing it to start from a random location, and by working with ltrations that are larger with the natural ltration of standard Brownian motions. We say that B is a Brownian motion (started at B0 ) if (Bt B0 , t 0) is a standard Brownian motion which is independent of B0 . Otherwise said, it is the same as the denition as a standard Brownian motion, except that we do not require that B0 = 0. If we want to express this on the Wiener space with the Wiener measure, we have for every measurable functional F : W R+ , E[F (Bt , t 0)] = E[F (Bt B0 + B0 , t 0)], and since (Bt B0 ) has law Wx , this is P (B0 dx)
Rd W
W0 (dw)F (x + w(t), t 0) =
Rd
P (B0 dx)Wx (F ) = E[WB0 (F )],
6.3. THE STRONG MARKOV PROPERTY
49
where as above, Wx is the image of W0 by the translation w x + w, and WB0 (F ) is the random variable WB0 () (F ). Using Proposition 1.3.4 actually shows that E[F (B)|B0 ] = WB0 (F ). Let (Ft , t 0) be a ltration. We say that a Brownian motion B is an (Ft )-Brownian motion if B is adapted to (Ft ), and if B (t) = (Bt+s Bt , s 0) is independent of Ft for every t 0. For instance, if (Ft ) is the natural ltration of a 2-dimensional Brownian 1 2 1 motion (Bt , Bt , t 0), then (Bt , t 0) is an (Ft )-Brownian motion. If B is a standard Brownian motion and X is a random variable independent of B , then B = (X +Bt , t 0) is a Brownian motion (started at B0 = X), and it is an (FtB ) = ((X) FtB )-Brownian motion. A Brownian motion is always an (FtB )-Brownian motion. If B is a standard Brownian motion, then the completed ltration Ft = FtB N (N being the set of events of probability 0) can be shown to be right-continuous, i.e. Ft+ = Ft for every t 0, and B is an (Ft )-Brownian motion. Let (Bt , t 0) be an (Ft )-Brownian motion in Rd and T be an (Ft )-stopping time. (T ) We let Bt = BT +t BT for every t 0 on the event {T < }, and 0 otherwise. Then Theorem 6.3.1 (Strong Markov property) Conditionally on {T < }, the process B (T ) is a standard Brownian motion, which is independent of FT . Otherwise said, conditionally given FT and {T < }, the process (BT +t , t 0) is an (FT +t )-Brownian motion started at BT . Proof. Suppose rst that T < a.s. Let A FT , and consider times t1 < t2 < . . . < tk . We want to show that for every bounded continuous function F on (Rd )k , E[1A F (Bt1 , . . . , Btk )] = P (A)E[F (Bt1 , . . . , Btk )].
(T ) (T )
(6.1)
Indeed, taking A = entails that B (T ) is a Brownian motion, while letting A vary in FT (T ) (T ) entails the independence of (Bt1 , . . . , Btk ) and FT for every t1 , . . . , tk , hence of B (T ) and FT . Now, suppose rst that T takes its values in a countable subset E of R+ . Then E[1A F (Bt1 , . . . , Btk )] =
(T ) (T ) sE
E[1A{T =s} F (Bt1 , . . . , Btk )]

(s) (s)
=
sE
P (A {T = s})E[F (Bt1 , . . . , Btk )],
where we used the simple Markov property and the fact that A {T = s} Fs by denition. Back to the general case, we can apply this result to the stopping time Tn = 2n 2n T . Since Tn T , it holds that FT FTn so that we obtain for A FT E[1A F (Bt1 n , . . . , Btk n )] = P (A)E[F (Bt1 , . . . , Btn )].
(T ) (T ) (T ) (T )
(6.2)
Now, by a.s. continuity of B, it holds that Bt n converges a.s. to Bt as n , for every t 0. Since F is bounded, the dominated convergence theorem allows to pass to the limit in (6.2), obtaining (6.1). Finally, if P (T = ) > 0, check that (6.1) remains true when replacing A by A{T < }, and divide by P ({T < }).
50
An important example of application of the strong Markov property is the so-called reection principle. Recall that St = sup0st Bs . Theorem 6.3.2 (Reection principle) Let (Bt , t 0) be an (Ft )-Brownian motion started at 0, and T be an (Ft )-stopping time. Then, the process Bt = Bt 1{tT } + (2BT Bt )1{t>T } , is also an (Ft )-Brownian motion started at 0. Proof. By the strong Markov property, the processes (Bt , 0 t T ) and B (T ) are independent. Moreover, B (T ) is a standard Brownian motion, and hence has same law as B (T ) . Therefore, the pair ((Bt , 0 t T ), B (T ) ) has same law as ((Bt , 0 t T ), B (T ) ). On the other hand, the trajectory B is a measurable G((Bt , 0 t T ), B (T ) ), where G(X, Y ) is the concatenation of the paths X, Y . The conclusion follows from the fact that G((Bt , 0 t T ), B (T ) ) = B. Corollary 6.3.1 (Sometimes also called the reection principle) Let 0 0. Then Tx is an (FtB )-stopping time for every x by (i), Proposition 4.1.1. Notice that Tx < a.s. since S = a.s., where S = limt St . Now by continuity of B, BTx = x for every x. By the reection principle applied to T = Tb , we obtain (with the denition B given in the statement of the reection principle) P (St b, Bt a) = P (Tb t, 2b Bt 2b a) = P (Tb t, Bt 2b a), since 2b Bt = Bt as soon as t Tb . On the other hand, the event {Bt 2b a} is contained in {Tb t} since 2b a b. Therefore, we obtain P (St b, Bt a) = P (Bt 2b a), and the result follows since B is a Brownian motion. Notice also that the probability under consideration is equal to P (St > b, Bt < a) = P (Bt > 2b a), i.e. the inequalities can be strict or not. Indeed, for the right-hand side, this is due to the fact that the distribution of Bt is non-atomic, and for the left-hand side, this boils down to showing that for every x, Tx = inf{t 0 : Bt > x} , a.s., which is a straightforward consequence of the strong Markov property at time Tx , combined with Proposition 6.2.2. Corollary 6.3.2 The random variable St has the same law as |Bt |, for every xed t 0. Moreover, for every x > 0, the random time Tx has same law as (x/B1 )2 . Proof. As a b, the probability P (St b, Bt a) converges to P (St b, Bt b), and this is equal to P (Bt b) by Corollary 6.3.1. Therefore, P (St b) = P (St b, Bt b) + P (Bt b) = 2P (Bt b) = P (|Bt | b), because {Bt b} {St b}, and this gives the result. We leave the computation of the distribution of Tx as an exercise. t0
6.4. MARTINGALES AND BROWNIAN MOTION
51
6.4
Some martingales associated to Brownian motion
One of the nice features of Brownian motion is that there are a tremendous amount of martingales that are associated with it. Proposition 6.4.1 Let (Bt , t 0) be an (Ft )-Brownian motion. (i) If d = 1 and B0 L1 , the process (Bt , t 0) is a (Ft )-martingale. 2 (ii) If d = 1 and B0 L2 , the process (Bt t, t 0) is a (Ft )-martingale. (iii) In any dimension, let u = (u1 , . . . , ud ) Cd . If E[exp( u, B0 )] < , the process M = (exp( u, Bt tu2 /2), t 0) is also a (Ft )-martingale for every u Cd , where u2 is a notation for d u2 . i=1 i Notice that in (iii), we are dealing with C-valued processes. The denition of E[X|G] the conditional expectation for a random variable X L1 (C) is E[ X|G]+iE[ X|G], and we say that an integrable process (Xt , t 0) with values in C, and adapted to a ltration (Ft ), is a martingale if its real and imaginary parts are. Notice that the hypothesis on B0 in (iii) is automatically satised whenever u = iv with v R is purely imaginary. Proof. (i) If s t, E[Bt Bs |Fs ] = E[Bts ] = 0, where Bu = Bu+s Bs has mean 0 and is independent of Fs , by the simple Markov property. The integrability of the process is obvious by hypothesis on B0 . (ii) Integrability is an easy exercise using that Bt B0 is independent of B0 . We have, 2 2 for s t, Bt = (Bt Bs )2 + 2Bs (Bt Bs ) + Bs . Taking conditional expectation given Fs 2 2 and using the simple Markov property gives that E[Bt ] = (t s) + Bs , hence the result. (iii) Integrability comes from the fact that E[exp(Bt )] = exp(t2 /2) whenever B is a standard Brownian motion, and the fact that E[exp( u, Bt )] = E[exp( u, (Bt B0 +B0 ) )] = E[exp( u, Bt B0 )]E[exp( u, B0 )] < . For s t, Mt = exp(i u, (Bt Bs ) +i u, Bs +t|u|2 /2). We use the Markov property again, and the fact that E[exp(i u, Bt Bs )] = exp((t s)|u|2 /2), which is the characteristic function of a Gaussian law with mean 0 and variance |u|2 . From this, one can show that Proposition 6.4.2 Let (Bt , t 0) be a standard Brownian motion and Tx = inf{t 0 : Bt = x}. Then for x, y > 0, one has x P (Ty < Tx ) = , E[Tx Ty ] = xy. x+y Proposition 6.4.3 Let (Bt , t 0) be a (Ft )-Brownian motion. Let f (t, x) : R+ Rd C be continuously dierentiable in the variable t and twice continuously dierentiable in x, and suppose that f and its derivatives of all order are bounded. Then,
t (s) (s)
Mt = f (t, Bt ) f (0, B0 )
0
ds
1 + f (s, Bs ) , t 2
t0
is a (Ft )-martingale, where = coordinate of f .
d 2 i=1 x2 i
is the Laplacian operator acting on the spatial
52
This is the rst symptom of the famous It formula, which says what this martingale o actually is. Proof. Integrability is trivial from the boundedness of f , as well as adptedness, since Mt is a function of (Bs , 0 s t). Let s, t 0. We estimate
t+s
E[Mt+s |Ft ] = Mt + E f (t + s, Bt+s ) f (t, Bt )

t
du
+ t 2
f (u, Bu ) Ft .
On the one hand, E[f (t + s, Bt+s ) f (t, Bt )|Ft ] = E[f (t + s, Bt+s Bt + Bt )|Ft ] f (t, Bt ), and since Bt+s Bt is independent of Ft with law N (0, s), using Proposition 1.3.4, this is equal to f (t + s, Bt + x)p(s, x)dx f (t, Bt ),
R
(6.3)
where p(s, x) = (2s)d/2 exp(|x|2 /(2s)) is the probability density function for N (0, s). On the other hand, if we let L = /t + /2,
t+s s
E
t
du Lf (u, Bu ) Ft
= E
0
du Lf (u + t, Bt+s Bt + Bt ) Ft .
This expression is of the form E[F ((B (t) , Bt ))|Ft ], where F is measurable and Bs = Bt+s Bt , s 0 is independent of Bt by the simple Markov property, and has law W0 (dw), the Wiener measure. If (Xt , t 0) is the canonical process Xt (w) = wt , then this last expression rewrites, by Proposition 1.3.4,
t+s s
(t)
E
t
Lf (u, Bu )du Ft
=
W s
W0 (dw)
0
du Lf (u + t, Xs + Bt )
=
0 s
du
W
W0 (dw)Lf (u, Xs + Bt ) dx p(s, x)Lf (u, x + Bt ),

Rd
=
0
du
where f (s, x) = f (s+t, x), and we made a use of Fubinis theorem. Next, the boundedness of Lf entails that this is equal to
s
lim
0
du
Rd
dx p(s, x)Lf (u, x + Bt ).
From the expression for L, we can split this into two parts. Using integration by parts,
s
dx
R
du p(u, x)
f (u, x + Bt ) t dx p(, x)f (, x + Bt )

R
=
R
dx p(s, x)f (s, x + Bt )

s
Rd
dx
du
p (u, x) f (t + u, x + Bt ). t
6.5. RECURRENCE AND TRANSIENCE PROPERTIES Similarly, integrating by parts twice yields
s
53
du
R
dx p(u, x)
f (u + t, x + Bt ) = 2
du
R
dx
p(u, x) f (u + t, x + Bt ). 2
Now, p(t, x) satises the heat equation (t /2)p = 0. Therefore, the integral terms cancel each other, and it remains
t+s
E
t
du Lf (u, Bu ) Ft =
Rd
dx p(s, x)f (s, x + Bt ) lim

0 Rd
dx p(, x)f (, x + Bt ),
which by dominated convergence is exactly (6.3). This shows that E[Mt+s Mt |Ft ] = 0.
6.5
Recurrence and transience properties of Brownian motion
From this section on, we are going to introduce a bit of extra notation. We will suppose that the reference measurable space (, F) on which (Bt , t 0) is dened endowed with probability measures Px , x Rd such that under Px , (Bt x, t 0) is a standard Brownian motion. A possibility is to choose the Wiener space and endow it with the measures Wx , so that the canonical process (Xt , t 0) is a Brownian motion started at x under Wx . We let Ex be the expectation associated with Px . In the sequel, B(x, r), B(x, r) will respectively denote the open and closed Euclidean balls with center x and radius r, in Rd for some d 1. Theorem 6.5.1 (i) If d = 1, Brownian motion is point-recurrent in the sense that under P0 (or any Py , y R), a.s., {t 0 : Bt = x} is unbounded for every x R. (ii) If d = 2, Brownian motion is neighborhood-recurrent, in the sense that for every x, under Px , a.s., {t 0 : |Bt | } is unbounded for every x Rd , > 0. However, points are polar in the sense that for every x Rd , P0 (Hx = ) = 1 , where Hx = inf{t > 0 : Bt = x} is the hitting time of x. (iii) If d 3, Brownian motion is transient, in the sense that a.s. under P0 , |Bt | as t . Proof. (i) is a consequence of (ii) in Proposition 6.2.2. For (ii), let 0 < < R be real numbers and f be a C function, which is bounded with all its derivatives, and that coincides with x log |x| on D,R = {x R2 : |x| R}.
54
Then one can check that f = 0 on the interior of D,R , and therefore, if we let S = inf{t 0 : |Bt | = } and T = inf{t 0 : |Bt | = R}, then S, T, H = S T are stopping times, and from Proposition 6.4.3, the stopped process (log |BtH |, t 0) is a (bounded) martingale. If < |x| < R, we thus obtain that Ex [log |BH |] = log |x|. Since H T < a.s. (Brownian motion is unbounded a.s.), and since |BS | = , |BT | = R on the event that S < , T < , the left-hand side is (log )Px (S < T ) + (log R)Px (S > T ). Therefore, Px (S < T ) = log R log |x| . log R log (6.4)
Letting 0 shows that the probability of hitting 0 before hitting the boundary of the ball with radius R is 0, and therefore, letting R , the probability of hitting 0 (starting from x = 0) is 0. The announced result (for x = 0) is then obtained by translation. We thus have that P0 (Hx < ) = 0 for every x = 0. Next, we have P0 (t a : Bt = 0) = P0 (s 0 : Bs+a Ba + Ba = 0), and the Markov property at time a shows that this is P0 (t a : Bt = 0) =
R2
P0 (Ba dy)P0 (s 0 : Bs + y = 0) P0 (Ba dy)Py (s 0 : Bs = 0) = 0,

R2
because the law of Ba under P0 is a Gaussian law that does not charge the point 0 (we have been using the notation P (X dx) for the law of the random variable X). On the other hand, letting R rst in (6.4), we get that the probability of hitting the ball with center 0 and radius is 1 for every , starting from any point: Px (t 0 : |Bt | ) = 1. Thus, for every n Z+ , a similar application of the Markov property at time n gives Px (t n : |Bt | ) =
R2
Px (Bn dy)Py (t 0 : |Bt | ) = 1.
Hence the result. For (iii) Since the three rst components of a Brownian motion in Rd form a Brownian motion, it is clearly sucient to treat the case d = 3. So assume d = 3. Let f be a C function with all derivatives that are bounded, and that coincides with x 1/|x| on D,R , which is dened as previously but for d = 3. Then f = 0 on the interior of D,R , and the same argument as above shows that for x D,R , dening S, T as above, Px (S < T ) = |x|1 R1 . 1 R1
This converges to /|x| as R , which is thus the probability of ever visiting B(0, ) when starting from x (with |x| ). Dene two sequences of stopping times Sk , Tk , k 1 by S1 = inf{t 0 : |Bt | }, and Tk = inf{t Sk : |Bt | 2} , Sk+1 = inf{t Sk : |Bt | 2}.
6.6. THE DIRICHLET PROBLEM
55
If Sk is nite, we get that Tk is also nite, because Brownian motion is an a.s. unbounded process, so {Sk < } = {Tk < } up to a zero-probability event. The strong Markov property at time Tk gives Px (Sk+1 < |Sk < ) = Px (Sk+1 < |Tk < ) = Px (s Tk : |Bs BTk + BTk | | Tk < ) =
R3
Px (BTk dy|Tk < )Py (s 0 : |Bs | ),
where Px (BTk dy|Tk < ) is the law of BTk under the probability measure Px (A|Tk < ), A F. Since |BTk | = 2 on the event {Tk < }, we have that the last probability is /|y| = 1/2. Finally, we obtain by induction that Px (Sk < ) Px (S1 < )2k+1 , and the Borel-Cantelli lemma entails that a.s., Sk = for some k. Therefore, Brownian motion in dimension 3 a.s. eventually leaves the ball of radius for good, and letting = n along Z+ gives the result. Remark. If B(x, ) is the Euclidean ball of center x and radius , notice that the property of (ii) implies the fact that {t 0 : Bt B(x, )} is unbounded for every x R2 and every > 0, almost surely (indeed, one can cover R2 by a countable union of balls of a xed radius). In particular, the trajectory of a 2-dimensional Brownian motion is everywhere dense. On the other hand, it will a.s. never hit a xed countable family of points (except maybe at time 0), like the points with rational coordinates!
6.6
Brownian motion and the Dirichlet problem
Let D be a connected open subset of Rd for some d 2. We will say that D is a domain. Let D be the boundary of D. We denote by the Laplacian on Rd . Suppose given a measurable function g : D R. A solution of the Dirichlet problem with boundary condition g on D is a function u : D R of class C 2 (D) C(D), such that u = 0 u|D = g. on D (6.5)
A solution of the Dirichlet problem is the mathematical counterpart of the following physical problem: given an object made of homogeneous material, such that the temperature g(y) is imposed at point y of its boundary, the solution u(x) of the Dirichlet problem gives the temperature at the point x in the object when equilibium is attained. As we will see, it is possible to give a probabilistic resolution of the Dirichlet problem with the help of Brownian motion. This is essentially due to Kakutani. We let Ex be the law of the Brownian motion in Rd started at x. In the remaining of the section, let T = inf{t 0 : Bt D} be the rst exit time from D. It is a stopping time, as it is the / rst entrance time in the closed set Dc . We will assume that the domain D is such that P (T < ) = 1 to avoid complications. Hence BT is a well-dened random variable. In the sequel, | | is the euclidean norm on Rd . The goal of this section is to prove the following
56
Theorem 6.6.1 Suppose that g C(D, R) is bounded, and assume that D satises a local exterior cone condition (l.e.c.c.), i.e. for every y D, there exists a nonempty open convex cone with origin at y such that C B(y, r) Dc for some r > 0. Then the function u : x Ex [g(BT )] is the unique bounded solution of the Dirichlet problem (6.5). In particular, if D is bounded and satises the l.e.c.c., then u is the unique solution of the Dirichlet problem. We start with a uniqueness statement. Proposition 6.6.1 Let g be a bounded function in C(D, R). Set u(x) = Ex [g(BT )] . If v is a bounded solution of the Dirichlet problem, then v = u. In particular, we obtain uniqueness when D is bounded. Notice that we do not make any assumption on the regularity of D here besides the fact that T < a.s. Proof. Let v be a bounded solution of the Dirichlet problem. For every N 1, introduce the reduced set DN = {x D : |x| < N and d(x, D) > 1/N }. Notice it is an open set, but which need not be connected. We let TN be the rst exit time of DN . By Proposition 6.4.3, the process
t
Mt = vN (Bt ) vN (B0 ) +
0
1 vN (Bs )ds , 2
t0
is a martingale, where vN is a C 2 function that coincides with v on DN , and which is bounded with all its partial derivatives (this may look innocent, but the fact that such a function exists is highly non-trivial, the use of such a function could be avoided by a stopped analog of Proposition 6.4.3). Moreover, the martingale stopped at TN is MtTN = v(BtTN ) v(B0 ), because v = 0 inside D, and it is bounded (because v is bounded on DN ), hence uniformly integrable. By optional stopping at TN , we get that for every x DN , 0 = Ex [MTN ] = Ex [v(BTN )] v(x) (6.6) Now, as N , BTN converges to BT a.s. by continuity of paths and the fact that T < a.s. Since v is bounded, we can use dominated convergence as N , and get that for every x D, v(x) = Ex [v(BT )] = Ex [g(BT )], hence the result. For every x Rd and r > 0, let x,r be the uniform probability measure on the sphere Sx,r = {y Rd : |y x| = r}. It is the unique probability measure on Sx,r that is invariant under isometries of Sx,r . We say that a locally bounded measurable function h : D R is harmonic on D if for every x D and every r > 0 such that the closed ball B(x, r) with center x and radius r is contained in D, h(x) =
Sx,r
h(y)x,r (dy).
6.6. THE DIRICHLET PROBLEM
57
Proposition 6.6.2 Let h be harmonic on a domain D. Then h C (D, R), and h = 0 on D. Proof. Let x D and > 0 such that B(x, ) D. Then let C (D, R) be non-negative with non-empty compact support in [0, [. We have, for 0 < r < , h(x) =
S(0,r)
h(x + y)0,r (dy).
Multiplying by (r)rd1 and integrating in r gives ch(x) =

Rd
(|z|)h(x + z)dz,
where c > 0 is some constant, where we have used the fact that f (x)dx = C
Rd R+
rd1 dr
S(0,r)
f (ry)0,r (dy)
for some C > 0. Therefore, ch(x) = Rd (|z x|)h(z)dz and by derivation under the sign, we easily get that h is C . Next, by translation we may suppose that 0 D and show only that h(0) = 0. we may apply Taylors formula to h, obtaining, as x 0,
d
h(x) = h(0) +
h(0), x +
i=1
x2 i
2h (0) + x2 i
xi xj
i=j
2h (0) + o(|x|2 ). xi xj
Now, integration over S0,r for r small enough yields h(x)0,r (dx) = h(0) + Cr h(0) + o(|r|2 ),
Sx,r
where Cr = S0,r x2 0,r (dx), as the reader may check that all the other integrals up to the 1 second order are 0, by symmetry. since the left-hand side is h(0), we obtain h(0) = 0.
Therefore, harmonic functions are solutions of certain Dirichlet problems. Proposition 6.6.3 Let g be a bounded measurable function on D, and let T = inf{t 0 : Bt D}. Then the function h : x D Ex [g(BT )] is harmonic on D, and hence / h = 0 on D. Proof. For every Borel subsets A1 , . . . , Ak of Rd and times t1 < . . . < tk , the map x Px (Bt1 A1 , . . . , Btn An ) is measurable by Fubinis theorem, once one has written the explicit formula for this probability. Therefore, by the monotone class theorem, x Ex [F ] is measurable for
58
every integrable random variable F , which is measurable with respect to the product -algebra on C(R, Rd ). Moreover, h is bounded by assumption. Now, let S = inf{t 0 : |Bt x| r} the rst exit time of B form the ball of center x and radius r. Then by (ii), Proposition 6.2.2, S < a.s. By the strong Markov property, B = (BS+t , t 0) is an (FS+t ) Brownian motion started at BS . Moreover, the rst hitting time of D for B is T = T S. Moreover, BT = BT , so that e Ex [g(BT )] = Ex [g(BT )] = e
Rd
Px (BS dy)Ey [g(BT )1{T <} ],
and we recognize Px (BS dy)h(y) in the last expression. Since B starts from x under Px , the rotation invariance of Brownian motion shows that BS x has a distribution on the sphere of center 0 and radius r which is invariant under the orthogonal group, so we conclude that the distribution of BS is the uniform measure on the sphere of center x and radius r, and therefore that h is harmonic on D. It remains to understand whether the function u of Theorem 6.6.1 is actually a solution of the Dirichlet problem. Indeed, is not the case in general that u(x) has limit g(y) as x D, x y, and the reason is that some points of D may be invisible to Brownian motion. The reader can convince himself, for example, that if D = B(0, 1)\{0} is the open ball of R2 with center 0 and radius 1, whose origin has been removed, and if g = 1{0} , then no solution of the Dirichlet problem with boundary constraint g exists. The probabilistic reason for that is that Brownian motion does not see the boundary point 0. This is the reason why we have to make regularity assumptions on D in the following theorem. Proof of Theorem 6.6.1. It remains to prove that under the l.e.c.c., u is continuous on D, i.e. u(x) converges to g(y) as x D converges to y D. In order to do that, we need a preliminary lemma. Recall that T is the rst exit time of D for the Brownian path. Lemma 6.6.1 Let D be a domain satisfying the l.e.c.c., and let y D. Then for every > 0, Px (T < ) 1 as x D y. Proof. Let Cy = y + C be a nonempty open convex cone with origin at y such that for some > 0, Cy Dc (we leave as an exercise the case when only a neighborhood of this cone around y is contained in Dc ). Then it is an elementary geometrical fact that for every > 0 small enough, there exist > 0 and a nonempty open convex cone C with origin at 0, such that x + (C \ B(0, )) Cy for every x B(y, ). Now by (iii) in Proposition 6.2.2, if HC = inf{t > 0 : Bt C \ B(0, )}, then P0 (HC < ) P0 (HC < ) = 1 as 0. Since hitting x + (C \ B(0, )) implies hitting Cy and therefore leaving D, we obtain, after translating by x, that for every , > 0, Px (T < ) can be made 1 for x belonging to a suciently small -neighborhood of y in D. We can now nish the proof of Theorem 6.6.1. Let y D. We want to estimate the quantity Ex [g(BT )] g(y) for some y D. For , > 0, let A, = sup |Bt x| /2 .
0t
6.7. DONSKERS INVARIANCE PRINCIPLE
59
This event decreases to as 0 because B has continuous paths. Now, for any , > 0, Ex [|g(BT ) g(y)|] = Ex [|g(BT ) g(y)| ; {T } Ac ] , +Ex [|g(BT ) g(y)| ; {T } A, ] +Ex [|g(BT ) g(y)| ; {T }] Fix > 0. We are going to show that each of these three quantities can be made < /3 for x close enough to y. Since g is continuous at y, for some > 0, |y z| < with y, z D implies |g(y) g(z)| < /3. Moreover, on the event {T } Ac , we know , that |BT x| < /2, and thus |BT y| as soon as |x y| /2. Therefore, for every > 0, the rst quantity is less than /3 for x B(y, /2). Next, if M is an upper bound for |g|, the second quantity is bounded by 2M P (A, ). Hence, by now choosing small enough, this is < /3. Finally, with , xed as above, the third quantity is bounded by 2M Px (T ). By the previous lemma, this is < /3 as soon as x B(y, ) D for some > 0. Therefore, for any x B(y, /2) D, |u(x) g(y)| < . This entails the result. Corollary 6.6.1 A function u : D R is harmonic in D if and only if it is in C 2 (D, R), and satises u = 0. Proof. Let u be of class C 2 (D) be of zero Laplacian, and let x D. Let be such that B(x, ) D, and notice that u|B(x,) is a solution of the Dirichlet problem on B(x, ) with boundary values u|B(x,) . Then B(x, ) satises the l.e.c.c., so that u|B(x,) is the unique such solution, which is also given by the harmonic function of Theorem 6.6.1. Therefore, u is harmonic on D.
6.7
Donskers invariance principle
The following theorem completes the description of Brownian motion as a limit of centered random walks as depicted in the beginning of the chapter, and strengthen the convergence of nite-dimensional marginals to that convergence in distribution. We endow C([0, 1], R) with the supremum norm, and recall (see the exercises on continuous-time processes) that the product -algebra associated with it coincides with the Borel -algebra associated with this norm. We say that a function F : C([0, 1]) R is continuous if it is continuous with respect to this norm. Theorem 6.7.1 (Donskers invariance principle) Let (Xn , n 1) be a sequence of R-valued integrable independent random variables with common law , such that x(dx) = 0 and x2 (dx) = 2 (0, ).
Let S0 = 0 and Sn = X1 + . . . + Xn , and dene a continuous process that interpolates linearly between values of S, namely St = (1 {t})S[t] + {t}S[t]+1 t 0,
60
where [t] denotes the integer part of t and {t} = t [t]. Then S [N ] := (( 2 N )1/2 SN t , 0 t 1) converges in distribution to a standard Brownian motion between times 0 and 1, i.e. for every bounded continuous function F : C([0, 1]) R, E F (S [N ] )
n
E0 [F (B)].
Notice that this is much stronger than what Proposition 6.1.1 says. Despite the slight dierence of framework between these two results (one uses c`dl`g continuousa a time version of the random walk, and the other uses an interpolated continuous version), Donskers invariance principle is stronger. For instance, one can infer from this theorem that the random variable N 1/2 sup0nN Sn converges to sup0t1 Bt in distribution, because f sup f is a continuous operation on C([0, 1], R). Proposition 6.1.1 would be powerless to address this issue. The proof we give here is an elegant demonstration that makes use of a coupling of the random walk with the Brownian motion, called the Skorokhod embedding theorem. It is however specic to dimension d = 1. Suppose we are given a Brownian motion (Bt , t 0) on some probability space (, F, P ). Let + (dx) = P (X1 dx)1{x0} , (dy) = P (X1 dy)1{y0} dene two nonnegative measures. Assume that (, F, P ) is a rich enough probability space so that we can further dene on it, independently of (Bt , t 0), a sequence of independent identically distributed R2 -valued random variables ((Yn , Zn ), n 1) with distribution P ((Yn , Zn ) dxdy) = C(x + y)+ (dx) (dy), where C > 0 is the appropriate normalizing constant that makes this expression a probability measure. Next, consider let F0 = {(Yn , Zn ), n 1} and Ft = F0 FtB , so that (Ft , t 0) is a ltration such that B is an (Ft )-Brownian motion. We dene a sequence of random times, by T0 = 0, T1 = inf{t 0 : Bt {Y1 , Z1 }}, and recursively, Tn = inf{t Tn1 : Bt BTn1 {Yn , Zn }}. By (ii) in Proposition 6.2.2, these times are a.s. nite, and they are stopping times with respect to the ltration (Ft ). We claim that Lemma 6.7.1 The sequence (BTn , n 0) has the same law as (Sn , n 0). Moreover, the intertimes (Tn Tn1 , n 0) form an independent sequence of random variables with same distribution, and expectation E[T1 ] = 2 . Proof. By repeated application of the Markov property at times Tn , n 1, and the fact that the (Yn , Zn ), n 1 are independent with same distribution, we obtain that the processes (Bt+Tn1 BTn1 , 0 t Tn Tn1 ) are independent with the same distribution. The fact that the dierences BTn BTn1 , n 1 and Tn Tn1 , n 0 form sequences of independent and identicaly distributed random variables follows from this observation. It therefore remains to check that BT1 has same law as X1 and E[T1 ] = 2 . Remember from Proposition 6.4.2 that given Y1 , Z1 , the probability that BT1 = Y1 is Z1 /(Y1 + Z1 ), as
6.7. DONSKERS INVARIANCE PRINCIPLE
61
follows from the optional stopping theorem. Therefore, for every non-negative measurable function f , by rst conditioning on (Y1 , Z1 ), we get E[f (BT1 )] = E f (Y1 ) =
R+ R+
Z1 Y1 + f (Z1 ) Y1 + Z 1 Y1 + Z 1 x y + f (y) x+y x+y
C(x + y)+ (dx) (dy) f (x)
= C
R+
(f (x)+ (dx) + f (x) (dx)) = C E[f (X1 )],
for C = C x+ (dx) which can only be = 1 (by taking f = 1). Here, we have used the fact that x+ (dx) = x (dx), which amounts to say that X1 is centered. For E[T1 ], recall from Propostition 6.4.2 that E[inf{t 0 : Bt {x, y}}] = xy, so by a similar conditioning argument as above, E[T1 ] =
R+ R+
C(x + y)xy+ (dx) (dy) = 2 , x+ (dx) = 1.
where we again used that C
Proof of Donskers invariance principle. We suppose given a Brownian motion (N ) B. For N 1, dene Bt = N 1/2 BN 1 t , t 0, which is a Brownian motion by scaling invariance. Perform the Skorokhod embedding construction on B (N ) to obtain variables (N ) (N ) (N ) (N ) Tn , n 0. Then, let Sn = B (N ) . Then by Lemma 6.7.1, Sn , n 0 is a random Tn walk with same law as Sn , n 0. We interpolate linearly between integers to obtain a (N ) (N ) (N ) (N ) continuous process St , t 0. Finally, let St = ( 2 N )1/2 SN t , t 0 and Tn = (N ) N 1 Tn . (N ) We are going to show that the supremum norm of Bt St , 0 t 1 converges to 0 in probability. By the law of large numbers, Tn /n converges a.s. to 2 as n . Thus, by a monotonicity argument, N 1 sup0nN |Tn 2 n| converges to 0 a.s. as N . As a consequence, this supremum converges to 0 in probability, meaning that for every > 0, P
(N sup |Tn ) n/N |
0nN
0.
(N ) (N )
On the other hand, for every t [n/N, (n + 1)/N ], there exists some u [Tn , Tn+1 ] with (N ) (N ) Bu = St , because Sn/N = BTn ) for every n and by the intermediate values theorem, e(N S (N ) and B being continuous. Therefore, the event {sup0t1 |St in the union K N LN , where KN = and LN = { t [0, 1], u [t 1/N, t + + 1/N ] : |Bt Bu | > }.
(N sup |Tn ) n/N | > (N )
Bt | > } is contained
0nN
62
We already know that P (K N ) 0 as N . For LN , since B is a.s. uniformly continuous on [0, 1], by taking small enough and then N large enough, we can make P (LN ) as small as wanted. Therefore, we have showed that P S (N ) B
>
0.
Therefore, (S (N ) , 0 t 1) converges in probability for the uniform norm to (Bt , 0 t 1), which entails convergence in distribution by Proposition 5.2.1. This concludes the proof.
Chapter 7 Poisson random measures and Poisson processes

7.1 Poisson random measures
Let (E, E) be a measurable space, and let be a non-negative -nite measure on (E, E). We denote by E the set of -nite atomic measures on E, i.e. the set of -nite measures taking values in Z+ {} (in fact, we will only consider measures that can be put in the form iI xi with I countable and xi E, i I). The set E is endowed with the product -algebra E = (XA , A E), where XA (m) = m(A) for m E , and A E. Otherwise said, for every A E, the mapping m m(A) from E to Z+ {} is measurable with respect to E . For > 0 we denote by P() the Poisson distribution with parameter , which assigns mass e n /n! to the integer n. Denition 7.1.1 A Poisson measure on (E, E) with intensity is a random variable M with values in E , such that if (Ak , k 1) is a sequence of disjoint sets in E, with (Ak ) < for every k, (i) the random variables M (Ak ), k 1 are independent, and (ii) the law of M (Ak ) is P((Ak )) for k 1. Notice that properties (i) and (ii) completely characterize the law of the random variable M . Indeed, notice that events which are either empty or of the form {m E : m(A1 ) = i1 , . . . , m(Ak ) = ik } , with pairwise disjoint A1 , . . . , Ak E, (Aj ) < , 1 j k and where (i1 , . . . , ik ) are integers, form a -system that generates E . If now M is a Poisson random measure with intensity , on some probability space (, F, P ), then
k
P (M (A1 ) = i1 , . . . , M (Ak ) = ik ) =
j=1
e(Aj )
(Aj )ij . nj !
Hence the uniqueness of the law of a random measure satisfying (i), (ii). Existence is stated in the next 63
64
CHAPTER 7. POISSON RANDOM MEASURES AND PROCESSES
Proposition 7.1.1 For every -nite non-negative measure on (E, E), there exists a Poisson random measure on (E, E) with intensity . Proof. Suppose rst that = (E) < . We let N be a Poisson random variable with parameter , and X1 , . . . be independent random variables, independent of N , with law /(E). Finally, we let M = N () Xi () . i=1 Now, if N is Poisson with parameter and (Yi , i 1) are independent and independent N of N , with P (Yi = j) = pj , 1 j k, it holds that i=1 1{Yi =j} , 1 j k are independent with respective laws P(pj ), 1 j k. It follows that M is a Poisson measure with intensity : for disjoint A1 , . . . , Ak in E with nite -measures, we let Yi = j whenever Xi Aj , dening independent random variables in {1, . . . , k} with P (Yi = j) = (Aj )/(E), so that M (Aj ), 1 j k are independent P((E)(Aj )/(E)), 1 j k random variables. In the general case, since is -nite, there is a partition of E into measurable sets Ek , k 1 that are disjoint and have nite -measure. We can construct independent Poisson measures Mk on Ek with intensity ( Ek ), for k 1. We claim that M (A) =
k1
Mk (A Ek ) ,
A E,
denes a Poisson random measure with intensity . This is an easy consequence of the property that if Z1 , Z2 , . . . are independent Poisson variables with respective parameters 1 , 2 , . . ., then the sum Z1 + Z2 + . . . is Poisson with parameter 1 + 2 + . . . (with the convention that P() is a Dirac mass at ). From the construction, we obtain the following important property of Poisson random measures: Proposition 7.1.2 Let M be a Poisson random measure on E with intensity , and let A E be such that (A) < . Then M (A) has law P((A)), and given M (A) = k, the restriction M |A has same law as k Xi , where (X1 , X2 , . . . , Xk ) are independent with i=1 law ( A)/(A). Moreover, if A, B E are disjoint, then the restrictions M |A , M |B are independent. Last, any Poisson random measure can be written in the form M (dx) = iI xi (dx) where I is a countable index-set and the xi , i I are random variables.
7.2
Integrals with respect to a Poisson measure
Proposition 7.2.1 Let M be a Poisson random measure on E, with intensity . Then for every measurable f : E R+ , the quantity M (f ) :=
E
f (x)M (dx)
denes a random variable, and E[exp(M (f ))] = exp

E
(dx)(1 exp(f (x))) .
7.2. INTEGRALS WITH RESPECT TO A POISSON MEASURE Moreover, if f : E R is measurable and in L1 (), then f L1 (M ) a.s., denes a random variable, and E[exp(iM (f ))] = exp
E
65 f (x)M (dx)
(dx)(exp(if (x)) 1) .
The rst formula is sometimes called the Laplace functional formula, or Campbell formula. Notice that by replacing f by af , dierentiating the formula with respect to a and letting a 0, one gets the rst moment formula E[M (f )] =
E
f (x)(dx) ,
whenever f 0, or f is integrable w.r.t. (in this case, consider rst f + , f ). Similarly, Var M (f ) =
E
f (x)2 (dx)
(for this, rst notice that the restrictions of M to {f 0} and {f < 0} are independent). Proof. Let En , n 0 be a measurable partition of E into sets with nite -measure. First assume that f = 1A for A E, (A) < . Then M (A) is a random variable by denition of M , and this extends to any A E by considering A En , n 0 and summation. Since any measurable non-negative function is the increasing limit of nite linear combinations of such indicator functions, we obtain that M (f ) is a random variable as a limit of random variables. Moreover, a similar argument shows that M (f 1En ), n 0 are independent random variables. Next, assume f 0. The number Nn of atoms of M that fall in En has law P((En )) and given Nn = k, the atoms can be supposed to be independent random variables with law ( En )/(En ). Therefore,
E[exp(M (f 1En ))] =

k=0
e(En )
(En )k k!
En
(dx) f (x) e (En )
= exp
En
(dx)(1 exp(f (x)))
From the independence of the variables M (f 1En ), we can then take products over n 0 (i.e. apply monotone convergence) and obtain the wanted formula. From this, we obtain the rst moment formula for functions f 0. If f is a measurable function from E R, applying the result to |f | shows that if f L1 (), then M (|f |) < a.s. so M (f ) is well-dened for almost every , and denes a random variable as it is equal to M (f + ) M (f ). To establish the last formula of the theorem, in the case where f L1 (), follows by the same kind of arguments: rst, we establish the formula for f 1En in place of f . Then, to obtain the result, we must show that An (dx)(eif (x) 1) converges to E (dx)(eif (x) 1), where An = E0 En . But |eif (x) 1| |f (x)|, whence the function under consideration is integrable with respect to , giving the result (| E\An g(x)(dx)| E\An |g(x)|(dx) decreases to 0 whenever g is integrable).
66
7.3
Poisson point processes
We now show how Poisson random measures can be used to dene certain stochastic processes. Let (E, E) be a measurable space, and consider a -nite measure G on (E, E). Let be the product measure dt G(dx) on R+ E, where dt is the Lebesgue measure on (R+ , B(R+ )). Otherwise said, is the unique measure such that ([0, t] A) = tG(A) for t 0 and A E. A Lvy process (Xt , t 0) (with values in R) is a process with independent and e stationary increments, i.e. such that for every 0 = t0 t1 . . . tk , the random variables (Xti Xti1 , 1 i k) are independent with respective laws that of Xti ti1 , 1 i k. Equivalently, X is a Lvy process if and only if X (t) = (Xt+s Xt , s 0) has same law e as X and is independent of FtX = (Xs , 0 s t) for every t 0 (simple Markov property). Proposition 7.3.1 A Poisson random measure M whose intensity is of the above form is called a Poisson point process. If f be a measurable G-integrable function on E, then the process Ntf =
[0,t]E
f (x)M (ds, dx) ,
t 0,
is a Lvy process. Moreover, the process e Mtf =

[0,t]E
f (x)M (ds, dx) t

E
f (x)G(dx) ,
t 0,
is a martingale with respect to the ltration Ft = (M ([0, s] A), s t, A E), t 0. If moreover f L2 (), the process (Mtf )2 t
E
f (x)2 G(dx),
is an (Ft )-martingale.
f Proof. For s t, we have Ntf Ns = (s,t]E f (x)M (du, dx). Moreover, it is easy to check that M (du, dx)1{u(s,t]} has same law as the image of M (du, dx)1{u(0,ts]} under (u, x) (s + u, x) from R+ E to itself, and is independent of M (du, dx)1{u[0,s]} . We obtain that N f has stationary and independent increments. The fact that M f is a martingale is a straightforward consequence of the rst moment formula and the simple f f Markov property. The last statement comes from writing (Mtf )2 = (Mtf Ms + Ms )2 and expanding, then using the variance formula and the simple Markov property.
7.3.1
Example: the Poisson process
Let X1 , X2 , . . . be a sequence of independent exponential random variables with parameter , and dene 0 = T0 T1 . . . by Tn = X1 + . . . + Xn . We let
Nt
=
n=1
1{Tn t} ,
t 0,
7.3. POISSON POINT PROCESSES
67
be the c`dl`g process that counts the number of times Tn that are t. The process a a (Nt , t 0) is called the (homogeneous) Poisson process with intensity . This is the socalled Markovian description of the Poisson process, which is a jump-hold Markov process. The following alternative description makes use of Poisson random measures. We give the statement without proof, which can be found in textbooks, or make a good exercise (rst notice that with both denitions, N is a process with stationary and independent increments). Proposition 7.3.2 Let > 0, and let M be a Poisson random measure with intensity dt on R+ . Then the process Nt = M ([0, t]) , is a Poisson process with intensity . The set of atoms of the measure M itself is sometimes also called a Poisson (point) process with intensity . t0
7.3.2
Example: compound Poisson processes
A compound Poisson process with intensity is a process of the form Nt =

[0,t]R
xM (ds, dx) ,
t 0,
where M is a Poisson random measure with intensity dt (dx) and is a nite measure on R. Alternatively, if we write M in the form iI (ti ,xi ) , for every t 0 we can write Xt = xi whenever t = ti and Xt = 0 otherwise. As an exercise, one can prove that this is a.s. well-dened, i.e. that a.s., for every t 0, the set {i I : ti = t} has at most one element. With this notation, we can write Nt =
0st
Xs ,
t0
(notice that there is a.s. a nite set of times s [0, t] such that Xs = 0, so the sum is meaningful). There is a Markov jump-hold description for these processes as well: if N is a Poisson process with parameter = (R) and jump times 0 < T1 < T2 < . . ., and if Y1 , Y2 , . . . is a sequence of i.i.d. random variables, independent of N and with law /, then the process Yn 1{Tn t} ,
n1
t 0,
is a compound Poisson process with intensity . This comes from the following marking property of Poisson measures: suppose we have a description of a Poisson random measure M (dx) with intensity as iI Xi (dx), where (Xi , i I) is a countable family of random variables. If (Yi , i I) is a family of i.i.d. random variables with law , and independent
68
of M , then M = iI (Xi ,Yi ) is a Poisson random measure with intensity the product measure . We let CP() be the law of N1 , it is called the compound Poisson distribution with intensity . It can be written in the form CP() =
n0
e(R)
n , n!
where n is the n-fold convolution of the measure . Recall that the convolution of two nite measures , on R is the unique measure which is characterized by (A) =
1A (x + y)(dx)(dy) ,
A BR ,
and that if , are probability measures, then is the law of the sum of two independent random variables with respective laws , ). The characteristic function of CP() is given by CP() (u) = exp((R)(1 /(R) (u))), where /(R) is the characteristic function of /(R).
Chapter 8 Innitely divisible laws and Lvy e processes

In this chapter, we consider only random variables and processes with values in R.
8.1
Innitely divisible laws and Lvy-Khintchine fore mula
Denition 8.1.1 Let be a probability measure on (R, BR ). We say that is innitely divisible (ID) if for every n 1, there exists a probability distribution n such that if X1 , . . . , Xn are independent with law n , then their sum X1 + . . . + Xn has law . Otherwise said, for every n, there exists n such that n = , where stands for n the convolution operation for measures. Yet otherwise said, the characteristic function of is such that for every n 1, there exists another characteristic function n with n = . We stress that it is not the existence of a function whose n-th power is which n is problematic, but really that this function is a characteristic function. To start with, let us mention examples of ID laws. Constant random variables are ID. The Gaussian N (m, 2 ) is the convolution of n laws N (m/n, 2 /n), so it is ID. The Poisson law P() is also ID as the convolution of n laws P(/n). More generally, a compound Poisson law CP() is ID, as the n-th power of CP(/n). It is a bit harder to see, but however true, that exponential and geometric distributions are ID. However, the uniform distribution on [0, 1], or the Bernoulli distribution with parameter p (0, 1), are not ID. Suppose indeed that an ID law has a support which is bounded above and below by M > 0. Then the support of n is bounded by M/n, but then its variance is M 2 /n2 , which shows that the variance of is M 2 /n for every n, hence is a Dirac mass. The main goal of this chapter is to give a structural theorem for ID laws, the Lvye Khintchine formula. Say that a triple (a, q, ) is a Lvy triple if e a R, q 0, 69
70
CHAPTER 8. ID LAWS AND LEVY PROCESSES is a -nite measure on R such that ({0}) = 0 and (x2 1) (dx) < .
In particular, (1{|x|>} ) < for every > 0. Theorem 8.1.1 (Lvy-Khintchine formula) Let be an ID law. Then there exist a e unique Lvy triple (a, q, ) such that if is the characteristic function of , (u) = e(u) , e where is the characteristic exponent given by q (u) = iau u2 + 2 (eiux 1 iux1{|x|<1} )(dx).
R
We reobtain the constants for a = q = 0, the normal laws for = 0, and the compound Poisson laws for a = q = 0 and (dx) = (dx), a nite measure. Lemma 8.1.1 The characteristic function of an ID law never vanishes, and therefore, the characteristic exponent with (0) = 0 is well-dened and unique. Proof. If is ID, then = n for all n, where n is the characteristic exponent of n some law n . Therefore, || = |n |n , and taking logarithms, as n , we see that n converges pointwise to 1{=0} . However, is continuous and takes the value 1 at 0, so it is non-zero in a neighborhood of 0, and 1{=0} equals 1 (hence is continuous) in a neighborhood of 0. By Lvys convergence theorem, this shows that n weakly converges e to some distribution, which has no choice but to be 0 . In particular, never vanishes. To conclude, it is a standard topology exercise that a continuous function f : R C that never vanishes and such that f (0) = 1 can be uniquely lifted into a continuous function g : R C with g(0) = 0, so that eg = f . As a corollary, notice that n , the n-th root of , can itself be written in the form e for a unique continuous n satisfying n (0) = 0, so that n = /n. It also entails the uniqueness of n such that n = . n
n
Lemma 8.1.2 An ID law is the weak limit of compound Poisson laws. Proof. Let n be the characteristic function of n , as dened above. Then since (1(1 n ))n = , and n 1 pointwise, we obtain that n(1 n ) pointwise, taking the complex logarithm in a neighborhood of 1. In fact, this convergence even holds uniformly on compact neighborhood of 0, a fact that we will need later on. Exponentiating gives exp(n(1 n )) . However, on the left-hand size we can recognize the characterisitc function of a compound Poisson law with intensity nn . Proof of the Lvy-Khintchine formula. We must now prove that the limit of e n(1 n ) has the form given in the statement of the theorem. First of all, we make a technical modication of the statement, replacing the 1{|x|<1} in the statement by a continuous function h such that 1{|x|<1} h 1{|x|2} . This will just modify the value of a in the statement.
8.2. LEVY PROCESSES
71
Let n (dx) = (1 x2 )nn (dx), which is a sequence of measures with nite total mass. Suppose we know that the sequence (n , n 1) is tight and (n (R), n 1) is bounded, and let be the limit of n along some subsequence nk . Then (eiux 1)nn (dx) =
R R
(eiux 1)
=
R
n (dx) x2 1 (eiux 1 iuxh(x)) n (dx) + iu x2 1
(8.1) xh(x) n (dx) x2 1
=
R
(u, x)n (dx) + iuan
where n (u, x) =
(eiux 1 iuxh(x))/(x2 1) u2 /2
if x = 0 if x = 0,
and an = R xh(x) n (dx). Now, for each xed u, (, u) is a continuous bounded function, x2 1 and therefore, along the subsequence nk , R (u, x)n (dx) converges to R (u, x)(dx). Since the left-hand side in (8.1) converges to (u), this implies that ank converges to some a R. Therefore, if q = ({0}), we obtain that q (u) = iua u2 + 2 (eiux 1 iuxh(x))(dx),
R
where (dx) = 1{x=0} (x2 1)1 (dx) is a measure that is -nite, integrates x2 1, and does not charge 0. Hence the result. So, let us prove that (n , n 1) is tight and that the total masses are bounded. First, 2 x 1{|x|1} C(1 cos x) for some C > 0, so n (|x| 1) =
R
x2 1{|x|1} nn (dy) C
R
(1 cos x)nn (dx),
which converges to C (1) as n . Second, adapting Lemma 5.4.1, since n 1{|x|1} = nn 1{|x|1} , for some C > 0, and every K 1,
K 1
n (|x| K) CK
{|x|K 1 }
n(1 n (x))dx CK
n K 1
(x)dx,
where the limit can be taken because the convergence of the integrand is uniform on compact neighborhoods of 0, as stressed in the proof of Lemma 8.1.2. Now the limit can be made as small as wanted for K large enough, because is continuous and (0) = 0. This entails the result. The uniqueness statement will be proved in the next section.
8.2
Lvy processes e
In this section, all the Lvy processes under consideration start at X0 = 0. Lvy processes e e are closely related to ID laws, indeed, if X is a Lvy process, then the random variable e
72
CHAPTER 8. ID LAWS AND LEVY PROCESSES
X1 can be written as a sum of i.i.d. variables

n
X1 =
k=1
(Xk/n X(k1)/n ),
hence is ID. In fact, (laws of) c`dl`g Lvy processes are in one-to-one correspondence with a a e ID laws, as we show in this section. The rst problem we address is that the mapping X X1 is injective from the set of (laws of) c`dl`g Lvy processes to the set of ID laws. a a e Proposition 8.2.1 Let be an ID law. Then there exists at most one c`dl`g Lvy a a e process (Xt , t 0) such that X1 has law . Moreover, if has a Lvy triple (a, q, ) with e associated characteristic exponent q (u) = iau u2 + 2 (eiux 1 iux1{|x|<1} )(dx),
R
then the law of such a process X is entirely characterized by the formula. E[exp(iuXt )] = exp(t(u)). Proof. If X is as in the statement, then for n 1, X1/n must have /n as characteristic exponent by uniqueness of the characteristic exponent of the n-th root of an ID law. From this we deduce easily that E[exp(iuXt )] = exp(t(u)) for every t Q+ and u R. Since X is c`dl`g we deduce the result for every t R+ by approximating t by 2n 2n t . a a Therefore, the one-dimensional marginal distributions of X are uniquely determined by . It is then easy to check that the nite-marginal distributions of Lvy processes are in e turn determined by their one-dimensional marginal distributions, because the increments (Xtj Xtj1 , 1 j k), for any 0 = t0 t1 . . . tk , are independent with respective laws (Xtj tj1 , 1 j k). Hence the result. The next theorem is a kind of converse to this theorem, and gives an explicit construction of the c`dl`g Lvy process whose law at time 1 is a given ID law . Let (a, q, ) a a e be a Lvy triple associated to an ID law . Consider a Poisson random measure M on e R+ R with intensity dt (dx), and let (t, t ) = (t, x) if M has an atom of the form (t, x), and t = 0 otherwise. For any n 1, consider the martingale Ytn =
[0,t] R
1{n1 |y|<1} yM (ds, dy) t
y(dy)1{n1 |y|<1} ,
t0
associated by Proposition 7.3.1 with the Poisson measure M (dt, dx)1{n1 |x|<1} . Notice also that this last measure has always a nite number of atoms, because (dx)1{|x|>n1 } is a nite measure by assumption on , so that Ytn =
0st
t 1{n1 |t |<1} t
y(dy)1{n1 |y|<1} ,
t 0.
(8.2)
Independently of M , let Bt be a standard Brownian motion. Finally notice that Yt0 =

0st
t 1{|t |1} ,
t0
(8.3)
is a compound Poisson process with intensity (dx)1{|x|>1} . We let Ft be the -algebra generated by {Bs , Ys0 , Ysn , n 1; 0 s t}.
8.2. LEVY PROCESSES
73
Theorem 8.2.1 (Lvy-Its theorem) Let be an ID law, with Lvy triple (a, q, ), e o e 0 n and let B, Y , Y , n 1 denote the processes associated with this triple as explained above. Then there exists a c`dl`g square integrable (Ft )-martingale Y such that for every t 0, a a E Moreover, the process Xt = at + qBt + Yt0 + Yt , t0 sup |Ysn Ys |2 0.
n
0st
is a Lvy process such that X1 has distribution . e This theorem, which is extremely useful in practice, is an explicit construction of any c`dl`g Lvy process, out of four independent ingredients: a deterministic drift, a Brownian a a e motion, and a jump part made of a compound Poisson process, and a compensated L2 c`dl`g martingale. The compensation by a drift in the formula dening Y n is crucial, a a because the identity function is in general not in L1 (), so that [0,t][0,1] xM (ds, dx) is in general ill-dened. Proof. For every n > m > 0, the process Y n Y m is a c`dl`g martingale, and Doobs a a 2 L inequality gives E sup |Ysn Ysm |2 4E[|Ytn Ytm |2 ] = 4t
R
0st
y 2 (dy)1{n1 |x|<m1 }
4t
R
y 2 (dy)1{0<|x|<m1 } ,
where we used the last statement of Proposition 7.3.1 for the second equality. Since y 2 (dy)1{0<x<1} < , this can be made as small as wanted for m large enough. In particular, for every t, Ytn is a Cauchy sequence in L2 , and thus converges to a limit Yt in L2 . The process (Yt , t 0) then denes a martingale, as is checked by passing to the limit as n in E[Ytn |Fs ] = Ysn . Moreover, by passing to the limit as n , we obtain that sup0st |Ys Ysm |2 converges in L2 to 0 as m , for every t 0. By extracting along a subsequence, we may assume that the convergence is almost-sure, so that Y is the a.s. uniform limit over compacts of c`dl`g processes, hence is also a c`dl`g a a a a process (in fact, admits a c`dl`g version). a a Therefore, the process X dened in the statement is indeed a c`dl`g process, and it is a a easy to show that it is a Lvy process, being a pointwise L2 limit of Lvy processes. The e e last thing that remains to be proved is that X1 has law . But from the independence of the components used to build X, we obtain that if Xtn = at + qBt + Yt0 + Ytn , q n E[exp(iuX1 )] = exp iua u2 + 2 +
R
(eiuy 1)(dy)1{|x|1}
R
(eiuy 1 iuy)(dy)1{n1 |y|<1} .
By passing to the limit as n , we obtain that X1 has the characteristic function associated to .
74
CHAPTER 8. ID LAWS AND LEVY PROCESSES
Proof of the uniqueness in Theorem 8.1.1. Let be an ID law with Lvy triple e (a, q, ), and let X be the unique c`dl`g Lvy process such that X1 has law , given by a a e Proposition 8.2.1. Then Theorem 8.2.1 shows that the jumps of X between times 0 and t, i.e. the process (s , 0 s t) dened by s = Xs Xs , s 0, are the atoms of a Poisson random measure M with intensity t on R. This intensity is determined by the law of M through the rst moment formula t(A) = E[M (A)] of Proposition 7.2.1. Then, by dening Y 0 and Y n by the formulas (8.2), (8.3) and letting Y = limn Y n along a subsequence according to which the limit is almost-sure uniformly on compacts, we obtain that X Y 0 Y is a (scaled) Brownian motion with drift, with same law as B = (at + qBt , t 0). We can recover a as the expectation of B1 , and q as its variance. Finally, we see that uniquely determines its Lvy triple. e
Chapter 9 Exercises
Warmup
The exercises of this section are designed to help remind you of basic concepts of probability theory (random variables, expectation, classical probability distributions, BorelCantelli lemmas). The last one is a longer exercise that contains the basic results on uniform integrability that are needed in this course. Exercise 9.0.1 Remind yourself what the following classical discrete distributions are : Bernoulli with parameter p [0, 1], binomial with parameters (n, p) N [0, 1], geometric with parameter p [0, 1], Poisson with parameter 0. Do so with the following classical distributions on R: uniform on [a, b], exponential with mean 1 , gamma with (positive) parameters (a, ) (mean a/, variance a/2 ), beta with (positive) parameters (a, b), Gaussian with mean m and variance 2 , Cauchy with parameter a. Exercise 9.0.2 Compute the distribution of 1/N 2 , where N is a standard Gaussian N (0, 1) random variable. What is the distribution of N/N , where N, N are two independent such random variables ? Exercise 9.0.3 Show that for any countable set I and I-indexed family (Xi , i I) of non-negative random variables, then supiI E[Xi ] E[supiI Xi ]. Show that these two quantities are equal if for every i, j I there exists some k I such that Xi Xj Xk . Exercise 9.0.4 Fix > 0, and let (Zn , n 0) be a sequence of independent random variables with values in {0, 1}, whose laws are characterized by 1 P (Zn = 1) = = 1 P (Zn = 0). n 1 Show that Zn converges to 0 in L . Show that lim supn Zn is 0 a.s. if > 1 and 1 a.s. if 1. 75
76
CHAPTER 9. EXERCISES
Exercise 9.0.5 Let (Xn , n 1) be a sequence of independent exponential random variables with mean 1. Show that lim supn (log n)1 Xn = 1 a.s. Exercise 9.0.6 Let N be a random N (0, 1) random variable. Show that 1 P (N > x) exp(x2 /2). x 2 Show in fact that as x , 1 P (N > x) = exp(x2 /2)(1 + o(1)). x 2 Let (Yn , n 1) be a sequence of independent such Gaussian variables. lim supn (2 log n)1/2 Yn = 1 a.s. Show that
Exercise 9.0.7 The basics of uniform integrability Let (E, A, ) be a measured space, with (E) < . If f is a measurable non-negative function, we let (f ) be a shorthand for E f d. A family of R-valued functions (fi , i I) in L1 (E, A, ) is said to be uniformly integrable (U.I. in short) if the following holds : sup (|fi |1{|fi |>a} ) 0.
iI a
You may think of (E, A, ) and the fi as being a probability space and random variables. 1. Show that a U.I. family is bounded in L1 (E, A, ). Show that the converse is not true. 2. Show that a nite family of integrable functions is U.I. 3. Let G : R+ R+ be a measurable function such that limx x1 G(x) = +. Show that for every C > 0, the family {f L1 (E, A, ) : (G(|f |)) C} is U.I. Deduce that a family of measurable functions that is bounded in Lp (E, A, ) for some p > 1 is U.I. 4. (Harder) Show that the converse is true : if (fi , i I) is a U.I. family, then there exists a function G as in 3. so that (fi , i I) is included in a set of the form of previous displayed expression. (Hint : consider an increasing positive sequence (an , n 0) such that supiI (|fi |1{fi an } ) 2n for every n) 5. Let (fi , i I) be a family that is bounded in L1 (E, A, ). Show that (i) and (ii) below are equivalent : (i) (fi , i I) is U.I. (ii) > 0, > 0 s.t. A A, (A) < = supiI (|fi |1A ) < . 6. Show that if (fi , i I) and (gj , j J) are two U.I. families, then (fi +gj , i I, j J) is also U.I.
9.1. CONDITIONAL EXPECTATION
77
7. Let (fn , n 0) be a sequence of L1 functions that converges in measure to a measurable function f , i.e. for every > 0, ({|f fn | > }) 0.
n
Show that (fn , n 0) converges in L1 to f if and only if (fn , n 0) is U.I. Hint : For the necessary condition, you might nd useful to consider sets such as {|f fn | > 1}, { < |f fn | 1} and {|f fn | }. Remark. This shows that a sequence of random variables converging in probability (or a.s.) to some other random variable has an upgraded L1 convergence if and only if it is uniformly integrable.
9.1
Conditional expectation
Exercise 9.1.1 Let X, Y be two random variables in L1 so that E[X|Y ] = Y and E[Y |X] = X.
Show that X = Y a.s. As a hint, you may want to consider quantities like E[(X Y )1{X>c,Y c} ] + E[(X Y )1{Xc,Y c} ]. Exercise 9.1.2 Let X, Y be two independent Bernoulli random variables with parameter p (0, 1). Let Z = 1{X+Y =0} . Compute E[X|Z], E[Y |Z]. Exercise 9.1.3 Let X 0 be a random variable on a probability space (, F, P ), and let G F be a sub--algebra. Show that X > 0 implies that E[X|G] > 0, up to an event of zero probability. Show that {E[X|G] > 0} is actually the smallest G-measurable event that contains the event {X > 0}, up to zero probability events. Exercise 9.1.4 Check that the sum Z of two independent exponential random variables X, Y with parameter > 0 (mean 1/) is a gamma distribution with parameter (2, ), whose density with respect to Lebesgue measure is 2 x exp(x)1{x0} . Show that for every non-negative measurable h, 1 Z h(u)du. E[h(X)|Z] = Z 0 Conversely, let Z be a random variable with a (2, ) distribution, and suppose X is a random variable whose conditional distribution given Z is uniform on [0, Z]. Namely, Z for every Borel non-negative function h, E[h(X)|Z] = Z 1 0 h(x)dx a.s. Show that X and Z X are independent, with exponential law.
78
Exercise 9.1.5 Suppose given a, b > 0, and let X, Y be two random variables with values in Z+ and R+ respectively, whose distribution is characterized by the formula
t
P (X = n, Y t) = b
0
(ay)n exp((a + b)y)dy. n!
Let n Z+ and h : R+ R+ be a measurable function, compute E[h(Y )|X = n]. Then compute E[Y /(X + 1)], E[1{X=n} |Y ] and E[X|Y ]. Exercise 9.1.6 Let (X, Y1 , . . . , Yn ) be a random vector with components in L2 . Show that the best approximation of X in the L2 norm by an ane combination of the (Yi , 1 i n), say of the form 0 + n i (Yi E[Yi ]), is given by 0 = E[X] and any solution (1 , . . . , n ) i=1 of the linear system
n
Cov (X, Yj ) =
i=1
i Cov (Yi , Yj ) ,
1 j n.
This ane combination is called the linear regression of X with respect to (Y1 , . . . , Yn ). If (X, Y1 , . . . , Yn ) is a Gaussian random vector, show that E[X|Y1 , . . . , Yn ] equals the linear regression of X with respect to (Y1 , . . . , Yn ). Exercise 9.1.7 Let X L1 (, F, P ). Show that the family {E[X|G] : G is a sub--algebra of F} is uniformly integrable. Exercise 9.1.8 Conditional independence Let G F be a sub--algebra. Two random variables X, Y are said to be independent conditionally on G if for every non-negative measurable f, g, E[f (X)g(Y )|G] = E[f (X)|G] E[g(Y )|G]. What are two random variables independent conditionally on {, }? On F? 1. Show that X, Y are independent conditionally on G if and only if for every nonnegative G-measurable random variable Z, and every f, g non-negative measurable functions, E[f (X)g(Y )Z] = E[f (X)ZE[g(Y )|G]], and this if and only if for every measurable non-negative g, E[g(Y )|G (X)] = E[g(Y )|G]. Comment the case G = {, }. 2. Suppose given three random variables X, Y, Z with a positive density p(x, y, z). Suppose X, Y are independent conditionally on (Z). Show that there exist measurable positive functions r, s so that p(x, y, z) = q(z)r(x, z)s(y, z) where q is the density of Z, and conversely.
9.2. DISCRETE-TIME MARTINGALES
79
9.2
Discrete-time martingales
Exercise 9.2.1 Let (Xn , n 0) be an integrable process with values in a countable subset E R. Show that X is a martingale with respect to its natural ltration if and only if for every n and every i0 , . . . , in E, E[Xn+1 |X0 = i0 , . . . , Xn = in ] = in . Exercise 9.2.2 Let (Xn , n 1) be a sequence of independent random variables with respective laws given by 1 n2 1 = 1 2. P Xn = 2 P (Xn = n2 ) = 2 , n n 1 n Let Sn = X1 + . . . + Xn . Show that Sn /n 1 a.s. as n , and deduce that (Sn , n 0) is a martingale which converges to +. Exercise 9.2.3 Let (, F, (Fn ), P ) be ltered probability space. Let A Fn for some n, and let m, m n. Show that m1A + m 1Ac is a stopping time. Show that an adapted process (Xn , n 0) with respect to some ltered probability space is a martingale if and only if it is integrable, and for every bounded stopping time T , E[XT ] = E[X0 ]. Exercise 9.2.4 Let X be a martingale (resp. supermartingale) on some ltered probability space, and let T be an a.s. nite stopping time. Prove that E[XT ] = E[X0 ] (resp. E[XT ] E[X0 ]) if either one of the following conditions holds: 1. X is bounded (M > 0 : n 0, |Xn | M a.s.). 2. X has bounded increments (M > 0 : n 0, |Xn+1 Xn | M a.s.) and E[T ] < . Exercise 9.2.5 Let (Xn , n 0) be a non-negative supermartingale. Show the maximal inequality for a > 0: aP
0kn
max Xk a
E[X0 ].
Exercise 9.2.6 Let T be an (Fn , n 0)-stopping time such that for some integer N > 0 and > 0, P (T N + n|Fn ) , for every n 0.
Show that E[T ] < . Hint: Find bounds for P (T > kN ).
80
Exercise 9.2.7 Your winnings per unit stake on game n are n , where (n , n 0) is a sequence of independent random variables with P (n = 1) = p , P (n = 1) = 1 p = q,
where p (1/2, 1). Your stake Cn on game n must lie between 0 and Zn1 , where Zn1 is your fortune at time n 1. Your object is to maximize the expected interest rate E[log(ZN /Z0 )] where N is a given integer representing the length of the game, and Z0 , your fortune at time 0, is a given constant. Let Fn = {1 , . . . , n }. Show that if C is any previsible strategy, that is if Cn is Fn1 -measurable for all n, then log Zn n is a supermartingale, where denotes the entropy = p log p + q log q + log 2, so that E[log(ZN /Z0 )] N , but that, for a certain strategy, log Zn n is a martingale. What is the best strategy? Exercise 9.2.8 Plyas urn o Consider an urn that initially contains two balls, one black, one white. One picks at random one of the balls with equal probability, checks the color, replaces the ball in the urn and adds another ball of the same color. Then resume the procedure. After step n, n + 2 balls are in the urn, of which Bn + 1 are black and n + 1 Bn are white. 1. Show that ((n + 2)1 (Bn + 1), n 0) is a martingale with respect to a certain ltration you should indicate. Show that it converges a.s. and in Lp for all p 1 to a [0, 1]-valued random variable X . 2. Show that for every k, the process (Bn + 1)(Bn + 2) . . . (Bn + k) , (n + 2)(n + 3) . . . (n + k + 1) n1
k is a martingale. Deduce the value of E[X ], and nally the law of X . 3. Re-obtain this result by directly showing that P (Bn = k) = (n + 1)1 for every n 1, 1 k n. As a hint, let Yi be the indicator that the i-th picked ball is black, and compute P (Yi = ai , 1 i n) for any (ai , 1 i n) {0, 1}n . 4. Show that for 0 < < 1, (Nn , n 0) is a martingale, where Nn =
(n + 1)! Bn (1 )nBn Bn !(n Bn )!
Exercise 9.2.9 Bayes urn Let U be a uniform random variable on [0, 1], and conditionally on U , let X1 , X2 , . . . be independent Bernoulli random variables with parameter U . Let Bn = n Xi . Show that i=1 for every n, (B1 , . . . , Bn ) has the same law as the sequence (B1 , . . . , Bn ) in the previous exercise. Show that Nn is a conditional density function of U given B1 , . . . , Bn . Exercise 9.2.10 Monkey typing ABRACADABRA A monkey types a text at random on a keyboard, so that each new letter is picked
81
uniformly at random among the 26 letters of the roman alphabet. Let Xn be the n-th letter of the monkeys masterpiece, and let T be the rst time when the monkey has typed the exact word ABRACADABRA T = inf{n 0 : (Xn10 , Xn9 , . . . , Xn ) = (A, B, R, A, C, A, D, A, B, R, A)}. Show that E[T ] < . The goal is to give the exact value of E[T ]. For this, suppose that just before each time n, a player Pn comes and bets 1 gold coin (GC) that Xn will be A. If he loses, he leaves the game, and if he wins, he earns 26GC, which he entirely plays on Xn+1 being B. If he loses, he leaves, else he earns 262 GC which he bets on Xn+2 being R, and so on. Show that E[T ] = 2611 + 264 + 26. (Hint: Use exercise 9.2.4) Why is that larger than the average rst time the monkey has typed ABRACADABRI? Exercise 9.2.11 Let (Xn , n 0) be a sequence of [0, 1]-valued random variables, which satisfy the following property. First, X0 = a a.s. for some a (0, 1), and for n 0, P Xn+1 = Xn Fn 2 = 1 Xn , P Xn+1 = 1 + Xn Fn 2 = Xn ,
where Fn = {Xk , 0 k n}. Here, we have denoted P (A|G) = E[1A |G]. 1. Prove that (Xn , n 0) is a martingale that converges in Lp for every p 1. 2. Check that E[(Xn+1 Xn )2 ] = E[Xn (1 Xn )]/4. Then determine E[X (1 X )] and deduce the law of X . Exercise 9.2.12 Let (Xn , n 0) be a martingale in L2 . Show that its increments (Xn+1 Xn , n 0) are pairwise orthogonal. Conclude that X is bounded in L2 if and only if E[(Xn+1 Xn )2 ] < ,
n0
and that Xn converges in L2 in this case, without using the L2 convergence theorem for martingales. Exercise 9.2.13 Walds identity Let (Xn , n 0) be a sequence of independent and identically distributed real integrable random variables, which are not a.s. 0. We let Sn = X1 +. . .+Xn be the associated random walk, and recall that (Sn nE[X1 ], n 0) is a martingale. Let T be a (Fn )-stopping time. 1. Show that
E[|ST n ST |]
k=n+1
E[|Xk |1{T k} ] E[|X1 |]E[T 1{T n+1} ].
82
Deduce that if E[T ] < , then ST n converges to ST in L1 . Deduce that if E[T ] < , then E[ST ] = E[X1 ]E[T ]. 2. Suppose E[X1 ] = 0 and Ta = inf{n 0 : Sn > a} for some a > 0. Show that E[Ta ] = . 3. Let now a < 0 b}. Assume that E[X1 ] = 0. By discussing separately the cases where X1 is bounded or not, prove that E[Ta,b ] < and that E[STa,b ] = E[X1 ]E[Ta,b ]. 4. Assume that E[X1 ] = 0. Show that E[Ta,b ] < . Hint: consider again separately the cases when X1 is bounded and unbounded. In the bounded case, think how far 2 (Sn , n 0) is from being a martingale. Exercise 9.2.14 The gamblers ruin Let 0 < K < N be integers. Consider a sequence of independent random variables (Xn , n 1) with P (Xn = 1) = p = 1 P (Xn = 1), where p (0, 1/2) (1/2, 1). Let Sn = X1 + . . . + Xn and dene T0 = inf{n 1 : Sn = 0} , TN = inf{n 1 : Sn = N }.
Show that T := T0 TN is a.s. nite (and in fact has nite expectation). Then show that, letting q = 1 p, Mn = q p
Sn
Nn = Sn (p q)n ,
n 0,
denes two martingales with respect to the natural ltration of (Sn , n 1). Compute P (T0 < TN ) and E[ST ], E[T ]. What happens to this exercise if p = 1/2? Exercise 9.2.15 Azuma-Hding inequality 1. Let Y be a random variable taking values in [c, c] for some c > 0, and such that E[Y ] = 0. Show that for every R, E[eY ] cosh c exp As a hint, the convexity of z ez entails that ey y + c c c y c e + e . 2c 2c 2 c2 2 .
Also, state and prove a conditional version of this fact. 2. Let M be a martingale with M0 = 0, and such that there exists a sequence (cn , n 0) of positive real numbers such that |Mn Mn1 | cn for every n. Show that for x 0, x2 P sup Mk x exp . 2 n c2 0kn k=1 k As a hint, notice that (eMn , n 0) is a submartingale, and optimize over .
83
Exercise 9.2.16 A discrete Girsanov theorem Let be the space of real-valued sequences (n , n 0) such that lim supn n = + and lim inf n n = . We say that such sequences oscillate. Let Fn = {Xk , 0 k n} where Xk () = k is the k-th projection, and F = F . Show that p = 1/2 is the only real in (0, 1) such that there exists a probability measure Pp on (, F) that makes (Xn , n 0) a simple random walk with step distributions Pp (X1 = 1) = p = 1 P (X1 = 1). Let Pp,n be the unique probability measure on (, Fn ) that makes (Xk , 0 k n) a simple random walk with these step distributions. If p (0, 1) \ {1/2}, identify the martingale dPp,n . Mn = dP1/2,n Find a nite stopping time T such that E1/2 [MT ] < 1. Exercise 9.2.17 Let f : [0, 1] R be a Lipschitz function, i.e. |f (x) f (y)| K|x y| for some K > 0 and every x, y. Let fn be the function obtained by interpolating linearly between the values of f taken at numbers of the form k2n , 0 k 2n , and let Mn = fn . 1. Show that Mn is a martingale in some ltration. 2. Deduce that there exists an integrable function g : [0, 1] R such that f (x) = x f (0) + 0 g(y)dy for almost every 0 x 1. Exercise 9.2.18 Doobs decomposition of submartingales Let (Xn , n 0) be a submartingale. 1. Show that there exists a unique martingale Mn and a unique previsible process (An , n 0) such that A0 = 0, A is increasing and X = M + A. 2. Show that M, A are bounded in L1 if and only if X is, and that A < a.s. in this case (and even that E[A ] < ), where A is the increasing limit of An as n . Exercise 9.2.19 Let (Xn , n 0) be a U.I. submartingale. 1. Show that if X = M + A is the Doob decomposition of X, then M is U.I. 2. Show that for every pair of stopping times S, T , with S T , E[XT |FS ] XS Exercise 9.2.20 Quadratic variation Let (Xn , n 0) be a square-integrable martingale. 1. Show that there exists a unique increasing previsible process starting at 0, which 2 we denote by ( X n , n 0), so that (Xn X n , n 0) is a martingale. 2. Let C be a bounded previsible process. Compute C X . 3. Let T be a stopping time, show that X T = X T . 4. (Harder) Show that X < implies that Xn converges as n , up to a zero probability event. Is the converse true? Show that it is when supn0 |Xn+1 Xn | K a.s. for some K > 0.
84
9.3
Continuous-time processes
Exercise 9.3.1 Gaussian processes A real-valued process (Xt , t 0) is called a Gaussian process if for every t1 < t2 < . . . < tk , the random vector (Xt1 , . . . , Xtk ) is a Gaussian random vector. Show that the law of a Gaussian process is uniquely characterized by the numbers E[Xt ], t 0 and Cov (Xs , Xt ) for s, t 0. Exercise 9.3.2 Let T be an exponential random variable with parameter > 0. Dene Zt = 0 1 if t < T , if t T Ft = {Zs , 0 s t} , Mt = 1 et 1 if t < T . if t T
Show that E[|Mt |] < for every t 0, and that E[Mt 1{T >r} ] = E[Ms 1{T >r} ] for every r s t. Deduce that (Mt , t 0) is a c`dl`g (Ft )-martingale. a a 1 Is M bounded in L ? Is it uniformly integrable? Is MT in L1 ? Exercise 9.3.3 Hazard function Let T be a random variable in (0, ) that admits a strictly positive continuous density f on (0, ). Let F (t) = P (T t). Let
t
At =
0
f (s)ds , 1 F (s)
t0
to be the hazard function of T . Show that AT has the law of an exponential random variable with parameter 1. As a hint, consider the distribution function P (AT t), t 0 and write it in terms of the inverse function A1 . By letting Zt = 1tT , t 0 and Ft = {Zs , 0 s t}, prove that (Zt AT t , t 0) is a c`dl`g martingale with respect to (Ft , t 0). a a The next exercises are designed to (hopefully) help those of you who want to have a better insight on the nature of ltrations and events related to continuous-time processes. Exercise 9.3.4 Let C1 be the product -algebra on = C([0, 1], R), i.e. the smallest -algebra that makes the applications Xt : (t) for t 0 measurable. Let C2 be the (more natural?) Borel -algebra on C([0, 1], R), when endowed with the uniform norm and the associated topology. Show that C1 = C2 . Exercise 9.3.5 Let I be a nonempty real interval. Let = RI be the set of all functions dened on I, which is endowed with the product -algebra F, i.e. the smallest -algebra with respect to which Xt : (t) is measurable for every t. Show that G=
J I
(Xs , s J)
9.4. WEAK CONVERGENCE is a -algebra, where J Show that the set
85
I stands for J I and J is countable. Deduce that G = F. { : s Xs () is continuous}
is not measurable with respect to F.
9.4
Weak convergence
Exercise 9.4.1 Let (Xn , n 1) be a sequence of independent random variables with uniform distribution on [0, 1]. Let Mn = max(X1 , . . . , Xn ). Show that n(1 Mn ) converges in distribution as n , and determine the limit law. Exercise 9.4.2 Let (Xn , n N {}) be random variables dened on some probability space (, F, P ), with values in a metric space (M, d). 1. Suppose that Xn X a.s. as n . Show that Xn converges to X in distribution. 2. Suppose that Xn converges in probability to X . Show that Xn converges in distribution to X . Hint: use the fact that (Xn , n 0) converges in probability to X if and only if for every subsequence extracted from (Xn , n 0), there exists a further subsequence converging a.s. to X . 3. If Xn converges in distribution to a constant X = c, then Xn converges in probability to c. Exercise 9.4.3 Suppose given sequences (Xn , n 0), (Yn , n 0) of real-valued random variables, and two extra random variables X, Y , such that Xn , Yn respectively converge in distribution to X, Y . Is it true that (Xn , Yn ) converges in distribution to (X, Y )? Show that this is true in the following cases 1. For every n, Xn and Yn are independent, as well as X and Y . 2. Y is a.s. constant (Hint: use 3. in the previous exercise). Exercise 9.4.4 Let m be a probability measure on R. Dene, for every n 0, mn (dx) =
kZ
m([k2n , (k + 1)2n ))k2n (dx),
where z (dx) denotes the Dirac mass at z. Show that mn converges weakly to m. Exercise 9.4.5 1. Let (Xn , n 1) be independent exponential random variables with mean 1. Dene
86
Sn = X1 +. . .+Xn , and determine without computation the limit of P (Sn n) as n (Hint: which theorem could be useful here?). 2. Determine also without computation the limit of exp(n) n (k!)1 nk . k=0 Hint: recall that the Poisson law with parameter > 0 is the probability distribution on Z+ that puts mass e n /n! on the integer n. Then if X, Y are two random variables which are independent and with respective laws that are Poisson with parameters , , then X + Y has a Poisson law with parameter + . Using this, make the formula look like question 1. Exercise 9.4.6 2 Let (Yn , n 0) be a sequence of random variables so that Yn follows a Gaussian N (mn , n ) law, and suppose that Yn weakly converges to some Y as n . Show that there exist 2 m R and 2 0 so that mn m, n 2 , and that Y is Gaussian N (m, 2 ). Hint: Use characteristic functions, and rst show that the variance converges. Exercise 9.4.7 Let d 1. 1. Show that a nite family of probability measures on Rd is tight. 2. Assuming Prokhorovs theorem for probability measures on Rd , show that if (n , n 0) is a sequence of non-negative measures on Rd which is tight (for every > 0 there is a compact K Rd such that supn0 n (Rd \ K) < ) and such that sup n (Rd ) < ,
n0
then there exists a subsequence nk along which n weakly converges to a limit (i.e. nk (f ) converges to (f ) for every bounded continuous f ).
9.5
Brownian motion
Exercise 9.5.1 Recall that a Gaussian process (Xt , t 0) in Rd is a process such that for every t1 < t2 < . . . < tk R+ , the vector (Xt1 , . . . , Xtk ) is a Gaussian random vector. Show that the (standard) Brownian motion in Rd is the unique Gaussian process (Bt , t 0) with E[Bt ] = 0 for every t 0 and Cov (Bs , Bt ) = (s t) Id for every s, t 0. Exercise 9.5.2 Let B be a standard real-valued Brownian motion. 1. Show that a.s., Bt lim sup = , t t0 Bt lim inf = . t0 t
2. Show that Bn /n 0 a.s. as n . Then show that a.s. for n large enough, supt[n,n+1] |Bt Bn | n, and conclude that Bt /t 0 a.s. as t .
9.5. BROWNIAN MOTION 3. Show that the process Bt = tB1/t 0 if t > 0 , if t = 0 t0
87
is a standard Brownian motion (Hint: Use Exercise 9.5.1). 4. Use this to show that Bt lim sup = , t t Bt lim inf = . t t
Exercise 9.5.3 Around hitting times Let (Bt , t 0) be a standard real-valued Brownian motion. 1. Let Tx = inf{t 0 : Bt = x} for x R. Prove that Tx has same distribution as (x/B1 )2 , and compute its probability distribution function. 2. For x, y > 0, show that P (Ty < Tx ) = x x+y , E[Tx Ty ] = xy.
3. Show that if 0 < x < y, the random variable Ty Tx has same law as Tyx , and is independent of FTx (where (Ft , t 0) is the natural ltration of B). Hint: the three questions are independent. Exercise 9.5.4 Let (Bt , t 0) be a standard real-valued Brownian motion. Compute the joint distribution of (Bt , sup0st Bs ) for t 0. Exercise 9.5.5 Let (Bt , t 0) be a standard Brownian motion, and let 0 a < b. 1. Compute the mean and variance of
2n
Xn :=
k=1
Ba+k(ba)2n Ba+(k1)(ba)2n
2. Show that Xn converges a.s. and give its limit. 3. Deduce that a.s. there exists no interval [a, b] with a 1/2 on [a, b], i.e. supas,tb (|Bt Bs |/|t s| ) < . Exercise 9.5.6 Let (Bt , t 0) be a standard Brownian motion. Dene G1 = sup{t 1 : Bt = 0} and D1 = inf{t 1 : Bt = 0}. 1. Are these random variables stopping times? Show that G1 has same distribution 1 as D1 . 2. By applying the Markov property at time 1, compute the law of D1 . Deduce that of G1 (it is called the arcsine law).
88
Exercise 9.5.7 Let (Bt , t 0) be a standard Brownian motion, and let (Ft , t 0) be its natural ltration. Determine all the polynomials f (t, x) of degree less than or equal to 3 in x such that (f (t, Bt ), t 0) is a martingale. Exercise 9.5.8 Let (Bt , t 0) be a standard Brownian motion in R3 . We let Rt = 1/|Bt |. 1. Show that (Rt , t 1) is bounded in L2 . 2. Show that E[Rt ] 0 as t . 3. Show that Rt , t 1 is a supermartingale. Deduce that |Bt | as t , a.s. Exercise 9.5.9 Zeros of Brownian motion Let (Bt , t 0) be a standard real-valued Brownian motion. Let Z = {t 0 : Bt = 0} be the set of zeros of B. 1. Show that it is closed, unbounded and has zero Lebesgue measure a.s. 2. By using the stopping times Dq = inf{t q : Bt = 0} for q Q+ , show that Z has no isolated point a.s. Exercise 9.5.10 Let W0 (dw) denote Wieners measure on 0 = {w C([0, 1]) : w(0) = 0}, and dene a (a) new probability measure W0 on 0 by dW0 (w) = exp(aw(1) a2 /2). dW0 1. Show that under W0 , the canonical process Xt : w w(t) remains Gaussian, and give its distribution. 2. Show that W0 ({f 0 : f < }) > 0 for every > 0, where f = sup0t1 |f (t)|. 3. Show that for every non-empty open set U 0 , one has W0 (A) > 0. Hint: First note that any such U contains the -neighborhood of a function f which is piecewise linear, for some > 0. Exercise 9.5.11 Brownian bridge Let (Bt , 0 t 1) be a standard Brownian motion. We let (Zty = yt + (Bt tB1 ), 0 y t 1) for any y R, and call it the Brownian bridge from 0 to y. Let W0 be the y law of (Zt , 0 t 1) on C([0, 1]). Show that for any non-negative measurable function y F : C([0, 1]) R+ for f (y) = W0 (F ), we have E[F (B)|B1 ] = f (B1 ) , a.s.
(a) (a)
Hint: nd a simple argument entailing that B1 is independent of the process (Bt tB1 , 0 t 1). y Explain why we can interpret W0 as the law of a Brownian motion conditioned to hit y at time 1.
9.6. POISSON MEASURES, ID LAWS AND LEVY PROCESSES
89
Exercise 9.5.12 Show that the Dirichlet problem on D = B(0, 1) \ {0} in Rd , with boundary conditions g(x) = 0 for |x| = 1 and g(x) = 1 for x = 0, has no solution for d 2. Exercise 9.5.13 Dirichlet problem in the upper-half plane Let H = {(x, y) R2 : y > 0}. Let (Bt , t 0) be a Brownian motion started from x under the probability measure Px , and let T = inf{t 0 : Bt H}. / 1. Determine the law of BT under Px whenever x H. 2. Show that if u is a bounded continuous function on H which is harmonic on H, then 1 y dz u(z, 0) u(x, y) = dz. (x z)2 + y 2 R
9.6
Poisson measures, ID laws and Lvy processes e
Exercise 9.6.1 Prove that the Poisson law with parameter > 0 is the weak limit of the Binomial law with parameters (n, /n) as n . A factory makes 500,000 light bulbs in a day. On an average, 4 of these are defectuous. Estimate the probability that on some given day, 2 of the produced light bulbs were defectuous. Exercise 9.6.2 The bus paradox Why do we always feel we are waiting a very long time before buses arrive? This exercise gives in indication of why... well, if buses arrive according to a Poisson process. 1. Suppose buses are circulating in a city day and night since ever, the counterpart being that drivers do not ociate with a timetable. Rather, the times of arrival of buses at a given bus-stop are the atoms of a Poisson measure on R with intensity dt, where dt is Lebesgue measure on R. A customer arrives at a xed time t at the bus-stop. Let S, T be the two consecutive atoms of the Poisson measure satisfying S < t < T . Show that the average time E[T S] that elapses between the arrivals of the last bus before time t and the rst bus after time t is 2/. Explain why this is twice the average time between consecutive buses. Can you see why this is so? 2. Suppose that buses start circulating at time 0, so that arrivals of buses at the station are now the jump times of a Poisson process with intensity on R+ . If the customer arrives at time t, show that the average elapsed time between the bus before (time S) and after his arrival (time T ) is 1 (2 et ) (with the convention S = 0 if no atom has fallen in [0, t]). Exercise 9.6.3 Prove Proposition 7.3.2. Exercise 9.6.4 Check the marking property of Poisson random measures: if M (dx) =
iI
xi (dx) is
90
a Poisson random measure on E, E with intensity , and if yi , i I are i.i.d. random variables with law on some measurable space (F, F), and independent of M , then iI (xi ,yi ) (dx dy) is a Poisson random measure on E F with intensity . Exercise 9.6.5 Brownian motion and the Cauchy process 1 2 Let (Bt = (Bt , Bt ), t 0) be a standard Brownian motion in R2 (i.e. B0 = 0). Recall that the Cauchy law with parameter a > 0 has probability distribution function a/((a2 + x2 )), x R. We let 2 a 0. Ca = inf{t 0 : Bt = a} ,
1 Prove that the process (BCa , a 0) is a Lvy process such that Ca is a Cauchy law with e parameter a for every a > 0. Does this remind you of a previous exercise?
Index
c`dl`g, 29 a a Blumenthals 0 1 law, 47 Branching process, 23 Brownian motion, 45 (Ft )-Brownian motion, 48 nite marginal distributions, 45 standard, 45 Central limit theorem, 40 Characteristic exponent, 70 Compound Poisson distribution, 68 Compound Poisson process, 67 Conditional convergence theorems, 9 Conditional density functions, 11 Conditional expectation discrete case, 5 for L1 random variables, 6 for non-negative random variables, 8 Conditional Jensen inequality, 9 Convergence in distribution, 39 Dirichlet problem, 55 Donskers invariance principle, 59 Doobs Lp inequality, 19, 34 Doobs maximal inequality, 18 Doobs upcrossing lemma, 17 Exterior cone condition, 55 Optional stopping Filtration, 13 for discrete-time martingales, 16 ltered space, 13 for uniformly integrable martingales, 20, natural ltration, 13 34 Finite marginal distributions, 31 Poisson point process, 66 First entrance time, 14, 30 First hitting times for Brownian motion, 50, Poisson random measure, 63 Prokhorovs theorem, 40 51 Harmonic function, 56 Innitely divisible distribution, 69 Radon-Nikodym theorem, 25 Recurrence, 53 Reection principle, 49 91 Intensity measure, 63 Kakutanis product-martingales theorem, 26 Kolmogorovs 0 1 law, 23 Kolmogorovs continuity criterion, 35 Lvy process, 66, 71 e Lvy triple, 69 e Lvys convergence theorem, 41 e Lvy-It theorem, 73 e o Lvy-Khintchine formula, 70 e Laplace functional, 65 Last exit time, 14 Law of large numbers, 23 Likelihood ratio test, 28 Martingale, 14 backwards, 21 closed, 19 complex-valued, 51 convergence theorem almost-sure, 17, 34 for backwards martingales, 21 in L1 , 20, 34 in Lp , p > 1, 19, 34 regularization, 32 uniformly integrable, 20 Martingale transform, 15
92 Scaling property of Brownian motion, 47 Separable -algebra, 25 Simple Markov property for Brownian motion, 47 for Lvy processes, 66 e Skorokhods embedding, 60 Stochastic process, 13 adapted, 13 continuous-time, 29 discrete-time, 15 integrable, 13 previsible, 15 stopped process, 15 Stopping time denition, 14 measurable events before T , 15 Strong Markov property, 49 Submartingale, 14 Supermartingale, 14 Taking out what is known, 10 Tightness, 40 Tower property, 10 Transience, 53 Versions of a process, 31 Weak convergence, 37 Wiener space, 46 Wieners measure, 46, 48 Wieners theorem, 45
INDEX

Ad PR 2006

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Ad PR 2006

Enviado por

Direitos autorais:

Formatos disponíveis

Advanced Probability

Chapter 1 Conditional expectation

|E[X 1Bi ]| E[|X|]. P (Bi )

Moreover, it is straightforward to check: 5

CHAPTER 1. CONDITIONAL EXPECTATION

E[X|Y = y]1{Y =y} .

Conditioning with respect to a -algebra

1.2. CONDITIONING WITH RESPECT TO A -ALGEBRA

1.2. CONDITIONING WITH RESPECT TO A -ALGEBRA Proposition 1.2.2 Let G F be a -algebra.

lim E[Xn |G] = E[X|G] ,

CHAPTER 1. CONDITIONAL EXPECTATION

Specic properties of conditional expectation

1.4. COMPUTING A CONDITIONAL EXPECTATION

Computing a conditional expectation

We give two concrete and important examples of computation of conditional expectations.

Conditional density functions

h(x)g(y)fX,Y (x, y)dxdy g(y)fY (y)dy

fX,Y (x, y) 1{fY (y)>0} dx fY (y)

and 0 else. We interpret this result by saying that E[h(X)|Y ] =

The Gaussian case

CHAPTER 1. CONDITIONAL EXPECTATION

Chapter 2 Discrete-time martingales

CHAPTER 2. DISCRETE-TIME MARTINGALES

Doobs stopping times

2.2. OPTIONAL STOPPING

Discrete-time martingales: optional stopping

CHAPTER 2. DISCRETE-TIME MARTINGALES

If (Xn , n 0) is adapted and (Cn , n 1) is previsible, we dene an adapted process C X by

2.3. THE CONVERGENCE THEOREM

Discrete-time martingales: the convergence theorem

CHAPTER 2. DISCRETE-TIME MARTINGALES

(XTi XSi ) + (Xn XSNn +1 )1{SNn +1 n}

Doobs inequalities and Lp convergence, p > 1

Proof. Letting T = inf{k 0 : Xk c}, we obtain by optional stopping that

Since {T n} = {Xn c}, the conclusion follows.

2.4. LP CONVERGENCE, P > 1

p p = E[|Xn |(Xn )p1 ] Xn p1 p1 which yields the result.

CHAPTER 2. DISCRETE-TIME MARTINGALES

Uniform integrability and convergence in L1

Optional stopping in the case of U.I. martingales

E[|Xn |1{T =n} ] + E[|X |1{T =} ]

E[|X |1{T =n} ] = E[|X |].

E[1B 1{T =n} X ] =

E[1B 1{T =n} Xn ] = E[1B XT ],

2.7. BACKWARDS MARTINGALES

CHAPTER 2. DISCRETE-TIME MARTINGALES

Chapter 3 Examples of applications of discrete-time martingales

A martingale approach to the Radon-Nikodym theorem

3.3. THE RADON-NIKODYM THEOREM

Q(A) 1A (), P (A)

This is a particular case of Lebesgue dierentiation theorem.

Product martingales and likelihood ratio tests

3.4. PRODUCT MARTINGALES 4. an > 0.

fn (x)gn (x)(dx) > 0

Example: consistency of the likelihood ratio test

Chapter 4 Continuous-parameter stochastic processes

Theoretical problems when dealing with continuous time processes

CHAPTER 4. CONTINUOUS-PARAMETER PROCESSES

Xd 1{Tn =d} + Xt 1{t<Tn } .

4.2. FINITE MARGINAL DISTRIBUTIONS, VERSIONS

Finite marginal distributions, versions

CHAPTER 4. CONTINUOUS-PARAMETER PROCESSES

The martingale regularization theorem