Calin o An Informal Introduction To Stochastic Calculus With

An Informal Introduction to
Stochastic Calculus
with Applications
9620_9789814678933_tp.indd 1 4/6/15 9:39 am

November 24, 2014 14:49 BC: 9370 The (1 + 1)-Nonlinear Universe of the Parabolic Map Universe page vi
An Informal Introduction to
Stochastic Calculus
with Applications
Ovidiu Calin
Eastern Michigan University, USA
World Scientific
NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TA I P E I CHENNAI
9620_9789814678933_tp.indd 2 4/6/15 9:39 am

Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Library of Congress Cataloging-in-Publication Data

Calin, Ovidiu.
An informal introduction to stochastic calculus with applications / by Ovidiu Calin (Eastern Michigan
University, USA).
pages cm
Includes bibliographical references and index.
ISBN 978-9814678933 (hardcover : alk. paper) -- ISBN 978-9814689915 (pbk : alk. paper)
1. Stochastic analysis. 2. Calculus. I. Title. II. Title: Introduction to stochastic calculus with applications.
QA274.2.C35 2015
519.2'2--dc23
2015014680
British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.
Copyright 2015 by World Scientific Publishing Co. Pte. Ltd.

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or
mechanical, including photocopying, recording or any information storage and retrieval system now known or to
be invented, without written permission from the publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.
Printed in Singapore
RokTing - An Informal Introduction to Stochastic Calculus.indd 1 8/5/2015 3:41:36 PM

May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page v
Preface
Deterministic Calculus has been proved extremely useful in the last few hun-
dred years for describing the dynamics laws for macro-objects, such as plan-
ets, projectiles, bullets, etc. However, at the micro-scale, the picture looks
completely dierent, since at this level the classical laws of Newtonian me-
chanics cease to function normally. Micro-particles behave dierently, in
the sense that their state cannot be determined accurately as in the case of
macro-objects; their position or velocity can be described using probability
densities rather than exact deterministic variables. Consequently, the study
of nature at the micro-scale level has to be done with the help of a special
tool, called Stochastic Calculus. The fact that nature at a small scale has a
non-deterministic character makes Stochastic Calculus a useful and important
tool for the study of Quantum Mechanics.
In fact, all branches of science involving random functions can be ap-
proached by Stochastic Calculus. These include, but they are not limited to,
signal processing, noise ltering, stochastic control, optimal stopping, elec-
trical circuits, nancial markets, molecular chemistry, population evolution,
etc.
However, all these applications assume a strong mathematical background,
which takes a long time to develop. Stochastic Calculus is not an easy theory
to grasp and, in general, requires acquaintance with probability, analysis and
measure theory. This fact makes Stochastic Calculus almost always absent
from the undergraduate curriculum. However, many other subjects studied at
this level, such as biology, chemistry, economics, or electrical circuits, might be
more completely understood if a minimum knowledge of Stochastic Calculus
is assumed.
The attribute informal, present in the title of the book, refers to the fact
that the approach is at an introductory level and not at its maximum math-
ematical detail. Many proofs are just sketched, or done naively without
putting the reader through a theory with all the bells and whistles.
The goal of this work is to informally introduce elementary Stochastic
Calculus to senior undergraduate students in Mathematics, Economics and
Business majors. The authors goal was to capture as much as possible of the
v
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page vi
vi An Informal Introduction to Stochastic Calculus with Applications
spirit of elementary Calculus, which the students have already been exposed
to in the beginning of their majors. This assumes a presentation that mimics
similar properties of deterministic Calculus as much as possible, which facili-
tates the understanding of more complicated concepts of Stochastic Calculus.
The reader of this text will get the idea that deterministic Calculus is just
a particular case of Stochastic Calculus and that Itos integral is not a too
much harder concept than the Riemannian integral, while solving stochastic
dierential equations follows relatively similar steps as solving ordinary dif-
ferential equations. Moreover, modeling real life phenomena with Stochastic
Calculus rather than with deterministic Calculus brings more light, detail and
signicance to the picture.
The book can be used as a text for a one semester course in stochastic
calculus and probabilities, or as an accompanying text for courses in other
areas such as nance, economics, chemistry, physics, or engineering.
Since deterministic Calculus books usually start with a brief presentation
of elementary functions, and then continue with limits, and other properties
of functions, we employed here a similar approach, starting with elementary
stochastic processes, dierent types of limits and pursuing with properties
of stochastic processes. The chapters regarding dierentiation and integration
follow the same pattern. For instance, there is a product rule, a chain-type rule
and an integration by parts in Stochastic Calculus, which are modications of
the well-known rules from elementary Calculus.
In order to make the book available to a wider audience, we sacriced rigor
and completeness for clarity and simplicity, emphasizing mainly on examples
and exercises. Most of the time we assumed maximal regularity conditions for
which the computations hold and the statements are valid. Many complicated
proofs can be skipped at the rst reading without aecting later understand-
ing. This will be found attractive by both Business and Economics students,
who might get lost otherwise in a very profound mathematical textbook where
the forests scenery is obscured by the sight of the trees. A ow chart indicat-
ing the possible order the reader can follow can be found at the end of this
preface.
An important feature of this textbook is the large number of solved prob-
lems and examples which will benet both the beginner as well as the advanced
student.
This book grew from a series of lectures and courses given by the author
at Eastern Michigan University (USA), Kuwait University (Kuwait) and Fu-
Jen University (Taiwan). The student body was very varied. I had math,
statistics, computer science, economics and business majors. At the initial
stage, several students read the rst draft of these notes and provided valuable
feedback, supplying a list of corrections, which is far from exhaustive. Finding
any typos or making comments regarding the present material are welcome.
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page vii
Preface vii
Heartfelt thanks go to the reviewers who made numerous comments and

observations contributing to the quality of this book, and whose time is very
much appreciated.
Finally, I would like to express my gratitude to the World Scientic Pub-
lishing team, especially Rok-Ting Tan and Ying-Oi Chiew for making this
endeavor possible.
O. Calin Michigan, January 2015

November 24, 2014 14:49 BC: 9370 The (1 + 1)-Nonlinear Universe of the Parabolic Map Universe page vi
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page ix
List of Notations and Symbols
The following notations have been frequently used in the text.

(, F, P ) Probability space
Sample space
F -eld
X Random variable
Xt Stochastic process
as lim Xt The almost sure limit of Xt
t
ms lim Xt The mean square limit of Xt
t
p lim Xt The limit in probability of Xt
t
Ft Filtration
t , dWt
Nt , W White noise
dt
W t , Bt Brownian motion
Wt , Bt Jumps of the Brownian motion during time interval t
dWt , dBt Innitesimal jumps of the Brownian motion
V (Xt ) Total variation of Xt
V (2) (Xt ), X, Xt Quadratic variation of Xt
FX (x) Probability distribution function of X
pX (x) Probability density function of X
p(x, y; t) Transition density function
E[ ] Expectation operator
E[X|G] Conditional expectation of X with respect to G
V ar(X) Variance of the random variable X
Cov(X, Y ) Covariance of X and Y
(X, Y ), Corr(X, Y ) Correlation of X and Y
AX , F X
-algebras generated by X
ix
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page x
x An Informal Introduction to Stochastic Calculus with Applications
( ) Gamma function
B( , ) Beta function
Nt Poisson process
Sn Waiting time for Poisson process
Tn Interarrival time for Poisson process
1 2 The minimum between 1 and 2 (= min{1 , 2 })
1 2 The maximum between 1 and 2 (= max{1 , 2 })
n Sequence superior limit (= supn1 n )
n Sequence inferior limit (= inf n1 n )
Drift rate
Volatility, standard deviation

xk , Partial derivative with respect to xk
xk
Rn n-dimensional Euclidean space

x Euclidean norm (= x21 + + x2n )
f Laplacian of f
1A , A The characteristic function of A

f L2 2
The L -norm (=
b 2
a f (t) dt)
2
L [0, T ] Squared integrable functions on [0, T ]
C 2 (Rn ) Functions twice dierentiable with second derivative continuous
C02 (Rn ) Functions with compact support of class C 2
Rt Bessel process
t The mean square estimator of t
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page xi
Contents
Preface v
List of Notations and Symbols ix
1 A Few Introductory Problems 1

1.1 Stochastic Population Growth Models . . . . . . . . . . . . . . 1
1.2 Pricing Zero-coupon Bonds . . . . . . . . . . . . . . . . . . . . 3
1.3 Noisy Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Diusion of Particles . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Cholesterol Level . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Electron Motion . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.7 White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.8 Bounded and Quadratic Variation . . . . . . . . . . . . . . . . 6
2 Basic Notions 11
2.1 Probability Space . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Sample Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Events and Probability . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Integration in Probability Measure . . . . . . . . . . . . . . . . 14
2.6 Two Convergence Theorems . . . . . . . . . . . . . . . . . . . . 15
2.7 Distribution Functions . . . . . . . . . . . . . . . . . . . . . . . 16
2.8 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.9 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.9.1 The best approximation of a random variable . . . . . . 20
2.9.2 Change of measure in an expectation . . . . . . . . . . . 21
2.10 Basic Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.11 Sums of Random Variables . . . . . . . . . . . . . . . . . . . . 25
2.12 Conditional Expectations . . . . . . . . . . . . . . . . . . . . . 27
2.13 Inequalities of Random Variables . . . . . . . . . . . . . . . . . 32
2.14 Limits of Sequences of Random Variables . . . . . . . . . . . . 37
xi
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page xii
xii An Informal Introduction to Stochastic Calculus with Applications
2.15 Properties of Mean-Square Limit . . . . . . . . . . . . . . . . . 40

2.16 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . 41
3 Useful Stochastic Processes 45

3.1 The Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Geometric Brownian Motion . . . . . . . . . . . . . . . . . . . . 51
3.3 Integrated Brownian Motion . . . . . . . . . . . . . . . . . . . . 53
3.4 Exponential Integrated Brownian Motion . . . . . . . . . . . . 57
3.5 Brownian Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.6 Brownian Motion with Drift . . . . . . . . . . . . . . . . . . . . 58
3.7 Bessel Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.8 The Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . 61
3.9 Interarrival Times . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.10 Waiting Times . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.11 The Integrated Poisson Process . . . . . . . . . . . . . . . . . . 66
3.12 Submartingales . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4 Properties of Stochastic Processes 73

4.1 Stopping Times . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Stopping Theorem for Martingales . . . . . . . . . . . . . . . . 77
4.3 The First Passage of Time . . . . . . . . . . . . . . . . . . . . . 79
4.4 The Arc-sine Laws . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.5 More on Hitting Times . . . . . . . . . . . . . . . . . . . . . . . 88
4.6 The Inverse Laplace Transform Method . . . . . . . . . . . . . 91
4.7 The Theorems of Levy and Pitman . . . . . . . . . . . . . . . 98
4.8 Limits of Stochastic Processes . . . . . . . . . . . . . . . . . . . 104
4.9 Mean Square Convergence . . . . . . . . . . . . . . . . . . . . . 104
4.10 The Martingale Convergence Theorem . . . . . . . . . . . . . . 106
4.11 Quadratic Variation . . . . . . . . . . . . . . . . . . . . . . . . 108
4.11.1 The quadratic variation of Wt . . . . . . . . . . . . . . . 109
4.12 The Total Variation of Brownian Motion . . . . . . . . . . . . . 111
4.12.1 The quadratic variation of Nt t . . . . . . . . . . . . 112
5 Stochastic Integration 117

5.1 Nonanticipating Processes . . . . . . . . . . . . . . . . . . . . . 117
5.2 Increments of Brownian Motions . . . . . . . . . . . . . . . . . 118
5.3 The Ito Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.4 Examples of Ito Integrals . . . . . . . . . . . . . . . . . . . . . 120
5.4.1 The case Ft = c, constant . . . . . . . . . . . . . . . . . 120
5.4.2 The case Ft = Wt . . . . . . . . . . . . . . . . . . . . . 121
5.5 Properties of the Ito Integral . . . . . . . . . . . . . . . . . . . 122
5.6 The Wiener Integral . . . . . . . . . . . . . . . . . . . . . . . . 127
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page xiii
Contents xiii
5.7 Poisson Integration . . . . . . . . . . . . . . . . . . . . . . . . . 129

5.8 The case Ft = Mt . . . . . . . . . . . . . . . . . . . . . . . . . . 130
T
5.9 The Distribution Function of XT = 0 g(t) dNt . . . . . . . . . 136
6 Stochastic Dierentiation 139

6.1 Basic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.2 Itos Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.2.1 Ito diusions . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2.2 Itos formula for Poisson processes . . . . . . . . . . . . 146
6.2.3 Itos multidimensional formula . . . . . . . . . . . . . . 147
7 Stochastic Integration Techniques 149

7.1 Notational Conventions . . . . . . . . . . . . . . . . . . . . . . 149
7.2 Stochastic Integration by Parts . . . . . . . . . . . . . . . . . . 152
7.3 The Heat Equation Method . . . . . . . . . . . . . . . . . . . . 157
7.4 Table of Usual Stochastic Integrals . . . . . . . . . . . . . . . . 163
8 Stochastic Dierential Equations 165

8.1 Denitions and Examples . . . . . . . . . . . . . . . . . . . . . 165
8.2 The Integration Technique . . . . . . . . . . . . . . . . . . . . . 167
8.3 Exact Stochastic Equations . . . . . . . . . . . . . . . . . . . . 172
8.4 Integration by Inspection . . . . . . . . . . . . . . . . . . . . . 175
8.5 Linear Stochastic Dierential Equations . . . . . . . . . . . . . 178
8.6 Stochastic Equations with respect to a Poisson Process . . . . . 183
8.7 The Method of Variation of Parameters . . . . . . . . . . . . . 184
8.8 Integrating Factors . . . . . . . . . . . . . . . . . . . . . . . . . 187
8.9 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . 189
8.10 Finding Mean and Variance . . . . . . . . . . . . . . . . . . . . 191
9 Applications of Brownian Motion 199

9.1 The Directional Derivative . . . . . . . . . . . . . . . . . . . . . 199
9.2 The Generator of an Ito Diusion . . . . . . . . . . . . . . . . . 200
9.3 Dynkins Formula . . . . . . . . . . . . . . . . . . . . . . . . . . 203
9.4 Kolmogorovs Backward Equation . . . . . . . . . . . . . . . . 204
9.5 Exit Time from an Interval . . . . . . . . . . . . . . . . . . . . 205
9.6 Transience and Recurrence of Brownian Motion . . . . . . . . . 206
9.7 Application to Parabolic Equations . . . . . . . . . . . . . . . . 211
9.7.1 Deterministic characteristics . . . . . . . . . . . . . . . . 211
9.7.2 Stochastic characteristics . . . . . . . . . . . . . . . . . 212
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page xiv
xiv An Informal Introduction to Stochastic Calculus with Applications
10 Girsanovs Theorem and Brownian Motion 215

10.1 Examples of Martingales . . . . . . . . . . . . . . . . . . . . . . 215
10.2 How to Recognize a Brownian Motion . . . . . . . . . . . . . . 221
10.3 Time Change for Martingales . . . . . . . . . . . . . . . . . . . 225
10.4 Girsanovs Theorem . . . . . . . . . . . . . . . . . . . . . . . . 232
11 Some Applications of Stochastic Calculus 241

11.1 White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
11.2 Stochastic Kinematics . . . . . . . . . . . . . . . . . . . . . . . 244
11.3 Radioactive Decay . . . . . . . . . . . . . . . . . . . . . . . . . 247
11.4 Noisy Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . 250
11.5 Stochastic Population Growth . . . . . . . . . . . . . . . . . . 255
11.6 Pricing Zero-coupon Bonds . . . . . . . . . . . . . . . . . . . . 261
11.7 Finding the Cholesterol Level . . . . . . . . . . . . . . . . . . . 262
11.8 Photons Escape Time from the Sun . . . . . . . . . . . . . . . 263
11.9 Filtering Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 265
12 Hints and Solutions 277
Bibliography 309
Index 313
May 15, 2015 14:45 BC: 9620 An Informal Introduction to Stochastic Calculus Driverbook page 1
Chapter 1
A Few Introductory Problems
Even if deterministic Calculus is an excellent tool for modeling real life prob-
lems, however, when it comes to random exterior inuences, Stochastic Cal-
culus is the one which can allow for a more accurate modeling of the problem.
In real life applications, involving trajectories, measurements, noisy signals,
etc., the eects of many unpredictable factors can be averaged out, via the
Central Limit Theorem, as a normal random variable. This is related to the
Brownian motion, which was introduced to model the irregular movements of
pollen grains in a liquid.
In the following we shall discuss a few problems involving random pertur-
bations, which serve as motivation for the study of the Stochastic Calculus
introduced in next chapters. We shall come back to some of these problems
and solve them partially or completely in Chapter 11.
1.1 Stochastic Population Growth Models

Exponential growth model Let P (t) denote the population at time t. In
the time interval t the population increases by the amount P (t) = P (t +
t)P (t). The classical model of population growth suggests that the relative
percentage increase in population is proportional with the time interval, i.e.
P (t)
= rt,
P (t)
where the constant r > 0 denotes the population growth. Allowing for in-
nitesimal time intervals, the aforementioned equation writes as
dP (t) = rP (t)dt.
This dierential equation has the solution P (t) = P0 ert , where P0 is the initial
population size. The evolution of the population is driven by its growth rate
1
2 An Informal Introduction to Stochastic Calculus with Applications
40 100
K
80
30
60
20
40
10
20
0 2 4 6 8 10 0 2 4 6 8 10 12
a b
Figure 1.1: (a) Noisy population with exponential growth. (b) Noisy population
with logistic growth.
r. In real life this rate is not constant. It might be a function of time t, or even
more general, it might oscillate irregularly around some deterministic average
function a(t):
rt = a(t) + noise.
In this case, rt becomes a random variable indexed over time t. The associated
equation becomes a stochastic dierential equation

dP (t) = a(t) + noise P (t)dt. (1.1.1)
Solving an equation of type (1.1.1) is a problem of Stochastic Calculus, see
Fig. 1.1(a).
Logistic growth model The previous exponential growth model allows the
population to increase indenitely. However, due to competition, limited space
and resources, the population will increase slower and slower. This model
was introduced by P.F. Verhust in 1832 and rediscovered by R. Pearl in the
twentieth century. The main assumption of the model is that the amount
of competition is proportional with the number of encounters between the
population members, which is proportional with the square of the population
size
dP (t) = rP (t)dt kP (t)2 dt. (1.1.2)
The solution is given by the logistic function
P0 K
P (t) = ,
P0 + (K P0 )ert
where K = r/k is the saturation level of the population. One of the stochastic
variants of equation (1.1.2) is given by
dP (t) = rP (t)dt kP (t)2 dt + (noise)P (t),
A Few Introductory Problems 3
where R is a measure of the size of the noise in the system. This equation is
used to model the growth of a population in a stochastic, crowded environment,
see Fig. 1.1(b).
1.2 Pricing Zero-coupon Bonds

A bond is a nancial instrument which pays back at the end of its lifetime, T ,
an amount equal to B, and provides some periodical payments, called coupons.
If the coupons are equal to zero, the bond is called a zero-coupon bond or a
discount bond. Using the time value of money, the price of a bond at time t
is B(t) = Bet(T t) , where r is the risk-free interest rate. The bond satises
the ordinary dierential equation
dB(t) = rB(t)dt
with the nal condition B(T ) = B. In a noisy market the constant interest
rate r is replaced by rt = r(t) + noise, a fact that makes the bond pricing
more complicated. This treatment can be achieved by Stochastic Calculus.
1.3 Noisy Pendulum

The free oscillations of a simple pendulum of unit mass can be described by

the nonlinear equation (t) = k2 sin (t), where (t) is the angle between
the string and the vertical direction. If the pendulum is moving under the
inuence of a time dependent exterior force F = F (t), then the equation of
+ k2 sin (t) = F (t). We
the pendulum with forced oscillations is given by (t)
may encounter the situation when the force is not deterministic and we have
F (t) = f (t) + (noise).
How does the noisy force inuence the deviation angle (t)? Stochastic Cal-
culus can be used to answer this question.
1.4 Diusion of Particles

Consider a owing uid with the velocity eld v(x). A particle that moves
with the uid has a trajectory (t) described by the equation (t) = v (t) .
A small particle, that is also subject to molecular
bombardments, will be

described by an equation of the type (t) = v (t) + (noise), where the
constant > 0 determines the size of the noise and controls the diusion of
the small particle in the uid.
Now consider a drop of ink (which is made out of a very large number of
tiny particles) left to diuse in a liquid. Each ink particle performs a noisy
trajectory in the liquid. Let p(x, t) represent the density of particles that
arrive about x at time t. After some diusion time, the darker regions of
the liquid represent the regions with higher density p(x, t), while the lighter
regions correspond to smaller density p(x, t). Knowing the density p(x, t)
provides control over the dynamics of the diusion process and can be used to
nd the probability that an ink particle reaches a certain region.
1.5 Cholesterol Level

The blood cholesterol level at time t is denoted by C(t). This depends on the
intaken food fat as well as organism absorption and individual production of
cholesterol. The rate of change of the cholesterol level is given by
dC(t)
= a C0 C(t) + bE,
dt
where C0 is the natural level of cholesterol and E denotes the daily rate of
intaken cholesterol; the constants a and b model the production and absorption
of cholesterol in the organism. The solution of this linear dierential equation
is
b
C(t) = C0 eat + C0 + E (1 eat ),
a
which in the long run tends to the saturation level of cholesterol C0 + ab E.
Due to either observation errors or variations in the intake amount of food,
the aforementioned equation will get the following noisy form
dC(t)
= a C0 C(t) + bE + noise.
dt
This equation can be explicitly solved using Stochastic Calculus. Furthermore,
we can also nd the probability that the cholesterol level is over the allowed
organism limit.
1.6 Electron Motion

Consider an electron situated at the initial distance x(0) from the origin,
which moves with a unit speed towards the origin. Its coordinate x(t) R3 is
supposed to satisfy the equation
dx(t) x(t)
=
dt |x(t)|
3
14
x0
12 2
10
1
6 3 2 1 1 2 3
4 xt 1
2 o 2
0 2 4 6 8 10 3
a b
Figure 1.2: (a) The trajectory of the electron x(t) tends towards the origin.
(b) White noise.
Like in the case of the pollen grain, whose motion is agitated by the neigh-
boring molecules, we assume that the electron is subject to bombardment by
some aether particles, which makes its movement unpredictable, with con-
stant tendency to go towards the origin, see Fig. 1.2 (a). Then its equation
becomes
dx(t) x(t)
= + noise
dt |x(t)|
This type of description of electrons is usually seen in stochastic mechanics.

This theory can be found in Fenyes [18] and Nelson [36].
1.7 White Noise
All aforementioned problems involved a noise inuence. This noise is di-

rectly related to the trajectory of a small particle which diuses in a liquid
due to the molecular bombardments (just consider the last example in the case
of a static uid, v = 0). This was observed rst time by Brown [8] in 1828,
and was called Brownian motion and it is customarily denoted by Bt . It has
a very irregular, continuous trajectory, which from the mathematical point
of view is nowhere dierentiable. A satisfactory explanation of the Brownian
motion was given by Einstein [17] in 1905. A dierent but likewise succesful
decription of the Brownian motion was done by Langevin [32] in 1908.
The adjective white comes from signal processing, where it refers to the
fact that the noise is completely unpredictable, i.e. it is not biased towards
any specic frequency,1 see Fig. 1.2. If Nt denotes the white noise at
time t, the trajectory (t) of a diused particle satises
(t) = Nt , t 0.
The solution depends on the Brownian motion starting at x, i.e (t) = x + Bt .

Therefore the white noise is the instantaneous rate of change of the Brownian
motion, and can be written informally as
dBt
Nt = .
dt
This looks contradictory, since Bt is not dierentiable. However, there is a
way of making sense of the previous formula by considering the derivative in
the following generalized sense:

Nt f (t) dt = Bt f (t) dt,
R R
for any compact supported, smooth function f . From this point of view, the
white noise Nt is a generalized function or a distribution. We shall get back to
the notion of white noise in section 11.1.
1.8 Bounded and Quadratic Variation

The graph of a C 1 -dierentiable function, dened on a compact interval, has
nite length. Unlike the case of dierentiable functions, trajectories of Brow-
nian motions, or other stochastic processes, are not of nite length. We may
say that to a certain extent, the role of the length in this case is played by
the quadratic variation. This is actually a measure of the roughness of the
process, see Fig. 1.3. This section will introduce these notions in an informal
way. We shall cover these topics in more detail later in sections 4.11 and 4.12.
Let f : [a, b] R be a continuously dierentiable function, and consider
the partition a = x0 < x1 < < xn = b of the interval [a, b]. A smooth curve
y = f (x), a x b, is rectiable (has length) if the sum of the lengths of the
line segments with vertices at P0 (x0 , f (x0 )), , Pn (xn , f (xn )) is bounded by
a given constant, which is independent of the number n and the choice of the
division points xi . Assuming the division is equidistant, x = (b a)/n, the
curve length becomes
1
The white light is an equal mixture of radiations of all visible frequencies.
14 14 14
12 12 12
10 10 10
8 8 8
6 6 6
4 4 4
2 2 2
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
a b c
Figure 1.3: (a) Smooth. (b) Rough. (c) Very rough.

n1
n1
= sup |Pk Pk+1 | = sup (x)2 + (f )2 .
xi xi
k=0 k=0
Furthermore, if f is continuously dierentiable, the computation can be con-

tinued as

f 2 b

n1
= lim 1+ x = 1 + f (x)2 dx,
n x a
k=0
where we used that the limit of an increasing sequence is equal to its superior
limit.
Denition 1.8.1 The function f (x) has bounded variation on the interval
[a, b] if for any division a = x0 < x1 < < xn = b the sum

n1
|f (xk+1 ) f (xk )|
k=0
is bounded above by a given constant.
The total variation of f on [a, b] is dened by

n1
V (f ) = sup |f (xk+1 ) f (xk )|. (1.8.3)
xi
k=0
The amount V (f ) measures in a certain sense the roughness of the function.

If f is a constant function, then V (f ) = 0. If f is a stair-type function, then
V (f ) is the sum of the absolute value of its jumps.
We note that if f is continuously dierentiable, then the total variation

can be written as an integral

n1
V (f ) = sup |f (xk+1 ) f (xk )|
xi
k=0

n1
|f (xk+1 ) f (xk )| b
= lim x = |f (x)| dx.
n xk+1 xk a
k=0
The next result states a relation between the length of the graph and the
total variation of a function f , which is not dierentiable.
Proposition 1.8.2 Let f : [a, b] R be a function. Then the graph y = f (x)

has length if and only if V (f ) < .
Proof: Consider the simplifying notations (f )k = f (xk+1 ) f (xk ) and

x = xk+1 xk . Taking the summation in the double inequality

|(f )k | (x)2 + |(f )k |2 x + |(f )k |
and then applying the sup yields
V (f ) (b a) + V (f ),
which implies the desired conclusion.

By virtue of the previous result, the functions with innite total variations
have graphs of innite lengths.
The following informal computation shows that the Brownian motion has
innite total variation (a real proof of this fact is given in section 4.12)

n1 b b
dBt
V (Bt ) = sup |Btk+1 Btk | = dt = |Nt | dt = ,
tk a dt a
k=0
since the area under the curve t |Nt | is innite.

We can try to model a ner roughness of the function using the quadratic
variation of f on the interval [a, b]

n1
(2)
V (f ) = sup |f (xk+1 ) f (xk )|2 . (1.8.4)
xi
k=0
where the sup is taken over all divisions a = x0 < x1 < < xn = b.
It is worth noting that if f has bounded total variation, V (f ) < , then

V (2) (f )
= 0. This comes from the following inequality
V (2) (f ) max |f (xk+1 ) f (xk )|V (f ) 0, as |x| 0.

xk
The total variation for the Brownian motion does not provide much in-
formation. It turns out that the correct measure for the roughness of the
Brownian motion is the quadratic variation

n1
V (2) (Bt ) = sup |B(tk+1 ) B(tk )|2 . (1.8.5)
ti
k=0
It will be shown that V (2) (Bt ) is equal to the time interval b a.

Chapter 2
Basic Notions
2.1 Probability Space

The modern theory of probability stems from the work of Kolmogorov [28],
published in 1933. Kolmogorov associates a random experiment with a prob-
ability space, which is a triplet, (, F, P ), consisting of the set of outcomes,
, a -eld, F, with Boolean algebra properties, and a probability measure,
P . In the following sections, each of these elements will be discussed in more
detail.
2.2 Sample Space

A random experiment in the theory of probability is an experiment whose out-
comes cannot be determined in advance. When an experiment is performed,
all possible outcomes form a set called the sample space, which will be denoted
by .
For instance, ipping a coin produces the sample space with two states
= {H, T }, while rolling a die yields a sample space with six states =
{1, , 6}. Choosing randomly a number between 0 and 1 corresponds to a
sample space, which is the entire segment = (0, 1).
In nancial markets one can regard as the states of the world, by this,
we mean all possible states the world might have. The number of states of the
world that aect the stock market is huge. These would contain all possible
values for the vector parameters that describe the world, which is practically
innite.
All subsets of the sample space form a set denoted by 2 . The reason
for this notation is that the set of parts of can be put into a bijective
correspondence with the set of binary functions f : {0, 1}. The number
of elements of this set is 2|| , where || denotes the cardinal of . If the set is
11
nite, || = n, then 2 has 2n elements. If is innitely countable (i.e. can

be put into a bijective correspondence with the set of natural numbers), then
2|| is innite and its cardinal is the same as that of the real number set R.
Remark 2.2.1 Pick a natural number at random. Any subset of the sample
space corresponds to a sequence formed with 0 and 1. For instance, the subset
{1, 3, 5, 6} corresponds to the sequence 10101100000 . . . having 1 on the 1st,
3rd, 5th and 6th places and 0 in rest. It is known that the number of these
sequences is innite and can be put into a bijective correspondence with the
real number set R. This can also be written as |2N | = |R|, and stated by saying
that the set of all subsets of natural numbers N has the same cardinal as the
real numbers set R.
2.3 Events and Probability

The set of parts 2 satises the following properties:
1. It contains the empty set ;
2. If it contains a set A, then it also contains its complement A = \A;
3. It is closed with regard to unions, i.e., if A1 , A2 , . . . is a sequence of sets,

then their union A1 A2 also belongs to 2 .
Any subset F of 2 that satises the previous three properties is called a -

eld. The sets belonging to F are called events. This way, the complement of
an event, or the union of events is also an event. We say that an event occurs
if the outcome of the experiment is an element of that subset.
The chance of occurrence of an event is measured by a probability function
P : F [0, 1] which satises the following two properties:
1. P () = 1;
2. For any mutually disjoint events A1 , A2 , F,
P (A1 A2 ) = P (A1 ) + P (A2 ) + .
The triplet (, F, P ) is called a probability space. This is the main setup

in which the probability theory works.
Example 2.3.1 In the case of a coin ipping, the probability space has the
following elements: = {H, T }, F = {, {H}, {T }, {H, T }} and P is dened
by P () = 0, P ({H}) = 12 , P ({T }) = 12 , P ({H, T }) = 1.
Basic Notions 13

Figure 2.1: If any set X 1 (a, b) is known, then the random variable X :
R is 2 -measurable.
Example 2.3.2 Consider a nite sample space = {s1 , . . . , sn }, with the -

eld F = 2 , and probability given by P (A) = |A|/n, A F. Then (, F, P )
is a probability space.
Example 2.3.3 Let = [0, 1] and consider the -eld B([0, 1]) given by the
set of all open or closed intervals on [0, 1], or any unions, intersections, and
complementary sets. Dene P (A) = (A), where stands for the Lebesgue
measure (in particular, if A = (a, b), then P (A) = b a is the length of the
interval). It can be shown that (, B([0, 1]), P ) is a probability space.
2.4 Random Variables

Since the -eld F provides the knowledge about which events are possible on
the considered probability space, then F can be regarded as the information
component of the probability space (, F, P ). A random variable X is a
function that assigns a numerical value to each state of the world, X : R,
such that the values taken by X are known to someone who has access to the
information F. More precisely, given any two numbers a, b R, then all the
states of the world for which X takes values between a and b forms a set that
is an event (an element of F), i.e.
{ ; a < X() < b} F.
Another way of saying this is that X is an F-measurable function.

Example 2.4.1 Let X() be the number of people who want to buy houses,
given the state of the market . Is X measurable? This would mean that
given two numbers, say a = 10, 000 and b = 50, 000, we know all the market
situations for which there are at least 10, 000 and at most 50, 000 people
willing to purchase houses. Many times, in theory, it makes sense to assume
that we have enough knowledge so that we can assume X is measurable.
Example 2.4.2 Consider the experiment of ipping three coins. In this case
is the set of all possible triplets, which can be made with H and T . Consider
the random variable X which gives the number of tails obtained. For instance
X(HHH) = 0, X(HHT ) = 1, etc. The sets
{; X() = 0} = {HHH}, {; X() = 1} = {HHT, HT H, T HH},

{; X() = 3} = {T T T }, {; X() = 2} = {HT T, T HT, T T H}
belong to 2 , and hence X is a random variable.
2.5 Integration in Probability Measure

The notion of expectation is based on integration on measure spaces. In this
section we recall briey the denition of an integral with respect to the prob-
ability measure P . For more insight on measurable functions and integration
theory the reader is referred to the classical text of Halmos [21].
Let X : R be a random variable on the probability space (, F, P ). A
partition (i )1in of is a family of subsets i , with i F, satisfying
1. i j = , for i = j;
n
2. i = .
i
Each i is an event and its associated probability is P (i ). Consider the
1, if A
characteristic function of a set A dened by A () = .
0, if
/A
More properties of A can be found in Exercise 2.12.9. The integral will be
dened in the following three steps:
(i) A simple function is a sum of characteristic functions f = ni ci i ,
ci R. This means f () = ck for k . The integral of the simple function
f is dened by
n
f dP = ci P (i ).
i
(ii) If X : R is a random variable, then from the measure theory

it is known that there is a sequence of simple functions (fn )n1 satisfying
Basic Notions 15
limn fn () = X(). Furthermore, if X 0, then we may assume that

fn fn+1 . Then we dene

X dP = lim fn dP.
n
(iii) If X is not non-negative, we can write X = X + X with X + =

sup{X, 0} 0 and X = sup{X, 0} 0. Dene

+
X dP = lim X dP lim X dP,
n n
where we assume that at least one of the integrals is nite.

From now on, the integral notations X dP or X() dP () will be used
interchangeably. In the rest of the chapter the integral notation will be used
informally, without requiring a direct use of the previous denition.
Two widely used properties of the integral dened above are:
Linearity: For any two random variables X and Y and a, b R

(aX + bY ) dP = a X dP + b Y dP ;

Positivity: If X 0 then

X dP 0.

2.6 Two Convergence Theorems

During future computations we shall often need to swap the limit symbol with
the integral. There are two basic measure theory results that allow doing this.
We shall state these results below and use them whenever needed.
Theorem 2.6.1 (The monotone convergence theorem) Let (, F, P ) be
a probability space and (fn )n1 a sequence of measurable functions, fn :
[0, ) such that:
(i) 0 fk () fk+1 (), , k 1;
(ii) the sequence is pointwise convergent
f () = lim fn (), .
n
Then
(1) f is measurable;

(2) lim fn dP = f dP.
n
Theorem 2.6.2 (The dominated convergence theorem) Let (, F, P ) be

a probability space and (fn )n1 a sequence of measurable functions, fn :
R. Assume that:
(i) (fn ) is pointwise convergent
f () = lim fn (), ;
n

(ii) there is an integrable function g, (i.e. |g| dP < ) such that
|fn ()| g(), , n 1.
Then f is integrable and

lim fn dP = f dP.
n
2.7 Distribution Functions

Let X be a random variable on the probability space (, F, P ). The distribu-
tion function of X is the function FX : R [0, 1] dened by
FX (x) = P (; X() x).
It is worth observing that since X is a random variable, then the set {; X()
x} belongs to the information set F.
The distribution function is non-decreasing and satises the limits
lim FX (x) = 0, lim FX (x) = 1.

x x+
If we have
d
F (x) = p(x),
dx X
then we say that p(x) is the probability density function of X.
It is important to note the following relation among distribution function,
probability and probability density function of the random variable X
x
FX (x) = P (X x) = dP () = p(u) du. (2.7.1)
{Xx}
The probability density function p(x) has the following properties:

(i) p(x) 0

(ii) p(u) du = 1.
Basic Notions 17
The rst one is a consequence of the fact that the distribution function
FX (x) is non-decreasing. The second follows from (2.7.1) by making x

p(u) du = dP = P () = 1.

As an extension of formula (2.7.1) we have for any F-measurable function

h
h(X()) dP () = h(x)p(x) dx. (2.7.2)
R
Another useful propety, which follows from the Fundamental Theorem of

Calculus is
b
P (a < X < b) = P (; a < X() < b) = p(x) dx.
a
In the case of discrete random variables the aforementioned integral is replaced

by the following sum

P (a < X < b) = P (X = x).
a<x<b
For more details the reader is referred to a traditional probability book, such
as Wackerly et al. [13].
2.8 Independence
Roughly speaking, two random variables X and Y are independent if the
occurrence of one of them does not change the probability density of the other.
More precisely, if for any two open intervals A, B R, the events
E = {; X() A}, F = {; Y () B}
are independent, i.e., P (E F ) = P (E)P (F ), then X and Y are called inde-

pendent random variables.
Proposition 2.8.1 Let X and Y be independent random variables with den-

sity functions pX (x) and pY (y). Then the joint density function of (X, Y ) is
given by pX,Y (x, y) = pX (x) pY (y).
Proof: Using the independence of sets, we have1
pX,Y (x, y) dxdy = P (x < X < x + dx, y < Y < y + dy)

= P (x < X < x + dx)P (y < Y < y + dy)
= pX (x) dx pY (y) dy
= pX (x)pY (y) dxdy.
Dropping the factor dxdy yields the desired result. We note that the converse
also holds true.
The -algebra generated by a random variable X : R is the -algebra
generated by the unions, intersections and complements of events of the form
{; X() (a, b)}, with a < b real numbers. This will be denoted by AX .
Two -elds G and H included in F are called independent if
P (G H) = P (G)P (H), G G, H H.
The random variable X and the -eld G are called independent if the
algebras AX and G are independent.
2.9 Expectation
A random variable X : R is called integrable if

|X()| dP () = |x|p(x) dx < ,
R
where p(x) denotes the probability density function of X. The previous iden-
tity is based on changing the domain of integration from to R.
The expectation of an integrable random variable X is dened by

E[X] = X() dP () = x p(x) dx.
R
Customarily, the expectation of X is denoted by and is called the mean. In

general, for any measurable function h : R R, we have

E[h(X)] = h X() dP () = h(x)p(x) dx.
R
In the case of a discrete random variable X the expectation is dened as

E[X] = xk P (X = xk ).
k1
1
x+dx
We are using the useful approximation P (x < X < x + dx) = x
p(u) du = p(x)dx.
Basic Notions 19
Proposition 2.9.1 The expectation operator E is linear, i.e. for any inte-
grable random variables X and Y
1. E[cX] = cE[X], c R;
2. E[X + Y ] = E[X] + E[Y ].
Proof: It follows from the fact that the integral is a linear operator.
Proposition 2.9.2 Let X and Y be two independent integrable random vari-

ables. Then
E[XY ] = E[X]E[Y ].
Proof: This is a variant of Fubinis theorem, which in this case states that a
double integral is a product of two simple integrals. Let pX , pY , pX,Y denote
the probability densities of X, Y and (X, Y ), respectively. Since X and Y are
independent, by Proposition 2.8.1 we have

E[XY ] = xypX,Y (x, y) dxdy = xpX (x) dx ypY (y) dy = E[X]E[Y ].
Denition 2.9.3 The covariance of two random variables is dened by
Cov(X, Y ) = E[XY ] E[X]E[Y ].
The variance of X is given by
V ar(X) = Cov(X, X).
Proposition 2.9.2 states that if X and Y are independent, then Cov(X, Y ) =

0. It is worth to note that the converse in not necessarily true, see Exercise
2.9.6. However, the converse holds true if both X and Y are assumed normally
distributed.
Exercise 2.9.4 Show that

(a) Cov(X, Y ) = E[(X X )(Y Y )], where X = E[X] and Y = E[Y ];
(b) V ar(X) = E[(X X )2 ];
From Exercise 2.9.4 (b), we have V ar(X) 0, so, there is a real number
> 0 such that V ar(X) = 2 . The number is called standard deviation.
Exercise 2.9.5 Let and denote the mean and the standard deviation of
the random variable X. Show that
E[X 2 ] = 2 + 2 .
Exercise 2.9.6 Consider two random variables with the following table of
joint probabilities:
Y \X 1 0 1
1 1/16 3/16 1/16
0 3/16 0 3/16
1 1/16 3/16 1/16
Show the following:
(a) E[X] = E[Y ] = E[XY ] = 0;
(b) Cov(X, Y ) = 0;
(c) P (0, 0) = PX (0)PY (0);
(d) X and Y are not independent.
The covariance can be standardized in the following way. Let X and Y be

the standard deviations of X and Y , respectively. The correlation coecient
of X and Y is dened as
Cov(X, Y )
(X, Y ) = .
X Y
Exercise 2.9.7 (a) Prove that for any random variables A and B we have
E[AB]2 E[A2 ]E[B 2 ].
(b) Use part (a) to show that for any random variables X and Y we have
1 (X, Y ) 1.
(c) What can you say about the random variables X and Y if (X, Y ) = 1?
2.9.1 The best approximation of a random variable

Let X be a random variable. We would like to approximate X by a single
(nonrandom) number x. The best value of x is chosen in the sense of the
least squares, i.e. x is picked such that the expectation of the error square
(X x)2 is minimum. Denote = E[X] and 2 = V ar(X). Since
E[(X x)2 ] = E[X 2 ] 2xE[X] + x2

= 2 + 2 2x + x2
= 2 + (x )2 ,
Basic Notions 21
the minimum is obtained for x = , and in this case
min E[(X x)2 ] = 2 .

x
It follows that the mean, , is the best approximation of the random variable
X in the least squares sense.
2.9.2 Change of measure in an expectation

Let P, Q : F R be two probability measures on , such that there is an
integrable random variable f : R, such that dQ = f dP . This means

Q(A) = dQ = f () dP (), A F.
A A
Denote by EP and EQ the expectations with respect to the measures P and

Q, respectively. Then we have

E [X] =
Q
X() dQ() = X()f () dP () = EP [f X].

Exercise 2.9.8 Let g : [0, 1] [0, ) be a integrable function with

1
g(x) dx = 1.
0

Consider Q : B([0, 1]) R, given by Q(A) = g(x) dx. Show that Q is a
A
probability measure on ( = [0, 1], B([0, 1])).
2.10 Basic Distributions

We shall recall a few basic distributions, which are most often seen in appli-
cations.
Normal distribution A random variable X is said to have a normal distri-
bution if its probability density function is given by
1 2 2
p(x) = e(x) /(2 ) ,
2
with and > 0 constant parameters, see Fig. 2.2(a). The mean and variance
are given by
E[X] = , V ar[X] = 2 .
If X has a normal distribution with mean and variance 2 , we shall write
X N (, 2 ).
Exercise 2.10.1 Let , R. Show that if X is normal distributed, with

X N (, 2 ), then Y = X + is also normal distributed, with
Y N ( + , 2 2 ).
Log-normal distribution Let X be normally distributed with mean and

variance 2 . Then the random variable Y = eX is said to be log-normal
distributed. The mean and variance of Y are given by
2
E[Y ] = e+ 2
2 2
V ar[Y ] = e2+ (e 1).
The density function of the log-normal distributed random variable Y is given

by
1 (ln x)2
p(x) = e 22 , x > 0,
x 2
see Fig. 2.2(b).
Denition 2.10.2 The moment generating

function of a random variable X
is the function mX (t) = E[etX ] = etx p(x) dx, where p(x) is the probability
density function of X, provided the integral exists.
The name comes from the fact that the nth moments of X, given by n =
E[X n ], are generated by the derivatives of mX (t)
dn mX (t)
= n .
dtn |t=0
It is worth noting the relation between the Laplace transform
and the mo-
ment generating function, in the case x 0, L(p(x))(t) = etx p(x) dx =
0
mX (t).
Exercise 2.10.3 Find the moment generating function for the exponential
distribution p(x) = ex , x 0, > 0.
Exercise 2.10.4 Show that if X and Y are two independent random vari-
ables, then mX+Y (t) = mX (t)mY (t).
Exercise 2.10.5 Given that the moment generating function of a normally

2 2
distributed random variable X N (, 2 ) is m(t) = E[etX ] = et+t /2 , show
that
Basic Notions 23
0.4
0.5
0.3 0.4
0.3
0.2
0.2
0.1
0.1
4 2 0 2 4 0 2 4 6 8
a b
3.5
3, 9 8, 3
3.0
0.20
3, 2 2.5
0.15
2.0
4, 3
0.10 1.5
1.0
0.05
0.5
0 5 10 15 20 0.0 0.2 0.4 0.6 0.8 1.0
c d
Figure 2.2: (a) Normal distribution. (b) Log-normal distribution. (c) Gamma
distributions. (d) Beta distributions.
2 2 /2
(a) E[Y n ] = en+n , where Y = eX .
(b) Show that the mean and variance of the log-normal random variable
Y = eX are
2 2 2
E[Y ] = e+ /2 , V ar[Y ] = e2+ (e 1).
Gamma distribution A random variable X is said to have a gamma distri-

bution with parameters > 0, > 0 if its density function is given by
x1 ex/
p(x) = , x 0,
()
where () denotes the gamma function

() = y 1 ey dy.
0
It is worth noting that for = n, integer, we have (n) = (n 1)!. The

gamma distribution is provided in Fig. 2.2(c). The mean and variance are
given by
E[X] = , V ar[X] = 2 .
The case = 1 is known as the exponential distribution, see Fig. 2.3(a). In

this case
1
p(x) = ex/ , x 0.

The particular case when = n/2 and = 2 becomes the 2 distribution
with n degrees of freedom. This characterizes also a sum of n independent
standard normal distributions.
Beta distribution A random variable X is said to have a beta distribution
with parameters > 0, > 0 if its probability density function is of the form
x1 (1 x)1
p(x) = , 0 x 1,
B(, )
where B(, ) denotes the beta function.2 See see Fig. 2.2(d) for two partic-
ular density functions. In this case

E[X] = , V ar[X] = .
+ ( + )2 ( + + 1)
Poisson distribution A discrete random variable X is said to have a Poisson

probability distribution if
k
P (X = k) = e , k = 0, 1, 2, . . . ,
k!
with > 0 parameter, see Fig. 2.3(b). In this case E[X] = and V ar[X] = .
Pearson 5 distribution Let , > 0. A random variable X with the density
function
1 e/x
p(x) = , x0
() (x/)+1
is said to have a Pearson 5 distribution3 with positive parameters and . It
can be shown that

2
, if > 1 , if > 2
E[X] = 1 V ar(X) = ( 1)2 ( 2)
,
otherwise, , otherwise.
2 ()()
Two denition formulas for the beta functions are B(, ) = (+)
and B(, ) =
1
0
y (1 y)
1 1
dy.
3
The Pearson family of distributions was designed by Pearson between 1890 and 1895.
There are several Pearson distributions, this one being distinguished by the number 5.
Basic Notions 25
0.10
0.35
0.08
15, 0 k 30
0.30
0.25
0.06
0.20
3
0.15 0.04
0.10
0.02
0.05
0 2 4 6 8 10 5 10 15 20 25 30
a b
Figure 2.3: (a) Exponential distribution. (b) Poisson distribution.

The mode of this distribution is equal to .
+1
The Inverse Gaussian distribution Let , > 0. A random variable X
has an inverse Gaussian distribution with parameters and if its density
function is given by
2
(x)
p(x) = e 22 x , x > 0. (2.10.3)
2x3
We shall write X IG(, ). Its mean, variance and mode are given by

3 92 3
E[X] = , V ar(X) = , M ode(X) = 1+ ,
42 2
where the mode denotes the value xm for which p(x) is maximum, i.e., p(x0 ) =
maxx p(x). This distribution will be used to model the time instance when a
Brownian motion with drift exceeds a certain barrier for the rst time.
2.11 Sums of Random Variables

Let X be a positive random variable with probability density f . We note rst
that for any s > 0

sX
E[e ]= esx f (x) dx = L(f (x))(s), (2.11.4)
0
where L denotes the Laplace transform.

The following result provides the relation between the convolution and the
probability density of a sum of two random variables.
Theorem 2.11.1 Let X and Y be two positive, independent random variables

with probability densities f and g. Let h be the probability density of the sum
X + Y . Then
t t
h(t) = (f g)(t) = f (t )g( ) d = f ( )g(t ) d.
0 0
Proof: Since X and Y are independent, we have
E[es(X+Y ) ] = E[esX ] E[esY ].
Using (2.11.4) this can be written in terms of Laplace transforms as
L(h)(s) = L(f )(s)L(g)(s).
Using Exercise 2.11.2, the density h can be written as the desired convolution
h(t) = (f g)(t).
Exercise 2.11.2 If F (s) = L(f (t))(s), G(s) = L(g(t))(s) both exist for s >
a 0, then
H(s) = F (s)G(s) = L(h(t))(s),
for
t t
h(t) = (f g)(t) = f (t )g( ) d = f ( )g(t ) d.
0 0
Using the associativity of the convolution
(f g) k = f (g k) = f g k
we obtain that if f , g and k are the probability densities of the positive,
independent random variables X, Y and Z, respectively, then f g k is the
probability density of the sum X + Y + Z. The aforementioned result can be
easily extended to the sum of n random variables.
Example 2.11.3 Consider two independent, exponentially distributed ran-
dom variables X and Y . We shall investigate the distribution of the sum
X + Y . Consider f (t) = g(t) = et in Theorem 2.11.1 and obtain the
probability density of the sum
t
h(t) = e(t ) e d = 2 tet , t 0,
0
which is Gamma distributed, with parameters = 2 and = 1/.
Exercise 2.11.4 Consider the independent, exponentially distributed random
variables X 1 e1 t and Y 2 e2 t , with 1 = 2 . Show that the sum is
distributed as
1 2
X +Y (e2 t e1 t ), t 0.
1 2
Basic Notions 27
Figure 2.4: The orthogonal projection of the random variable X on the space
SG is the conditional expectation Y = E[X|G].
2.12 Conditional Expectations

Let X be a random variable on the probability space (, F, P ), and G be a
-eld contained in F. Since X is F-measurable, the expectation of X, given
the information F must be X itself, a fact that can be written as E[X|F] = X
(for details see Example 2.12.5).
On the other hand, the information G does not completely determine X.
The random variable that makes a prediction for X based on the information
G is denoted by E[X|G], and is called the conditional expectation of X given
G. This is dened as the random variable Y = E[X|G], which is the best
approximation of X in the least squares sense, i.e.
E[(X Y )2 ] E[(X Z)2 ], (2.12.5)
for any G-measurable random variable Z.

The set of all square integrable random variables on forms a Hilbert
space with the inner product
X, Y = E[XY ],
see Exercise 2.12.10. This denes the norm X2 = E[X 2 ], which induces the
distance d(X, Y ) = X Y . Denote by SG the set of all G-measurable random
variables on . We shall show that the element of SG that is the closest to
X in the aforementioned distance is the conditional expectation Y = E[X|G],
see Fig. 2.4. Let X denote the orthogonal projection of X on the space SG .
This satises
E[(X X )(Z X )] = 0, Z SG . (2.12.6)
The Pythagorean relation
X X 2 + Z X 2 = X Z2 , Z SG
implies the inequality
X X 2 X Z2 , Z SG ,
which is equivalent to
E[(X X )2 ] E[(X Z)2 ], Z SG ,
which yields X = E[X|G], so the conditional expectation is the orthogonal

projection of X on the space SG . The uniqueness of this projection is a con-
sequence of the Pythagorean relation. The orthogonality relation (2.12.6) can
be written equivalently as
E[(X Y )U ] = 0, U SG .
Therefore, the conditional expectation Y satises the identity
E[XU ] = E[Y U ], U SG .
In particular, if we choose U = A , the characteristic function of a set A G,

then the foregoing relation yields

X dP = Y dP, A G.
A A
We arrive at the following equivalent characterization of the conditional ex-

pectations.
The conditional expectation of X given G is a random variable satisfying:
1. E[X|G] is G-measurable;

2. A E[X|G] dP = A X dP, A G.
Exercise 2.12.1 Consider the probability space (, F, P ), and let G be a -

eld included in F. If X is a G-measurable random variable such that

X dP = 0 A G,
A
then X = 0 a.s.
It is worth mentioning here an equivalent famous result, which relates to

conditional expectations:
Basic Notions 29
Theorem 2.12.2 (Radon-Nikodym) Let (, F, P ) be a probability space

and G be a -eld included in F. Then for any random variable X there is a
G-measurable random variable Y such that

X dP = Y dP, A G. (2.12.7)
A A
Radon-Nikodyms theorem states the existence of Y . In fact this is unique

almost surely by the application of Exercise 2.12.1.
Example 2.12.3 Show that if G = {, }, then E[X|G] = E[X].
Proof: We need to show that E[X] satises conditions 1 and 2. The rst one
is obviously satised since any constant is G-measurable. The latter condition
is checked on each set of G. We have

X dP = E[X] = E[X] dP = E[X]dP

X dP = E[X]dP.

Example 2.12.4 Show that E[E[X|G]] = E[X], i.e. all conditional expecta-
tions have the same mean, which is the mean of X.
Proof: Using the denition of expectation and taking A = in the second

relation of the aforementioned denition, yields

E[E[X|G]] = E[X|G] dP = XdP = E[X],

which ends the proof.
Example 2.12.5 The conditional expectation of X given the total informa-

tion F is the random variable X itself, i.e.
E[X|F] = X.
Proof: The random variables X and E[X|F] are both F-measurable (from
the denition of the random variable). From the denition of the conditional
expectation we have

E[X|F] dP = X dP, A F.
A A
Exercise 2.12.1 implies that E[X|F] = X almost surely.

General properties of the conditional expectation are stated below without
proof. The proof involves more or less simple manipulations of integrals and
can be taken as an exercise for the reader.
Proposition 2.12.6 Let X and Y be two random variables on the probability

space (, F, P ). We have
1. Linearity:
E[aX + bY |G] = aE[X|G] + bE[Y |G], a, b R;
2. Factoring out the measurable part:
E[XY |G] = XE[Y |G]
if X is G-measurable. In particular, E[X|G] = X.

3. Tower property (the least information wins):
E[E[X|G]|H] = E[E[X|H]|G] = E[X|H], if H G;
4. Positivity:
E[X|G] 0, if X 0;
5. Expectation of a constant is a constant:
E[c|G] = c.
6. An independent condition drops out:
E[X|G] = E[X],
if X is independent of G.
Exercise 2.12.7 Prove the property 3 (tower property) given in the previous
proposition.
Exercise 2.12.8 Toss a fair coin 4 times. Each toss yields either H (heads)
or T (tails) with equal probability.
(a) How many elements does the sample space have?
(b) Consider the events A = {Two of the 4 tosses are H}, B = {The rst
toss is H}, and C = {3 of the 4 tosses are H}. Compute P (A), P (B) and
P (C).
(c) Compute P (A B) and P (B C).
(d) Are the events A and B independent?
(e) Are the events B and C independent? Find P (B|C).
Basic Notions 31
(f ) Consider the following information sets (-algebras)
F = {we know the outcomes of the rst two tosses}
G = {we know the outcomes of the tosses but not the order}.
How can you state in words the information set F G?

(g) Prove or disprove: (i) A G, (ii) B F, and (iii) C G.
(h)Dene the random variables
X = number of H number of T
Y = number of T before the rst H.
Show that X is G-measurable while Y is not G-measurable.

(i) Find the expectations E[X], E[Y ] and E[X|G].
Exercise 2.12.9 Let X be a random variable on the probability space (, F, P ),

which is independent of the -eld G F. Consider the characteristic function
1, if A
of a set A dened by A () =
0, if
/ A.
Show the following:
(a) A is G-measurable for any A G;
(b) P (A) = E[A ];
(c) X and A are independent random variables;
(d) E[A X] = E[X]P (A) for any A G;
(e) E[X|G] = E[X].
Exercise 2.12.10 Let L2 (, F, P ) be the space of square integrable random

variables on the probability space (, F, P ). Dene the following scalar product
on L2 (, F, P )
X, Y = E[XY ].
(a) Show that L2 (, F, P ) becomes a Hilbert space;

(b) Show that if is a random variable in L2 (, F, P ) and G is a -eld
contained in F, then E[|G] is the orthogonal projection of onto the subspace
of L2 (, F, P ) consisting of G-measurable random variables.
Figure 2.5: Jensens inequality (E[X]) < E[(X)] for a convex function .
2.13 Inequalities of Random Variables

This section prepares the reader for the limits of sequences of random variables
and limits of stochastic processes. We recall rst that an innite dierentiable
function f (x), has a Taylor series at a if
f (a) f (a) f (a)

f (x) = f (a) + (x a) + (x a)2 + (x a)3 + . . . ,
1! 2! 3!
where x belongs to an interval neighborhood of a.
Exercise 2.13.1 Let f (x) be a function that is n + 1 times dierentiable on

an interval I, containing a. Show that there is a I such that for any x I
f (a) f (n) (a) f (n+1) (a)

f (x) = f (a) + (x a) + + (x a)n + ( a)n+1 .
1! n! (n + 1)!
We shall start with a classical inequality result regarding expectations.
Theorem 2.13.2 (Jensens inequality) Let : R R be a convex func-

tion and let X be an integrable random variable on the probability space (, F, P ).
If (X) is integrable, then
(E[X]) E[(X)].
Proof: We shall assume twice dierentiable with continuous. Let =

E[X]. Expand in a Taylor series about , see Exercise 2.13.1, and get
1
(x) = () + ()(x ) + ()( )2 ,
2
Basic Notions 33
with in between x and . Since is convex, 0, and hence
(x) () + ()(x ),

which means the graph of (x) is above the tangent line at x, (x) . Replac-
ing x by the random variable X, and taking the expectation yields
E[(X)] E[() + ()(X )] = () + ()(E[X] )

= () = (E[X]),
which proves the result.

Fig. 2.5 provides a graphical interpretation of Jensens inequality. If the
distribution of X is symmetric, then the distribution of (X) is skewed, with
(E[X]) < E[(X)].
It is worth noting that the inequality is reversed for concave. We shall
present next a couple of applications.
A random variable X : R is called square integrable if

E[X 2 ] = |X()|2 dP () = x2 p(x) dx < .
R
Application 2.13.3 If X is a square integrable random variable, then it is

integrable.
Proof: Jensens inequality with (x) = x2 becomes
E[X]2 E[X 2 ].
Since the right side is nite, it follows that E[X] < , so X is integrable.
Application 2.13.4 If mX (t) denotes the moment generating function of the

random variable X with mean , then
mX (t) et .
Proof: Applying Jensens inequality with the convex function (x) = ex

yields
eE[X] E[eX ].
Substituting tX for X implies that
eE[tX] E[etX ]. (2.13.8)

Using the denition of the moment generating function mX (t) = E[etX ] and
that E[tX] = tE[X] = t, then (2.13.8) leads to the desired inequality.
The variance of a square integrable random variable X is dened by
V ar(X) = E[X 2 ] E[X]2 .
By Application 2.13.3 we have V ar(X) 0, so that there is a constant X > 0,

called standard deviation, such that
2
X = V ar(X).
Exercise 2.13.5 Prove the following identity:
V ar[X] = E[(X E[X])2 ].
Exercise 2.13.6 Prove that a non-constant random variable has a nonzero

standard deviation.
Exercise 2.13.7 Prove the following extension of Jensens inequality: If is

a convex function, then for any -eld G F we have
(E[X|G]) E[(X)|G].
Exercise 2.13.8 Show the following:

(a) |E[X]| E[|X|];
(b) |E[X|G]| E[|X| |G], for any -eld G F;
(c) |E[X]|r E[|X|r ], for r 1;
(d) |E[X|G]|r E[|X|r |G], for any -eld G F and r 1.
Theorem 2.13.9 (Markovs inequality) For any , p > 0, we have the

following inequality:
1
P (; |X()| ) E[|X|p ].
p
Proof: Let A = {; |X()| }. Then

E[|X| ] =
p
|X()| dP ()
p
|X()| dP ()
p
p dP ()

A A
= p
dP () = P (A) = P (|X| ).
p p
A
Dividing by p leads to the desired result.

Basic Notions 35
Theorem 2.13.10 (Tchebychevs inequality) If X is a random variable

with mean and variance 2 , then
2
P (; |X() | ) .
2
Proof: Let A = {; |X() | }. Then

= V ar(X) = E[(X ) ] = (X ) dP (X )2 dP
2 2 2

A
2 dP = 2 P (A) = 2 P (; |X() | ).
A
Dividing by 2 leads to the desired inequality.

The next result deals with exponentially decreasing bounds on tail distri-
butions.
Theorem 2.13.11 (Cherno bounds) Let X be a random variable. Then

for any > 0 we have
E[etX ]
1. P (X ) , t > 0;
et
E[etX ]
2. P (X ) , t < 0.
et
Proof: 1. Let t > 0 and denote Y = etX . By Markovs inequality
E[Y ]
P (Y et ) .
et
Then we have
P (X ) = P (tX t) = P (etX et )
E[Y ] E[etX ]
= P (Y et ) t = .
e et
2. The case t < 0 is similar.
In the following we shall present an application of the Cherno bounds for
the normal distributed random variables.
Let X be a random variable normally distributed with mean and variance
2
. It is known that its moment generating function is given by
1 2 2
m(t) = E[etX ] = et+ 2 t
.
Using the rst Cherno bound we obtain

m(t) 1 2 2
P (X ) t
= e()t+ 2 t , t > 0,
e
which implies
1
min[( )t + t2 2 ]
P (X ) e t>0 2 .
It is easy to see that the quadratic function f (t) = ( )t + 12 t2 2 has the

minimum value reached for t = . Since t > 0, needs to satisfy > .
2
Then ( )2
min f (t) = f 2
= .
t>0 2 2
Substituting into the previous formula, we obtain the following result:
Proposition 2.13.12 If X is a normally distributed variable, X N (, 2 ),

then for any >
( )2

P (X ) e 2 2 .
Exercise 2.13.13 Let X be a Poisson random variable with mean > 0.

(a) Show that the moment generating function of X is m(t) = e(e 1) ;
t
(b) Use a Cherno bound to show that

t 1)tk
P (X k) e(e , t > 0.
Markovs, Tchebychevs and Chernos inequalities will be useful later

when computing limits of random variables.
Proposition 2.13.14 Let X be a random variable and f and g be two func-

tions, both increasing or decreasing. Then
E[f (X)g(X)] E[f (X)]E[g(X)]. (2.13.9)
Proof: For any two independent random variables X and Y , we have

f (X) f (Y ) g(X) g(Y ) 0.
Applying expectation yields
E[f (X)g(X)] + E[f (Y )g(Y )] E[f (X)]E[f (Y )] + E[f (Y )]E[f (X)].
Considering Y as an independent copy of X we obtain
2E[f (X)g(X)] 2E[f (X)]E[g(X)].

Basic Notions 37
Exercise 2.13.15 Show the following inequalities:

(a) E[X 2 ] E[X]2 ;
(b) E[X sinh(X)] E[X]E[sinh(X)];
(c) E[X 6 ] E[X]E[X 5 ];
(d) E[X 6 ] E[X 3 ]2 .
Exercise 2.13.16 For any n, k 1, show that
E[X 2(n+k+1) ] E[X 2k+1 ]E[X 2n+1 ].
2.14 Limits of Sequences of Random Variables

Consider a sequence (Xn )n1 of random variables dened on the probability
space (, F, P ). There are several ways of making sense of the limit expression
X = lim Xn . This is the subject treated in the following sections.
n
Almost Sure Limit The sequence Xn converges almost surely to X, if for
all states of the world , except a set of probability zero, we have
lim Xn () = X().
n
More precisely, this means

P ; lim Xn () = X() = 1,
n
and we shall write as-lim Xn = X. An important example where this type of

n
limit occurs is the Strong Law of Large Numbers:
If Xn is a sequence of independent and identically distributed random vari-
X1 + + Xn
ables with the same mean , then as-lim = .
n n
This result ensures that the sample mean tends to the (unknown) popu-
lation mean almost surely as n , a fact that makes it very useful in
statistics.
Mean Square Limit Another possibility of convergence is to look at the
mean square deviation of Xn from X. We say that Xn converges to X in the
mean square if
lim E[(Xn X)2 ] = 0.
n
More precisely, this should be interpreted as

2
lim Xn () X() dP () = 0.
n
This limit will be abbreviated by ms-lim Xn = X. The mean square conver-

n
gence is useful when dening the Ito integral.
Proposition 2.14.1 Consider a sequence Xn of random variables such that

there is a constant k with E[Xn ] k and V ar(Xn ) 0 as n . Show
that ms-lim Xn = k.
n
Proof: Since we have
E[|Xn k|2 ] = E[Xn2 2kXn + k2 ] = E[Xn2 ] 2kE[Xn ] + k2

= E[Xn2 ] E[Xn ]2 + E[Xn ]2 2kE[Xn ] + k2
2
= V ar(Xn ) + E[Xn ] k ,
the right side tends to 0 when taking the limit n .
Exercise 2.14.2 Show the following relation

2
E[(X Y )2 ] = V ar[X] + V ar[Y ] + E[X] E[Y ] 2Cov(X, Y ).
Exercise 2.14.3 If Xn tends to X in mean square, with E[X 2 ] < , show

that:
(a) E[Xn ] E[X] as n ;
(b) E[Xn2 ] E[X 2 ] as n ;
(c) V ar[Xn ] V ar[X] as n ;
(d) Cov(Xn , X) V ar[X] as n .
Exercise 2.14.4 If Xn tends to X in mean square, show that E[Xn |H] tends
to E[X|H] in mean square.
Limit in Probability The random variable X is the limit in probability of

Xn if for n large enough the probability of deviation from X can be made
smaller than any arbitrary . More precisely, for any > 0

lim P ; |Xn () X()| = 1.
n
This can also be written as

lim P ; |Xn () X()| > = 0.
n
This limit is denoted by p-lim Xn = X.

n
It is worth noting that both almost certain convergence and convergence
in mean square imply the convergence in probability.
Proposition 2.14.5 The convergence in mean square implies the convergence

in probability.
Basic Notions 39
Proof: Let ms-lim Yn = Y . Let > 0 be arbitrarily xed. Applying Markovs

n
inequality with X = Yn Y , p = 2 and = , yields
1
0 P (|Yn Y | ) E[|Yn Y |2 ].
2
The right side tends to 0 as n . Applying the Squeeze Theorem we obtain
lim P (|Yn Y | ) = 0,
n
which means that Yn converges stochastically to Y .
Example 2.14.6 Let Xn be a sequence of random variables such that E[|Xn |]

0 as n . Prove that p-lim Xn = 0.
n
Proof: Let > 0 be arbitrarily xed. We need to show

lim P ; |Xn ()| = 0. (2.14.10)
n
From Markovs inequality (see Theorem 2.13.9) we have

E[|Xn |]
0 P ; |Xn ()| .

Using the Squeeze Theorem we obtain (2.14.10).
Remark 2.14.7 The conclusion still holds true even in the case when there
is a p > 0 such that E[|Xn |p ] 0 as n .
Limit in Distribution We say the sequence Xn converges in distribution to

X if for any continuous bounded function (x) we have
lim E[(Xn )] = E[(X)].

n
We make the remark that this type of limit is even weaker than the stochastic
convergence, i.e. it is implied by it.
An application of the limit in distribution is obtained if we consider (x) =
itx
e . In this case the expectation becomes the Fourier transform of the prob-
ability density
E[(X)] = eitx p(x) dx = p(t),
and it is called the characteristic function of the random variable X. It follows

that if Xn converges in distribution to X, then the characteristic function of
Xn converges to the characteristic function of X. From the properties of the
Fourier transform, the probability density of Xn approaches the probability

density of X.
It can be shown that the convergence in distribution is equivalent to
lim Fn (x) = F (x),

n
whenever F is continuous at x, where Fn and F denote the distribution func-

tions of Xn and X, respectively. This is the reason that this convergence bears
its name.
Exercise 2.14.8 Consider the probability space (, F, P ), with = [0, 1], F

the -algebra of Borel sets, and P the Lebesgue measure (see Example 2.3.3).
Dene the sequence Xn by

0, if < 1/2 1, if < 1/2
X2n = X2n+1 =
1, if 1/2, 0, if 1/2.
Show that Xn converges in distribution, but does not converge in probability.
2.15 Properties of Mean-Square Limit

This section deals with the main properties of the mean-square limit, which
will be useful in later applications regarding the Ito integral.
Lemma 2.15.1 If ms-lim Xn = 0 and ms-lim Yn = 0, then

n n
ms-lim (Xn + Yn ) = 0.
n
Proof: It follows from the inequality
(x + y)2 2x2 + 2y 2 .
The details are left to the reader.
Proposition 2.15.2 If the sequences of random variables Xn and Yn converge

in the mean square, then
1. ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn

n n n
2. ms-lim (cXn ) = c ms-lim Xn , c R.
n n
Proof: 1. Let ms-lim Xn = X and ms-lim Yn = Y . Consider the sequences

n n
Xn= Xn X and Yn = Yn Y . Then ms-lim Xn = 0 and ms-lim Yn = 0.
n n
Applying Lemma 2.15.1 yields
Basic Notions 41
ms-lim (Xn + Yn ) = 0.

n
This is equivalent to
ms-lim (Xn X + Yn Y ) = 0,
n
which becomes
ms-lim (Xn + Yn ) = X + Y .
n
2. The second relation can be proved in a similar way and is left as an exercise
to the reader.
Remark 2.15.3 It is worthy to note that

ms-lim (Xn Yn ) = ms-lim (Xn ) ms-lim (Yn ).
n n n
Counter-examples can be found, see Exercise 2.15.5.
Exercise 2.15.4 Use a computer algebra system to show the following:

x2 2
(a) 5+1
dx = 1 51/2 ;
x 4
0
x4
(b) dx = ;
x5 + 1
0
1 1 1 4
(c) dx = .
0 x5 + 1 5 5 5
Exercise 2.15.5 Let X be a random variable with the probability density func-
tion
5 1
p(x) = , x 0.
(1/5)(4/5) x5 + 1
(a) Show that E[X 2 ] < and E[X 4 ] = ;
1
(b) Construct the sequences of random variables Xn = Yn = n X. Show
that ms-lim Xn = 0, ms-lim Yn = 0, but ms-lim (Xn Yn ) = .
n n n
2.16 Stochastic Processes

A stochastic process on the probability space (, F, P ) is a family of random
variables Xt parameterized by t T, where T R. If T is an interval we
say that Xt is a stochastic process in continuous time. If T = {1, 2, 3, . . . }
we shall say that Xt is a stochastic process in discrete time. The latter case
describes a sequence of random variables. The reader interested in these type
of processes can consult Brzezniak and Zastawniak [9].
The aforementioned types of convergence can be easily extended to con-
tinuous time. For instance, Xt converges almost surely to X as t if

P ; lim Xt () = X() = 1.
t
The evolution in time of a given state of the world given by the function
t Xt () is called a path or realization of Xt . The study of stochastic
processes using computer simulations is based on retrieving information about
the process Xt given a large number of its realizations.
Next we shall structure the information eld F with an order relation
parameterized by the time t. Consider that all the information accumulated
until time t is contained by the -eld Ft . This means that Ft contains the
information containing events that have already occurred until time t, and
which did not. Since the information is growing in time, we have
Fs Ft F
for any s, t T with s t. The family Ft is called a ltration.

A stochastic process Xt is said to be adapted to the ltration Ft if Xt is
Ft - measurable, for any t T. This means that the information at time t
determines the value of the random variable Xt .
Example 2.16.1 Here there are a few examples of ltrations:

1. Ft represents the information about the evolution of a stock until time
t, with t > 0.
2. Ft represents the information about the evolution of a Black-Jack game
until time t, with t > 0.
3. Ft represents the medical information of a patient until time t.
Example 2.16.2 If X is a random variable, consider the conditional expec-

tation
Xt = E[X|Ft ].
From the denition of conditional expectation, the random variable Xt is Ft -
measurable, and can be regarded as the measurement of X at time t using
the information Ft . If the accumulated knowledge Ft increases and eventually
equals the -eld F, then X = E[X|F], i.e. we obtain the entire random
variable. The process Xt is adapted to Ft .
Example 2.16.3 Don Joe goes to a doctor to get an estimation of how long
he still has to live. The age at which he will pass away is a random variable,
denoted by X. Given his medical condition today, which is contained in Ft ,
the doctor can infer an average age, which is the average of all random in-
stances that agree with the information to date; this is given by the conditional
expectation Xt = E[X|Ft ]. The stochastic process Xt is adapted to the medical
knowledge Ft .
Basic Notions 43
We shall dene next an important type of stochastic process.4
Denition 2.16.4 A process Xt , t T, is called a martingale with respect to

the ltration Ft if
1. Xt is integrable for each t T;
2. Xt is adapted to the ltration Ft ;
3. Xs = E[Xt |Fs ], s < t.
Remark 2.16.5 The rst condition states that the unconditional forecast is
nite E[|Xt ]] = |Xt | dP < . Condition 2 says that the value Xt is known,

given the information set Ft . This can also be stated by saying that Xt is
Ft -measurable. The third relation asserts that the best forecast of unobserved
future values is the last observation on Xt .
Example 2.16.6 Let Xt denote Mr. Li Zhus salary after t years of work at
the same company. Since Xt is known at time t and it is bounded above, as all
salaries are, then the rst two conditions hold. Being honest, Mr. Zhu expects
today that his future salary will be the same as todays, i.e. Xs = E[Xt |Fs ],
for s < t. This means that Xt is a martingale.
Exercise 2.16.7 If X is an integrable random variable on (, F, P ), and Ft

is a ltration. Prove that Xt = E[X|Ft ] is a martingale.
Exercise 2.16.8 Let Xt and Yt be martingales with respect to the ltration

Ft . Show that for any a, b, c R the process Zt = aXt + bYt + c is an Ft -
martingale.
Exercise 2.16.9 Let Xt and Yt be martingales with respect to the ltration

Ft .
(a) Is the process Xt Yt always a martingale with respect to Ft ?
(b) What about the processes Xt2 and Yt2 ?
Exercise 2.16.10 Two processes Xt and Yt are called conditionally uncorre-

lated, given Ft , if
E[(Xt Xs )(Yt Ys )|Fs ] = 0, 0 s < t < .
Let Xt and Yt be martingale processes. Show that the process Zt = Xt Yt is

a martingale if and only if Xt and Yt are conditionally uncorrelated. Assume
that Xt , Yt and Zt are integrable.
4
The concept of martingale was introduced by Levy in 1934.
In the following, if Xt is a stochastic process, the minimum amount of

information resulted from knowing the process Xs until time t is denoted by
Ft = (Xs ; s t). This is the -algebra generated by the events {; Xs ()
(a, b)}, for any real numbers a < b and s t.
In the case of a discrete process, the minimum amount of information
resulted from knowing the process Xk until time n is Fn = (Xk ; k n), the
-algebra generated by the events {; Xk () (a, b)}, for any real numbers
a < b and k n.
Exercise 2.16.11 Let Xn , n 0 be a sequence of integrable independent

random variables, with E[Xn ] < , for all n 0. Let S0 = X0 , Sn =
X0 + + Xn . Show the following:
(a) Sn E[Sn ] is an Fn -martingale.
(b) If E[Xn ] = 0 and E[Xn2 ] < , n 0, then Sn2 V ar(Sn ) is an
Fn -martingale.
Exercise 2.16.12 Let Xn , n 0 be a sequence of independent, integrable

random variables such that E[Xn ] = 1 for n 0. Prove that Pn = X0 X1
Xn is an Fn -martingale.
Exercise 2.16.13 (a) Let X be a normally distributed random variable with

mean = 0 and variance 2 . Prove that there is a unique = 0 such that
E[eX ] = 1.
(b) Let (Xi )i0 be a sequence of identically normally
n distributed random vari-
ables with mean = 0. Consider the sum Sn = j=0 Xj . Show that Zn = eSn
is a martingale, with dened in part (a).
In section 10.1 we shall encounter several processes which are martingales.

Chapter 3
Useful Stochastic Processes
This chapter deals with the most common used stochastic processes and their
basic properties. The two main basic processes are the Brownian motion and
the Poisson process. The other processes described in this chapter are derived
from the previous two. For more advanced topics on the Brownian motion,
the reader may consult Freedman [19], Hida [22], Knight [27], Karatzas and
Shreve [26], or Morters and Peres [34].
3.1 The Brownian Motion

The observation rst made by the botanist Robert Brown in 1827, that small
pollen grains suspended in water have a very irregular and unpredictable state
of motion, led to the denition of the Brownian motion, which is formalized
in the following.
Denition 3.1.1 A Brownian motion process is a stochastic process Bt , t

0, which satises
1. The process starts at the origin, B0 = 0;
2. Bt has independent increments;
3. The process Bt is continuous in t;
4. The increments Bt Bs are normally distributed with mean zero and
variance |t s|,
Bt Bs N (0, |t s|).
The process Xt = x + Bt has all the properties of a Brownian motion that

starts at x. Condition 4 states that the increments of a Brownian motion are
stationary, i.e. the distribution of Bt Bs depends only on the time interval
ts
P (Bt+s Bs a) = P (Bt B0 a) = P (Bt a).
45
It is worth noting that even if Bt is continuous, it is nowhere dierentiable.

From condition 4 we get that Bt is normally distributed with mean E[Bt ] = 0
and V ar[Bt ] = t
Bt N (0, t).
This implies also that the second moment is E[Bt2 ] = t. Let 0 < s < t. Since
the increments are independent, we can write
E[Bs Bt ] = E[(Bs B0 )(Bt Bs ) + Bs2 ] = E[Bs B0 ]E[Bt Bs ] + E[Bs2 ] = s.
Consequently, Bs and Bt are not independent.

Condition 4 also has a physical explanation. A pollen grain suspended in
water is kicked about by a very large number of water molecules. The inuence
of each molecule on the grain is independent of the other molecules. These
eects are averaged out into a resultant increment of the grain coordinate.
According to the Central Limit Theorem, this increment has to be normally
distributed.
If the exterior stochastic activity on the pollen grain is represented at
time t by the noise Nt , then the cummulative eect on the grain during the
t
time interval [0, t] is represented by the integral Wt = 0 Ns ds, which is the
Brownian motion.
There are three distinct classical constructions of the Brownian motion,
due to Wiener [47], Kolmogorov [28] and Levy [33]. However, the existence of
the Brownian motion process is beyond the goal of this book.
It is worth noting that the processes with stationary and independent
increments form a special class of stochastic processes, called Levy processes;
so, in particular, Brownian motions are Levy processes.
Proposition 3.1.2 A Brownian motion process Bt is a martingale with re-

spect to the information set Ft = (Bs ; s t).
Proof: The integrability of Bt follows from Jensens inequality
E[|Bt |]2 E[Bt2 ] = V ar(Bt ) = |t| < .
Bt is obviously Ft -measurable. Let s < t and write Bt = Bs + (Bt Bs ). Then
E[Bt |Fs ] = E[Bs + (Bt Bs )|Fs ]

= E[Bs |Fs ] + E[Bt Bs |Fs ]
= Bs + E[Bt Bs ] = Bs + E[Bts B0 ] = Bs ,
where we used that Bs is Fs -predictable (from where E[Bs |Fs ] = Bs ) and that
the increment Bt Bs is independent of previous values of Bt contained in the
information set Ft = (Bs ; s t).
Useful Stochastic Processes 47
A process with similar properties as the Brownian motion was introduced

by Wiener.
Denition 3.1.3 A Wiener process Wt is a process adapted to a ltration Ft
such that
1. The process starts at the origin, W0 = 0;
2. Wt is a squared integrable Ft -martingale with
E[(Wt Ws )2 ] = t s, s t;
3. The process Wt is continuous in t.
Since Wt is a martingale, its increments satisfy
E[Wt Ws ] = E[Wt Ws |Fs ] = E[Wt |Fs ] Ws = Ws Ws = 0,
and hence E[Wt ] = 0. It is easy to show that
V ar[Wt Ws ] = |t s|, V ar[Wt ] = t.
Exercise 3.1.4 Show that a Brownian process Bt is a Wiener process.
The only property Bt has and Wt seems not to have is that the increments are
normally distributed. However, it can be shown that there is no distinction
between these two processes, as the famous Levy theorem states, see section
10.2. From now on, the notations Bt and Wt will be used interchangeably.
Innitesimal relations In stochastic calculus we often need to use innites-
imal notation and its properties. If dWt denotes the innitesimal increment
of a Wiener process in the time interval dt, the aforementioned properties
become dWt N (0, dt), E[dWt ] = 0, and E[(dWt )2 ] = dt.
Proposition 3.1.5 If Wt is a Brownian motion with respect to the informa-
tion set Ft , then Yt = Wt2 t is a martingale.
Proof: Yt is integrable since
E[|Yt |] E[Wt2 + t] = 2t < , t > 0.
Let s < t. Using that the increments Wt Ws and (Wt Ws )2 are independent
of the information set Fs and applying Proposition 2.12.6 yields
E[Wt2 |Fs ] = E[(Ws + Wt Ws )2 |Fs ]
= E[Ws2 + 2Ws (Wt Ws ) + (Wt Ws )2 |Fs ]
= E[Ws2 |Fs ] + E[2Ws (Wt Ws )|Fs ] + E[(Wt Ws )2 |Fs ]
= Ws2 + 2Ws E[Wt Ws |Fs ] + E[(Wt Ws )2 |Fs ]
= Ws2 + 2Ws E[Wt Ws ] + E[(Wt Ws )2 ]
= Ws2 + t s,
and hence E[Wt2 t|Fs ] = Ws2 s, for s < t.

The following result states the memoryless property of Brownian motion1
Wt .
Proposition 3.1.6 The conditional distribution of Wt+s , given the present

Wt and the past Wu , 0 u < t, depends only on the present.
Proof: Using the independent increment assumption, we have
P (Wt+s c|Wt = x, Wu , 0 u < t)

= P (Wt+s Wt c x|Wt = x, Wu , 0 u < t)
= P (Wt+s Wt c x)
= P (Wt+s c|Wt = x).
Since Wt is normally distributed with mean 0 and variance t, its density

function is
1 x2
t (x) = e 2t .
2t
Then its distribution function is
x
1 u2
Ft (x) = P (Wt x) = e 2t du.
2t
The probability that Wt is between the values a and b is given by

b
1 u2
P (a Wt b) = e 2t du, a < b.
2t a
Even if the increments of a Brownian motion are independent, their values

are still correlated.
Proposition 3.1.7 Let 0 s t. Then

1. Cov(Ws , Wt ) = s;

s
2. Corr(Ws, Wt ) = .
t
1
These type of processes are called Markov processes.
Proof: 1. Using the properties of covariance
Cov(Ws , Wt ) = Cov(Ws , Ws + Wt Ws )
= Cov(Ws , Ws ) + Cov(Ws , Wt Ws )
= V ar(Ws ) + E[Ws (Wt Ws )] E[Ws ]E[Wt Ws ]
= s + E[Ws ]E[Wt Ws ]
= s,
since E[Ws ] = 0.
We can also arrive at the same result starting from the formula
Cov(Ws , Wt ) = E[Ws Wt ] E[Ws ]E[Wt ] = E[Ws Wt ].
Using that conditional expectations have the same expectation, factoring out
the predictable part, and using that Wt is a martingale, we have
E[Ws Wt ] = E[E[Ws Wt |Fs ]] = E[Ws E[Wt |Fs ]]

= E[Ws Ws ] = E[Ws2 ] = s,
so Cov(Ws , Wt ) = s.
2. The correlation formula yields
Cov(Ws , Wt ) s s
Corr(Ws , Wt ) = = = .
(Wt )(Ws ) s t t
Remark 3.1.8 Removing the order relation between s and t, the previous
relations can also be stated as
Cov(Ws , Wt ) = min{s, t};

min{s, t}
Corr(Ws , Wt ) = .
max{s, t}
The following exercises state the translation and the scaling invariance
properties of the Brownian motion.
Exercise 3.1.9 For any t0 0, show that the process Xt = Wt+t0 Wt0 is a
Brownian motion. It can also be stated that the Brownian motion is translation
invariant.
Exercise 3.1.10 For any > 0, show that the process Xt = 1 Wt is a

Brownian motion. This says that the Brownian motion is invariant by scaling.
Exercise 3.1.11 Let 0 < s < t < u. Show the following multiplicative prop-
erty
Corr(Ws , Wt )Corr(Wt , Wu ) = Corr(Ws , Wu ).
Exercise 3.1.12 Find the expectations E[Wt3 ] and E[Wt4 ].
Exercise 3.1.13 (a) Use the martingale property of Wt2 t to nd

E[(Wt2 t)(Ws2 s)];
(b) Evaluate E[Wt2 Ws2 ];
(c) Compute Cov(Wt2 , Ws2 );
(d) Find Corr(Wt2 , Ws2 ).
Exercise 3.1.14 Consider the process Yt = tW 1 , t > 0, and dene Y0 = 0.

t
(a) Find the distribution of Yt ;
(b) Find the probability density of Yt ;
(c) Find Cov(Ys , Yt );
(d) Find E[Yt Ys ] and V ar(Yt Ys ) for s < t.
It is worth noting that the process Yt = tW 1 , t > 0 with Y0 = 0 is a Brownian

t
motion, see Exercise 10.2.10 .
Exercise 3.1.15 The process Xt = |Wt | is called a Brownian motion reected

at the origin. Show that
(a) E[|Wt |] = 2t/;
(b) V ar(|Wt |) = (1 2 )t.
Exercise 3.1.16 Let 0 < s < t. Find E[Wt2 |Fs ].
Exercise 3.1.17 Let 0 < s < t. Show that

(a) E[Wt3 |Fs ] = 3(t s)Ws + Ws3 ;
(b) E[Wt4 |Fs ] = 3(t s)2 + 6(t s)Ws2 + Ws4 .
t
Exercise 3.1.18 Show that E Wu du|Fs = (t s)Ws .
s
Exercise 3.1.19 Show that the process

t
Xt = Wt3 3 Ws ds
0
is a martingale with respect to the information set Ft = {Ws ; s t}.

50 100 150 200 250 300 350

4
3
0.5
1.0
50 100 150 200 250 300 350
a b
Figure 3.1: (a) Three simulations of the Brownian motion Wt . (b) Two sim-
ulations of the exponential Brownian motion eWt .
Exercise 3.1.20 Show that the following processes are Brownian motions
(a) Xt = WT WT t , 0 t T ;
(b) Yt = Wt , t 0.
Exercise 3.1.21 Let Wt and W t be two independent Brownian motions and

be a constant with || 1.

(a) Show that the process Xt = Wt + 1 2 W t is continuous and has
the distribution N (0, t);
(b) Is Xt a Brownian motion?
Exercise 3.1.22 Let Y be a random variable distributed as N (0, 1). Consider

the process Xt = tY . Is Xt a Brownian motion?
3.2 Geometric Brownian Motion

The geometric Brownian motion with drift and volatility is the process
2
Xt = eWt +( 2
)t
, t 0.
In the standard case, when = 0 and = 1, the process becomes Xt =

eWt t/2 , t 0. This driftless process is always a martingale, see Exercise 3.2.4.
The following result will be useful later in the chapter.
2 t/2
Lemma 3.2.1 E[eWt ] = e , for 0.
Proof: Using the denition of expectation

1 x2
E[eWt
] = e t (x) dx =
x
e 2t +x dx
2t
2 t/2
= e ,
where we have used the integral formula

ax2 +bx b2
e dx = e 4a , a>0
a
1
with a = 2t and b = .
Proposition 3.2.2 The exponential Brownian motion Xt = eWt is log-normally

distributed with mean et/2 and variance e2t et .
Proof: Since Wt is normally distributed, then Xt = eWt will have a log-normal

distribution. Using Lemma 3.2.1 we have
E[Xt ] = E[eWt ] = et/2

E[Xt2 ] = E[e2Wt ] = e2t ,
and hence the variance is
V ar[Xt ] = E[Xt2 ] E[Xt ]2 = e2t (et/2 )2 = e2t et .
A few simulations of the process Xt are contained in Fig. 3.1(b).

The distribution function of Xt = eWt can be obtained by reducing it to
the distribution function of a Brownian motion as in the following.
FXt (x) = P (Xt x) = P (eWt x)

= P (Wt ln x) = FWt (ln x)
ln x
1 u2
= e 2t du.
2t
The density function of the geometric Brownian motion Xt = eWt is given by

1

2
e(ln x) /(2t) , if x > 0,
d
p(x) = F (x) = x 2t
dx Xt

0, elsewhere.
E[eWt Ws ] = e
ts
2 , s < t.
Figure 3.2: To be used in the proof of formula (3.3.1); the area of the blocks
can be counted in two equivalent ways, horizontally and vertically.
Exercise 3.2.4 Let Xt = eWt .

(a) Show that Xt is not a martingale.
(b) Show that e 2 Xt is a martingale.
t
1 2
(c) Show that for any constant c R, the process Yt = ecWt 2 c t is a
martingale.
Exercise 3.2.5 If Xt = eWt , nd Cov(Xs , Xt )

(a) by direct computation;
(b) by using Exercise 3.2.4 (b).

2Wt2 (1 4t)1/2 , 0 t < 1/4
E[e ]=
, otherwise.
3.3 Integrated Brownian Motion

The stochastic process
t
Zt = Ws ds, t0
0
is called the integrated Brownian motion. Obviously, Z0 = 0.

Let 0 = s0 < s1 < < sk < sn = t, with sk = kt

n. Then Zt can be
written as a limit of Riemann sums

n
Ws 1 + + Ws n
Zt = lim Wsk s = t lim ,
n n n
k=1
where s = sk+1 sk = nt . Since Wsk are not independent, we rst need

to transform the previous expression into a sum of independent normally dis-
tributed random variables. A straightforward computation shows that
Ws 1 + + Ws n
= n(Ws1 W0 ) + (n 1)(Ws2 Ws1 ) + + (Wsn Wsn1 )
= X1 + X2 + + Xn . (3.3.1)
This formula becomes clear if one sums the area of the blocks in Fig. 3.2
horizontally and then vertically. Since the increments of a Brownian motion
are independent and normally distributed, we have

X1 N 0, n2 s

X2 N 0, (n 1)2 s

X3 N 0, (n 2)2 s
..
.

Xn N 0, s .
Recall now the following well known theorem on the addition formula for
Gaussian random variables.
Theorem 3.3.1 If Xj are independent random variables normally distributed
with mean j and variance j2 , then the sum X1 + + Xn is also normally
distributed with mean 1 + + n and variance 12 + + n2 .
Then
n(n + 1)(2n + 1)
X1 + +Xn N 0, (1+22 +32 + +n2 )s = N 0, s ,
6
t
with s = . Using (3.3.1) yields
n
Ws + + Ws n (n + 1)(2n + 1)
t 1 N 0, t3 .
n 6n2
Taking the limit with n , we get
t3
Zt N 0, .
3
Proposition 3.3.2 The integrated Brownian motion Zt has a normal distri-

bution with mean 0 and variance t3 /3.
Remark 3.3.3 The aforementioned limit was taken heuristically, without

specifying the type of the convergence. In order to make this work, the fol-
lowing result is usually used:
If Xn is a sequence of normal random variables that converges in mean
square to X, then the limit X is normally distributed, with E[Xn ] E[X]
and V ar(Xn ) V ar(X), as n .
The mean and the variance can also be computed in a direct way as follows.
By Fubinis theorem we have
t t
E[Zt ] = E[ Ws ds] = Ws ds dP
0 R 0
t t
= Ws dP ds = E[Ws ] ds = 0,
0 R 0
since E[Ws ] = 0. Then the variance is given by
V ar[Zt ] = E[Zt2 ] E[Zt ]2 = E[Zt2 ]

t t t t
= E Wu du Wv dv = E Wu Wv dudv
0 0 0 0
t t
= E[Wu Wv ] dudv = min{u, v} dudv
0 0 [0,t][0,t]

= min{u, v} dudv + min{u, v} dudv, (3.3.2)
D1 D2
where
D1 = {(u, v); u > v, 0 u t}, D2 = {(u, v); u < v, 0 u t}.
The rst integral can be evaluated using Fubinis theorem

min{u, v} dudv = v dudv
D1 D1
t
u t
u2 t3
= v dv du = du = .
0 0 0 2 6
Similarly, the latter integral is equal to

t3
min{u, v} dudv = .
D2 6
Substituting in (3.3.2) yields

t3 t3 t3
V ar[Zt ] = + = .
6 6 3
For another computation of the variance of Zt , see Exercise 5.6.2.
Exercise 3.3.4 (a) Prove that the moment generating function of Zt is given
by
2 3
m(u) = eu t /6 .
(b) Use the rst part to nd the mean and variance of Zt .
Exercise 3.3.5 Let s < t. Show that the covariance of the integrated Brown-
ian motion is given by
t s
Cov Zs , Zt = s2 , s < t.
2 6
(a) Cov(Zt , Zt Zth ) = 12 t2 h + o(h), where o(h) denotes a quantity such
that limh0 o(h)/h = 0;
t2
(b) Cov(Zt , Wt ) = .
2
E[eWs +Wu ] = e
emin{s,u} .
u+s
2
t
Exercise 3.3.8 Consider the process Xt = eWs ds.
0
(a) Find the mean of Xt ;
(b) Find the variance of Xt .
In the next exercises Ft denotes the -eld generated by the Brownian

motion Wt .
t
Exercise 3.3.9 Consider the process Zt = Wu du, t > 0.
0
(a) Show that E[ZT |Ft ] = Zt + Wt (T t), for any t < T ;
(b) Prove that the process Mt = Zt tWt is an Ft -martingale.
t
Exercise 3.3.10 Let Yt = Wu2 du, t > 0.
0
T 1
(a) Show that E Ws2 ds|Ft = Wt2 (T t) + (T t)2 , for any t < T ;
t 2
(b) Prove that the process Mt = Yt tWt2 + t2 /2 is an Ft -martingale.
10
4
8
6
2
0.2 0.4 0.6 0.8 1.0 2
2 2 4 6 8 10
2
4
4
a b
Figure 3.3: (a) Brownian bridge pinned down at 0 and 1. (b) Brownian motion
with drift Xt = t + Wt , with positive drift > 0.
3.4 Exponential Integrated Brownian Motion

t
If Zt = 0 Ws ds denotes the integrated Brownian motion, the process
Vt = eZt
is called exponential integrated Brownian motion. The process starts at V0 =
e0 = 1. Since Zt is normally distributed, then Vt is log-normally distributed.
We compute the mean and the variance in a direct way. Using Exercises 3.2.5
and 3.3.4 we have
t3
E[Vt ] = E[eZt ] = m(1) = e 6
4t3 2t3
E[Vt2 ] = E[e2Zt ] = m(2) = e 6 =e 3
2t3 t3
V ar(Vt ) = E[Vt2 ] E[Vt ]2 = e 3 e3
t+3s
Cov(Vs , Vt ) = e 2 .
(T t)3
Exercise 3.4.1 Show that E[VT |Ft ] = Vt e(T t)Wt + 3 for t < T .
3.5 Brownian Bridge

The process Xt = Wt tW1 is called the Brownian bridge xed at both 0 and
1, see Fig. 3.3(a). Since we can also write
Xt = Wt tWt tW1 + tWt
= (1 t)(Wt W0 ) t(W1 Wt ),
using that the increments Wt W0 and W1 Wt are independent and normally
distributed, with
Wt W0 N (0, t), W1 Wt N (0, 1 t),
it follows that Xt is normally distributed with
E[Xt ] = (1 t)E[(Wt W0 )] tE[(W1 Wt )] = 0

V ar[Xt ] = (1 t)2 V ar[(Wt W0 )] + t2 V ar[(W1 Wt )]
= (1 t)2 (t 0) + t2 (1 t)
= t(1 t).
This can also be stated by saying that the Brownian bridge tied at
0 and 1 is
a Gaussian process with mean 0 and variance t(1 t), so Xt N 0, t(1 t) .
Exercise 3.5.1 Let Xt = Wt tW1 , 0 t 1 be a Brownian bridge xed at

0 and 1. Let Yt = Xt2 . Show that Y0 = Y1 = 0 and nd E[Yt ] and V ar(Yt ).
3.6 Brownian Motion with Drift

The process Yt = t + Wt , t 0, is called Brownian motion with drift, see
Fig. 3.3(b). The process Yt tends to drift o at a rate . It starts at Y0 = 0
and it is a Gaussian process with mean
E[Yt ] = t + E[Wt ] = t
and variance
V ar[Yt ] = V ar[t + Wt ] = V ar[Wt ] = t.
Exercise 3.6.1 Find the distribution and the density functions of the process
Yt .
3.7 Bessel Process

This section deals with the process satised by the Euclidean distance from
the origin to a particle following a Brownian motion in Rn . More precisely, if
W1 (t), , Wn (t) are independent Brownian motions, consider the n-dimensional
Brownian motion W (t) = (W1 (t), , Wn (t)), n 2. The process

Rt = dist(O, W (t)) = W1 (t)2 + + Wn (t)2
is called the n-dimensional Bessel process, see Fig. 3.4.

The probability density of this process is given by the following result.
Proposition 3.7.1 The probability density function of Rt , t > 0 is given by

10
Wt
5
Rt
10 5 0 5 10
5
10
Figure 3.4: The Bessel process Rt = |W (t)| for n = 2.
2

2
n1 e 2t , 0;
(2t)n/2(n/2)
pt () =

0, <0
with
n
n ( 2 1)! for n even;
=
2
( n2 1)( n2 2) 32 12 , for n odd.
Proof: Since the Brownian motions W1 (t), . . . , Wn (t) are independent, their
joint density function is
1 2 2
fW1 Wn (x) = fW1 (x) fWn (x) = e(x1 ++xn )/(2t) , t > 0.
(2t)n/2
In the next computation we shall use the following formula of integration

that follows from the use of polar coordinates

n1
f (x) dx = (S ) r n1 g(r) dr, (3.7.3)
{|x|} 0
where f (x) = g(|x|) is a function on Rn with spherical symmetry, and where

2 n/2
(Sn1 ) = is the area of the (n 1)-dimensional sphere in Rn . Let
(n/2)
0. The distribution function of Rt is

FR () = P (Rt ) = fW1 Wn (x) dx1 dxn
{Rt }

1 2 2
= n/2
e(x1 ++xn )/(2t) dx1 dxn
x21 ++x2n 2 (2t)

1 2 2
= r n1 n/2
e(x1 ++xn )/(2t) d dr
0 S(0,1) (2t)

(Sn1 ) n1 r2 /(2t)
= r e dr.
(2t)n/2 0
Dierentiating yields
d (Sn1 ) n1 2
pt () = FR () = e 2t
d (2t)n/2
2
2 n1 2t
= e , > 0, t > 0.
(2t)n/2 (n/2)
It is worth noting that in the 2-dimensional case the aforementioned density

becomes a particular case of a Weibull distribution with parameters m = 2
and = 2t, called Walds distribution
1 x2
pt (x) = xe 2t , x > 0, t > 0.
t
Exercise 3.7.2 Let P (Rt t) be the probability of a 2-dimensional Brownian

motion being inside of the disk D(0, ) at time t > 0. Show that
2 2 2
1 < P (Rt t) < .
2t 4t 2t
Exercise 3.7.3 Let Rt be a 2-dimensional Bessel process. Show that

(a) E[Rt ] = 2t/2;

(b) V ar(Rt ) = 2t 1 4 .
Rt
Exercise 3.7.4 Let Xt = , t > 0, where Rt is a 2-dimensional Bessel
t
process. Show that Xt 0 as t in mean square.
3.8 The Poisson Process

A Poisson process describes the number of occurrences of a certain event before
time t, such as: the number of electrons arriving at an anode until time t; the
number of cars arriving at a gas station until time t; the number of phone calls
received on a certain day until time t; the number of visitors entering a museum
on a certain day until time t; the number of earthquakes that occurred in Chile
during the time interval [0, t]; the number of shocks in the stock market from
the beginning of the year until time t; the number of twisters that might hit
Alabama from the beginning of the century until time t.
The denition of a Poisson process is stated more precisely in the following.
Its graph looks like a stair-type function with unit jumps, see Fig. 3.5.
Denition 3.8.1 A Poisson process is a stochastic process Nt , t 0, which

satises
1. The process starts at the origin, N0 = 0;
2. Nt has independent increments;
3. The process Nt is right continuous in t, with left hand limits;
4. The increments Nt Ns , with 0 < s < t, have a Poisson distribution
with parameter (t s), i.e.
k (t s)k (ts)
P (Nt Ns = k) = e .
k!
It can be shown that condition 4 in the previous denition can be replaced by
the following two conditions:
P (Nt Ns = 1) = (t s) + o(t s) (3.8.4)

P (Nt Ns 2) = o(t s), (3.8.5)
where o(h) denotes a quantity such that limh0 o(h)/h = 0. Then the prob-
ability that a jump of size 1 occurs in the innitesimal interval dt is equal to
dt, and the probability that at least 2 events occur in the same small interval
is zero. This implies that the random variable dNt may take only two values,
0 and 1, and hence satises
P (dNt = 1) = dt (3.8.6)
P (dNt = 0) = 1 dt. (3.8.7)
Exercise 3.8.2 Show that if condition 4 is satised, then conditions (3.8.4)

and (3.8.5) hold.
2 4 6 8 10 12 14
Figure 3.5: The Poisson process Nt .
Exercise 3.8.3 Which of the following expressions are o(h)?

(a) f (h) = 3h2 + h;

(b) f (h) = h + 5;
(c) f (h) = h ln |h|;
(d) f (h) = heh .
Condition 4 also states that Nt has stationary increments. The fact that
Nt Ns is stationary can be stated as

n
(t)k
P (Nt+s Ns n) = P (Nt N0 n) = P (Nt n) = et .
k!
k=0
From condition 4 we get the mean and variance of increments
E[Nt Ns ] = (t s), V ar[Nt Ns ] = (t s).
In particular, the random variable Nt is Poisson distributed with E[Nt ] = t

and V ar[Nt ] = t. The parameter is called the rate of the process. This
means that the events occur at the constant rate , with > 0.
Since the increments are independent, we have for 0 < s < t
E[Ns Nt ] = E[(Ns N0 )(Nt Ns ) + Ns2 ]

= E[Ns N0 ]E[Nt Ns ] + E[Ns2 ]
= s (t s) + (V ar[Ns ] + E[Ns ]2 )
= 2 st + s. (3.8.8)
As a consequence we have the following result:
Proposition 3.8.4 Let 0 s t. Then

1. Cov(Ns , Nt ) = s;

s
2. Corr(Ns, Nt ) = .
t
Proof: 1. Using (3.8.8) we have
Cov(Ns , Nt ) = E[Ns Nt ] E[Ns ]E[Nt ]

= 2 st + s st
= s.
2. Using the formula for the correlation yields

Cov(Ns , Nt ) s s
Corr(Ns , Nt ) = 1/2
= = .
(V ar[Ns ]V ar[Nt ]) (st)1/2 t
It worth noting the similarity with Proposition 3.1.7.
Proposition 3.8.5 Let Nt be Ft -adapted. Then the process Mt = Nt t is

an Ft -martingale.
Proof: Let s < t and write Nt = Ns + (Nt Ns ). Then
E[Nt |Fs ] = E[Ns + (Nt Ns )|Fs ]

= E[Ns |Fs ] + E[Nt Ns |Fs ]
= Ns + E[Nt Ns ]
= Ns + (t s),
where we used that Ns is Fs -measurable (and hence E[Ns |Fs ] = Ns ) and

that the increment Nt Ns is independent of previous values of Ns and the
information set Fs . Subtracting t yields
E[Nt t|Fs ] = Ns s,
or E[Mt |Fs ] = Ms . Since it is obvious that Mt is integrable and Ft -adapted,

it follows that Mt is a martingale.
It is worth noting that the Poisson process Nt is not a martingale. The
martingale process Mt = Nt t is called the compensated Poisson process.
Exercise 3.8.6 Compute E[Nt2 |Fs ] for s < t. Is the process Nt2 an Fs -
martingale?
Exercise 3.8.7 (a) Show that the moment generating function of the random
variable Nt is
mNt (x) = et(e 1) .
x
(b) Deduce the expressions for the rst few moments
E[Nt ] = t
E[Nt2 ] = 2 t2 + t
E[Nt3 ] = 3 t3 + 32 t2 + t
E[Nt4 ] = 4 t4 + 63 t3 + 72 t2 + t.
(c) Show that the rst few central moments are given by
E[Nt t] = 0
E[(Nt t)2 ] = t
E[(Nt t)3 ] = t
E[(Nt t)4 ] = 32 t2 + t.
Exercise 3.8.8 Find the mean and variance of the process Xt = eNt .
Exercise 3.8.9 (a) Show that the moment generating function of the random
variable Mt is
mMt (x) = et(e x1) .
x
(b) Let s < t. Verify that
E[Mt Ms ] = 0,
E[(Mt Ms )2 ] = (t s),
E[(Mt Ms )3 ] = (t s),
E[(Mt Ms )4 ] = (t s) + 32 (t s)2 .
Exercise 3.8.10 Let s < t. Show that
V ar[(Mt Ms )2 ] = (t s) + 22 (t s)2 .
3.9 Interarrival Times

For each state of the world, , the path t Nt () is a step function that
exhibits unit jumps. Each jump in the path corresponds to an occurrence of
a new event. Let T1 be the random variable which describes the time of the
1st jump. Let T2 be the time between the 1st jump and the second one. In
general, denote by Tn the time elapsed between the (n 1)th and nth jumps.
The random variables Tn are called interarrival times.
Proposition 3.9.1 The random variables Tn are independent and exponen-

tially distributed with mean E[Tn ] = 1/.
Proof: We start by noticing that the events {T1 > t} and {Nt = 0} are the
same, since both describe the situation that no events occurred until time t.
Then
P (T1 > t) = P (Nt = 0) = P (Nt N0 = 0) = et ,
and hence the distribution function of T1 is
FT1 (t) = P (T1 t) = 1 P (T1 > t) = 1 et .
Dierentiating yields the density function

d
fT1 (t) = FT (t) = et .
dt 1
It follows that T1 is has an exponential distribution, with E[T1 ] = 1/.
In order to show that the random variables T1 and T2 are independent, it
suces to show that
P (T2 t) = P (T2 t|T1 = s),
i.e. the distribution function of T2 is independent of the values of T1 . We note

rst that from the independent increments property

P 0 jumps in (s, s + t], 1 jump in (0, s] = P (Ns+t Ns = 0, Ns N0 = 1)
= P (Ns+t Ns = 0)P (Ns N0 = 1)

= P 0 jumps in (s, s + t] P 1 jump in (0, s] .
Then the conditional distribution of T2 is
F (t|s) = P (T2 t|T1 = s) = 1 P (T2 > t|T1 = s)

P (T2 > t, T1 = s)
= 1
P (T1 = s)

P 0 jumps in (s, s + t], 1 jump in (0, s]
= 1
P (T1 = s)

P 0 jumps in (s, s + t] P 1 jump in (0, s]
= 1
P 1 jump in (0, s]

= 1 P 0 jumps in (s, s + t]
= 1 P (Ns+t Ns = 0) = 1 et ,
which is independent of s. Then T2 is independent of T1 and exponentially

distributed. A similar argument for any Tn leads to the desired result.
3.10 Waiting Times

The random variable Sn = T1 + T2 + + Tn is called the waiting time until
the nth jump. The event {Sn t} means that there are n jumps that occurred
before or at time t, i.e. there are at least n events that happened up to time t;
the event is equal to {Nt n}. Hence the distribution function of Sn is given
by

(t)k
FSn (t) = P (Sn t) = P (Nt n) = et
k!
k=n
Dierentiating we obtain the density function of the waiting time Sn
d et (t)n1
fSn (t) = FSn (t) =
dt (n 1)!
Writing
tn1 et
fSn (t) = ,
(1/)n (n)
it turns out that Sn has a gamma distribution with parameters = n and
= 1/. It follows that
n n
E[Sn ] = , V ar[Sn ] =
2
The relation lim E[Sn ] = states that the expectation of the waiting time
n
is unbounded as n .
d et (t)n1
Exercise 3.10.1 Prove that FSn (t) =
dt (n 1)!
Exercise 3.10.2 Using that the interarrival times T1 , T2 , are independent
and exponentially distributed, compute directly the mean E[Sn ] and variance
V ar(Sn ).
3.11 The Integrated Poisson Process

The function u Nu is continuous with the exception of a set of countable
jumps of size 1. It is known that such functions are Riemann integrable, so it
makes sense to dene the process
t
Ut = Nu du,
0
called the integrated Poisson process. The next result provides a relation
between the process Ut and the partial sum of the waiting times Sk .
Figure 3.6: The Poisson process Nt and the waiting times S1 , S2 , Sn . The
area of the shaded rectangle is n(Sn+1 t).
Proposition 3.11.1 The integrated Poisson process can be expressed as

Nt
Ut = tNt Sk .
k=1
Let Nt = n. Since Nu is equal to k between the waiting times Sk and Sk+1 ,

the process Ut , which is equal to the area of the subgraph of Nu between 0
and t, can be expressed as
t
Ut = Nu du = 1 (S2 S1 ) + 2 (S3 S2 ) + + n(Sn+1 Sn ) n(Sn+1 t).
0
Since Sn < t < Sn+1 , the dierence of the last two terms represents the area
of the last rectangle, which has the length t Sn and the height n. Using
associativity, a computation yields
1 (S2 S1 ) + 2 (S3 S2 ) + + n(Sn+1 Sn ) = nSn+1 (S1 + S2 + + Sn ).
Substituting in the aforementioned relation, we get
Ut = nSn+1 (S1 + S2 + + Sn ) n(Sn+1 t)

= nt (S1 + S2 + + Sn )

Nt
= tNt Sk ,
k=1
where we replaced n by Nt .
The conditional distribution of the waiting times is provided by the fol-
lowing useful result.
Theorem 3.11.2 Given Nt = n, the waiting times S1 , S2 , , Sn have the

joint density function given by
n!
f (s1 , s2 , , sn ) = , 0 < s1 s2 sn < t.
tn
This is the same as the density of an ordered sample of size n from a uni-
form distribution on the interval (0, t). A naive explanation of this re-
sult is as follows. If we know that there will be exactly n events during the
time interval (0, t), since the events can occur at any time, each of them can
be considered uniformly distributed, with the density f (sk ) = 1/t. Since it
makes sense to consider the events independent, taking into consideration all
n! possible permutations, the joint density function becomes f (s1 , , sn ) =
n!
n!f (s1 ) f (sn ) = n .
t
Exercise 3.11.3 Find the following means
(a) E[Ut ].

Nt

(b) E Sk .
k=1

(a) S1 + S2 + + Sn = nT1 + (n 1)T2 + 2Tn1 + Tn ;
Nt n(n + 1)
(b) E Sk |Nt = n = ;
2
k=1
n + 1
(c) E Ut |Nt = n = n t .
2
t3
Exercise 3.11.5 Show that V ar(Ut ) = .
3
Exercise 3.11.6 Can you apply a similar proof as in Proposition 3.3.2 to
show that the integrated Poisson process Ut is also a Poisson process?
Exercise 3.11.7 Let Y : N be a discrete random variable. Show that for

any random variable X we have

E[X] = E[X|Y = y]P (Y = y).
y0
Exercise 3.11.8 Use Exercise 3.11.7 to solve Exercise 3.11.3 (b).
Exercise 3.11.9 (a) Let Tk be the kth interarrival time. Show that

E[eTk ] = , > 0.
+
(b) Let n = Nt . Show that
Ut = nt [nT1 + (n 1)T2 + + 2Tn1 + Tn ].
(c) Find the conditional expectation

E eUt Nt = n .
(Hint: If we know that there are exactly n jumps in the interval [0, T ], it makes
sense to consider the arrival time of the jumps Ti independent and uniformly
distributed on [0, T ]).
(d) Find the expectation
E eUt .
3.12 Submartingales
A stochastic process Xt on the probability space (, F, P ) is called a sub-
martingale with respect to the ltration Ft if:

(a) |Xt | dP < (Xt integrable);
(b) Xt is known if Ft is given (Xt is adaptable to Ft );
(c) E[Xt+s |Ft ] Xt , t, s 0 (future predictions exceed the present value).
Example 3.12.1 We shall prove that the process Xt = t + Wt , with > 0

is a submartingale.
The integrability follows from the inequality |Xt ()| t + |Wt ()| and in-
tegrability of Wt . The adaptability of Xt is obvious, and the last property
follows from the computation:
E[Xt+s |Ft ] = E[t + Wt+s |Ft ] + s > E[t + Wt+s |Ft ]

= t + E[Wt+s |Ft ] = t + Wt = Xt ,
where we used that Wt is a martingale.

Example 3.12.2 We shall show that the square of the Brownian motion, Wt2 ,
is a submartingale.
Using that Wt2 t is a martingale, we have

2 2
E[Wt+s |Ft ] = E[Wt+s (t + s)|Ft ] + t + s = Wt2 t + t + s
= Wt2 + s Wt2 .
The following result supplies examples of submartingales starting from

martingales or submartingales.
Proposition 3.12.3 (a) If Xt is a martingale and a convex function such
that (Xt ) is integrable, then the process Yt = (Xt ) is a submartingale.
(b) If Xt is a submartingale and an increasing convex function such that
(Xt ) is integrable, then the process Yt = (Xt ) is a submartingale.
(c) If Xt is a martingale and f (t) is an increasing, integrable function, then
Yt = Xt + f (t) is a submartingale.
Proof: (a) Using Jensens inequality for conditional probabilities, Exercise
2.13.7, we have

E[Yt+s |Ft ] = E[(Xt+s )|Ft ] E[Xt+s |Ft ] = (Xt ) = Yt .
(b) From the submartingale property and monotonicity of we have

E[Xt+s |Ft ] (Xt ).
Then apply a similar computation as in part (a).

(c) We shall check only the forecast property, since the other properties are
obvious.
E[Yt+s |Ft ] = E[Xt+s + f (t + s)|Ft ] = E[Xt+s |Ft ] + f (t + s)

= Xt + f (t + s) Xt + f (t) = Yt , s, t > 0.
Corollary 3.12.4 (a) Let Xt be a right continuous martingale. Then Xt2 ,

|Xt |, eXt are submartingales.
(b) Let > 0. Then et+Wt is a submartingale.
Proof: (a) Results from part (a) of Proposition 3.12.3.

(b) It follows from Example 3.12.1 and part (b) of Proposition 3.12.3.
The following result provides important inequalities involving submartin-

gales, see for instance Doob [14].
Proposition 3.12.5 (Doobs Submartingale Inequality) (a) Let Xt be a

non-negative submartingale. Then
E[Xt ]
P (sup Xs x) , x > 0.
st x
(b) If Xt is a right continuous submartingale, then for any x > 0
E[Xt+ ]
P (sup Xt x) ,
st x
where Xt+ = max{Xt , 0}.
Exercise 3.12.6 Let x > 0. Show the inequalities:

t
(a) P (sup Ws2 x) .
st x

2t/
(b) P (sup |Ws | x) .
st x
supst |Ws |
Exercise 3.12.7 Show that p-lim = 0.
t t
Exercise 3.12.8 Show that for any martingale Xt we have the inequality
E[Xt2 ]
P (sup Xt2 > x) , x > 0.
st x
It is worth noting that Doobs inequality implies Markovs inequality. Since

sup Xs Xt , then P (Xt x) P (sup Xs x). Then Doobs inequality
st st
E[Xt ]
P (sup Xs x)
st x
implies Markovs inequality (see Theorem 2.13.9)
E[Xt ]
P (Xt x) .
x
Exercise 3.12.9 Let Nt denote the Poisson process and consider the infor-
mation set Ft = {Ns ; s t}.
(a) Show that Nt is a submartingale;
(b) Is Nt2 a submartingale?
Exercise 3.12.10 It can be shown that for any 0 < < we have the in-
equality
N 2 4
t
E 2
t
t
Nt
Using this inequality prove that mslim = .
t t
The following famous inequality involving expectations was also found by
Doob. The proof can be found for instance in Chung and Williams [12].
Theorem 3.12.11 (Doobs inequality) If Xt is a continuous martingale,

then
E sup Xt2 4E[Xt2 ].
0tT
Exercise 3.12.12 Use Doobs inequality to show

E sup Wt2 4T.
0tT
Exercise 3.12.13 Find Doobs inequality for the martingale Xt = Wt2 t.

Chapter 4
Properties of Stochastic
Processes
This chapter presents detailed properties specic to stochastic processes, such

as stopping times, hitting times, bounded variation, quadratic variation as
well as some results regarding convergence and optimal stopping.
4.1 Stopping Times

Consider the probability space (, F, P ) and the ltration (Ft )t0 , i.e. an
ascending sequence of -elds
Fs Ft F, s < t.
Assume that the decision to stop playing a game before or at time t is deter-
mined by the information Ft available at time t. Then this decision can be
modeled by a random variable : [0, ], which satises
{; () t} Ft , t 0.
This means that given the information set Ft , we know whether the event
{; () t} had occurred or not. We note that the possibility = is also
included, since the decision to continue the game for ever is a possible event.
A random variable with the previous properties is called a stopping time.
The next example illustrates a few cases when a decision is or is not a stopping
time. In order to accomplish this, think of the situation that is the time
when some random event related to a given stochastic process occurs rst.
Example 4.1.1 Let Ft be the information available until time t regarding the
evolution of a stock. Assume the price of the stock at time t = 0 is $50 per
share. The following decisions are stopping times:
73
(a) Sell the stock when it reaches for the rst time the price of $100 per
share;
(b) Buy the stock when it reaches for the rst time the price of $10 per
share;
(c) Sell the stock at the end of the year;
(d) Sell the stock either when it reaches for the rst time $80 or at the end
of the year.
(e) Keep the stock either until the initial investment doubles or until the
end of the year;
The following decision is not a stopping time:
(f ) Sell the stock when it reaches the maximum level it will ever be.
Part (f ) is not a stopping time because it requires information about the

future that is not contained in Ft . In part (e) there are two conditions; the
latter one has the occurring probability equal to 1.
Exercise 4.1.2 Show that any positive constant, = c, is a stopping time

with respect to any ltration.
Exercise 4.1.3 Let () = inf{t > 0; |Wt ()| K}, with K > 0 constant.
Show that is a stopping time with respect to the ltration Ft = (Ws ; s t).
The random variable is called the rst exit time of the Brownian motion Wt
from the interval (K, K). In a similar way one can dene the rst exit time
of the process Xt from the interval (a, b):
() = inf{t > 0; Xt ()
/ (a, b)} = inf{t > 0; Xt () b or Xt () a)}.
Let X0 < a. The rst entry time of Xt in the interval [a, b] is dened as
() = inf{t > 0; Xt () [a, b]}.
If we let b = , we obtain the rst hitting time of the level a
() = inf{t > 0; Xt () a)}.
We shall deal with hitting times in more detail in section 4.3.
Exercise 4.1.4 Let Xt be a continuous stochastic process. Prove that the rst
exit time of Xt from the interval (a, b) is a stopping time.
We shall present in the following some properties regarding operations

with stopping times. Consider the notations 1 2 = max{1 , 2 }, 1 2 =
min{1 , 2 }, n = supn1 n and n = inf n1 n .
Properties of Stochastic Processes 75
Proposition 4.1.5 Let 1 and 2 be two stopping times with respect to the
ltration Ft . Then
1. 1 2
2. 1 2
3. 1 + 2
are stopping times.
Proof: 1. We have
{; 1 2 t} = {; 1 t} {; 2 t} Ft ,
since {; 1 t} Ft and {; 2 t} Ft . Then 1 2 is a stopping time.

2. The event {; 1 2 t} Ft if and only if {; 1 2 > t} Ft .
{; 1 2 > t} = {; 1 > t} {; 2 > t} Ft ,
since {; 1 > t} Ft and {; 2 > t} Ft , as the -algebra Ft is closed to

complements.
3. We note that 1 + 2 t if there is a c (0, t) such that
1 c, 2 t c.
Using that the rational numbers are dense in R, we can write

{; 1 + 2 t} = {; 1 c} {; 2 t c} Ft ,
0<c<t,cQ
since
{; 1 c} Fc Ft , {; 2 t c} Ftc Ft .
It follows that 1 + 2 is a stopping time.

A ltration Ft is called right-continuous if Ft = Ft+ 1 , for t > 0.
n
n=1
This means that the information available at time t is a good approximation
for any future innitesimal information Ft+
; or, equivalently, nothing more
can be learned by peeking innitesimally far into the future. If denote by

Ft+ = Ft+ 1 , then the right-continuity can be written conveniently as
n
n=1
Ft = Ft+ .
s
Exercise 4.1.6 (a) Let Ft = {Ws ; s t} and Gt = { 0 Wu du; s t},
where Wt is a Brownian motion. Is Ft right-continuous? What about Gt ?
(b) Let Nt = {Ns ; s t}, where Nt is a Poisson motion. Is Nt right-
continuous?
The next result states that in the case of a right-continuous ltration the
inequality { t} from the denition of the stopping time can be replaced by
a strict inequality.
Proposition 4.1.7 Let Ft be right-continuous. The following are equivalent:

(a) is a stoping time;
(b) { < t} Ft , for all t 0.
Proof: (a) = (b) Let be a stopping time. Then { t n1 } Ft 1

n
Ft , and hence n1 { t n1 } Ft . Then writing
1
{ < t} = { t } Ft
n
n1
it follows that { < t} Ft , i.e. is a stopping time.

(b) = (a) It follows from
1
{ t} = { < t + } Ft+ 1 = Ft+ = Ft .
n n
n1 n1
Proposition 4.1.8 Let Ft be right-continuous and (n )n1 be a sequence of

bounded stopping times. Then supn n and inf n are stopping times.
Proof: The fact that n = supn n is a stopping time follows from

{; n t} {; n t} Ft .
n1
In order to show that n = inf n is a stopping time we shall proceed as in

the following. Using that Ft is right-continuous and closed to complements,
using proposition 4.1.7, it suces to show that {; n t} Ft . This follows
from
{; n t} = {; n > t} Ft .
n1
Exercise 4.1.9 Let be a stopping time.

(a) Let c 1 be a constant. Show that c is a stopping time.
(b) Let f : [0, ) R be a continuous, increasing function satisfying
f (t) t. Prove that f ( ) is a stopping time.
(c) Show that e is a stopping time.
Exercise 4.1.10 Let be a stopping time and c > 0 a constant. Prove that
+ c is a stopping time.
Exercise 4.1.11 Let a be a constant and dene = inf{t 0; Wt = a}. Is

a stopping time?
Exercise 4.1.12 Let be a stopping time. Consider the following sequence

n = (m + 1)2n if m2n < (m + 1)2n (stop at the rst time of the form
k2n after ). Prove that n is a stopping time.
4.2 Stopping Theorem for Martingales

The next result states that in a fair game, the expected nal fortune of a
gambler, who is using a stopping time to quit the game, is the same as the
expected initial fortune. From the nancial point of view, the theorem says
that if you buy an asset at some initial time and adopt a strategy of deciding
when to sell it, then the expected price at the selling time is the initial price;
so one cannot make money by buying and selling an asset whose price is a
martingale. Fortunately, the price of a stock is not a martingale, and people
can still expect to make money buying and selling stocks.
If (Mt )t0 is an Ft -martingale, then taking the expectation in
E[Mt |Fs ] = Ms , s < t
and using Example 2.12.4 yields
E[Mt ] = E[Ms ], s < t.
In particular, E[Mt ] = E[M0 ], for any t > 0. The next result states necessary
conditions under which this identity holds if t is replaced by any stopping time
. The reader can skip the proof at the rst reading.
Theorem 4.2.1 (Optional Stopping Theorem) Let (Mt )t0 be a right con-
tinuous Ft -martingale and be a stopping time with respect to Ft such that
is bounded, i.e. N < such that N .
Then E[M ] = E[M0 ]. If Mt is an Ft -submartingale, then E[M ] E[M0 ].
Proof: Consider the following convenient notation for the indicator function
of a set
1, () > t;
1{ >t} () =
0, () t.
Taking the expectation in relation
M = M t + (M Mt )1{ >t} ,
see Exercise 4.2.3, yields
E[M ] = E[M t ] + E[M 1{ >t} ] E[Mt 1{ >t} ].
Since M t is a martingale, see Exercise 4.2.4 (b), then E[M t ] = E[M0 ]. The
previous relation becomes
E[M ] = E[M0 ] + E[M 1{ >t} ] E[Mt 1{ >t} ], t > 0.
Taking the limit yields
E[M ] = E[M0 ] + lim E[M 1{ >t} ] lim E[Mt 1{ >t} ]. (4.2.1)

t t
We shall show that both limits are equal to zero.

Since |M 1{ >t} | |M |, t > 0, and M is integrable, see Exercise 4.2.4
(a), by the dominated convergence theorem we have

lim E[M 1{ >t} ] = lim M 1{ >t} dP = lim M 1{ >t} dP = 0.
t t t
For the second limit

lim E[Mt 1{ >t} ] = lim Mt 1{ >t} dP = 0,
t t
since for t > N the integrand vanishes. Hence relation (4.2.1) yields E[M ] =
E[M0 ].
It is worth noting that the previous theorem is a special case of the more
general Optional Stopping Theorem of Doob:
Theorem 4.2.2 Let Mt be a right continuous martingale and , be two

bounded stopping times, with . Then M , M are integrable and
E[M |F ] = M a.s.
In particular, taking expectations, we have
E[M ] = E[M ] a.s.
In the case when Mt is a submartingale then E[M ] E[M ] a.s.

2.5
Wt
2.0
a
1.5
Ta
50 100 150 200 250 300 350
Figure 4.1: The rst hitting time Ta given by WTa = a.
M = M t + (M Mt )1{ >t} ,
where 1{ >t} is the indicator function of the set { > t}.
Exercise 4.2.4 Let Mt be a right continuous martingale and be a bounded

stopping time. Show that
(a) M is integrable;
(b) M t is a martingale.
Exercise 4.2.5 Show that letting = 0 in Theorem 4.2.2 yields Theorem

4.2.1.
4.3 The First Passage of Time

The rst passage of time is a particular type of hitting time, which is useful
in nance when studying barrier options and lookback options. For instance,
knock-in options enter into existence when the stock price hits for the rst
time a certain barrier before option maturity. A lookback option is priced
using the maximum value of the stock until the present time. The stock price
is not a Brownian motion, but it depends on one. Hence the need for studying
the hitting time for the Brownian motion.
The rst result deals with the rst hitting time for a Brownian motion to
reach the barrier a R, see Fig. 4.1.
Lemma 4.3.1 Let Ta be the rst time the Brownian motion Wt hits a. Then
the distribution function of Ta is given by

2 y 2 /2
P (Ta t) = e dy.
2 |a|/ t
Proof: If A and B are two events, then
P (A) = P (A B) + P (A B)
= P (A|B)P (B) + P (A|B)P (B). (4.3.2)
Let a > 0. Using formula (4.3.2) for A = {; Wt () a} and B = {; Ta ()

t} yields
P (Wt a) = P (Wt a|Ta t)P (Ta t)

+P (Wt a|Ta > t)P (Ta > t) (4.3.3)
If Ta > t, the Brownian motion did not reach the barrier a yet, so we must
have Wt < a. Therefore
P (Wt a|Ta > t) = 0.
If Ta t, then WTa = a. Since the Brownian motion is a Markov process,

it starts fresh at Ta . Due to symmetry of the density function of a normal
variable, Wt has equal chances to go up or go down after the time interval
t Ta . It follows that
1
P (Wt a|Ta t) = .
2
Substituting into (4.3.3) yields
P (Ta t) = 2P (Wt a)

2 x2 /(2t) 2 y 2 /2
= e dx = e dy.
2t a 2 a/ t
If a < 0, symmetry implies that the distribution of Ta is the same as that of

Ta , so we get

2 y 2 /2
P (Ta t) = P (Ta t) = e dy.
2 a/ t
Remark 4.3.2 The previous proof is based on a more general principle called
the reection principle: If is a stopping time for the Brownian motion Wt ,
then the Brownian motion reected at is also a Brownian motion.
Theorem 4.3.3 Let a R be xed. Then the Brownian motion hits a (in a
nite amount of time) with probability 1.
Proof: The probability that Wt hits a (in a nite amount of time) is

2 y 2 /2
P (Ta < ) = lim P (Ta t) = lim e dy
t t 2 |a|/ t

2 2
= ey /2 dy = 1,
2 0
where we used the well known integral

2 1 y2 /2 1
ey /2 dy = e dy = 2.
0 2 2
The previous result stated that the Brownian motion hits the barrier a
almost surely. The next result shows that the expected time to hit the barrier
is innite.
Proposition 4.3.4 The random variable Ta has a Pearson 5 distribution

given by
|a| a2 3
p(t) = e 2t t 2 , t > 0.
2
a2
It has the mean E[Ta ] = and the mode .
3
Proof: Dierentiating in the formula of distribution function1

2 y 2 /2
FTa (t) = P (Ta t) = e dy
2 a/ t
yields the following probability density function
dFTa (t) a a2 3
p(t) = = e 2t t 2 , t > 0.
dt 2
This is a Pearson 5 distribution with parameters = 1/2 and = a2 /2. The

expectation is

a 1 a2
E[Ta ] = tp(t) dt = e 2t dt.
0 2 0 t
(t)
d
1
One may use Leibnizs formula f (u) du = f ((t)) (t) f ((t)) (t).
dt (t)
0.4
0.3
0.2
0.1
2
a3
1 2 3 4
Figure 4.2: The distribution of the rst hitting time Ta .
a2 a2
Using the inequality e 2t > 1 , t > 0, we have the estimation
2t

a 1 a3 1
E[Ta ] > dt 3/2
dt = , (4.3.4)
2 0 t 2 2 0 t
1
1
since dt is divergent and
0 t 0 t3/2 dt is convergent.
The mode of Ta is given by

a2 a2
= 1 = .
+1 2( 2 + 1) 3
Remark 4.3.5 The distribution has a peak at a2 /3. Then if we need to pick
a small time interval [t dt, t + dt] in which the probability that the Brownian
motion hits the barrier a is maximum, we need to choose t = a2 /3, see Fig. 4.2.
Remark 4.3.6 Formula (4.3.4) states that the expected waiting time for Wt
to reach the barrier a is innite. However, the expected waiting time for the
Brownian motion Wt to hit either a or a is nite, see Exercise 4.3.9.
Corollary 4.3.7 A Brownian motion process returns to the origin in a nite

amount time with probability 1.
Proof: Choose a = 0 and apply Theorem 4.3.3.
Exercise 4.3.8 Try to apply the proof of Lemma 4.3.1 for the following stochas-
tic processes
(a) Xt = t + Wt , with , > 0 constants;
t
(b) Xt = Ws ds.
0
Where is the diculty?
Exercise 4.3.9 Let a > 0 and consider the hitting time
a = inf{t > 0; |Wt | a}.
Prove that E[a ] = a2 .
Exercise 4.3.10 (a) Show that the distribution function of the process
Xt = max Ws
s[0,t]
is given by
a/ t
2 2 /2
P (Xt a) = ey dy,
2 0
and the probability density is
2 x2
pt (x) = e 2t , x 0.
2t
2
(b) Show that E[Xt ] = 2t/ and V ar(Xt ) = t 1 .

Exercise 4.3.11 (a) Show that the probability density of the absolute value
of a Brownian motion Xt = |Wt |, t 0, is given by
2 x2
pt (x) = e 2t , x 0.
2t
(b) Consider the processes Xt = |Wt | and Yt = |Bt |, with Wt and Bt indepen-
dent Brownian motions. Use Theorem 2.11.1 to obtain the probability density
of the sum process Zt = Xt + Yt .
The fact that a Brownian motion returns to the origin or hits a barrier
almost surely is a property characteristic to the rst dimension only. The
next result states that in larger dimensions this is no longer possible.
Theorem 4.3.12 Let (a, b) R2 . The 2-dimensional Brownian motion W (t) =

W1 (t), W2 (t) (with W1 (t) and W2 (t) independent) hits the point (a, b) with
probability zero. The same result is valid for any n-dimensional Brownian
motion, with n 2.
However, if the point (a, b) is replaced by the disk
D
(x0 ) = {x R2 ; |x x0 | },
then there is a dierence in the behavior of the Brownian motion from n = 2

to n > 2, as pointed out by the next two results:

Theorem 4.3.13 The 2-dimensional Brownian motion W (t) = W1 (t), W2 (t)
hits the disk D
(x0 ) with probability one.
Theorem 4.3.14 Let n > 2. The n-dimensional Brownian motion W (t) hits
the ball D
(x0 ) with probability
|x | 2n
0
P = < 1.

The previous results can be stated by saying that that Brownian motion
is transient in Rn , for n > 2. If n = 2 the previous probability equals 1. We
shall come back with proofs to the aforementioned results in a later chapter
(see section 9.6).
Remark 4.3.15 If life spreads according to a Brownian motion, the afore-

mentioned results explain why life is more extensive on earth rather than in
space. The probability for a form of life to reach a planet of radius R situated
at distance d is Rd . Since d is large the probability is very small, unlike in the
plane, where the probability is always 1.
Exercise 4.3.16 Is the one-dimensional Brownian motion transient or recur-

rent in R?
4.4 The Arc-sine Laws

In this section we present a few results which provide certain probabilities
related with the behavior of a Brownian motion in terms of the arc-sine of
a quotient of two time instances. These results are generally known as the
Arc-sine Laws.
The following result will be used in the proof of the rst Arc-sine Law.
Proposition 4.4.1 (a) If X : N is a discrete random variable, then for

any subset A , we have

P (A) = P (A|X = x)P (X = x).
xN
(b) If X : R is a continuous random variable, then

P (A) = P (A|X = x)dP = P (A|X = x)fX (x) dx.
0.4
0.2
t1 t2
50 100 150 200 250 300 350
0.2
a
0.4
Wt
0.6
Figure 4.3: The event A(a; t1 , t2 ) in the Arc-sine Law.
Proof: (a) The sets X 1 (x) = {X = x} = {; X() = x} form a partition of

the sample space , i.e.:

(i) = x X 1 (x);
(ii) X 1 (x) X 1 (y) = for x = y.

Then A = A X 1 (x) = A {X = x} , and hence
x x

P (A) = P A {X = x}
x
P (A {X = x})
= P ({X = x})
x
P ({X = x})

= P (A|X = x)P (X = x).
x
(b) In the case when X is continuous, the sum is replaced by an integral and
the probability P ({X = x}) by fX (x)dx, where fX is the density function of
X.
The zero set of a Brownian motion Wt is dened by {t 0; Wt = 0}. Since
Wt is continuous, the zero set is closed with no isolated points almost surely.
The next result deals with the probability that the zero set does not intersect
the interval (t1 , t2 ).
Theorem 4.4.2 (The Arc-sine Law) The probability that a Brownian mo-
tion Wt does not have any zeros in the interval (t1 , t2 ) is equal to

2 t1
0, t1 t t2 ) = arcsin
P (Wt = .
t2
Proof: The proof follows Ross [43]. Let A(a; t1 , t2 ) denote the event that the
Brownian motion Wt takes on the value a between t1 and t2 . In particular,
A(0; t1 , t2 ) denotes the event that Wt has (at least) a zero between t1 and
t2 . Substituting A = A(0; t1 , t2 ) and X = Wt1 into the formula provided by
Proposition 4.4.1

P (A) = P (A|X = x)fX (x) dx
yields

P A(0; t1 , t2 ) = P A(0; t1 , t2 )|Wt1 = x fWt (x) dx (4.4.5)
1

1 x2
= P A(0; t1 , t2 )|Wt1 = x e 2t1 dx.
2t1
Using the properties of Wt with respect to time translation and symmetry we
have

P A(0; t1 , t2 )|Wt1 = x = P A(0; 0, t2 t1 )|W0 = x

= P A(x; 0, t2 t1 )|W0 = 0

= P A(|x|; 0, t2 t1 )|W0 = 0

= P A(|x|; 0, t2 t1 )

= P T|x| t2 t1 ,
the last identity stating that Wt hits |x| before t2 t1 . Using Lemma 4.3.1
yields

2 y
2
P A(0; t1 , t2 )|Wt1 = x = e 2(t2 t1 ) dy.

2(t2 t1 ) |x|
Substituting into (4.4.5) we obtain
x2
1 2 2
2(t yt )
P A(0; t1 , t2 ) = e 2 1 dy e 2t1 dx
2t1 2(t2 t1 ) |x|
2 2
1 y x
= e 2(t2 t1 ) 2t1 dydx.
t1 (t2 t1 ) 0 |x|
The above integral can be evaluated to get (see Exercise 4.4.3 )

2 t1
P A(0; t1 , t2 ) = 1 arcsin .
t2

Using P (Wt = 0, t1 t t2 ) = 1 P A(0; t1 , t2 ) we obtain the desired
result.
Exercise 4.4.3 Use polar coordinates to show

2

1 x2 2 t1
2(t yt ) 2t
e 2 1 1 dydx = 1 arcsin .
t1 (t2 t1 ) 0 |x| t2
Exercise 4.4.4 Find the probability that a 2-dimensional Brownian motion

W (t) = W1 (t), W2 (t) stays in the same quadrant for the time interval t
(t1 , t2 ).
Exercise 4.4.5 Find the probability that a Brownian motion Wt does not take
the value a in the interval (t1 , t2 ).
Exercise 4.4.6 Let a = b. Find the probability that a Brownian motion Wt

does not take any of the values {a, b} in the interval (t1 , t2 ). Formulate and
prove a generalization.
We provide below without proof a few similar results dealing with arc-sine
probabilities, whose proofs can be found for instance in Kuo [30]. The rst
result deals with the amount of time spent by a Brownian motion on the
positive half-axis.
t
Theorem 4.4.7 (Arc-sine Law of L evy) Let L+ +
t = 0 sgn Ws ds be the
amount of time a Brownian motion Wt is positive during the time interval
[0, t]. Then
+ 2
P (Lt ) = arcsin .
t
The next result deals with the Arc-sine Law for the last exit time of a
Brownian motion from 0.
Theorem 4.4.8 (Arc-sine Law of exit from 0) Let t = sup{0 s
t; Ws = 0}. Then

2
P (t ) = arcsin , 0 t.
t
The Arc-sine Law for the time the Brownian motion attains its maximum
on the interval [0, t] is given by the next result.
Theorem 4.4.9 (Arc-sine Law of maximum) Let Mt = max Ws and de-
0st
ne
t = sup{0 s t; Ws = Mt }.
Then
2 s
P (t s) = arcsin , 0 s t, t > 0.
t
4.5 More on Hitting Times

In this section we shall deal with results regarding hitting times of Brownian
motion with drift. These type of results are useful in Mathematical Finance
when nding the value of barrier options.
Theorem 4.5.1 Let Xt = t + Wt denote a Brownian motion with nonzero

drift rate , and consider , > 0. Then
e2 1
P (Xt goes up to before down to ) = .
e2 e2
Proof: Let T = inf{t > 0; Xt or Xt } be the rst exit time of Xt

from the interval (, ), which is a stopping time, see Exercise 4.1.4. The
exponential process
c2
Mt = ecWt 2 t , t0
is a martingale, see Exercise 3.2.4(c). Then E[Mt ] = E[M0 ] = 1. By the
Optional Stopping Theorem (see Theorem 4.2.1), we get E[MT ] = 1. This can
be written as
1 2 1 2
1 = E[ecWT 2 c T ] = E[ecXT (c+ 2 c )T
]. (4.5.6)
Choosing c = 2 yields E[e2XT ] = 1. Since the random variable XT takes

only the values and , if we let p = P (XT = ), the previous relation
becomes
e2 p + e2 (1 p ) = 1.
Solving for p yields
e2 1
p = (4.5.7)
e2 e2
Noting that
p = P (Xt goes up to before down to )
leads to the desired answer.

It is worth noting how the previous formula changes in the case when the
drift rate is zero, i.e. when = 0, and Xt = Wt . The previous probability is
computed by taking the limit 0 and using LHospitals rule
e2 1 2e2
lim 2 2
= lim 2 2
= .
0 e e 0 2e + 2e +
Hence

P (Wt goes up to before down to ) = .
+
Taking the limit we recover the following result
P (Wt hits ) = 1.
If = we obtain
1
P (Wt goes up to before down to ) = ,
2
which shows that the Brownian motion is equally likely to go up or down an

amount in a given time interval.
If T and T denote the times when the process Xt reaches and , respec-
tively, then the aforementioned probabilities can be written using inequalities.
For instance, the rst identity becomes
e2 1
P (T T ) = .
e2 e2
Exercise 4.5.2 Let Xt = t + Wt denote a Brownian motion with nonzero

drift rate , and consider > 0.
(a) If > 0 show that
P (Xt goes up to ) = 1.
(b) If < 0 show that
P (Xt goes up to ) = e2 < 1.
Formula (a) can be written equivalently as
P (sup(Wt + t) ) = 1, 0,
t0
while formula (b) becomes
P (sup(Wt + t) ) = e2 , < 0,
t0
or
P (sup(Wt t) ) = e2 , > 0,
t0
which is known as one of the Doobs inequalities. This can also be described
in terms of stopping times as follows. Dene the stopping time
= inf{t > 0; Wt t }.
Using
P ( < ) = P sup(Wt t)
t0
yields the identities
P ( < ) = e2 , > 0,
P ( < ) = 1, 0.
Exercise 4.5.3 Let Xt = t + Wt denote a Brownian motion with nonzero

drift rate , and consider > 0. Show that the probability that Xt never hits
is given by
1 e2 , if > 0
0, if < 0.
Recall that T is the rst time when the process Xt hits or .
Exercise 4.5.4 (a) Show that
e2 + e2
E[XT ] = .
e2 e2
(b) Find E[XT2 ];
(c) Compute V ar(XT ).
The next result deals with the time one has to wait (in expectation) for
the process Xt = t + Wt to reach either or .
Proposition 4.5.5 The expected value of T is
e2 + e2
E[T ] = .
(e2 e2 )
Proof: Using that Wt is a martingale, with E[Wt ] = E[W0 ] = 0, applying the
Optional Stopping Theorem, Theorem 4.2.1, yields
0 = E[WT ] = E[XT T ] = E[XT ] E[T ].
Then by Exercise 4.5.4(a) we get
E[XT ] e2 + be2
E[T ] = = .
(e2 e2 )
Exercise 4.5.6 Take the limit 0 in the formula provided by Proposition

4.5.5 to nd the expected time for a Brownian motion to hit either or .
Exercise 4.5.7 Find E[T 2 ] and V ar(T ).
Exercise 4.5.8 (Walds identities) Let T be a nite and bounded stopping

time for the Brownian motion Wt . Show that:
(a) E[WT ] = 0;
(b) E[WT2 ] = E[T ].
The previous techniques can also be applied to right continuous martin-

gales. Let a > 0 and consider the hitting time of the Poisson process for the
barrier a
= inf{t > 0; Nt a}.
Proposition 4.5.9 The expected waiting time for Nt to reach the barrier a
is E[ ] = a .
Proof: Since Mt = Nt t is a right continuous martingale, by the Optional

Stopping Theorem E[M ] = E[M0 ] = 0. Then E[N ] = 0 and hence
E[ ] = 1 E[N ] = a .
4.6 The Inverse Laplace Transform Method

In this section we shall use the Optional Stopping Theorem in conjunction
with the inverse Laplace transform to obtain the probability density functions
for hitting times.
The case of standard Brownian motion Let x > 0. The rst hitting time
1 2
= Tx = inf{t > 0; Wt x} is a stopping time. Since Mt = ecWt 2 c t , t 0,
is a martingale, with E[Mt ] = E[M0 ] = 1, by the Optional Stopping Theorem,
see Theorem 4.2.1, we have
E[M ] = 1.
1 2
This can be written equivalently as E[ecW e 2 c ] = 1. Using W = x, we get
1 2
E[e 2 c ] = ecx .
1 2
It is worth noting that c > 0. This is implied from the fact that e 2 c
<1
and , x > 0.
Substituting s = 12 c2 , the previous relation becomes

E[es ] = e 2sx
. (4.6.8)
This relation has a couple of useful applications.

Proposition 4.6.1 The moments of the rst hitting time are all innite
E[ n ] = , n 1.
Proof: The nth moment of can be obtained by dierentiating and taking

s=0
dn
E[es ] = E[( )n es ] = (1)n E[ n ].
dsn
s=0 s=0
Using (4.6.8) yields
dn 2sx
E[ n ] = (1)n e .
dsn
s=0
Since by induction we have
dn 2sx Mk xnk n1
n 2sx
e = (1) e ,
dsn 2rk /2 s(n+k)/2
k=0
with Mk , rk positive integers, it easily follows that E[ n ] = .

For instance, in the case n = 1, we have
d 2sx
2sx x
E[ ] = e = lim e = +.
ds s0 +
2 2sx
s=0
Another application involves the inverse Laplace transform to get the prob-
ability density. This way we can retrieve the result of Proposition 4.3.4.
Proposition 4.6.2 The probability density of the hitting time is given by
|x| x2
p(t) = e 2t , t > 0. (4.6.9)
2t3
Proof: Let x > 0. The expectation

E[es ] = es p( ) d = L{p( )}(s)
0
is the Laplace transform of p( ). Applying the inverse Laplace transform

yields

p( ) = L1 {E[es ]}( ) = L1 {e 2sx
}( )
x x2
= e 2 , > 0.
2 3
In the case x < 0 we obtain

x x2
p( ) = e 2 , > 0,
2 3
which leads to (4.6.9).

The computation on the inverse Laplace transform L1 {e 2sx }( ) is be-
yond the goal of this book. The reader can obtain the value of this inverse
Laplace transform using the Mathematica software. However, the more math-
ematically interested reader is referred to consult the method of complex in-
tegration in a book on inverse Laplace transforms.
Another application of formula (4.6.8) is the following inequality.
Proposition 4.6.3 (Cherno bound) Let be the rst hitting time when
the Brownian motion Wt hits the barrier x, x > 0. Then
x2
P ( ) e 2 , > 0.
Proof: Let s = t in part 2 of Theorem 2.13.11 and use (4.6.8) to get
E[etX ] E[esX ]
P ( ) t
= s
= esx 2s , s > 0.
e e

Then P ( ) emins>0 f (s) , where f (s) = s x 2s. Since f (s) = x2s ,
x 2
then f (s) reaches its minimum at the critical point s0 = 2 2 . The minimum
value is
x2
min f (s) = f (s0 ) = .
s>0 2
Substituting in the previous inequality leads to the required result.
The case of Brownian motion with drift Consider the Brownian motion
with drift Xt = t + Wt , with , > 0. Let
= inf{t > 0; Xt x}
denote the rst hitting time of the barrier x, with x > 0. We shall compute
the distribution of the random variable and its rst two moments.
Applying the Optional Stopping Theorem (Theorem 4.2.1) to the martin-
1 2
gale Mt = ecWt 2 c t yields
E[M ] = E[M0 ] = 1.
Using that W = 1 (X ) and X = x, the previous relation becomes

c 1 2
E[e( + 2 c )
] = e x .
c
(4.6.10)
c 1 2
Substituting s = + c and completing to a square yields
2
2 2
2s + = c + .
2
Solving for c we get the solutions

2 2
c = + 2s + 2 , c= 2s + .
2
Assume c < 0. Then substituting the second solution into (4.6.10) yields
1

2 2
E[es ] = e 2 (+ 2s + )x .
1

s 2 (+ 2s2 +2 )x
This relation is contradictory since e < 1 while e > 1,
where we used that s, x, > 0. Hence it follows that c > 0. Substituting the
rst solution into (4.6.10) leads to
1

s ( 2s2 +2 )x
E[e ] = e 2 .
We arrive at the following result:
Proposition 4.6.4 Assume , x > 0. Let be the time the process Xt =

t + Wt hits x for the rst time. Then we have
1

s ( 2s2 +2 )x
E[e ] = e 2 , s > 0. (4.6.11)
Proposition 4.6.5 Let be the time the process Xt = t + Wt hits x, with

x > 0 and > 0.
(a) Then the density function of is given by
x (x )2
p( ) = e 2 2 , > 0. (4.6.12)
2 3/2
(b) The mean and variance of are
x x 2
E[ ] = , V ar( ) = .
3
Proof: (a) Let p( ) be the density function of . Since

E[es ] = es p( ) d = L{p( )}(s)
0
is the Laplace transform of p( ), applying the inverse Laplace transform yields

1

1 s 1 ( 2s2 +2 )x
p( ) = L {E[e ]} = L {e 2 }
x (x )2
= e 2 2 , > 0.
2 3/2
It is worth noting that the computation of the previous inverse Laplace trans-
form is non-elementary; however, it can be easily computed using the Mathe-
matica software.
(b) The moments are obtained by dierentiating the moment generating
function and taking the value at s = 0

d d 1
E[es ] = e 2 ( 2s + )x
2 2
E[ ] =
ds ds
s=0
s=0
x 1
( 2s2 +2 )x
= e 2

2 + 2s s=0
x
= .

d2 s d2 12 ( 2s2 +2 )x
E[ 2 ] = (1)2 E[e ] = e

ds2 ds2
s=0

s=0
x( 2 + x 2 + 2s 2 ) 12 ( 2s2 +2 )x
= e
(2 + 2s 2 )3/2 s=0
x 2 x2
= + 2.
3
Hence
x 2
V ar( ) = E[ 2 ] E[ ]2 = .
3
It is worth noting that we can arrive at the formula E[ ] = x in the following
heuristic way. Taking the expectation in the equation + W = x yields
E[ ] = x, where we used that E[W ] = 0 for any nite stopping time
(see Exercise 4.5.8 (a)). Solving for E[ ] yields the aforementioned formula.
Even if the computations are more or less similar to the previous result,
we shall treat next the case of the negative barrier in its full length. This
is because of its particular importance in being useful in practical problems,
such as pricing perpetual American puts.
Proposition 4.6.6 Assume , x > 0. Let be the time the process Xt =

t + Wt hits x for the rst time.
(a) We have
1
s (+ 2s2 +2 )x
E[e ]=e 2 , s > 0. (4.6.13)
(b) Then the density function of is given by
x (x+ )2
p( ) = e 2 2 , > 0. (4.6.14)
2 3/2
(c) The mean of is
x 2x
E[ ] = e 2 .

Proof: (a) Consider the stopping time = inf{t > 0; Xt = x}. By the
Optional Stopping Theorem (Theorem 4.2.1) applied to the martingale Mt =
c2
ecWt 2 t yields
c2 c c2
1 = M0 = E[M ] = E ecW 2 = E e (X ) 2
c c c2 c c2
= E e x 2 = e x E e( + 2 ) .
c
Therefore
c c2
E e( + 2 ) = e x .
c
(4.6.15)

2 2
If let s = c
+ 2 , then solving for c yields c =
c
2s + 2 , but only the
negative solution works out; this comes from the fact that both terms of the
equation (4.6.15) have to be less than 1. Hence (4.6.15) becomes
1
2 2
E[es ] = e 2 (+ 2s + )x , s > 0.
(b) Relation (4.6.13) can be written equivalently as

1

2 2
L p( ) = e 2 (+ 2s + )x .
Taking the inverse Laplace transform, and using Mathematica software to

compute it, we obtain
1 2 2 x (x+ )2
p( ) = L 1
e 2 (+ 2s + )x ( ) = e 2 2 , > 0.
2 3/2
(c) Dierentiating and evaluating at s = 0 we obtain
d 2 x 2 x 2x
E[ ] = E[es ] = e 2 x 2 = e 2 .
ds s=0 2
Exercise 4.6.7 Assume the hypothesis of Proposition 4.6.6 are satised. Find
V ar( ).
Exercise 4.6.8 Find the modes of distributions (4.6.12) and (4.6.14). What
do you notice?
Exercise 4.6.9 Let Xt = 2t + 3Wt and Yt = 2t + Wt .

(a) Show that the expected times for Xt and Yt to reach any barrier x > 0
are the same.
(b) If Xt and Yt model the prices of two stocks, which one would you like
to own?
Exercise 4.6.10 Does 4t + 2Wt hit 9 faster (in expectation) than 5t + 3Wt
hits 14?
Exercise 4.6.11 Let be the rst time the Brownian motion with drift Xt =
t + Wt hits x, where , x > 0. Prove the inequality
x2 +2 2
P ( ) e 2
+x
, > 0.
The double barrier case In the following we shall consider the case of double
barrier. Consider the Brownian motion with drift Xt = t + Wt , > 0. Let
, > 0 and dene the stopping time
T = inf{t > 0; Xt or Xt }.
Relation (4.5.6) states

1 2
E[ecXT e(c+ 2 c )T
] = 1.
Since the random variables T and XT are independent (why?), we have
1 2
E[ecXT ]E[e(c+ 2 c )T
] = 1.
Using E[ecXT ] = ec p + ec (1 p ), with p given by (4.5.7), then
1 2 1
E[e(c+ 2 c )T
]=
ec p + ec (1 p )
If we substitute s = c + 12 c2 , then
1
E[esT ] = (4.6.16)
e(+ 2s+2 ) p + e(+ 2s+2 ) (1 p )
The probability density of the stopping time T is obtained by taking the inverse
Laplace transform of the right side expression

1
p(T ) = L1 2
2
( ),
e(+ 2s+ ) p + e(+ 2s+ ) (1 p )
an expression which is not feasible for having a closed form solution. However,
expression (4.6.16) would be useful for computing the price for double barrier
derivatives.
Exercise 4.6.12 Use formula (4.6.16) to nd the expectation E[T ].
Exercise 4.6.13 Let Tx = inf{t > 0; |Wt | x}, for x 0.

(a) Show that E[ecWTx ] = cosh(cx), for any c > 0.
(b) Prove that

E[eTx ] = sech( 2x), 0.
Exercise 4.6.14 Denote by Mt = Nt t the compensated Poisson process

and let c > 0 be a constant.
(a) Show that
Xt = ecMt t(e c1)
c
is an Ft -martingale, with Ft = (Nu ; u t).

(b) Let a > 0 and T = inf{t > 0; Mt > a} be the rst hitting time of the level
a. Use the Optional Stopping Theorem to show that
E[esT ] = e(s)a , s > 0,
where : [0, ) [0, ) is the inverse function of f (x) = ex x 1.

(c) Show that E[T ] = .
(d) Can you use the inverse Laplace transform to nd the probability density
function of T ?
4.7 The Theorems of L

evy and Pitman
This section presents two of the most famous results on Brownian motions,
which are Levys and Pitmans theorems. They deal with surprising equiva-
lences in law involving a Brownian motion and its maximum and minimum.
Let Wt be a standard Brownian motion and denote its running maximum
by
Mt = max Ws .
0st
2.5
2b 2.0
Wt
2ba
1.5
b 1.0
0.5
a
O Wt
0.2 0.4
Tb 0.6 0.8
t 1.0
0.5
Figure 4.4: The reection principle for a Brownian motion.
From Exercise 4.3.10 the probability density of Mt is given by

2 x2
pt (x) = e 2t , x 0.
2t
Lemma 4.7.1 The joint density function of (Wt , Mt ) is given by

2(2b a) (2ba)2
f (a, b) = e 2t , a b, b 0.
2t3/2
Proof: Let a b. Assume Mt b, i.e. Ws takes values larger than or equal to
b in the interval [0, t]. By the reection principle, see Fig. 4.4, the Brownian
motion reected at the stopping time Tb is also a Brownian motion. The
probabilities to increase or decrease by an amount of at least b a are then
equal
P (Wt 2b a | Mt b) = P (Wt a | Mt b).
Multiplying by P (Mt b) and using the conditional probability formula yields
P (Wt 2b a, Mt b) = P (Wt a, Mt b). (4.7.17)
Conditions Wt 2b a and 2b a b imply Mt b, so
P (Mt b | Wt 2b a) = 1.
Then the left side of (4.7.17) can be computed as
P (Wt 2b a, Mt b) = P (Wt 2b a)P (Mt b | Wt 2b a)
= P (Wt 2b a)

1 x2
= e 2t dx.
2t 2ba
Substituting into (4.7.17) implies

1 x2
P (Wt a, Mt b) = e 2t dx.
2t 2ba
The associated probability density f (a, b) can be obtained by dierentiation

P Wt (a, a a), Mt (b, b + b)
f (a, b) = lim lim
a0 b0 a b

2 1 x2

= e 2t dx
ab 2t 2ba
2(2b a) (2ba)2
= e 2t .
2t3
The following equivalence in law was found by Levy [33] in 1948.

Theorem 4.7.2 The processes Xt = Mt Wt and |Wt |, t 0 have the same
probability law.
Proof: The probability density of |Wt | is given by Exercise 4.3.11 (a)
2 x2
pt (x) = e 2t , x 0. (4.7.18)
2t
Using a direct computation and Lemma 4.7.1 we shall show that Xt = Mt Wt
also has the density (4.7.18). For u 0 we have

P (Xt u) = P (Mt Wt u) = f (x, y) dx dy
{0yxu,y0}
= I1 I2 ,
with
I1 = f (x, y) dx dy, I2 = f (x, y) dx dy.
{yxu,y0} {0yx}
This writes the integral over a strip as a dierence of integrals over the interior
of two angles, see Fig. 4.5(a). A great simplication of computation is done
by observing that the second integral vanishes
x
I2 = f (x, y) dy dx
0 0
x
2 (2yx)2
= (2y x)e 2t dy dx
2t3/2 0
x0
1 z2
= ze 2t dz dx = 0,
2t3/2 0 x
3.0 3.0
2.5 2.5
A
2.0 2.0
1.5 1.5
u yx
y xu
1.0
y x u 2 1.0
0.5
yx 0.5
C O B
1.0 0.5 0.5 1.0 1.0 0.5 0.5 1.0
u O
u u
0.5 0.5
a b
Figure 4.5: (a) The integration strip for the proof of Levys theorem. (b) The
domains DAOC , DAOB , and DABC for the proof of Pitmans theorem.
as an integral of an odd function over a symmetric interval. Next we shall

compute the rst integral
u+x
I1 = f (x, y) dy dx
u 0
u+x
2 (2yx)2
= (2y x)e 2t dy dx
2t3/2 u 0
(2u+x)/2t
2 2
= zez dz dx,
2t u x/ 2t
where we substituted z = 2y x. Changing the variable in r = z 2 , we get

(2u+x)2 /(2t)
1
I1 = er dr dx
2t u x /(2t)
2

1 x2 (2u+x)2
= e 2t e 2t dx
2t u
2

1 1 v2
x2t
= e dx e 2t dv
2t u 2t u
u u
1 x2 2 x2
2t
= e dx = e 2t dx.
2t u 2t 0
2
u x2
Therefore P (Xt u) = I1 = 2t 0 e
2t dx. Dierentiate with respect to u
we obtain the probability density of Xt

u
d d 2 x2
pXt (u) = P (Xt u) = e 2t dx
du du 2t 0
2 u2
= e 2t , u 0,
2t
which matches relation (4.7.18).

Denote the running minimum of a Brownian motion Wt by
mt = min Ws .
0st
Corollary 4.7.3 Let Bt be a Brownian motion. Then the processes Yt =

Bt mt and |Bt |, t 0 have the same probability law.
Proof: Consider the reected Brownian motion Wt = Bt . Then |Wt | = |Bt |
and Bt mt = Mt Wt . Applying Theorem 4.7.2 for Wt implies the desired
result.
It is worth concluding that the processes |Wt |, Mt and Mt Wt have the
same law.
Before getting to the second result of this section, we shall recall a few
results. The 3-dimensional Bessel process Rt is the process satised by the Eu-
clidean distance from the origin to a 3-dimensional Brownian motion W (t) =
(W1 (t), W2 (t), W2 (t)) in R3

Rt = W1 (t)2 + W2 (t)2 + W2 (t)3 .
Its probability density is given by Proposition 3.7.1

2 2
pt () = 2 e 2t , > 0. (4.7.19)
2t3/2
In the following we present another striking identity in law, similar to Levys,
which was found by Pitman [38] in 1975.
Theorem 4.7.4 The process Zt = 2Mt Wt , t 0 is distributed as a 3-
dimensional Bessel process Rt .
Proof: We shall follow the same idea as in the proof of Theorem 4.7.2. Let
u 0 and consider the domains, see Fig. 4.5(b).
DAOC = {2y x u, y 0, x y}
DAOB = {0 x, x y}
DABC = {2y x u, y 0}.
The probability function of Zt can be evaluated as

P (Zt u) = P (2Mt Wt u) = f (x, y) dx dy (4.7.20)
DAOC

= f (x, y) dx dy f (x, y) dx dy
DABC DAOB
= I1 (u) I2 (u).
We note rst that the second integral vanishes, as an integral of an odd func-
tion over a symmetric interval
u x
2 (2yx)2
I2 (u) = (2y x)e 2t dy dx
0 0 2t3/2
u x
1 z2
= ze 2t dz dx = 0.
2t3/2 0 x
The rst integral is computed using the substitutions z = 2y x and r = z 2
u (x+u)/2
2 (2yx)2
I1 (u) = (2y x)e 2t dy dx
u 0 2t3/2
u u u u2
1 z2 1
e 2t dr dx
r
= ze 2t dz dx =
2t3/2 2 2t3/2 u x2
u
u
x
2 2
t
e 2t e 2t dx
x u
=
2t3/2 u
u
1 x2 2 u2
= e 2t dx ue 2t
2t u 2t
u 2
2 u2
e 2t dx ue 2t .
x
=
2t 0
Dierentiate using the Fundamental Theorem of Calculus and the product

rule to obtain
d 2 u2
I1 (u) = u2 e 2t .
du 2 t3/2
Using (4.7.21) yields
d 2 u2
P (Zt u) = u2 e 2t ,
du 2 t3/2
which retrieves the density function (4.7.19) of a 3-dimensional Bessel process,

and hence the theorem is proved.
Corollary 4.7.5 The process Wt 2mt , t 0 is distributed as a 3-dimensional

Bessel process Rt .
Exercise 4.7.6 Prove Corollary 4.7.5.

4.8 Limits of Stochastic Processes

Let (Xt )t0 be a stochastic process. We can make sense of the limit expression
X = lim Xt , in a similar way as we did in section 2.14 for sequences of random
t
variables. We shall rewrite the denitions for the continuous case.
Almost Certain Limit The process Xt converges almost certainly to X, if
for all states of the world , except a set of probability zero, we have
lim Xt () = X().
t
We shall write ac-lim Xt = X. It is also sometimes called strong convergence.

t
Mean Square Limit We say that the process Xt converges to X in the mean
square if
lim E[(Xt X)2 ] = 0.
t
In this case we write ms-lim Xt = X.

t
Limit in Probability The stochastic process Xt converges in probability to

X if

lim P ; |Xt () X()| > = 0.
t
This limit is abbreviated by p-lim Xt = X.

t
It is worth noting that, like in the case of sequences of random variables,
both almost certain convergence and convergence in mean square imply the
convergence in probability, which implies the limit in distribution.
Limit in Distribution We say that Xt converges in distribution to X if for
any continuous bounded function (x) we have
lim E[(Xt )] = E[(X)].

t
It is worth noting that the stochastic convergence implies the convergence in

distribution.
4.9 Mean Square Convergence

The following property is a reformulation of Proposition 2.14.1 in the contin-
uous setup, the main lines of the proof remaining the same.
Proposition 4.9.1 Consider a stochastic process Xt such that E[Xt ] k, a

constant, and V ar(Xt ) 0 as t . Then ms-lim Xt = k.
t
It is worthy to note that the previous statement holds true if the limit to
innity is replaced by a limit to any other number.
Next we shall provide a few applications.
Application 4.9.2 If > 1/2, then
Wt
ms-lim = 0.
t t
Wt E[Wt ]
Proof: Let Xt =
. Then E[Xt ] = = 0, and
t t
1 t 1
V ar[Xt ] = 2 V ar[Wt ] = 2 = 21 ,
t t t
1
for any t > 0. Since 21 0 as t , applying Proposition 4.9.1 yields
t
Wt
ms-lim = 0.
t t
Wt
Corollary 4.9.3 We have ms-lim = 0.
t t
t
Application 4.9.4 Let Zt = 0 Ws ds. If > 3/2, then
Zt
ms-lim = 0.
t t
Zt E[Zt ]
Proof: Let Xt =
. Then E[Xt ] = = 0, and
t t
1 t3 1
V ar[Xt ] = 2 V ar[Zt ] = 2 = 23 ,
t 3t 3t
1
for any t > 0. Since 23 0 as t , applying Proposition 4.9.1 leads
3t
to the desired result.
Application 4.9.5 For any p > 0, c 1 we have

eWt ct
ms-lim = 0.
t tp
eWt ct eWt
Proof: Consider the process Xt = = . Since
tp tp ect
E[eWt ] et/2 1 1
E[Xt ] = = = 1 0, as t
e(c 2 )t t
p
t e ct p
t e ct p
V ar[eWt ] e2t et 1 1 1
V ar[Xt ] = = = 0,
t2p e2ct t2p e2ct t2p e2t(c1) et(2c1)
as t , Proposition 4.9.1 leads to the desired result.
Application 4.9.6 Show that

max Ws
0st
ms-lim = 0.
t t
max Ws
0st
Proof: Let Xt = . Since by Exercise 4.3.10
t
E[ max Ws ] = 0
0st

V ar( max Ws ) = 2 t,
0st
then
E[Xt ] = 0

2 t
V ar[Xt ] = 0, t .
t2
Apply Proposition 4.9.1 to get the desired result.
Remark 4.9.7 One of the strongest results regarding limits of Brownian mo-
tions is called the law of iterated logarithms and was rst proved by Lamperti:
Wt
lim sup = 1,
t 2t ln(ln t)
almost certainly.
Exercise 4.9.8 Find a stochastic process Xt such that the following both con-
ditions are satised:
(i) ms-lim Xt = 0
t
(ii) ms-lim Xt2 = 0.
t
Exercise 4.9.9 Let Xt be a stochastic process. Show that
ms-lim Xt = 0 ms-lim |Xt | = 0.

t t
4.10 The Martingale Convergence Theorem

We state now, without proof, a result which is a powerful way of proving the
almost sure convergence. We start with the discrete version:
Theorem 4.10.1 Let Xn be a martingale with bounded means

M > 0 such that E[|Xn |] M, n 1. (4.10.21)
Then there is a random variable L such that as- lim Xn = L, i.e.
n

P ; lim Xn () = L() = 1.
n
Since E[|Xn |]2 E[Xn2 ], the condition (4.10.21) can be replaced by its stronger
version
M > 0 such that E[Xn2 ] M, n 1.
The following result deals with the continuous version of theMartingale

Convergence Theorem. Denote the innite knowledge by F = t Ft .
Theorem 4.10.2 Let Xt be an Ft -martingale such that

M > 0 such that E[|Xt |] < M, t > 0.
Then there is an F -measurable random variable X such that Xt X
a.s. as t .
The next exercise involves a process that is as-convergent but is not ms-
convergent.
Exercise 4.10.3 It is known that Xt = eWt t/2 is a martingale. Since
E[|Xt |] = E[eWt t/2 ] = et/2 E[eWt ] = et/2 et/2 = 1,
by the Martingale Convergence Theorem there is a random variable L such
that Xt L a.s. as t .
(a) What is the limit L? How did you make your guess?
(b) Show that
2
E[|Xt 1|2 ] = V ar(Xt ) + E(Xt ) 1 .
(c) Show that Xt does not converge in the mean square to 1.

(d) Prove that the sequence Xt is as-convergent but it is not ms-convergent.
Exercise 4.10.4 Let Xt = Wt + 1, where Wt is a Brownian motion and
consider
T = inf{t > 0; Xt 0}.
(a) Is T a stopping time?
(b) Is Yt = XT t a continuous martingale?
(c) Show that E[Yt ] = 1, t > 0.
(d) Verify the limit as-limt Yt = 0.
(e) Is this contradicting the Optional Stopping Theorem?
4.11 Quadratic Variation

In order to gauge the roughness of stochastic processes we shall consider the
sum of k-powers of consecutive increments in the process, as the norm of the
partition decreases to zero.
Denition 4.11.1 Let Xt be a continuous stochastic process on , and k > 0

(k)
be a constant. The kth variation process of Xt , denoted by X, Xt is dened
by the following limit in probability
(k)

n1
X, Xt () = p- lim |Xti+1 () Xti ()|k ,
maxi |ti+1 ti |
i=0
where 0 = t1 < t2 < < tn = t.

(1)
If k = 1, the process X, Xt is called the total variation process of Xt .
(2)
For k = 2, X, Xt = X, Xt is called the quadratic variation process of
Xt .
It can be shown that the quadratic variation exists and is unique (up to in-
distinguishability) for continuous square integrable martingales Xt , i.e. mar-
tingales satisfying X0 = 0 a.s. and E[Xt2 ] < , for all t 0. Furthermore,
the quadratic variation, X, Xt , of a square integrable martingale Xt is an
increasing continuous process satisfying
(i) X, X0 = 0;
(ii) Xt2 X, Xt is a martingale.
Next we introduce a symmetric and bilinear operation.
Denition 4.11.2 The quadratic covariation of two continuous square inte-
grable martingales Xt and Yt is dened as
1
X, Y t = X + Y, X + Y t X Y, X Y t .
4
Exercise 4.11.3 Prove that:
(a) X, Y t = Y, Xt ;
(b) aX + bY, Zt = aX, Zt + bY, Zt .
Exercise 4.11.4 Let Mt and Nt be square integrable martingales. Prove that

the process Mt Nt M, N t is a martingale.
Exercise 4.11.5 Prove that the total variation on the interval [0, t] of a Brow-
nian motion is innite a.s.
We shall encounter in the following a few important examples that will be

useful when dealing with stochastic integrals.
4.11.1 The quadratic variation of Wt

The next result states that the quadratic variation of the Brownian motion
Wt on the interval [0, T ] is T . More precisely, we have:
Proposition 4.11.6 Let T > 0 and consider the equidistant partition 0 =

t0 < t1 < tn1 < tn = T . Then

n1
ms- lim (Wti+1 Wti )2 = T. (4.11.22)
n
i=0
Proof: Consider the random variable

n1
Xn = (Wti+1 Wti )2 .
i=0
Since the increments of a Brownian motion are independent, Proposition 5.2.1

yields

n1
n1
E[Xn ] = E[(Wti+1 Wti )2 ] = (ti+1 ti )
i=0 i=0
= tn t0 = T ;

n1
n1
2
V ar(Xn ) = V ar[(Wti+1 Wti ) ] = 2(ti+1 ti )2
i=0 i=0
T 2 2T 2
= n2 = ,
n n
where we used that the partition is equidistant. Since Xn satises the condi-
tions
E[Xn ] = T, n 1;
V ar[Xn ] 0, n ,
by Proposition 4.9.1 we obtain ms-lim Xn = T , or

n

n1
ms- lim (Wti+1 Wti )2 = T. (4.11.23)
n
i=0
Since the mean square convergence implies convergence in probability, it

follows that

n1
p- lim (Wti+1 Wti )2 = T,
n
i=0
i.e. the quadratic variation of Wt on [0, T ] is T .
Exercise 4.11.7 Prove that the quadratic variation of the Brownian motion
Wt on [a, b] is equal to b a.
The Fundamental Relation dWt2 = dt The relation discussed in this section

can be regarded as the fundamental relation of Stochastic Calculus. We shall
start by recalling relation (4.11.23)

n1
ms- lim (Wti+1 Wti )2 = T. (4.11.24)
n
i=0
The right side can be regarded as a regular Riemann integral

T
T = dt,
0
while the left side can be regarded as a stochastic integral with respect to dWt2
T
n1
2
(dWt ) = ms- lim (Wti+1 Wti )2 .
0 n
i=0

T T
(dWt )2 = dt, T > 0.
0 0
The dierential form of this integral equation is
dWt2 = dt.
In fact, this expression also holds in the mean square sense, as it can be inferred
from the next exercise.

(a) E[dWt2 dt] = 0;
(b) V ar(dWt2 dt) = o(dt);
(c) ms-lim (dWt2 dt) = 0.
dt0
Roughly speaking, the process dWt2 , which is the square of innitesimal in-
crements of a Brownian motion, is deterministic. This relation plays a central
role in Stochastic Calculus and will be useful when dealing with Itos lemma.
The following exercise states that dt dWt = 0, which is another important
stochastic relation useful in Itos lemma.
Exercise 4.11.9 Consider the equidistant partition 0 = t0 < t1 < tn1 <
tn = T . Show that

n1
ms- lim (Wti+1 Wti )(ti+1 ti ) = 0. (4.11.25)
n
i=0
4.12 The Total Variation of Brownian Motion

The total variation of a Brownian motion Wt is dened as

n1
V (Wt ) = sup |Wtk+1 Wtk |,
tk
k=0
for all partitions 0 = t0 < t1 < < tn1 < tn = T . Without losing generality,
we may assume the partition is equidistant, i.e. tk+1 tk = Tn . Equivalently,
V (Wt ) = lim Yn ,
n
where

n1
Yn = |Wtk+1 Wtk |.
k=0
Using Exercise 3.1.15 and the independent increments of the Brownian motion
provides the mean and variance of the random variable Yn

n1
2
n1
2nT
= E[Yn ] = E[|Wtk+1 Wtk |] = (tk+1 tk ) =

k=0 k=0

n1 2
n1
2 = V ar(Yn ) = V ar[|Wtk+1 Wtk |] = 1 (tk+1 tk )

k=0 k=0
2
= 1 T.

Since
{; Yn () < k} {; |Yn () | > k}
Tchebychevs inequality, Theorem 2.13.10, provides
1
P (Yn < k) P (|Yn | > k) < , k 1.
k2
This states that the probability of the left tail is smaller than the probability
of both tails, the second being bounded by k12 . Using the probability of the
complement event, the foregoing relation implies
1
P (Yn k) 1 , k 1.
k2
Substituting for and yields

2nT 2 1
P Yn k 1 T 1 2, k 1.
k

Considering k = n, we get
1
P Yn C n 1 , n 1,
n
where

2T 2
C= 1 T > 0.

Then for any constant M > 0, there is an integer n such that C n M and
hence 1
P (Yn M ) P Yn C n 1 , n n0 .
n
Taking the limit over n yields
lim P (Yn M ) = 1.
n
Hence, the total variation of a Brownian motion is innite, almost surely
P (V (Wt ) = ) = 1.
The rest of this chapter deals with similar properties regarding quadratic
variation of compensated Poisson process and can be skipped at a rst reading.
4.12.1 The quadratic variation of Nt t

The following result deals with the quadratic variation of the compensated
Poisson process Mt = Nt t.
Proposition 4.12.1 Let a < b and consider the partition a = t0 < t1 < <
tn1 < tn = b. Then

n1
ms lim (Mtk+1 Mtk )2 = Nb Na , (4.12.26)
n 0
k=0
where n := sup (tk+1 tk ).

0kn1
Proof: For the sake of simplicity we shall use the following notations
tk = tk+1 tk , Mk = Mtk+1 Mtk , Nk = Ntk+1 Ntk .
The relation we need to prove can also be written as

n1

ms-lim (Mk )2 Nk = 0.
n
k=0
Let
Yk = (Mk )2 Nk = (Mk )2 Mk tk .
It suces to show that
n1

E Yk = 0, (4.12.27)
k=0
n1

lim V ar Yk = 0. (4.12.28)
n
k=0
The rst identity follows from the properties of Poisson processes (see Exercise
3.8.9)
n1

n1
n1
E Yk = E[Yk ] = E[(Mk )2 ] E[Nk ]
k=0 k=0 k=0

n1
= (tk tk ) = 0.
k=0
For the proof of the identity (4.12.28) we need to rst nd the variance of Yk .
V ar[Yk ] = V ar[(Mk )2 (Mk + tk )] = V ar[(Mk )2 Mk ]
= V ar[(Mk )2 ] + V ar[Mk ] 2Cov[Mk2 , Mk ]
= tk + 22 t2k + tk

2 E[(Mk )3 ] E[(Mk )2 ]E[Mk ]
= 22 (tk )2 ,
where we used Exercise 3.8.9 and the fact that E[Mk ] = 0. Since Mt is a
process with independent increments, then Cov[Yk , Yj ] = 0 for i = j. Then
n1

n1
n1
V ar Yk = V ar[Yk ] + 2 Cov[Yk , Yj ] = V ar[Yk ]
k=0 k=0 k=j k=0

n1
n1
2 2 2
= 2 (tk ) 2 n tk = 22 (b a)n ,
k=0 k=0
n1

and hence V ar Yn 0 as n 0. According to the Proposition
k=0
2.14.1, we obtain the desired limit in mean square.
The previous result states that the quadratic variation of the martingale
Mt between a and b is equal to the jump of the Poisson process between a and
b.
The Relation dMt2 = dNt Recall relation (4.12.26)

n1
ms- lim (Mtk+1 Mtk )2 = Nb Na . (4.12.29)
n
k=0
The right side can be regarded as a Riemann-Stieltjes integral
b
Nb Na = dNt ,
a
while the left side can be regarded as a stochastic integral with respect to
(dMt )2
b
n1
(dMt )2 := ms- lim (Mtk+1 Mtk )2 .
a n
k=0
b b
(dMt )2 = dNt ,
a a
for any a < b. The equivalent dierential form is
(dMt )2 = dNt . (4.12.30)
The Relations dt dMt = 0, dWt dMt = 0 In order to show that dt dMt = 0

in the mean square sense, we need to prove the limit

n1
ms- lim (tk+1 tk )(Mtk+1 Mtk ) = 0. (4.12.31)
n
k=0
This can be thought of as a vanishing integral of the increment process dMt

with respect to dt
b
dMt dt = 0, a, b R.
a
Denote

n1
n1
Xn = (tk+1 tk )(Mtk+1 Mtk ) = tk Mk .
k=0 k=0
In order to show (4.12.31) it suces to prove that
1. E[Xn ] = 0;
2. lim V ar[Xn ] = 0.
n
Using the additivity of the expectation and Exercise 3.8.9 (b)
n1

n1
E[Xn ] = E tk Mk = tk E[Mk ] = 0.
k=0 k=0
Since the Poisson process Nt has independent increments, the same property
holds for the compensated Poisson process Mt . Then tk Mk and tj Mj
are independent for k = j, and using the properties of variance we have
n1

n1
n1
2
V ar[Xn ] = V ar tk Mk = (tk ) V ar[Mk ] = (tk )3 ,
k=0 k=0 k=0
where we used
V ar[Mk ] = E[(Mk )2 ] (E[Mk ])2 = tk ,
see Exercise 3.8.9 (b). If we let n = max tk , then

k

n1
n1
V ar[Xn ] = (tk )3 n 2 tk = (b a)n 2 0
k=0 k=0
as n . Hence we proved the stochastic dierential relation
dt dMt = 0. (4.12.32)
For showing the relation dWt dMt = 0, we need to prove
ms- lim Yn = 0, (4.12.33)

n
where we have denoted

n1
n1
Yn = (Wk+1 Wk )(Mtk+1 Mtk ) = Wk Mk .
k=0 k=0
Since the Brownian motion Wt and the process Mt have independent incre-
ments and Wk is independent of Mk , we have

n1
n1
E[Yn ] = E[Wk Mk ] = E[Wk ]E[Mk ] = 0,
k=0 k=0
where we used E[Wk ] = E[Mk ] = 0. Using also E[(Wk )2 ] = tk ,

E[(Mk )2 ] = tk , and invoking the independence of Wk and Mk , we
get
V ar[Wk Mk ] = E[(Wk )2 (Mk )2 ] (E[Wk Mk ])2
= E[(Wk )2 ]E[(Mk )2 ] E[Wk ]2 E[Mk ]2
= (tk )2 .
Then using the independence of the terms in the sum, we get

n1
n1
V ar[Yn ] = V ar[Wk Mk ] = (tk )2
k=0 k=0

n1
n tk = (b a)n 0,
k=0
as n . Since Yn is a random variable with mean zero and variance

decreasing to zero, it follows that Yn 0 in the mean square sense. Hence we
proved that
dWt dMt = 0. (4.12.34)
Exercise 4.12.2 Show the following stochastic dierential relations:
(a) dt dNt = 0; (b) dWt dNt = 0; (c) dt dWt = 0;
2
(d) (dNt ) = dNt ; 2
(e) (dMt ) = dNt ; (f ) (dMt )4 = dNt .
The relations proved in this section are useful when developing stochastic
models for a stock price that exhibits jumps modeled by a Poisson process.
We can represent all these rules in the following multiplication table:
dt dWt dNt dMt

dt 0 0 0 0
dWt 0 dt 0 0
dNt 0 0 dNt dNt
dMt 0 0 dNt dNt
Chapter 5
Stochastic Integration
This chapter deals with one of the most useful stochastic integrals, called the
Ito integral. This type of integral was introduced in 1944 by the Japanese
mathematician Ito [24], [25], and was originally motivated by a construction
of diusion processes. We shall keep the presentation to a maximum sim-
plicity, integrating with respect to a Brownian motion or Poisson process only.
The reader interested in details regarding a larger class of integrators may con-
sult Protter [40] or Kuo [30]. For a more formal introduction into stochastic
integration see Revuz and Yor [41].
Here is a motivation
b for studying an integral of stochastic type. The Rie-
mann integral a F (x) dx represents the work done by the force F between
positions x = a and x = b. The element F (x) dx represents the work done by
the force for the innitesimal displacement dx. Similarly, F (t) dWt represents
the work done by F during an innitesimal b Brownian jump dWt . The cum-
mulative eect is described by the object a F (t) dWt , which will be studied
in this chapter. This represents the work eect of the force F done along
the trajectory of a particle modeled by a Brownian motion during the time
interval [a, b].
5.1 Nonanticipating Processes

Consider the Brownian motion Wt . A process Ft is called a nonanticipating
process if Ft is independent of any future increment Wt Wt , for any t and
t with t < t . Consequently, the process Ft is independent of the behavior of
the Brownian motion in the future, i.e. it cannot anticipate the future. For
instance, Wt , eWt , Wt2 Wt + t are examples of nonanticipating processes,
while Wt+1 or 12 (Wt+1 Wt )2 are not.
Nonanticipating processes are important because the Ito integral concept
can be easily applied to them.
117
If Ft denotes the information known until time t, where this information is

generated by the Brownian motion {Ws ; s t}, then any Ft -adapted process
Ft is nonanticipating.
5.2 Increments of Brownian Motions

In this section we shall discuss a few basic properties of the increments of a
Brownian motion, which will be useful when computing stochastic integrals.
Proposition 5.2.1 Let Wt be a Brownian motion. If s < t, we have

1. E[(Wt Ws )2 ] = t s.
2. V ar[(Wt Ws )2 ] = 2(t s)2 .
Proof: 1. Using that Wt Ws N (0, t s), we have
E[(Wt Ws )2 ] = E[(Wt Ws )2 ] (E[Wt Ws ])2 = V ar(Wt Ws ) = t s.
2. Dividing by the standard deviation yields the standard normal random

Wt Ws (Wt Ws )2
variable N (0, 1). Its square, is 2 -distributed with 1
ts ts
degree of freedom.1 Its mean is 1 and its variance is 2. This implies
(W W )2
= 1 = E[(Wt Ws )2 ] = t s;
t s
E
ts
(W W )2
= 2 = V ar[(Wt Ws )2 ] = 2(t s)2 .
t s
V ar
ts
Remark 5.2.2 The innitesimal version of the previous result is obtained by

replacing t s with dt
1. E[dWt2 ] = dt;
2. V ar[dWt2 ] = 2dt2 = 0.

(a) E[(Wt Ws )4 ] = 3(t s)2 ;
(b) E[(Wt Ws )6 ] = 15(t s)3 .
1
A 2 -distributed random variable with n degrees of freedom has mean n and variance
2n.
Stochastic Integration 119
5.3 The Ito Integral

The Ito integral will be dened in a way that is similar to the Riemann in-
tegral. The Ito integral is taken with respect to innitesimal increments of
a Brownian motion, dWt , which are random variables, while the Riemann
integral considers integration with respect to the deterministic innitesimal
changes dt. It is worth noting that the Ito integral is a random variable, while
the Riemann integral is just a real number. Despite this fact, we shall see that
there are several common properties and relations between these two types of
integrals.
Consider 0 a < b and let Ft = f (Wt , t) be a nonanticipating process
satisfying the non-explosive condition
b
E Ft2 dt < . (5.3.1)
a
The role of the previous condition will be made more clear when we discuss
the martingale property of the Ito integral, see Proposition 5.5.7. Divide the
interval [a, b] into n subintervals using the partition points
a = t0 < t1 < < tn1 < tn = b,
and consider the partial sums

n1
Sn = Fti (Wti+1 Wti ).
i=0
We emphasize that the intermediate points are the left endpoints of each
interval, and this is the way they should always be chosen. Since the process
Ft is nonanticipative, the random variables Fti and Wti+1 Wti are always
independent; this is an important feature in the denition of the Ito integral.
The Ito integral is the limit of the partial sums Sn
b
ms-lim Sn = Ft dWt ,
n a
provided the limit exists. It can be shown that the choice of partition does
not inuence the value of the Ito integral. This is the reason why, for practical
purposes, it suces to assume the intervals equidistant, i.e.
(b a)
ti+1 ti = , i = 0, 1, , n 1.
n
The previous convergence is taken in the mean square sense, i.e.
b 2
lim E Sn Ft dWt = 0.
n a
Existence of the Ito integral

b
It is known that the Ito stochastic integral Ft dWt exists if the process
a
Ft = f (Wt , t) satises the following properties:
1. The paths t Ft () are continuous on [a, b] for any state of the world
;
2. The process Ft is nonanticipating for t [a, b];
b
3. E Ft2 dt < .
a
For instance, the following stochastic integrals make sense:

T T b
cos(Wt )
Wt2 dWt , sin(Wt ) dWt , dWt .
0 0 a t
5.4 Examples of Ito Integrals

As in the case of the Riemann integral, using the denition is not an ecient
way of computing integrals. The same philosophy applies to Ito integrals. We
shall compute in the following two simple Ito integrals. In later sections we
shall introduce more ecient methods for computing these types of stochastic
integrals.
5.4.1 The case Ft = c, constant

In this case the partial sums can be computed explicitly

n1
n1
Sn = Fti (Wti+1 Wti ) = c(Wti+1 Wti )
i=0 i=0
= c(Wb Wa ),
and since the answer does not depend on n, we have

b
c dWt = c(Wb Wa ).
a
In particular, taking c = 1, a = 0, and b = T , since the Brownian motion

starts at 0, we have the following formula:
T
dWt = WT .
0
5.4.2 The case Ft = Wt

We shall integrate the process Wt between 0 and T . Considering an equidistant
kT
partition, we take tk = , k = 0, 1, , n 1. The partial sums are given by
n

n1
Sn = Wti (Wti+1 Wti ).
i=0
Since
1
xy = [(x + y)2 x2 y 2 ],
2
letting x = Wti and y = Wti+1 Wti yields
1 2 1 1
Wti (Wti+1 Wti ) = W W 2 (Wti+1 Wti )2 .
2 ti+1 2 ti 2
Then after pair cancelations the sum becomes
1 2 1 2 1
n1 n1 n1
Sn = Wti+1 Wti (Wti+1 Wti )2
2 2 2
i=0 i=0 i=0
1 2 1
n1
= W (Wti+1 Wti )2 .
2 tn 2
i=0
Using tn = T , we get
1
n1
1
Sn = WT2 (Wti+1 Wti )2 .
2 2
i=0
Since the rst term on the right side is independent of n, using Proposition
4.11.6, we have
1
n1
1 2
ms- lim Sn = WT ms- lim (Wti+1 Wti )2 (5.4.2)
n 2 n 2
i=0
1 2 1
= W T. (5.4.3)
2 T 2
We have now obtained the following explicit formula of a stochastic inte-
gral:
T
1 1
Wt dWt = WT2 T.
0 2 2
In a similar way one can obtain
b
1 1
Wt dWt = (W 2 Wa2 ) (b a).
a 2 b 2
It is worth noting that the right side contains random variables depending on
the limits of integration a and b.
Exercise 5.4.1 Show the following identities:

T
(a) E[ 0 dWt ] = 0;
T
(b) E[ 0 Wt dWt ] = 0;
T
T2
(c) V ar[ Wt dWt ] = .
0 2
5.5 Properties of the Ito Integral

We shall start with some properties which are similar to those of the Rieman-
nian integral.
Proposition 5.5.1 Let f (Wt , t), g(Wt , t) be nonanticipating processes and

c R. Then we have
1. Additivity:
T T T
[f (Wt , t) + g(Wt , t)] dWt = f (Wt , t) dWt + g(Wt , t) dWt .
0 0 0
2. Homogeneity:
T T
cf (Wt , t) dWt = c f (Wt , t) dWt .
0 0
3. Partition property:
T u T
f (Wt , t) dWt = f (Wt , t) dWt + f (Wt , t) dWt , 0 < u < T.
0 0 u
Proof: 1. Consider the partial sum sequences

n1
Xn = f (Wti , ti )(Wti+1 Wti )
i=0

n1
Yn = g(Wti , ti )(Wti+1 Wti ).
i=0
T T
Since ms-lim Xn = f (Wt , t) dWt and ms-lim Yn = g(Wt , t) dWt , using
n 0 n 0
Proposition 2.15.2 yields
T
f (Wt , t) + g(Wt , t) dWt
0

n1
= ms-lim f (Wti , ti ) + g(Wti , ti ) (Wti+1 Wti )
n
i=0
n1

n1
= ms-lim f (Wti , ti )(Wti+1 Wti ) + g(Wti , ti )(Wti+1 Wti )
n
i=0 i=0
= ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn
n n n
T T
= f (Wt , t) dWt + g(Wt , t) dWt .
0 0
The proofs of parts 2 and 3 are left as an exercise for the reader.
Some other properties, such as monotonicity, do not hold in general. It

Tto have a non-negative random variable Ft for which the random
is possible
variable 0 Ft dWt has negative values. More precisely, let Ft = 1. Then
T
Ft > 0 but 0 1 dWt = WT is not always positive. The probability to be
negative is P (WT < 0) = 1/2.
Some of the random variable properties of the Ito integral are given by the
following result:
Proposition 5.5.2 We have
1. Zero mean:
b
E f (Wt , t) dWt = 0.
a
2. Isometry:
b 2 b
E f (Wt , t) dWt =E f (Wt , t)2 dt .
a a
3. Covariance:
b b b
E f (Wt , t) dWt g(Wt , t) dWt = E f (Wt , t)g(Wt , t) dt .
a a a
We shall discuss the previous properties giving rough reasons of proof. The
detailed proofs are beyond the goal of this book.
b
1. The Ito integral I = a f (Wt , t) dWt is the mean square limit of the partial

sums Sn = n1 i=0 fti (Wti+1 Wti ), where we denoted fti = f (Wti , ti ). Since
f (Wt , t) is a nonanticipative process, then fti is independent of the increments

Wti+1 Wti , and hence we have
n1
n1

E[Sn ] = E fti (Wti+1 Wti ) = E[fti (Wti+1 Wti )]
i=0 i=0

n1
= E[fti ]E[(Wti+1 Wti )] = 0,
i=0
because the increments have mean zero. Applying the Squeeze Theorem in
the double inequality
2
0 E[Sn I] E[(Sn I)2 ] 0, n
yields E[Sn ] E[I] 0. Since E[Sn ] = 0 it follows that E[I] = 0, i.e. the Ito
integral has zero mean.
2. Since the square of the sum of partial sums can be written as
n1
2
Sn2 = fti (Wti+1 Wti )
i=0

n1
= ft2i (Wti+1 Wti )2 + 2 fti (Wti+1 Wti )ftj (Wtj+1 Wtj ),
i=0 i=j
using the independence yields

n1
E[Sn2 ] = E[ft2i ]E[(Wti+1 Wti )2 ]
i=0

+2 E[fti ]E[(Wti+1 Wti )]E[ftj ]E[(Wtj+1 Wtj )]
i=j

n1
= E[ft2i ](ti+1 ti ),
i=0
b b
which are the Riemann sums of the integral E[ft2 ] dt =E ft2 dt , where
a a
the last identity follows from Fubinis theorem. Hence E[Sn2 ] converges to the
aforementioned integral.
3. Consider the partial sums

n1
n1
Sn = fti (Wti+1 Wti ), Vn = gtj (Wtj+1 Wtj ).
i=0 j=0
Their product is
n1
n1

Sn Vn = fti (Wti+1 Wti ) gtj (Wtj+1 Wtj )
i=0 j=0

n1
n1
= fti gti (Wti+1 Wti )2 + fti gtj (Wti+1 Wti )(Wtj+1 Wtj ).
i=0 i=j
Using that ft and gt are nonanticipative and that
E[(Wti+1 Wti )(Wtj+1 Wtj )] = E[Wti+1 Wti ]E[Wtj+1 Wtj ] = 0, i = j

E[(Wti+1 Wti )2 ] = ti+1 ti ,
it follows that

n1
E[Sn Vn ] = E[fti gti ]E[(Wti+1 Wti )2 ]
i=0

n1
= E[fti gti ](ti+1 ti ),
i=0
b
which is the Riemann sum for the integral E[ft gt ] dt. a
b
From 1 and 2 it follows that the random variable a f (Wt , t) dWt has mean
zero and variance
b b
V ar f (Wt , t) dWt = E f (Wt , t)2 dt .
a a
From 1 and 3 it follows that

b b b
Cov f (Wt , t) dWt , g(Wt , t) dWt = E[f (Wt , t)g(Wt , t)] dt.
a a a
Corollary 5.5.3 (Cauchys integral inequality) Let ft = f (Wt , t) and gt =

g(Wt , t). Then
b 2 b b
2
E[ft gt ] dt E[ft ] dt E[gt2 ] dt .
a a a
Proof: It follows from the previous theorem and from the correlation formula
|Cov(X, Y )|
|Corr(X, Y )| = 1.
[V ar(X)V ar(Y )]1/2
Let Ft be the information set at time t. This implies that fti and Wti+1
Wti are known at time t, for any ti+1 t. It follows that the partial sum

n1
Sn = fti (Wti+1 Wti ) is Ft -measurable. The following result, whose proof
i=0
is omitted for technical reasons, states that this is also valid after taking the
limit in the mean square:
t
Proposition 5.5.4 The Ito integral 0 fs dWs is Ft -measurable.
The following two results state that if the upper limit of an Ito integral is
replaced by the parameter t we obtain a continuous martingale.
Proposition 5.5.5 For any s < t we have

t s
E f (Wu , u) dWu |Fs = f (Wu , u) dWu .
0 0
Proof: Using part 3 of Proposition 5.5.2 we get

t
E f (Wu , u) dWu |Fs
0
s t
= E f (Wu , u) dWu + f (Wu , u) dWu |Fs
0 s
s t
= E f (Wu , u) dWu |Fs + E f (Wu , u) dWu |Fs . (5.5.4)
0 s
s
Since 0 f (Wu , u) dWu is Fs -measurable (see Proposition 5.5.4), by part 2 of
Proposition 2.12.6
s s
E f (Wu , u) dWu |Fs = f (Wu , u) dWu .
0 0
t
Since s f (Wu , u) dWu contains only information between s and t, it is in-
dependent of the information set Fs , so we can drop the condition in the
expectation; using that Ito integrals have zero mean we obtain
t t
E f (Wu , u) dWu |Fs = E f (Wu , u) dWu = 0.
s s
Substituting into (5.5.4) yields the desired result.
t
Proposition 5.5.6 Consider the process Xt = 0 f (Ws , s) dWs . Then Xt is
continuous, i.e. for almost any state of the world , the path t Xt ()
is continuous.
Proof: A rigorous proof is beyond the purpose of this book. We shall provide
just a rough sketch. Assume the process f (Wt , t) satises E[f (Wt , t)2 ] < M ,
for some M > 0. Let t0 be xed and consider h > 0. Consider the increment
Yh = Xt0 +h Xt0 . Using the aforementioned properties of the Ito integral we
have
t0 +h
E[Yh ] = E[Xt0 +h Xt0 ] = E f (Wt , t) dWt = 0
t0
t0 +h 2 t0 +h
E[Yh2 ] = E f (Wt , t) dWt = E[f (Wt , t)2 ] dt
t0 t0
t0 +h
< M dt = M h.
t0
The process Yh has zero mean for any h > 0 and its variance tends to 0 as
h 0. Using a convergence theorem yields that Yh tends to 0 in mean square,
as h 0. This is equivalent to the continuity of Xt at t0 .
t
Proposition 5.5.7 Let Xt = 0 f (Ws , s) dWs , with E 0 f 2 (Ws , s) ds <
. Then Xt is a continuous Ft -martingale.
Proof: We shall check in the following the properties of a martingale.

Integrability: Using properties of Ito integrals

2
t 2 t 2
E[Xt ] = E f (Ws , s) dWs =E f (Ws , s) ds
0 0
2
<E f (Ws , s) ds < ,
0
and then from the inequality E[|Xt |]2 E[Xt2 ] we obtain E[|Xt |] < , for all
t 0.
Measurability: Xt is Ft -measurable from Proposition 5.5.4.
Forecast: E[Xt |Fs ] = Xs for s < t by Proposition 5.5.5.
Continuity: See Proposition 5.5.6.
5.6 The Wiener Integral

The Wiener integral is a particular case of the Ito stochastic integral. It
is obtained by replacing the nonanticipating stochastic process f (Wt , t) by
b
the deterministic function f (t). The Wiener integral a f (t) dWt is the mean
square limit of the partial sums

n1
Sn = f (ti )(Wti+1 Wti ).
i=0
All properties of Ito integrals also hold for Wiener integrals. The Wiener
integral is a random variable with zero mean
b
E f (t) dWt = 0
a
and variance
b 2 b
E f (t) dWt = f (t)2 dt.
a a
However, in the case of Wiener integrals we can say something more about
their distribution.
b
Proposition 5.6.1 The Wiener integral I(f ) = a f (t) dWt is a normal ran-
dom variable with mean 0 and variance
b
V ar[I(f )] = f (t)2 dt := f 2L2 .
a
Proof: Since increments Wti+1 Wti are normally distributed with mean 0
and variance ti+1 ti , then
f (ti )(Wti+1 Wti ) N (0, f (ti )2 (ti+1 ti )).
Since these random variables are independent, by Theorem 3.3.1, their sum is
also normally distributed, with

n1 n1

Sn = f (ti )(Wti+1 Wti ) N 0, f (ti )2 (ti+1 ti ) .
i=0 i=0
Taking n and max ti+1 ti 0, the normal distribution tends to

i
b
N 0, f (t)2 dt .
a
The previous convergence holds in distribution, and it still needs to be shown

in the mean square. However, we shall omit this essential proof detail.
t
Exercise 5.6.2 Let Zt = 0 Ws ds.
(a) Use integration by parts to show that
t
Zt = (t s) dWs .
0
(b) Use the properties of Wiener integrals to show that

t3
V ar(Zt ) = .
3
T 1
Exercise 5.6.3 Show that the random variable X = dWt is normally
1 t
distributed with mean 0 and variance ln T .
T
Exercise 5.6.4 Let Y = 1 t dWt . Show that Y is normally distributed
with mean 0 and variance (T 2 1)/2.
t
Exercise 5.6.5 Find the distribution of the integral 0 ets dWs .
t t
Exercise 5.6.6 Show that Xt = 0 (2t u) dWu and Yt = 0 (3t 4u) dWu are
Gaussian processes with mean 0 and variance 73 t3 .
t
1
Exercise 5.6.7 Show that ms- lim u dWu = 0.
t0 t 0
t bu

Exercise 5.6.8 Find all constants a, b such that Xt = 0 a + t dWu is
normally distributed with variance t.
Exercise 5.6.9 Let n be a positive integer. Prove that

t tn+1
Cov Wt , un dWu = .
0 n+1
Formulate and prove a more general result.
5.7 Poisson Integration

In this section we deal with the integration with respect to the compensated
Poisson process Mt = Nt t, which is a martingale. Consider 0 a < b and
let Ft = F (t, Mt ) be a nonanticipating process with
b
E Ft2 dt < .
a
Consider the partition
a = t0 < t1 < < tn1 < tn = b
of the interval [a, b], and associate the partial sums

n1
Sn = Fti (Mti+1 Mti ),
i=0
where Fti is the left-hand limit at ti . We note that the intermediate points
are the left-handed limit to the endpoints of each interval. Since the process Ft
is nonanticipative, the random variables Fti and Mti+1 Mti are independent.
The integral of Ft with respect to Mt is the mean square limit of the

partial sum Sn
T
ms-lim Sn = Ft dMt ,
n 0
provided the limit exists. More precisely, this convergence means that
b 2
lim E Sn Ft dMt = 0.
n a
b
Exercise 5.7.1 Let c be a constant. Show that c dMt = c(Mb Ma ).
a
5.8 The case Ft = Mt

We shall integrate the process Mt between 0 and T with respect to Mt .
kT
Consider the equidistant partition points tk = , k = 0, 1, , n 1. Then
n
the partial sums are given by

n1
Sn = Mti (Mti+1 Mti ).
i=0
1
Using xy = [(x + y)2 x2 y 2 ], by letting x = Mti and y = Mti+1 Mti ,
2
we get
1 1 1
Mti (Mti+1 Mti ) = (Mti+1 Mti + Mti )2 Mt2i (Mti+1 Mti )2 .
2 2 2
Let J be the set of jump instances between 0 and T . Using that Mti = Mti
for ti
/ J, and Mti = 1 + Mti for ti J yields

Mti+1 , if ti
/J
Mti+1 Mti + Mti =
Mti+1 1, if ti J.
Splitting the sum, canceling in pairs, and applying the dierence of squares
formula we have
1 1 2 1
n1 n1 n1
Sn = (Mti+1 Mti + Mti )2 Mti (Mti+1 Mti )2
2 2 2
i=0 i=0 i=0
1 1 1 1 2
= (Mti+1 1)2 + Mt2i+1 Mt2i Mti
2 2 2 2
ti J ti J
/ ti J
/ ti J
1
n1
(Mti+1 Mti )2
2
i=0
1 1 1
n1
= (Mti+1 1)2 Mt2i + Mt2n (Mti+1 Mti )2
2 2 2
ti J i=0
1
= (Mti+1 1 Mti )(Mti+1 1 + Mti )
2
t J
!" #
i
=0
1 1
n1
+ Mt2n (Mti+1 Mti )2
2 2
i=0
1 2 1
n1
= Mtn (Mti+1 Mti )2 .
2 2
i=0
Hence we have arrived at the following formula

T
1 2 1
Mt dMt = M NT .
0 2 T 2
Similarly, one can obtain

b
1 1
Mt dMt = (Mb2 Ma2 ) (Nb Na ).
a 2 2
b
Exercise 5.8.1 (a) Show that E Mt dMt = 0.
a
b
(b) Find V ar Mt dMt .
a
Remark 5.8.2 (a) Let be a xed state of the world and assume the sample
path t Nt () has a jump in the interval (a, b). Even if beyond the scope of
this book, it can be shown that the integral
b
Nt () dNt
a
does not exist in the Riemann-Stieltjes sense.

(b) Let Nt denote the left-hand limit of Nt . It can be shown that Nt is

measurable, while Nt is not.
The previous remarks provide the reason why in the following we shall
b
work with Mt instead of Mt : the integral Mt dNt might not exist, while
b a
Mt dNt does exist.

a

T t
1
Nt dMt = (Nt2 Nt ) Nt dt.
0 2 0
Exercise 5.8.4 Find the variance of

T
Nt dMt .
0
The following integrals with respect to a Poisson process Nt are considered

in the Riemann-Stieltjes sense. The following result can be found in Bertoin
[6].
Proposition 5.8.5 For any continuous function f we have

t t
(a) E f (s) dNs = f (s) ds;
0 0
t 2 t t 2
2 2
(b) E f (s) dNs = f (s) ds + f (s) ds ;
0 0 0
t t f (s)
(c) E e 0 f (s) dNs = e 0 (e 1) ds .
Proof: (a) Consider the equidistant partition 0 = s0 < s1 < < sn = t,

with sk+1 sk = s. Then
t n1

E f (s) dNs = lim E f (si )(Nsi+1 Nsi )
0 n
i=0

n1
= lim f (si )E Nsi+1 Nsi
n
i=0

n1 t
= lim f (si )(si+1 si ) = f (s) ds.
n 0
i=0
(b) Using that Nt is stationary and has independent increments, we have

respectively
E[(Nsi+1 Nsi )2 ] = E[Ns2i+1 si ] = (si+1 si ) + 2 (si+1 si )2
= s + 2 (s)2 ,
E[(Nsi+1 Nsi )(Nsj+1 Nsj )] = E[(Nsi+1 Nsi )]E[(Nsj+1 Nsj )]
= (si+1 si )(sj+1 sj ) = 2 (s)2 .
Applying the expectation to the formula
n1
2
n1
f (si )(Nsi+1 Nsi ) = f (si )2 (Nsi+1 Nsi )2
i=0 i=0

+2 f (si )f (sj )(Nsi+1 Nsi )(Nsj+1 Nsj )
i=j
yields
n1
2
E f (si )(Nsi+1 Nsi )
i=0

n1
= f (si )2 (s + 2 (s)2 ) + 2 f (si )f (sj )2 (s)2
i=0 i=j

n1 n1

2 2
= f (si ) s + f (si )2 (s)2 + 2 f (si )f (sj )(s)2
i=0 i=0 i=j

n1 n1
2
= f (si )2 s + 2 f (si ) s
i=0 i=0
t t 2
f (s)2 ds + 2 f (s) ds , as n .
0 0
(c) Using that Nt is stationary with independent increments and has the
k 1)t
moment generating function E[ekNt ] = e(e , we have
t
E e 0 f (s) dNs
n1 n1
$
f (si )(Nsi+1 Nsi )
= lim E e i=0 = lim E ef (si )(Nsi+1 Nsi )
n n
i=0
$
n1 $
n1
= lim E ef (si )(Nsi+1 Nsi ) = lim E ef (si )(Nsi+1 si )
n n
i=0 i=0
$
n1
f (si ) 1)(s
n1
i+1 si ) (ef (si ) 1)(si+1 si )
= lim e(e = lim e i=0
n n
i=0
t f (s) 1) ds
= e 0 (e .
t
Since f is continuous, the Poisson integral f (s) dNs can be computed
0
in terms of the waiting times Sk
t
Nt
f (s) dNs = f (Sk ).
0 k=1
This formula can be used to give a proof for the previous result. For instance,
taking the expectation and using conditions over Nt = n, yields
t
Nt
n
E f (s) dNs = E f (Sk ) = E f (Sk )|Nt = n P (Nt = n)
0 k=1 n0 k=1
n t
(t)n
= f (x) dx et
t 0 n!
n0

1 (t)n
t
= et f (x) dx
0 t (n 1)!
n0
t t
= et f (x) dx et = f (x) dx.
0 0
Exercise 5.8.6 Solve parts (b) and (c) of Proposition 5.8.5 using a similar
idea with the one presented above.
t 2 t
E f (s) dMs = f (s)2 ds,
0 0
where Mt = Nt t is the compensated Poisson process.
Exercise 5.8.8 Prove that

t t
V ar f (s) dNs = f (s)2 dNs .
0 0
Exercise 5.8.9 Find t
E e 0 f (s) dMs .
Proposition 5.8.10 Let Ft = (Ns ; 0 s t). Then for any constant c, the
process
Mt = ecNt +(1e )t ,
c
t0
is an Ft -martingale.
Proof: Let s < t. Since Nt Ns is independent of Fs and Nt is stationary,
we have
E[ec(Nt Ns ) |Fs ] = E[ec(Nt Ns ) ] = E[ecNts ]
c 1)(ts)
= e(e .
On the other side, taking out the deterministic part yields
E[ec(Nt Ns ) |Fs ] = ecNs E[ecNt |Fs ].
Equating the last two relations we arrive at
E[ecNt +(1e )t |Fs ] = ecNs +(1e
c c )s
,
which is equivalent to the martingale condition E[Mt |Fs ] = Ms .
We shall present an application of the previous result. Consider the waiting
time until the nth jump, Sn = inf{t > 0; Nt = n}, which is a stopping time,
and the ltration Ft = (Ns ; 0 s t). Since
Mt = ecNt +(1e
c )t
is an Ft -martingale, by the Optional Stopping Theorem (Theorem 4.2.1) we

have E[MSn ] = E[M0 ] = 1, which is equivalent to E[e(1e )Sn ] = ecn . Sub-
c
stituting s = (1 ec ), then c = ln(1 + s ). Since s, > 0, then c > 0. The

previous expression becomes
x
E[esSn ] = en ln(1+ ) =
s
.
+s
Since the expectation on the left side is the Laplace transform of the proba-
bility density of Sn , then
x
p(Sn ) = L1 {E[esSn ]} = L1
+s
et tn1 n
= ,
(n)
which shows that Sn has a gamma distribution.
T
5.9 The Distribution Function of XT = 0 g(t) dNt
In this section we consider the function g(t) continuous. Let S1 < S2 < <
SNt denote the waiting times until time t. Since the increments dNt are equal
to 1 at Sk and 0 otherwise, the integral can be written as
T
XT = g(t) dNt = g(S1 ) + + g(SNt ).
0
T
The distribution function of the random variable XT = 0 g(t) dNt can be
obtained conditioning over the Nt

P (XT u) = P (XT u|NT = k) P (NT = k)
k0

= P (g(S1 ) + + g(SNt ) u|NT = k) P (NT = k)
k0

= P (g(S1 ) + + g(Sk ) u) P (NT = k). (5.9.5)
k0
Considering S1 , S2 , , Sk independent and uniformly distributed over the

interval [0, T ], we have

1 vol(Dk )
P g(S1 ) + + g(Sk ) u = k
dx1 dxk = ,
Dk T Tk
where
Dk = {g(x1 ) + g(x2 ) + + g(xk ) u} {0 x1 , , xk T }.
Substituting back in (5.9.5) yields

P (XT u) = P (g(S1 ) + + g(Sk ) u) P (NT = k)
k0
vol(Dk ) k T k k vol(Dk )
T T
= e = e . (5.9.6)
Tk k! k!
k0 k0
In general, the volume of the k-dimensional solid Dk is not easy to obtain.

However, there are simple cases when this can be computed explicitly.
A Particular Case T We shall do an explicit computation of the partition
function of XT = 0 s2 dNs . In this case the solid Dk is the intersection

between the k-dimensional ball of radius u centered at the origin and the
k-dimensional cube [0, T ]k . There are three possible shapes for Dk , which

depend on the size of u:

(a) if 0 u < T , then Dk is a 21k -part of a k-dimensional sphere;

(b) if T u < T k, then Dk has a complicated shape;

(c) if T k u, then Dk is the entire k-dimensional cube, and then vol(Dk ) =
T k.
k/2 Rk
Since the volume of the k-dimensional ball of radius R is given by ,
( k2 + 1)
then the volume of Dk in case (a) becomes
k/2 uk/2
vol(Dk ) = .
2k ( k2 + 1)

(2 u)k/2
P (XT u) = eT , 0 u < T.
k0
k!( k2 + 1)

It is worth noting that for u , the inequality T k u is satised
for all k 0; hence relation (5.9.6) yields
k T k
lim P (XT u) = eT = ekT ekT = 1.
u k!
k0
The computation in case (b) is more complicated and will be omitted.

T
Exercise 5.9.1 Calculate the expectation E 0 eks dNs and the variance

T
V ar 0 eks dNs .
T
Exercise 5.9.2 Compute the distribution function of Xt = 0 s dNs .
Exercise 5.9.3 The following stochastic dierential equation has been used
in [2] to model the depreciation value of a car with stochastic repair payments
dVt = kVt dt dNt ,
where k > 0 is the depreciation rate, > 0 is the average repair payment, and
Nt is a Poisson process with rate .
(a) Show that the solution is given by
t
Vt = V0 ekt ekt eks dNs ;
0
(b) Consider the stopping time
= inf{t > 0; Vt K}.
Show that
kV0 +
E[ek ] = .
+ kK
Chapter 6
Stochastic Dierentiation
Most stochastic processes are not dierentiable. For instance, the Brownian
motion process Wt is a continuous process which is nowhere dierentiable.
Hence, derivatives like dW t
dt do not make sense in stochastic calculus. The only
quantities allowed to be used are the innitesimal changes of the process, in
our case, dWt .
The innitesimal change of a process The change in the process Xt be-
tween instances t and t + t is given by Xt = Xt+t Xt . When t is
innitesimally small, we obtain the innitesimal change of a process Xt
dXt = Xt+dt Xt .
Sometimes it is useful to use the equivalent formula Xt+dt = Xt + dXt .
6.1 Basic Rules

The following rules are the analog of some familiar dierentiation rules from
the elementary Calculus.
The constant multiple rule If Xt is a stochastic process and c is a constant,
then
d(c Xt ) = c dXt .
The verication follows from a straightforward application of the innitesimal
change formula
d(c Xt ) = c Xt+dt c Xt = c(Xt+dt Xt ) = c dXt .
The sum rule If Xt and Yt are two stochastic processes, then
d(Xt + Yt ) = dXt + dYt .
139
The verication is as in the following:
d(Xt + Yt ) = (Xt+dt + Yt+dt ) (Xt + Yt )

= (Xt+dt Xt ) + (Yt+dt Yt )
= dXt + dYt .
The dierence rule If Xt and Yt are two stochastic processes, then
d(Xt Yt ) = dXt dYt .
The proof is similar to the one for the sum rule.

The product rule If Xt and Yt are two stochastic processes, then
d(Xt Yt ) = Xt dYt + Yt dXt + dXt dYt .
The proof is as follows:
d(Xt Yt ) = Xt+dt Yt+dt Xt Yt

= Xt (Yt+dt Yt ) + Yt (Xt+dt Xt ) + (Xt+dt Xt )(Yt+dt Yt )
= Xt dYt + Yt dXt + dXt dYt ,
where the second identity is veried by direct computation.

If the process Xt is replaced by the deterministic function f (t), then the
aforementioned formula becomes
d(f (t)Yt ) = f (t) dYt + Yt df (t) + df (t) dYt .
Since in most practical cases the process Yt satises the equation
dYt = a(t, Wt )dt + b(t, Wt )dWt , (6.1.1)
using relations dt dWt = dt2 = 0, the last term vanishes
df (t) dYt = f (t)dtdYt = 0,
and hence
d(f (t)Yt ) = f (t) dYt + Yt df (t).
This relation looks like the usual product rule.
The quotient rule If Xt and Yt are two stochastic processes, then
X Y dX X dY dX dY Xt
+ 3 (dYt )2 .
t t t t t t t
d = 2
Yt Yt Yt
Stochastic Dierentiation 141
The proof follows from Itos formula and will be addressed in section 6.2.3.
When the process Yt is replaced by the deterministic function f (t), and Xt
is a process satisfying an equation of type (6.1.1), then the previous formula
becomes
X f (t)dX X df (t)
t t t
d = 2
.
f (t) f (t)
Example 6.1.1 We shall show that
d(Wt2 ) = 2Wt dWt + dt.
Applying the product rule and the fundamental relation (dWt )2 = dt, yields
d(Wt2 ) = Wt dWt + Wt dWt + dWt dWt = 2Wt dWt + dt.
Example 6.1.2 Show that
d(Wt3 ) = 3Wt2 dWt + 3Wt dt.
Applying the product rule and the previous exercise yields
d(Wt3 ) = d(Wt Wt2 ) = Wt d(Wt2 ) + Wt2 dWt + d(Wt2 ) dWt

= Wt (2Wt dWt + dt) + Wt2 dWt + dWt (2Wt dWt + dt)
= 2Wt2 dWt + Wt dt + Wt2 dWt + 2Wt (dWt )2 + dt dWt
= 3Wt2 dWt + 3Wt dt,
where we used (dWt )2 = dt and dt dWt = 0.
Example 6.1.3 Show that d(tWt ) = Wt dt + t dWt .
Using the product rule and dt dWt = 0, we get
d(tWt ) = Wt dt + t dWt + dt dWt

= Wt dt + t dWt .
t
Example 6.1.4 Let Zt = 0 Wu du be the integrated Brownian motion. Show
that
dZt = Wt dt.
The innitesimal change of Zt is

t+dt
dZt = Zt+dt Zt = Ws ds = Wt dt,
t
since Ws is a continuous function in s.

t
Example 6.1.5 Let At = 1t Zt = 1t 0 Wu du be the average of the Brownian
motion on the time interval [0, t]. Show that
1 1
dAt = Wt Zt dt.
t t
We have
1 1 1
dAt = d Zt + dZt + d dZt
t t t
1 1 1 2
= 2
Zt dt + Wt dt + 2 Wt dt
!"#
t t t
=0
1 1
= Wt Zt dt.
t t
t
Exercise 6.1.6 Let Gt = 1t 0 eWu du be the average of the geometric Brown-
ian motion on [0, t]. Find dGt .
6.2 Itos Formula

Itos formula is the analog of the chain rule from elementary Calculus. We
shall start by reviewing a few concepts regarding function approximations.
Let f be a twice continuously dierentiable function of a real variable
x. Let x0 be xed and consider the changes x = x x0 and f (x) =
f (x)f (x0 ). It is known from Calculus that the following second order Taylor
approximation holds
1
f (x) = f (x)x + f (x)(x)2 + O(x)3 .
2
When x is innitesimally close to x0 , we replace x by the dierential dx and
obtain
1
df (x) = f (x)dx + f (x)(dx)2 + O(dx)3 . (6.2.2)
2
In elementary Calculus, all terms involving terms of equal or higher order to
dx2 are neglected; then the aforementioned formula becomes
df (x) = f (x)dx.
Now, if we consider x = x(t) to be a dierentiable function of t, substituting

into the previous formula we obtain the dierential form of the well known
chain rule
df x(t) = f x(t) dx(t) = f x(t) x (t) dt.
We shall present a similar formula for the stochastic environment. In this

case the deterministic function x(t) is replaced by a stochastic process Xt .
The composition between the dierentiable function f and the process Xt is
a process denoted by Ft = f (Xt ).
Neglecting the increment powers higher than or equal to (dXt )3 , the ex-
pression (6.2.2) becomes
1 2
dFt = f Xt dXt + f Xt dXt . (6.2.3)
2
In the computation of dXt we may take into the account stochastic relations
such as dWt2 = dt, or dt dWt = 0.
Theorem 6.2.1 (Itos formula) Let Xt be a stochastic process satisfying
dXt = bt dt + t dWt ,
with bt () and t () measurable processes. Let Ft = f (Xt ), with f twice

continuously dierentiable. Then
2
dFt = bt f (Xt ) + t f (Xt ) dt + t f (Xt ) dWt . (6.2.4)
2
Proof: We shall provide an informal proof. Using relations dWt2 = dt and

dt2 = dWt dt = 0, we have
2
(dXt )2 = bt dt + t dWt
= b2t dt2 + 2bt t dWt dt + t2 dWt2
= t2 dt.

1 2
dFt = f Xt dXt + f Xt dXt
2
1
= f Xt bt dt + t dWt + f Xt t2 dt
2
2
= bt f (Xt ) + t f (Xt ) dt + t f (Xt ) dWt .
2
Remark 6.2.2 Itos formula can also be written under the following equiva-
lent integral form
t t
1 2
Ft = F0 + bs f (Xs ) + s f (Xs ) ds + s f (Xs ) dWs .
0 2 0
In the case Xt = Wt we obtain the following consequence:
Corollary 6.2.3 Let Ft = f (Wt ). Then
1
dFt = f (Wt )dt + f (Wt ) dWt . (6.2.5)
2
Particular cases In the following we shall present the most often used cases:
1. If f (x) = x , with constant, then f (x) = x1 , f (x) = ( 1)x2 .
Then (6.2.5) becomes the following useful formula
1
d(Wt ) = ( 1)Wt2 dt + Wt1 dWt .
2
A couple of useful cases easily follow:
d(Wt2 ) = 2Wt dWt + dt

d(Wt3 ) = 3Wt2 dWt + 3Wt dt.
2. If f (x) = ekx , with k constant, f (x) = kekx , f (x) = k2 ekx . Therefore
1
d(ekWt ) = kekWt dWt + k2 ekWt dt.
2
In particular, for k = 1, we obtain the increments of a geometric Brownian

motion
1
d(eWt ) = eWt dWt + eWt dt.
2
3. If f (x) = sin x, then
1
d(sin Wt ) = cos Wt dWt sin Wt dt.
2
Exercise 6.2.4 Use the previous rules to nd the following increments

(a) d(Wt eWt )
(b) d(3Wt2 + 2e5Wt )
2
(c) d(et+Wt )

(d) d (t + Wt )n
1 t
(e) d Wu du
t 0
1 t
(f ) d eWu du , where is a constant.
t 0
In the case when the function f = f (t, x) is also time dependent, the analog
of (6.2.2) is given by
1
df (t, x) = t f (t, x)dt + x f (t, x)dx + x2 f (t, x)(dx)2 + O(dx)3 + O(dt)2 .
2
(6.2.6)
Substituting x = Xt yields
1
df (t, Xt ) = t f (t, Xt )dt + x f (t, Xt )dXt + x2 f (t, Xt )(dXt )2 . (6.2.7)
2
If Xt is a process satisfying an equation of type (6.1.1), then we obtain an
extra-term in formula (6.2.4)
b(Wt , t)2 2
dFt = t f (t, Xt ) + a(Wt , t)x f (t, Xt ) + x f (t, Xt ) dt
2
+b(Wt , t)x f (t, Xt ) dWt . (6.2.8)
d(tWt2 ) = (t + Wt2 )dt + 2tWt dWt .
Exercise 6.2.6 Find the following increments
(a) d(tWt ) (c) d(t2 cos Wt )

(b) d(et Wt ) (d) d(sin t Wt2 ).
6.2.1 Ito diusions

Consider the process Xt given by
dXt = b(Xt , t)dt + (Xt , t)dWt . (6.2.9)
A process Xt = (Xti ) Rn satisfying this relation is called an Ito diusion in
Rn . Equation (6.2.9) models the position of a small particle that moves under
the inuence of a drift force b(Xt , t), and is subject to random deviations. This
situation occurs in the physical world when a particle suspended in a moving
liquid is subject to random molecular bombardments. The amount 12 T is
called the diusion coecient and describes the difussion of the particle.
Exercise 6.2.7 Consider the time-homogeneous Ito diusion in Rn
dXt = b(Xt )dt + (Xt )dWt .
Show that:
E[(Xti xi )]
(a) bi (x) = lim .
t0 t
E[(Xti xi )(Xtj xj )]
(b) ( T )ij (x) = lim .
t0 t
6.2.2 Itos formula for Poisson processes

Consider the process Ft = F (Mt ), where Mt = Nt t is the compensated
Poisson process. Itos formula for the process Ft takes the following integral
form. For a proof the reader can consult Kuo [30].
Proposition 6.2.8 Let F be a twice dierentiable function. Then for any
a < t we have
t
Ft = Fa + F (Ms ) dMs + F (Ms ) F (Ms )Ms ,
a a<st
where Ms = Ms Ms and F (Ms ) = F (Ms ) F (Ms ).

We shall apply the aforementioned result for the case Ft = F (Mt ) = Mt2 .
We have
t
2 2
Mt = Ma + 2 Ms dMs + Ms2 Ms
2
2Ms (Ms Ms ) . (6.2.10)
a a<st
Since the jumps in Ns are of size 1, we have (Ns )2 = Ns . Since the

dierence of the processes Ms and Ns is continuous, then Ms = Ns . Using
these formulas we have

Ms2 Ms
2
2Ms (Ms Ms ) = (Ms Ms ) Ms + Ms 2Ms
= (Ms Ms )2 = (Ms )2 = (Ns )2
= Ns = Ns Ns .

Since the sum of the jumps between s and t is a<st Ns = Nt Na , formula
(6.2.10) becomes
t
Mt2 = Ma2 + 2 Ms dMs + Nt Na . (6.2.11)
a
The dierential form is
d(Mt2 ) = 2Mt dMt + dNt ,
d(Mt2 ) = (1 + 2Mt ) dMt + dt,
since dNt = dMt + dt.
T
1
Mt dMt = (MT2 NT ).
0 2
Exercise 6.2.10 Use Itos formula for the Poisson process to nd the condi-
tional expectation E[Mt2 |Fs ] for s < t.
6.2.3 Itos multidimensional formula

If the process Ft depends on several Ito diusions, say Ft = f (t, Xt , Yt ), then
a similar formula to (6.2.8) leads to
f f f
dFt = (t, Xt , Yt )dt + (t, Xt , Yt )dXt + (t, Xt , Yt )dYt
t x y
1 2f 2 1 2f
+ (t, X ,
t tY )(dX t ) + (t, Xt , Yt )(dYt )2
2 x2 2 y 2
2f
+ (t, Xt , Yt )dXt dYt .
xy
Example 6.2.11 (Harmonic function of a Brownian motion) In the case
when Ft = f (Xt , Yt ), with Xt = Wt1 , Yt = Wt2 independent Brownian motions,
we have
f f 1 2f 1 2f
dFt = dWt1 + dWt2 + (dW 1 2
) + (dWt2 )2
x y 2 x2 t
2 y 2
2f
+ dWt1 dWt2
xy
f f 1 2f 2f
= dWt1 + dWt2 + + dt.
x y 2 x2 y 2
The expression
1 2f 2f
f = +
2 x2 y 2
is called the Laplacian of f . We can rewrite the previous formula as
f f
dFt = dWt1 + dWt2 + f dt.
x y
A function f with f = 0 is called harmonic. The aforementioned formula in
the case of harmonic functions takes the simple form
f f
dFt = dWt1 + dWt2 . (6.2.12)
x y
Exercise 6.2.12 Let Wt1 , Wt2 be two independent Brownian motions. If the
function f is harmonic, show that Ft = f (Wt1 , Wt2 ) is a martingale. Is the
converse true?
Exercise 6.2.13 Use the previous formulas to nd dFt in the following cases
(a) Ft = (Wt1 )2 + (Wt2 )2
(b) Ft = ln[(Wt1 )2 + (Wt2 )2 ].

Exercise 6.2.14 Consider the Bessel process Rt = (Wt1 )2 + (Wt2 )2 , where
Wt1 and Wt2 are two independent Brownian motions. Prove that
1 W1 W2
dRt = dt + t dWt1 + t dWt2 .
2Rt Rt Rt
Example 6.2.15 (The product rule) Let Xt and Yt be two Ito diusions.
Show that
d(Xt Yt ) = Yt dXt + Xt dYt + dXt dYt .
Consider the function f (x, y) = xy. Since x f = y, y f = x, x2 f = y2 f = 0,

x y f = 1, then Itos multidimensional formula yields

d(Xt Yt ) = d f (X, Yt ) = x f dXt + y f dYt
1 1
+ x2 f (dXt )2 + y2 f (dYt )2 + x y f dXt dYt
2 2
= Yt dXt + Xt dYt + dXt dYt .
Example 6.2.16 (The quotient rule) Let Xt and Yt be two Ito diusions.
Show that
X Y dX X dY dX dY Xt
+ 3 (dYt )2 .
t t t t t t t
d =
Yt Yt2 Yt
Consider the function f (x, y) = xy . Since x f = y1 , y f = yx2 , x2 f = 0,

y f = yx2 , y2 f = 2x 1
y 3 , x y = y 2 , then applying Itos multidimensional
formula yields
X
t
d = d f (X, Yt ) = x f dXt + y f dYt
Yt
1 1
+ x2 f (dXt )2 + y2 f (dYt )2 + x y f dXt dYt
2 2
Yt dXt Xt dYt dXt dYt Xt
= 2 + 3 (dYt )2 .
Yt Yt
Chapter 7
Stochastic Integration
Techniques
Computing a stochastic integral starting from the denition of the Ito integral
is not only dicult, but also rather inecient. Like in elementary Calculus,
several methods can be developed to compute stochastic integrals. We tried
to keep the analogy with elementary Calculus as much as possible. The inte-
gration by substitution is more complicated in the stochastic environment and
we have considered only a particular case of it, which we called the method of
heat equation.
7.1 Notational Conventions

The intent of this section is to discuss some equivalent integral notations for
a stochastic dierential equation. Consider a process Xt whose increments
satisfy the stochastic dierential equation dXt = f (t, Wt )dWt . This can be
written equivalently in the integral form as
t t
dXs = f (s, Ws )dWs . (7.1.1)
a a
If we consider the partition 0 = t0 < t1 < < tn1 < tn = t, then the left
side becomes
t
n1
dXs = ms-lim (Xtj+1 Xtj ) = Xt Xa ,
a n
j=0
after canceling the terms in pairs. Substituting into formula (7.1.1) yields the
equivalent form t
Xt = Xa + f (s, Ws )dWs .
a
149
t
This can also be written as dXt = d f (s, Ws )dWs , since Xa is a constant.
a
Using dXt = f (t, Wt )dWt , the previous formula can be written in the following
two equivalent ways:
(i) For any a < t, we have
t
d f (s, Ws )dWs = f (t, Wt )dWt . (7.1.2)
a
(ii) If Yt is a stochastic process, such that Yt dWt = dFt , then

b
Yt dWt = Fb Fa .
a
These formulas are equivalent ways of writing the stochastic dierential equa-
tion (7.1.1), and will be useful in future computations. A few applications
follow.
Example 7.1.1 Verify the stochastic formula

t
W2 t
Ws dWs = t .
0 2 2
t Wt2 t
Let Xt = 0 Ws dWs and Yt = . From Itos formula
2 2
W 2 t 1 1
dYt = d t
d = (2Wt dWt + dt) dt = Wt dWt ,
2 2 2 2
and from formula (7.1.2) we get
t
dXt = d Ws dWs = Wt dWt .
0
Hence dXt = dYt , or d(Xt Yt ) = 0. Since the process Xt Yt has zero

increments, then Xt Yt = c, constant. Taking t = 0, yields
0 W 2 0
0
c = X0 Y0 = Ws dWs = 0,
0 2 2
and hence c = 0. It follows that Xt = Yt , which veries the desired relation.
Example 7.1.2 Verify the formula

t
t 2 t 1 t 2
sWs dWs = Wt W ds.
0 2 2 2 0 s
Stochastic Integration Techniques 151
t t 2
Consider the stochastic processes Xt = 0 sWs dWs , Yt = Wt 1 , and
t 2
Zt = 12 0 Ws2 ds. Formula (7.1.2) yields
dXt = tWt dWt

1 2
dZt = W dt.
2 t
Applying Itos formula, we get
t t 1 t2
2 2
dYt = d Wt = d(tWt ) d
2 2 2 4
1 1
= (t + Wt2 )dt + 2tWt dWt tdt
2 2
1 2
= W dt + tWt dWt .
2 t
We can easily see that
dXt = dYt dZt .
This implies d(Xt Yt + Zt ) = 0, i.e. Xt Yt + Zt = c, constant. Since
X0 = Y0 = Z0 = 0, it follows that c = 0. This proves the desired relation.
Example 7.1.3 Show that

t
1
(Ws2 s) dWs = Wt3 tWt .
0 3
Consider the function f (t, x) = 13 x3 tx, and let Ft = f (t, Wt ). Since t f =

x, x f = x2 t, and x2 f = 2x, then Itos formula provides
1
dFt = t f dt + x f dWt + x2 f (dWt )2
2
2 1
= Wt dt + (Wt t) dWt + 2Wt dt
2
= (Wt2 t)dWt .
From formula (7.1.2) we get

t t
1
(Ws2 s) dWs = dFs = Ft F0 = Ft = Wt3 tWt .
0 0 3
t t
2 2 1
(a) ln Ws dWs = Wt (ln Wt 2) ds;
0 0 Ws
t
s t
(b) e 2 cos Ws dWs = e 2 sin Wt ;
0

t s t
(c) e 2 sin Ws dWs = 1 e 2 cos Wt ;
0
t
eWs 2 dWs = eWt 2 1;
s t
(d)
0
t t
1
(e) cos Ws dWs = sin Wt +
sin Ws ds;
0 2 0
t
1 t
(f ) sin Ws dWs = 1 cos Wt cos Ws ds.
0 2 0
7.2 Stochastic Integration by Parts

Consider the process Ft = f (t)g(Wt ), with f and g dierentiable. Using the
product rule yields
dFt = df (t) g(Wt ) + f (t) dg(Wt )

1
= f (t)g(Wt )dt + f (t) g (Wt )dWt + g (Wt )dt
2
1
= f (t)g(Wt )dt + f (t)g (Wt )dt + f (t)g (Wt )dWt .
2
Writing the relation in the integral form, we obtain the rst integration by
parts formula:
b b b
1 b
f (t)g (Wt ) dWt = f (t)g(Wt ) f (t)g(Wt ) dt f (t)g (Wt ) dt.
a a a 2 a
This formula is to be used when integrating a product between a function

of t and a function of the Brownian motion Wt , for which an antiderivative
is known. The following two particular cases are important and useful in
applications.
1. If g(Wt ) = Wt , the aforementioned formula takes the simple form
b t=b b

f (t) dWt = f (t)Wt f (t)Wt dt. (7.2.3)
a t=a a
It is worth noting that the left side is a Wiener integral.

2. If f (t) = 1, then the formula becomes
b t=b 1 b

g (Wt ) dWt = g(Wt ) g (Wt ) dt. (7.2.4)
a t=a 2 a
T
Application 7.2.1 Consider the Wiener integral IT = t dWt . From the
0
general theory, see Proposition 5.6.1, it is known that I is a random variable
normally distributed with mean 0 and variance
T
T3
V ar[IT ] = t2 dt = .
0 3
Recall the denition of integrated Brownian motion
t
Zt = Wu du.
0
Formula (7.2.3) yields a relationship between I and the integrated Brownian

motion T T
IT = t dWt = T WT Wt dt = T WT ZT ,
0 0
and hence IT +ZT = T WT . This relation can be used to compute the covariance
between IT and ZT .
Cov(IT + ZT , IT + ZT ) = V ar[T WT ]
V ar[IT ] + V ar[ZT ] + 2Cov(IT , ZT ) = T 2 V ar[WT ]
T 3 /3 + T 3 /3 + 2Cov(IT , ZT ) = T 3
Cov(IT , ZT ) = T 3 /6,
where we used that V ar[ZT ] = T 3 /3. The processes It and Zt are not inde-
pendent. Their correlation coecient is 0.5 as the following calculation shows
Cov(IT , ZT ) T 3 /6
Corr(IT , ZT ) = 1/2 = T 3 /3
V ar[IT ]V ar[ZT ]
= 1/2.
x2
Application 7.2.2 If we let g(x) = 2 in formula (7.2.4), we get
b
Wb2 Wa2 1
Wt dWt = (b a).
a 2 2
It is worth noting that letting a = 0 and b = T , we retrieve a formula that was

proved by direct methods in chapter 3
T
WT2 T
Wt dWt = .
0 2 2
x3
Similarly, if we let g(x) = 3 in (7.2.4) yields
b
b
Wt3 b
Wt2 dWt = Wt dt.
a 3 a a
Application 7.2.3 Choosing f (t) = et and g(x) = sin x, we shall compute

T
the stochastic integral 0 et cos Wt dWt using the formula of integration by
parts
T T
t
e cos Wt dWt = et (sin Wt ) dWt
0 0
T T
1 T t
t
= e sin Wt
t
(e ) sin Wt dt e (sin Wt ) dt
0 0 2 0
T
1 T t
= e sin WT
T t
e sin Wt dt + e sin Wt dt
0 2 0

1 T t
= eT sin WT e sin Wt dt.
2 0
The particular case = 12 leads to the following exact formula of a stochastic
integral
T
t T
e 2 cos Wt dWt = e 2 sin WT . (7.2.5)
0
In
T atsimilar way, we can obtain an exact formula for the stochastic integral
0 e sin Wt dWt as follows
T T
e sin Wt dWt =
t
et (cos Wt ) dWt
0 0
T T
1 T t
= et cos Wt + et cos Wt dt e cos Wt dt.
0 0 2 0
1
Taking = 2 yields the closed form formula
T t T
e 2 sin Wt dWt = 1 e 2 cos WT . (7.2.6)
0
A consequence of the last two formulas and of Eulers formula

eiWt = cos Wt + i sin Wt ,
is
T
e 2 +iWt dWt = i(1 e 2 +iWT ).
t T
The proof details are left to the reader.

A general form of the integration by parts formula In general, if Xt

and Yt are two Ito diusions, from the product formula
Integrating between the limits a and b
b b b b
a a a a
From the Fundamental Theorem

b
d(Xt Yt ) = Xb Yb Xa Ya ,
a
so the previous formula takes the following form of integration by parts

b b b
Xt dYt = Xb Yb Xa Ya Yt dXt dXt dYt .
a a a
This formula is of theoretical value. In practice, the term dXt dYt needs to be
computed using the rules dWt2 = dt, and dt dWt = 0.
Exercise 7.2.4 (a) Use integration by parts to get

T T
1 1 Wt
2 dW t = tan (W T ) + 2 2 dt, T > 0.
0 1 + Wt 0 (1 + Wt )
(b) Show that

T
Wt
E[tan1 (WT )] = E dt.
0 (1 + Wt2 )2
(c) Prove the double inequality

3 3 x 3 3
, x R.
16 (1 + x2 )2 16
(d) Use part (c) to obtain
T
3 3 Wt 3 3
T 2 2 dt 16 T.
16 0 (1 + Wt )
(e) Use part (d) to get

3 3 1 3 3
T E[tan (WT )] T.
16 16
(f ) Does part (e) contradict the inequality

< tan1 (WT ) < ?
2 2
Exercise 7.2.5 (a) Show the relation

T
1 T Wt
Wt
e dWt = e WT
1 e dt.
0 2 0
(b) Use part (a) to nd E[eWt ].
Exercise 7.2.6 (a) Use integration by parts to show

T
1 T Wt
Wt eWt dWt = 1 + WT eWT eWT e (1 + Wt ) dt;
0 2 0
(b) Use part (a) to nd E[Wt eWt ];
(c) Show that Cov(Wt , eWt ) = tet/2 ;

t
(d) Prove that Corr(Wt, eWt ) = , and compute the limits as t 0
et 1
and t .
Exercise 7.2.7 (a) Let T > 0. Show the following relation using integration
by parts
T T
2Wt 2 1 Wt2
2 dW t = ln(1 + W T ) 2 2 dt.
0 1 + Wt 0 (1 + Wt )
(b) Show that for any real number x the following double inequality holds
1 1 x2
1.
8 (1 + x2 )2
(c) Use part (b) to show that

1 T
1 Wt2
T dt T.
8 0 (1 + Wt2 )2
(d) Use parts (a) and (c) to get
T
E[ln(1 + WT2 )] T.
8
(e) Use Jensens inequality to get
E[ln(1 + WT2 )] ln(1 + T ).
Does this contradict the upper bound provided in (d)?
Exercise 7.2.8 Use integration by parts to show

t
1 2 1 t 1
arctan Ws dWs = Wt arctan Wt ln(1 + Wt ) ds.
0 2 2 0 1 + Ws2
Exercise 7.2.9 (a) Using integration by parts prove the identity

t t
1
Ws eWs dWs = 1 + eWt (Wt 1) (1 + Ws )eWs ds;
0 2 0
(b) Use part (a) to compute E[Wt eWt ].
Exercise 7.2.10 Check the following formulas using integration by parts

t
1 t
(a) Ws sin Ws dWs = sin Wt Wt cos Wt (sin Ws + Ws cos Ws ) ds;
0 2 0
t
Ws 1 t
(b) 2
dWs = 1 + Wt 1 (1 + Ws2 )3/2 ds.
0 1 + W 2 2 0
s
7.3 The Heat Equation Method

In elementary Calculus, integration by substitution is the inverse application
of the chain rule. In the stochastic environment, this will be the inverse
application of Itos formula. This is dicult to apply in general, but there is
a particular case of great importance.
Let (t, x) be a solution of the equation
1
t + x2 = 0. (7.3.7)
2
This is called the heat equation without sources. The non-homogeneous equa-
tion
1
t + x2 = G(t, x) (7.3.8)
2
is called the heat equation with sources. The function G(t, x) represents the
density of heat sources, while the function (t, x) is the temperature at the
point x at time t in a one-dimensional wire. If the heat source is time inde-
pendent, then G = G(x), i.e. G is a function of x only.
Example 7.3.1 Find all solutions of the equation (7.3.7) of type (t, x) =
a(t) + b(x).
Substituting into equation (7.3.7) yields
1
b (x) = a (t).
2
Since the left side is a function of x only, while the right side is a function of
variable t, the only case where the previous equation is satised is when both
sides are equal to the same constant C. This is called a separation constant.
Therefore a(t) and b(x) satisfy the equations
1
a (t) = C, b (x) = C.
2
Integrating yields a(t) = Ct + C0 and b(x) = Cx2 + C1 x + C2 . It follows
that
(t, x) = C(x2 t) + C1 x + C3 ,
with C0 , C1 , C2 , C3 arbitrary constants.
Example 7.3.2 Find all solutions of the equation (7.3.7) of the type (t, x) =
a(t)b(x).
Substituting into the equation and dividing by a(t)b(x) yields
a (t) 1 b (x)

+ = 0.
a(t) 2 b(x)
a (t) b (x)

There is a separation constant C such that = C and = 2C.
a(t) b(x)
There are three distinct cases to discuss:
1. C = 0. In this case a(t) = a0 and b(x) = b1 x + b0 , with a0 , a1 , b0 , b1 real
constants. Then
(t, x) = a(t)b(x) = c1 x + c0 , c0 , c1 R
is just a linear function in x.

2
2. C > 0. Let > 0 such that 2C = 2 . Then a (t) = 2 a(t) and
b (x) = 2 b(x), with solutions
2 t/2
a(t) = a0 e
b(x) = c1 ex + c2 ex .
The general solution of (7.3.7) is

2 t/2
(t, x) = e (c1 ex + c2 ex ), c1 , c2 R.
2
3. C < 0. Let > 0 such that 2C = 2 . Then a (t) = 2 a(t) and
b (x) = 2 b(x). Solving yields
2 t/2
a(t) = a0 e
b(x) = c1 sin(x) + c2 cos(x).
The general solution of (7.3.7) in this case is

2
(t, x) = e t/2 c1 sin(x) + c2 cos(x) , c1 , c2 R.
In particular, the functions x, x2 t, ext/2 , ext/2 , et/2 sin x and et/2 cos x,
or any linear combination of them, are solutions of the heat equation (7.3.7).
However, there are other solutions which are not of the previous type.
Exercise 7.3.3 Prove that (t, x) = 13 x3 tx is a solution of the heat equation

(7.3.7).
2 /(2t)
Exercise 7.3.4 Show that (t, x) = t1/2 ex is a solution of the heat
equation (7.3.7) for t > 0.
x
Exercise 7.3.5 Let = u(), with = , t > 0. Show that satises the
2 t
heat equation (7.3.7) if and only if u + 2u = 0.

2 2
Exercise 7.3.6 Let erf c(x) = er dr. Show that = erf c(x/(2 t))
x
is a solution of the equation (7.3.7).
2
1 e 4t
x
Exercise 7.3.7 (the fundamental solution) Show that (t, x) = 4t
,
t > 0 satises the equation (7.3.7).
Sometimes it is useful to generate new solutions for the heat equation from
other solutions. Below we present a few ways to accomplish this:
(i) by linear combination: if 1 and 2 are solutions, then a1 1 + a1 2 is
a solution, where a1 , a2 are constants.
(ii) by translation: if (t, x) is a solution, then (t , x ) is a solution,
where (, ) is a translation vector.
(iii) by ane transforms: if (t, x) is a solution, then (t, 2 x) is a
solution, for any constant .
n+m
(iv) by dierentiation: if (t, x) is a solution, then n m (t, x) is a
x t
solution.
(v) by convolution: if (t, x) is a solution, then so are
b
(t, x )f () d
a
b
(t , x)g(t) dt.
a
For more detail on the subject the reader can consult Widder [46] and Cannon
[11].
Theorem 7.3.8 Let (t, x) be a solution of the heat equation (7.3.7) and
denote f (t, x) = x (t, x). Then
b
f (t, Wt ) dWt = (b, Wb ) (a, Wa ).
a
Proof: Let Ft = (t, Wt ). Applying Itos formula we get

1
dFt = x (t, Wt ) dWt + t + x2 dt.
2
Since t + 12 x2 = 0 and x (t, Wt ) = f (t, Wt ), we have
dFt = f (t, Wt ) dWt .
Writing in the integral form, yields

b b
f (t, Wt ) dWt = dFt = Fb Fa = (b, Wb ) (a, Wa ).
a a

T
1 1
Wt dWt = WT2 T.
0 2 2
Choose the solution of the heat equation (7.3.7) given by (t, x) = x2 t.

Then f (t, x) = x (t, x) = 2x. Theorem 7.3.8 yields
T T T

2Wt dWt = f (t, Wt ) dWt = (t, x) = WT2 T.
0 0 0
Dividing by 2 leads to the desired result.

T
1
(Wt2 t) dWt = WT3 T WT .
0 3
Consider the function (t, x) = 13 x3 tx, which is a solution of the heat equa-
tion (7.3.7), see Exercise 7.3.3. Then f (t, x) = x (t, x) = x2 t. Applying
Theorem 7.3.8 yields
T T T
2 1
(Wt t) dWt = f (t, Wt ) dWt = (t, Wt ) = WT3 T WT .
0 0 0 3
Application 7.3.11 Let > 0. Prove the identities

T 2 t 1 2 T WT
e 2
Wt
dWt = e 2 1 .
0
2 t
Consider the function (t, x) = e 2 x , which is a solution of the homoge-
neous heat equation (7.3.7), see Example 7.3.2. Then f (t, x) = x (t, x) =
2 t
e 2
x
. Apply Theorem 7.3.8 to get
T T T
2
2 t x 2 T
e dWt = f (t, Wt ) dWt = (t, Wt ) = e 2 WT 1.
0 0 0
Dividing by the constant ends the proof.

In particular, for = 1 the aforementioned formula becomes
T
e 2 +Wt dWt = e 2 +WT 1.
t T
(7.3.9)
0
Application 7.3.12 Let > 0. Prove the identity

T 2 t 1 2 T
e 2 cos(Wt ) dWt = e 2 sin(WT ).
0
2 t
From the Example 7.3.2 we know that (t, x) = e 2 sin(x) is a solution of the
heat equation. Applying Theorem 7.3.8 to the function f (t, x) = x (t, x) =
2 t
e 2 cos(x), yields
T T T
2 t
e 2 cos(Wt ) dWt = f (t, Wt ) dWt = (t, Wt )
0 0 0

2
t T 2
T
= e 2 sin(Wt ) = e 2 sin(WT ).
0
Divide by to end the proof.

If we choose = 1 we recover a result already familiar to the reader from
section 7.2
T
t T
e 2 cos(Wt ) dWt = e 2 sin WT . (7.3.10)
0
Application 7.3.13 Let > 0. Show that

T 2 t 1 2 T

e 2 sin(Wt ) dWt = 1 e 2 cos(WT ) .
0
2 t
Choose (t, x) = e 2 cos(x) to be a solution of the heat equation. Apply
2 t
Theorem 7.3.8 for the function f (t, x) = x (t, x) = e 2 sin(x) to get
T T
2 t
()e 2 sin(Wt ) dWt = (t, Wt )
0 0
T
2 T 2 T
= e 2 cos(Wt ) = e 2 cos(WT ) 1,
0
and then divide by .
Application 7.3.14 Let 0 < a < b. Show that

Wb2
b 3 Wt2 1 Wa2
1
t 2 Wt e 2t dWt = a 2 e 2a b 2 e 2b . (7.3.11)
a
2
From Exercise 7.3.4 we have that (t, x) = t1/2 ex /(2t) is a solution of the
2
homogeneous heat equation. Since f (t, x) = x (t, x) = t3/2 xex /(2t) ,
applying Theorem 7.3.8 yields the desired result. The reader can easily ll in
the details.
Integration techniques will be used when solving stochastic dierential
equations in the next chapter.
Exercise 7.3.15 Find the value of the following stochastic integrals

1
(a) et cos( 2Wt ) dWt
0
3
(b) e2t cos(2Wt ) dWt
0
4
(c) et+ 2Wt dWt .
0
Exercise 7.3.16 Let (t, x) be a solution of the following non-homogeneous

heat equation with time-dependent and uniform heat source G(t)
1
t + x2 = G(t).
2
Denote f (t, x) = x (t, x). Show that
b b
f (t, Wt ) dWt = (b, Wb ) (a, Wa ) G(t) dt.
a a
How does the formula change if the heat source G is constant?

7.4 Table of Usual Stochastic Integrals

Now we present a user-friendly table, which enlists integral identities developed
in this chapter. This table is far too complicated to be memorized in full.
However, the rst couple of identities in this table are the most memorable,
and should be remembered.
Let a < b and 0 < T . Then we have:
b
1. dWt = Wb Wa ;
a
T
WT2 T
2. Wt dWt = ;
0 2 2
T
WT2
3. (Wt2 t) dWt = T WT ;
0 3
T T
4. t dWt = T WT Wt dt, 0 < T ;
0 0

T
W3 T
5. Wt2 dWt = T Wt dt;
0 3 0
T t T
6. e 2 cos Wt dWt = e 2 sin WT ;
0
T t T
7. e 2 sin Wt dWt = 1 e 2 cos WT ;
0
T
e 2 +Wt dWt = e 2 +WT 1;
t T
8.
0
T 2 t 1 2 T
9. e 2 cos(Wt ) dWt = e 2 sin(WT );
0
T 2 t 1 2 T

10. e 2 sin(Wt ) dWt = 1 e 2 cos(WT ) ;
0
T 2 t 1 2 T WT
11. e 2
Wt
dWt = e 2 1 ;
0
b Wt2 2 Wb2
3 1 Wa 1
12. t 2 Wt e 2t dWt = a 2 e 2a b 2 e 2b ;
a
T T
1
13. cos Wt dWt = sin WT + sin Wt dt;
0 2 0
T
1 T
14. sin Wt dWt = 1 cos WT cos Wt dt;
0 2 0
t
15. d f (s, Ws ) dWs = f (t, Wt ) dWt ;
a
b
16. Yt dWt = Fb Fa , if Yt dWt = dFt ;
a
b b
17. f (t) dWt = f (t)Wt |ba f (t)Wt dt;
a a
b b 1 b

18. g (Wt ) dWt = g(Wt ) g (Wt ) dt.
a a 2 a
Chapter 8
Stochastic Dierential
Equations
If deterministic Calculus was developed mainly to put into dierential equa-

tions form the fundamental principles which govern all evolution phenomena,
then Stochastic Calculus plays a similar role for the case of noisy evolution
systems, which provide a more realistic description of the real world.
This chapter deals with several analytic techniques of solving stochastic
dierential equations. The number of these techniques is limited and follows
quite closely the methods used in the ordinary dierential equations treated,
for instance, in classical books of Arnold [3] or Boyce and DiPrima [7].
8.1 Denitions and Examples

Let Xt be a continuous stochastic process. If small changes in the process
Xt can be written as a linear combination of small changes in t and small
increments of the Brownian motion Wt , we may write
dXt = a(t, Wt , Xt )dt + b(t, Wt , Xt ) dWt (8.1.1)
and call it a stochastic dierential equation. In fact, this dierential relation

has the following integral meaning:
t t
Xt = X0 + a(s, Ws , Xs ) ds + b(s, Ws , Xs ) dWs , (8.1.2)
0 0
where the last integral is taken in the Ito sense. Relation (8.1.2) is taken as the
denition for the stochastic dierential equation (8.1.1). However, since it is
convenient to use stochastic dierentials informally, we shall approach stochas-
tic dierential equations by analogy with the ordinary dierential equations,
165
and try to present the same methods of solving equations in the new stochastic
environment.
Most of the stochastic dierential equations considered here describe dif-
fusions, and are of the type
dXt = a(t, Xt )dt + b(t, Xt )dWt , X0 = , (8.1.3)
with a(t, x) and b(t, x) measurable functions. The functions a(t, x) and b(t, x)
are called the drift rate and the volatility of the process Xt , respectively. Given
these two functions as input, one may seek for the solution Xt of the stochastic
dierential equation as an output. The desired outputs Xt are the so-called
strong solutions. The precise denition of this concept is given in the following.
The beginner can skip this denition; all solutions in this book will be solutions
in the strong sense anyway.
Denition 8.1.1 A process Xt is a strong solution for the stochastic equation

(8.1.3) on the probability space (, F, P ) if it satises the following properties:
(i) Xt is adapted to the augmented ltration Ft generated by the Brownian
motion Wt and the initial condition ;
(ii) P (X0 = ) = 1;
(iii) For any 0 t < we have
t
|a(s, Xs )| + b2 (s, Xs ) ds < ;
0
(iv) The formula

t t
Xt = X0 + a(s, Xs ) ds + b(s, Xs ) dWs
0 0
holds almost surely.
A few comments regarding the previous denition. Part (i) states that
given the information induced by and the history of the Brownian motion
until time t, one can determine the value Xt . Part (ii) states that X0 takes
the value with probability 1. Part (iii) deals with a non-explosive condition
for the coecients. Part (iv) states that Xt veries the associated integral
equation.
We shall start with an example.
Example 8.1.2 (The Brownian bridge) Let a, b R. Show that the pro-
cess t
1
Xt = a(1 t) + bt + (1 t) dWs , 0 t < 1
0 1s
Stochastic Dierential Equations 167
is a solution of the stochastic dierential equation

b Xt
dXt = dt + dWt , 0 t < 1, X0 = a.
1t
We shall perform a routine verication to show that Xt is a solution. First we
b Xt
compute the quotient :
1t
t
1
b Xt = b a(1 t) bt (1 t) dWs
0 1s
t
1
= (b a)(1 t) (1 t) dWs ,
0 1s
and dividing by 1 t yields

b Xt t
1
=ba dWs . (8.1.4)
1t 0 1s
Using
t
1 1
d dWs = dWt ,
0 1s 1t
the product rule provides
t t 1
1
dXt = a d(1 t) + bdt + d(1 t) dWs + (1 t)d dWs
0 1s 0 1s
t
1
= ba dWs dt + dWt
0 1 s
b Xt
= dt + dWt ,
1t
where the last identity comes from (8.1.4). We just veried that the process
Xt is a solution of the given stochastic equation. The question of how this
solution was obtained in the rst place, is the subject of study for the next few
sections.
8.2 The Integration Technique

We shall start with the simple case when both the drift and the volatility are
just functions of time t.
Proposition 8.2.1 The solution Xt of the stochastic dierential equation
dXt = a(t)dt + b(t)dWt

t t
is Gaussian distributed with the mean X0 + 0 a(s) ds and the variance 0 b2 (s) ds.
Proof: Integrating in the equation yields

t t t
Xt X0 = dXs = a(s) ds + b(s) dWs .
0 0 0
t
Using the property of Wiener integrals, 0 b(s) dWs is Gaussian distributed
t
with mean 0 and variance 0 b2 (s) ds. Then Xt is Gaussian (as a sum between
a deterministic function and a Gaussian), with
t t
E[Xt ] = E[X0 + a(s) ds + b(s) dWs ]
0 0
t t
= X0 + a(s) ds + E b(s) dWs
0 0
t
= X0 + a(s) ds,
0
t t
V ar[Xt ] = V ar[X0 + a(s) ds + b(s) dWs ]
0 0

t
= V ar b(s) dWs
0
t
= b2 (s) ds,
0
which ends the proof.
Exercise 8.2.2 Solve the following stochastic dierential equations for t 0

and determine the mean and the variance of the solution:
(a) dXt = cos t dt sin t dWt , X0 = 1.

(b) dXt = et dt + t dWt , X0 = 0.
(c) dXt = 1+tt 3/2 dW , X = 1.
2 dt + t t 0
If the drift and the volatility depend only on variables t and Wt , the
stochastic dierential equation
dXt = a(t, Wt )dt + b(t, Wt )dWt , t0
denes a stochastic process that can be expressed in terms of Ito integrals
t t
Xt = X0 + a(s, Ws ) ds + b(s, Ws ) dWs .
0 0
There are several cases when both integrals can be computed explicitly. In
order to compute the second integral we shall often use the table of usual
stochastic integrals provided in section 7.4.
Example 8.2.3 Find the solution of the stochastic dierential equation
dXt = dt + Wt dWt , X0 = 1.
Integrate between 0 and t and get

t t
W2 t
Xt = 1 + ds + Ws dWs = 1 + t + t
0 0 2 2
1
= (W 2 + t) + 1.
2 t
Example 8.2.4 Solve the stochastic dierential equation
dXt = (Wt 1)dt + Wt2 dWt , X0 = 0.

t
Let Zt = 0 Ws ds denote the integrated Brownian motion process. Integrating
the equation between 0 and t yields
t t t
Xt = dXs = (Ws 1)ds + Ws2 dWs
0 0 0
1
= Zt t + Wt3 Zt
3
1 3
= W t,
3 t
t
where we used that 0 Ws2 dWs = 13 Wt3 Zt .
dXt = t2 dt + et/2 cos Wt dWt , X0 = 0,
and nd E[Xt ] and V ar(Xt ).
Integrating yields
t t
Xt = s2 ds + es/2 cos Ws dWs
0 0
t3
= + et/2 sin Wt , (8.2.5)
3
where we used (7.3.10). Even if the process Xt is not Gaussian, we can still
compute its mean and variance. Since Ito integrals have zero expectation,
t t t t3
E[Xt ] = E s2 ds + es/2 cos Ws dWs = s2 ds = .
0 0 0 3
Another variant of computation is using Itos formula
1
d(sin Wt ) = cos Wt dWt sin Wt dt
2
Integrating between 0 and t yields
t t
1
sin Wt = cos Ws dWs sin Ws ds,
0 2 0
where we used that sin W0 = sin 0 = 0. Taking the expectation in the previous
relation yields
t 1
t
E[sin Wt ] = E cos Ws dWs E[sin Ws ] ds.
0 2 0
From the properties of the Ito integral, the rst expectation on the right side
is zero. Denoting (t) = E[sin Wt ], we obtain the integral equation
t
1
(t) = (s) ds.
2 0
Dierentiating yields the dierential equation
1
(t) = (t)
2
with the solution (t) = ket/2 . Since k = (0) = E[sin W0 ] = 0, it follows

that (t) = 0. Hence
E[sin Wt ] = 0.
Taking expectation in (8.2.5) leads to
t3 t3
E[Xt ] = E + et/2 E[sin Wt ] = .
3 3
Since the variance of deterministic functions is zero,
t3
V ar[Xt ] = V ar + et/2 sin Wt = (et/2 )2 V ar[sin Wt ]
3
et
= et E[sin2 Wt ] = (1 E[cos 2Wt ]). (8.2.6)
2
In order to compute the last expectation we use Itos formula
d(cos 2Wt ) = 2 sin 2Wt dWt 2 cos 2Wt dt

and integrate to get

t t
cos 2Wt = cos 2W0 2 sin 2Ws dWs 2 cos 2Ws ds.
0 0
Taking the expectation and using that Ito integrals have zero expectation,
yields t
E[cos 2Wt ] = 1 2 E[cos 2Ws ] ds.
0
If we denote m(t) = E[cos 2Wt ], the previous relation becomes an integral
equation t
m(t) = 1 2 m(s) ds.
0
Dierentiate and get
m (t) = 2m(t),
with the solution m(t) = ke2t . Since k = m(0) = E[cos 2W0 ] = 1, we have
m(t) = e2t . Substituting into (8.2.6) yields
et et et
V ar[Xt ] = (1 e2t ) = = sinh t.
2 2
In conclusion, the solution Xt has the mean and the variance given by
t3
E[Xt ] = , V ar[Xt ] = sinh t.
3
Example 8.2.6 Solve the following stochastic dierential equation
et/2 dXt = dt + eWt dWt , X0 = 0,
and nd the distribution of the solution Xt and its mean and variance.
Dividing by et/2 , integrating between 0 and t, and using formula (7.3.9) yields
t t
s/2
Xt = e ds + es/2+Ws dWs
0 0
t/2 t/2 Wt
= 2(1 e )+e e 1
t/2
= 1+e Wt
(e 2).
Since eWt is a geometric Brownian motion, using Proposition 3.2.2 yields
E[Xt ] = E[1 + et/2 (eWt 2)] = 1 2et/2 + et/2 E[eWt ]

= 2 2et/2 .
V ar(Xt ) = V ar[1 + et/2 (eWt 2)] = V ar[et/2 eWt ] = et V ar[eWt ]
= et (e2t et ) = et 1.
The process Xt has the following distribution:

F (y) = P (Xt y) = P 1 + et/2 (eWt 2) y
W 1
t
= P Wt ln 2 + et/2 (y 1) = P ln 2 + et/2 (y 1)
t t
1
= N ln 2 + et/2 (y 1) ,
t
u
1 2
where N (u) = es /2 ds is the distribution function of a standard
2
normal distributed random variable.

2
dXt = dt + t3/2 Wt eWt /(2t) dWt , X1 = 1.
Integrating between 1 and t and applying formula (7.3.11) yields

t t
2
Xt = X1 + ds + s3/2 Ws eWs /(2s) dWs
1 1
2 1 2
= 1 + t 1 eW1 /2 eWt /(2t)
t1/2
2 1 2
= t eW1 /2 eWt /(2t) , t 1.
t1/2
Exercise 8.2.8 Solve the following stochastic dierential equations by the
method of integration
1
(a) dXt = (t sin Wt )dt + (cos Wt )dWt , X0 = 0;
2
1
(b) dXt = ( cos Wt 1)dt + (sin Wt )dWt , X0 = 0;
2
1
(c) dXt = (sin Wt + Wt cos Wt )dt + (Wt sin Wt )dWt , X0 = 0.
2
8.3 Exact Stochastic Equations

The stochastic dierential equation
dXt = a(t, Wt )dt + b(t, Wt )dWt (8.3.7)
is called exact if there is a dierentiable function f (t, x) such that
1
a(t, x) = t f (t, x) + x2 f (t, x) (8.3.8)
2
b(t, x) = x f (t, x). (8.3.9)
Assume the equation is exact. Then substituting in (8.3.7) yields

1
dXt = t f (t, Wt ) + x2 f (t, Wt ) dt + x f (t, Wt )dWt .
2
Applying Itos formula, the previous equation becomes

dXt = d f (t, Wt ) ,
which implies Xt = f (t, Wt ) + c, with c constant.

Solving the partial dierential equations system (8.3.8)(8.3.9) requires the
following steps:
1. Integrating partially with respect to x in the second equation to obtain
f (t, x) up to an additive function T (t);
2. Substitute into the rst equation and determine the function T (t);
3. The solution is Xt = f (t, Wt ) + c, with c determined from the initial
condition on Xt .
Example 8.3.1 Solve the stochastic dierential equation as an exact equation
dXt = et (1 + Wt2 )dt + (1 + 2et Wt )dWt , X0 = 0.
In this case a(t, x) = et (1 + x2 ) and b(t, x) = 1 + 2et x. The associated system

is
1
et (1 + x2 ) = t f (t, x) + x2 f (t, x)
2
1 + 2et x = x f (t, x).
Integrating partially in x in the second equation yields

f (t, x) = (1 + 2et x) dx = x + et x2 + T (t).
Then t f = et x2 + T (t) and x2 f = 2et . Substituting in the rst equation

yields
et (1 + x2 ) = et x2 + T (t) + et .
This implies T (t) = 0, or T = c constant. Hence f (t, x) = x + et x2 + c, and
Xt = f (t, Wt ) = Wt + et Wt2 + c. Since X0 = 0, it follows that c = 0. The
solution is Xt = Wt + et Wt2 .
Example 8.3.2 Find the solution of

dXt = 2tWt3 + 3t2 (1 + Wt ) dt + (3t2 Wt2 + 1)dWt , X0 = 0.
The coecient functions are a(t, x) = 2tx3 + 3t2 (1+ x) and b(t, x) = 3t2 x2 + 1.
The associated system is given by
1
2tx3 + 3t2 (1 + x) = t f (t, x) + x2 f (t, x)
2
3t2 x2 + 1 = x f (t, x).
Integrating partially in the second equation yields

f (t, x) = (3t2 x2 + 1) dx = t2 x3 + x + T (t).
Then t f = 2tx3 + T (t) and x2 f = 6t2 x, and substituting into the rst
equation we get
1
2tx3 + 3t2 (1 + x) = 2tx3 + T (t) + 6t2 x.
2
After cancelations we get T (t) = 3t2 , so T (t) = t3 + c. Then
f (t, x) = t2 x3 + x + t3 + c.
The solution process is given by Xt = f (t, Wt ) = t2 Wt3 + Wt + t3 + c. Using

X0 = 0 we get c = 0. Hence the solution is Xt = t2 Wt3 + Wt + t3 .
The next result deals with a condition regarding the closeness of the
stochastic dierential equation.
Theorem 8.3.3 If the stochastic dierential equation (8.3.7) is exact, then

the coecient functions a(t, x) and b(t, x) satisfy the condition
1
x a = t b + x2 b. (8.3.10)
2
Proof: If the stochastic equation is exact, there is a function f (t, x) satisfying

the system (8.3.8)(8.3.9). Dierentiating the rst equation of the system with
respect to x yields
1
x a = t x f + x2 x f.
2
Substituting b = x f yields the desired relation.
Remark 8.3.4 The equation (8.3.10) has the meaning of a heat equation.
The function b(t, x) represents the temperature measured at x at the instance
t, while x a is the density of heat sources. The function a(t, x) can be regarded
as the potential from which the density of heat sources is derived by taking
the gradient in x.
It is worth noting that equation (8.3.10) is just a necessary condition for

exactness. This means that if this condition is not satised, then the equation
is not exact. In that case we need to try a dierent method to solve the
equation.
Example 8.3.5 Is the stochastic dierential equation
dXt = (1 + Wt2 )dt + (t4 + Wt2 )dWt
exact?
Collecting the coecients, we have a(t, x) = 1 + x2 , b(t, x) = t4 + x2 . Since

x a = 2x, t b = 4t3 , and x2 b = 2, the condition (8.3.10) is not satised, and
hence the equation is not exact.
Exercise 8.3.6 Solve the following exact stochastic dierential equations

(a) dXt = et dt + (Wt2 t)dWt , X0 = 1;
(b) dXt = (sin t)dt + (Wt2 t)dWt , X0 = 1;
2 Wt 2t
(c) dXt = t dt + e dWt , X0 = 0;
(d) dXt = tdt + et/2 (cos W t ) dWt , X0 = 1.
Exercise 8.3.7 Verify the closeness condition and then solve the following
exact stochastic dierential equations

(a) dXt = Wt + 32 Wt2 dt + (t + Wt3 )dWt , X0 = 0;
(b) dXt = 2tWt dt + (t2 + Wt )dWt , X0 = 0;

(c) dXt = et Wt + 12 cos Wt dt + (et + sin Wt )dWt , X0 = 0;
(d) dXt = eWt (1 + 2t )dt + teWt dWt , X0 = 2.
8.4 Integration by Inspection

When solving a stochastic dierential equation by inspection we look for op-
portunities to apply the product or the quotient formulas:
d(f (t)Yt ) = f (t) dYt + Yt df (t)
X f (t)dX X df (t)
t t t
d = .
f (t) f (t)2
For instance, if a stochastic dierential equation can be written as
dXt = f (t)Wt dt + f (t)dWt ,

the product rule brings the equation into the exact form

dXt = d f (t)Wt ,
which after integration leads to the solution
Xt = X0 + f (t)Wt .
Example 8.4.1 Solve
dXt = (t + Wt2 )dt + 2tWt dWt , X0 = a.
We can write the equation as
dXt = Wt2 dt + t(2Wt dWt + dt),
which can be contracted to
dXt = Wt2 dt + td(Wt2 ).
Using the product rule we can bring it to the exact form
dXt = d(tWt2 ),
with the solution Xt = tWt2 + a.
dXt = (Wt + 3t2 )dt + tdWt .
If we rewrite the equation as
dXt = 3t2 dt + (Wt dt + tdWt ),
we note the exact expression formed by the last two terms Wt dt + tdWt =
d(tWt ). Then
dXt = d(t3 ) + d(tWt ),
which is equivalent to d(Xt ) = d(t3 + tWt ). Hence Xt = t3 + tWt + c, c R.
e2t dXt = (1 + 2Wt2 )dt + 2Wt dWt .

Multiply by e2t to get

dXt = e2t (1 + 2Wt2 )dt + 2e2t Wt dWt .
After regrouping, this becomes
dXt = (2e2t dt)Wt2 + e2t (2Wt dWt + dt).
Since d(e2t ) = 2e2t dt and d(Wt2 ) = 2Wt dWt +dt, the previous relation becomes
dXt = d(e2t )Wt2 + e2t d(Wt2 ).
By the product rule, the right side becomes exact
dXt = d(e2t Wt2 ),
and hence the solution is Xt = e2t Wt2 + c, c R.
Example 8.4.4 Solve the equation
t3 dXt = (3t2 Xt + t)dt + t6 dWt , X1 = 0.
The equation can be written as
t3 dXt 3Xt t2 dt = tdt + t6 dWt .
Divide by t6
t3 dXt Xt d(t3 )
= t5 dt + dWt .
(t3 )2
Applying the quotient rule yields
X t4
t
d 3 = d + dWt .
t 4
Integrating between 1 and t, yields
Xt t4
= + Wt W1 + c
t3 4
so
1
Xt = ct3 + t3 (Wt W1 ), c R.
4t
Using X1 = 0 yields c = 1/4 and hence the solution is
1 3 1
Xt = t + t3 (Wt W1 ), c R.
4 t
Exercise 8.4.5 Solve the following stochastic dierential equations by the in-
spection method
(a) dXt = (1 + Wt )dt + (t + 2Wt )dWt , X0 = 0;
2 3
(b) t dXt = (2t Wt )dt + tdWt , X1 = 0;
t/2 1
(c) e dXt = 2 Wt dt + dWt , X0 = 0;
(d) dXt = 2tWt dWt + Wt dt,2 X0 = 0;

1
(e) dXt = 1 + 2t Wt dt + t dWt , X1 = 0.
8.5 Linear Stochastic Dierential Equations

Consider the stochastic dierential equation with the drift term linear in Xt

dXt = (t)Xt + (t) dt + b(t, Wt )dWt , t 0.
This can also be written as
dXt (t)Xt dt = (t)dt + b(t, Wt )dWt .

t
Let A(t) = 0 (s) ds. Multiplying by the integrating factor eA(t) , the left
side of the previous equation becomes an exact expression

eA(t) dXt (t)Xt dt = eA(t) (t)dt + eA(t) b(t, Wt )dWt

A(t)
d e Xt = eA(t) (t)dt + eA(t) b(t, Wt )dWt .
Integrating yields
t t
A(t) A(s)
e Xt = X0 + e
(s) ds + eA(s) b(s, Ws ) dWs
0 0
t t
A(s)
Xt = X0 eA(t)
+eA(t)
e (s) ds + eA(s) b(s, Ws ) dWs .
0 0
The rst integral within the previous parentheses is a Riemann integral, and
the latter one is an Ito stochastic integral. Sometimes, in practical applications
these integrals can be computed explicitly.
When b(t, Wt ) = b(t), the latter integral becomes a Wiener integral. In
this case the solution Xt is Gaussian with mean and variance given by
t
E[Xt ] = X0 eA(t) + eA(t) eA(s) (s) ds
0
t
V ar[Xt ] = e2A(t) e2A(s) b(s)2 ds.
0
Another important particular case is when (t) = = 0, (t) = are

constants and b(t, Wt ) = b(t). The equation in this case is

dXt = Xt + dt + b(t)dWt , t 0,
and the solution takes the form

t

Xt = X0 e t
+ (et 1) + e(ts) b(s) dWs .
0
Example 8.5.1 Solve the linear stochastic dierential equation
dXt = (2Xt + 1)dt + e2t dWt .
Write the equation as
dXt 2Xt dt = dt + e2t dWt
and multiply by the integrating factor e2t to get

d e2t Xt = e2t dt + dWt .
Integrate between 0 and t and multiply by e2t , to obtain

t t
Xt = X0 e2t + e2t e2s ds + e2t dWs
0 0
1
= X0 e + (e2t 1) + e2t Wt .
2t
2
dXt = (2 Xt )dt + et Wt dWt .
Multiplying by the integrating factor et yields
et (dXt + Xt dt) = 2et dt + Wt dWt .
Since et (dXt + Xt dt) = d(et Xt ), integrating between 0 and t we get

t t
t t
e Xt = X0 + 2e dt + Ws dWs .
0 0
Dividing by et and performing the integration yields

1
Xt = X0 et + 2(1 et ) + et (Wt2 t).
2
1
dXt = ( Xt + 1)dt + et cos Wt dWt .
2
Write the equation as
1
dXt Xt dt = dt + et cos Wt dWt
2
and multiply by the integrating factor et/2 to get
d(et/2 Xt ) = et/2 dt + et/2 cos Wt dWt .
Integrating yields
t t
et/2 Xt = X0 + es/2 ds + es/2 cos Ws dWs .
0 0
Multiply by et/2 and use formula (7.3.10) to obtain the solution
Xt = X0 et/2 + 2(et/2 1) + et sin Wt .
Exercise 8.5.4 Solve the following linear stochastic dierential equations

(a) dXt = (4Xt 1)dt + 2dWt ;
(b) dXt = (3Xt 2)dt + e3t dWt ;
(c) dXt = (1 + Xt )dt + et Wt dWt ;
(d) dXt = (4Xt + t)dt + e4t dWt ;

(e) dXt = t + 12 Xt dt + et sin Wt dWt ;
(f ) dXt = Xt dt + et dWt .
In the following we present an important example of stochastic dierential

equation, which can be solved by the method presented in this section.
Proposition 8.5.5 (The mean-reverting Ornstein-Uhlenbeck process)

Let m and be two constants. Then the solution Xt of the stochastic equation
dXt = (m Xt )dt + dWt (8.5.11)
is given by
t
Xt = m + (X0 m)et + est dWs . (8.5.12)
0
Xt is Gaussian with mean and variance given by
E[Xt ] = m + (X0 m)et

2
V ar(Xt ) = (1 e2t ).
2
Proof: Adding Xt dt to both sides and multiplying by the integrating factor
et we get
d(et Xt ) = met dt + et dWt ,
which after integration yields

t
et Xt = X0 + m(et 1) + es dWs .
0
Hence
t
t t t
Xt = X0 e +me
+ e es dWs
0
t
= m + (X0 m)et + est dWs .
0
Since Xt is the sum between a predictable function and a Wiener integral,

then we can use Proposition 5.6.1 and it follows that Xt is Gaussian, with
t
t
E[Xt ] = m + (X0 m)e + E est dWs = m + (X0 m)et
0
t t
V ar(Xt ) = V ar est dWs = 2 e2t e2s ds
0 0
e2t 1 1
= 2 e2t = 2 (1 e2t ).
2 2
The name mean-reverting comes from the fact that
lim E[Xt ] = m.
t
The variance also tends to zero exponentially, lim V ar[Xt ] = 0. According to

t
Proposition 4.9.1, the process Xt tends to m in the mean square sense.
Proposition 8.5.6 (The Brownian bridge) For a, b R xed, the stochas-

tic dierential equation
b Xt
dXt = dt + dWt , 0 t < 1, X0 = a
1t
has the solution

t
1
Xt = a(1 t) + bt + (1 t) dWs , 0 t < 1. (8.5.13)
0 1s
The solution has the property lim Xt = b, almost certainly.

t1
Proof: If we let Yt = b Xt the equation becomes linear in Yt

1
dYt + Yt dt = dWt .
1t
1
Multiplying by the integrating factor (t) = 1t yields
Y 1
t
d = dWt ,
1t 1t
which leads by integration to
t
Yt 1
=c dWs .
1t 0 1s
Making t = 0 yields c = a b, so

b Xt t
1
=ab dWs .
1t 0 1s
Solving for Xt yields

t
1
Xt = a(1 t) + bt + (1 t) dWs , 0 t < 1.
0 1s
t 1
Let Ut = (1 t) 0 1s dWs . First we notice that

1
t
E[Ut ] = (1 t)E dWs = 0,
0 1s
t
2
t 1 2 1
V ar(Ut ) = (1 t) V ar dWs = (1 t) 2
ds
0 1s 0 (1 s)
1
= (1 t)2 1 = t(1 t).
1t
In order to show as-limt1 Xt = b, we need to prove

P ; lim Xt () = b = 1.
t1
Since Xt = a(1 t) + bt + Ut , it suces to show that

P ; lim Ut () = 0 = 1. (8.5.14)
t1
We evaluate the probability of the complementary event

P ; lim Ut () = 0 = P ; |Ut ()| > , t ,
t1
for some > 0. Since by Markovs inequality

V ar(Ut ) t(1 t)
P ; |Ut ()| > ) < 2
=
2
holds for any 0 t < 1, choosing t 1 implies that

P ; |Ut ()| > , t) = 0,
which implies (8.5.14).

The process (8.5.13) is called the Brownian bridge because it joins X0 = a
with X1 = b. Since Xt is the sum between a deterministic linear function in t
and a Wiener integral, it follows that it is a Gaussian process, with mean and
variance
E[Xt ] = a(1 t) + bt
V ar(Xt ) = V ar(Ut ) = t(1 t).
It is worth noting that the variance is maximum at the midpoint t = (b a)/2

and zero at the end points a and b.
Exercise 8.5.7 Show that the Brownian bridge (8.5.13) satises Xt

ms b as
t 1.
Exercise 8.5.8 Find Cov(Xs , Xt ), 0 < s < t for the following cases:
(a) Xt is a mean reverting Ornstein-Uhlenbeck process;
(b) Xt is a Brownian bridge process.
8.6 Stochastic Equations with respect to a Poisson

Process
Similar techniques can be applied in the case when the Brownian motion
process Wt is replaced by a Poisson process Nt with constant rate . For
instance, the stochastic dierential equation
dXt = 3Xt dt + e3t dNt

X0 = 1
can be solved multiplying by the integrating factor e3t to obtain
d(e3t Xt ) = dNt .
Integrating yields e3t Xt = Nt + 1, so the solution is Xt = e3t (1 + Nt ).

The following equation
dXt = (m Xt )dt + dNt
is similar to the equation dening the mean-reverting Ornstein-Uhlenbeck pro-

cess. As we shall see, in this case, the process is no more mean-reverting, but
it reverts to a certain constant. A similar method yields the solution
t
t t
Xt = m + (X0 m)e + e es dNs .
0
Since from Proposition 5.8.5 and Exercise 5.8.8 we have

t
t s
E e dNs = es ds = (et 1)
0 0
t
t s
V ar e dNs = e2s ds = (e2t 1),
0 0 2
it follows that
E[Xt ] = m + (X0 m)et + (1 et ) m +

2
V ar(Xt ) = (1 e2t ).
2
It is worth noting that in this case the process Xt is not Gaussian any more.
8.7 The Method of Variation of Parameters

Let us start by considering the following stochastic equation
dXt = Xt dWt , (8.7.15)
with constant. This is the equation which, in physics, is known to model

the linear noise. Dividing by Xt yields
dXt
= dWt .
Xt
Switch to the integral form

dXt
= dWt ,
Xt
and integrate blindly to get ln Xt = Wt + c, with c an integration constant.
This leads to the pseudo-solution
Xt = eWt +c .
The nomination pseudo stands for the fact that Xt does not satisfy the
initial equation. We shall nd a correct solution by letting the parameter c be
a function of t. In other words, we are looking for a solution of the following
type:
Xt = eWt +c(t) , (8.7.16)
where the function c(t) is subject to be determined. Using Itos formula we
get
dXt = d(eWt +c(t) ) = eWt +c(t) (c (t) + 2 /2)dt + eWt +c(t) dWt
= Xt (c (t) + 2 /2)dt + Xt dWt .
Substituting the last term from the initial equation (8.7.15) yields
dXt = Xt (c (t) + 2 /2)dt + dXt ,
which leads to the equation
c (t) + 2 /2 = 0
2
with the solution c(t) = 2 t + k. Substituting into (8.7.16) yields
2
Xt = eWt 2
t+k
.
The value of the constant k is determined by taking t = 0. This leads to

X0 = ek . Hence we have obtained the solution of the equation (8.7.15)
2
Xt = X0 eWt 2
t
.
Example 8.7.1 Use the method of variation of parameters to solve the stochas-
dXt = Xt dt + Xt dWt ,
with and constants.
After dividing by Xt we bring the equation into the equivalent integral form

dXt
= dt + dWt .
Xt
Integrate on the left blindly and get
ln Xt = t + Wt + c,
where c is an integration constant. We arrive at the following pseudo-

solution
Xt = et+Wt +c .
Assume the constant c is replaced by a function c(t), so we are looking for a
solution of the form
Xt = et+Wt +c(t) . (8.7.17)
Apply Itos formula and get
2
dXt = Xt + c (t) + dt + Xt dWt .
2
Subtracting the initial equation yields
2
c (t) + dt = 0,
2
2 2
which is satised for c (t) = 2 , with the solution c(t) = 2 t + k, k R.
Substituting into (8.7.17) yields the solution
2 2 2
Xt = et+Wt 2
t+k
= e( 2
)t+Wt +k
= X0 e( 2
)t+Wt
.
Exercise 8.7.2 Use the method of variation of parameters to solve the equa-
tion
dXt = Xt Wt dWt
by following the next two steps:
(a) Divide by Xt and integrate blindly to get the pseudo-solution
Wt2
2t +c
Xt = e 2 ,
with c constant.
(b) Consider c = c(t, Wt ) and nd a solution of type
Wt2
2t +c(t,Wt )
Xt = e 2 .
Example 8.7.3 (Langevin equation) Solve dXt = qXt dt + dWt , with q

constant.
We start solving the associated deterministic equation
dXt = qXt dt,

which has the solution Xt = Ceqt , with C constant. Now, look for a solution
of the type Xt = C(t)eqt , and determine the function C(t) such that the
initial equation is satised. Comparing

dXt = d C(t)eqt = qC(t)eqt dt + eqt dC(t)
= qXt dt + eqt dC(t)
with the initial stochastic dierential equation of Xt implies that C(t) satises
dC(t) = eqt dWt .

t
Integrating we obtain C(t) = C(0) + 0 eqs dWs , and hence Xt = C(0)eqt +
t
eqt 0 eqs dWs . It is not hard to see that C(0) = X0 , which enables us to
write the nal solution as
t
qt qt
Xt = X0 e + e eqs dWs .
0
Exercise 8.7.4 (the mean reverting Orstein-Uhlenbeck process) Use the

method of variation of constants to solve
dXt = ( Xr )dt + dWt ,
where and are constants.
8.8 Integrating Factors

The method of integrating factors can be applied to a class of stochastic dif-
ferential equations of the type
dXt = f (t, Xt )dt + g(t)Xt dWt , (8.8.18)
where f and g are continuous deterministic functions. The integrating factor

is given by t
1 t 2
t = e 0 g(s) dWs + 2 0 g (s) ds .
The equation can be brought into the following exact form
d(t Xt ) = t f (t, Xt )dt.
Substituting Yt = t Xt , we obtain that Yt satises the deterministic dierential

equation
dYt = t f (t, Yt /t )dt,
which can be solved by either integration or as an exact equation. We shall
exemplify the method of integrating factors with a few examples.

dXt = rdt + Xt dWt , (8.8.19)
with r and constants.
1 2
The integrating factor is given by t = e 2 tWt . Using Itos formula, we can
easily check that
dt = t (2 dt dWt ).
Using dt2 = dt dWt = 0, (dWt )2 = dt we obtain
dXt dt = 2 t Xt dt.
Multiplying by t , the initial equation becomes
t dXt t Xt dWt = rt dt,
and adding and subtracting 2 t Xt dt from the left side yields
t dXt t Xt dWt + 2 t Xt dt 2 t Xt dt = rt dt.
This can be written as
t dXt + Xt dt + dt dXt = rt dt,
which, by virtue of the product rule, becomes
d(t Xt ) = rt dt.
Integrating yields t
t Xt = 0 X0 + r s ds
0
and hence the solution is
t
1 r
Xt = X0 + s ds
t t 0
t
Wt 12 2 t 1 2
= X0 e +r e 2 (ts)+(Wt Ws ) ds.
0
Exercise 8.8.2 Let be a constant. Solve the following stochastic dierential

equations by the method of integrating factors
(a) dXt = Xt dWt ;
(b) dXt = Xt dt + Xt dWt ;
1
(c) dXt = dt + Xt dWt , X0 > 0.
Xt
Exercise 8.8.3 Let Xt be thesolution of the stochastic equation dXt = Xt dWt ,
with constant. Let At = 1t 0 Xs dWs be the stochastic average of Xt . Find
t
the stochastic equation satised by At , the mean and variance of At .

8.9 Existence and Uniqueness

The following theorem is the analog of Picards result from ordinary dier-
ential equations. It states the existence and uniqueness of strong solutions
of stochastic dierential equations. The proof of the next theorem can be
found in ksendal [37]. For more results regarding existence and uniqueness
of strong solutions the reader is referred to Krylov and Zvonkin [29].
Theorem 8.9.1 (Existence and Uniqueness) Consider the stochastic dif-

ferential equation
dXt = b(t, Xt )dt + (t, Xt )dWt , X0 = c
where c is a constant and b and are continuous functions on [0, T ] R

satisfying
1. |b(t, x)| + |(t, x)| C(1 + |x|); x R, t [0, T ]
2. |b(t, x) b(t, y)| + |(t, x) (t, y)| K|x y|, x, y R, t [0, T ]
with C, K positive constants. Let Ft = {Ws ; s t}. Then there is a unique
solution process Xt that is continuous and Ft -adapted and satises
T
E Xt2 dt < .
0
The rst condition says that the drift and volatility increase no faster
than a linear function in x. This condition ensures that the solution Xt does
not explode in nite time, i.e. does not tend to for nite t. The second
conditions states that the functions are Lipschitz in the second argument; this
condition guarantees the solution uniqueness.
The following example deals with an exploding solution. Consider the
nonlinear stochastic dierential equation
dXt = Xt3 dt + Xt2 dWt , X0 = 1/a, (8.9.20)
where a is a nonzero constant. It is clear that condition 1. does not hold, since
the drift increases cubically.
We shall look for a solution of the type Xt = f (Wt ). Itos formula yields
1
dXt = f (Wt )dWt + f (Wt )dt.
2
Equating the coecients of dt and dWt in the last two equations yields
f (Wt ) = Xt2 = f (Wt ) = f (Wt )2 (8.9.21)

1
f (Wt ) = Xt3 = f (Wt ) = 2f (Wt )3 . (8.9.22)
2
We note that equation (8.9.21) implies (8.9.22) by dierentiation. So it suces

to solve only the ordinary dierential equation
f (x) = f (x)2 , f (0) = 1/a.
Separating and integrating we have

df 1
= ds = f (x) = .
f (x)2 ax
Hence a solution of equation (8.9.20) is
1
Xt = .
a Wt
Let Ta be the rst time the Brownian motion Wt hits a. Then the process Xt
is dened only for 0 t < Ta . Ta is a random variable with P (Ta < ) = 1
and E[Ta ] = , see section 4.3.
Example 8.9.2 Show that that the following stochastic dierential equations
have a unique (strong) solution, without solving the equations explicitly:
(a) dXt = Xt dt + dWt (Langevin equation);
(b) dXt = (m Xt ) dt + dWt (Mean reverting Ornstein-Uhlenbeck pro-
cess);
(c) dXt = Xt dWt (Linear noise);

(d) dXt = dt + X2t dWt (Squared Bessel process);
(e) dXt = Xt dt + Xt dWt (Geometric Brownian motion);

(f ) dXt = (0 + 1 Xt ) dt + X2t dWt (CIR process),
with m, , 0 , 1 and positive constants.
Example 8.9.3 Consider the stochastic dierential equation

1

dXt = 1+ Xt2 + Xt dt + 1 + Xt2 dWt , X0 = x0 .
2
(a) Solve the equation;

(b) Show that there is a unique solution.
8.10 Finding Mean and Variance

For most practical purposes, the most important information one needs to
know about a process is its mean and variance. These can be found directly
from the stochastic equation in some particular cases without solving explicitly
the equation. We shall deal with this problem in the present section.
Consider the process Xt satisfying (6.1.1). Then taking the expectation
in (8.1.2) and using the property of the Ito integral as a zero mean random
variable yields t
E[Xt ] = X0 + E[a(s, Ws , Xs )] ds. (8.10.23)
0
Applying the Fundamental Theorem of Calculus we obtain
d
E[Xt ] = E[a(t, Wt , Xt )].
dt
We note that Xt is not dierentiable, but its expectation E[Xt ] is. This equa-
tion can be solved exactly in a few particular cases.
1. If a(t, Wt , Xt ) = a(t), then dt d
E[Xt ] = a(t) with the exact solution
t
E[Xt ] = X0 + 0 a(s) ds.
2. If a(t, Wt , Xt ) = (t)Xt + (t), with (t) and (t) continuous determin-
istic function, then
d
E[Xt ] = (t)E[Xt ] + (t),
dt
which is a linear dierential equation in E[Xt ]. Its solution is given by
t
E[Xt ] = eA(t) X0 + eA(s) (s) ds , (8.10.24)
0
t
where A(t) = 0 (s) ds. It is worth noting that the expectation E[Xt ] does
not depend on the volatility term b(t, Wt , Xt ).
Exercise 8.10.1 If dXt = (2Xt + e2t )dt + b(t, Wt , Xt )dWt , then show that
E[Xt ] = e2t (X0 + t).
Proposition 8.10.2 Let Xt be a process satisfying the stochastic equation

dXt = (t)Xt dt + b(t)dWt .
Then the mean and variance of Xt are given by
E[Xt ] = eA(t) X0
t
2A(t)
V ar[Xt ] = e eA(s) b2 (s) ds,
0
t
where A(t) = 0 (s) ds.
Proof: The expression of E[Xt ] follows directly from formula (8.10.24) with
= 0. In order to compute the second moment we rst compute
(dXt )2 = b2 (t) dt;

d(Xt2 ) = 2Xt dXt + (dXt )2

= 2Xt (t)Xt dt + b(t)dWt + b2 (t)dt

= 2(t)Xt2 + b2 (t) dt + 2b(t)Xt dWt ,
where we used Itos formula. If we let Yt = Xt2 , the previous equation becomes

dYt = 2(t)Yt + b2 (t) dt + 2b(t) Yt dWt .
Applying formula (8.10.24) with (t) replaced by 2(t) and (t) by b2 (t),
yields t

E[Yt ] = e2A(t) Y0 + e2A(s) b2 (s) ds ,
0
t
E[Xt2 ] =e 2A(t) 2
X0 + e2A(s) b2 (s) ds .
0
It follows that the variance is

t
V ar[Xt ] = E[Xt2 ] (E[Xt ])2 = e2A(t) e2A(s) b2 (s) ds.
0
Remark 8.10.3 We note that the previous equation is of linear type. This
shall be solved explicitly in a future section.
The mean and variance for a given stochastic process can be computed by
working out the associated stochastic equation. We shall provide next a few
examples.
Example 8.10.4 Find the mean and variance of ekWt , with k constant.
From Itos formula

1
d(ekWt ) = kekWt dWt + k2 ekWt dt,
2
and integrating yields
t t
1
ekWt = 1 + k ekWs dWs + k2 ekWs ds.
0 2 0
Taking the expectations we have

t
1
E[ekWt ] = 1 + k2 E[ekWs ] ds.
2 0
If we let f (t) = E[ekWt ], then dierentiating the previous relations yields the
dierential equation
1
f (t) = k2 f (t)
2
2 t/2
with the initial condition f (0) = E[ekW0 ] = 1. The solution is f (t) = ek ,
and hence
2
E[ekWt ] = ek t/2 .
The variance is
2 t/2 2t
V ar(ekWt ) = E[e2kWt ] (E[ekWt ])2 = e4k ek
2 2
= ek t (ek t 1).
Example 8.10.5 Find the mean of the process Wt eWt .
We shall set up a stochastic dierential equation for Wt eWt . Using the product
formula and Itos formula yields
d(Wt eWt ) = eWt dWt + Wt d(eWt ) + dWt d(eWt )

1
= eWt dWt + (Wt + dWt )(eWt dWt + eWt dt)
2
1 Wt Wt Wt Wt
= ( Wt e + e )dt + (e + Wt e )dWt .
2
Integrating and using that W0 eW0 = 0 yields
t t
Wt 1 Ws Ws
Wt e = ( Ws e + e ) ds + (eWs + Ws eWs ) dWs .
0 2 0
Since the expectation of an Ito integral is zero, we have

t
1
E[Wt e ] =
Wt
E[Ws eWs ] + E[eWs ] ds.
0 2
Let f (t) = E[Wt eWt ]. Using E[eWs ] = es/2 , the previous integral equation
becomes t
1
f (t) = ( f (s) + es/2 ) ds.
0 2
Dierentiating yields the following linear dierential equation
1
f (t) = f (t) + et/2
2
with the initial condition f (0) = 0. Multiplying by et/2 yields the following
exact equation
(et/2 f (t)) = 1.
The solution is f (t) = tet/2 . Hence we obtained that
E[Wt eWt ] = tet/2 .
Exercise 8.10.6 Find (a) E[Wt2 eWt ]; (b) E[Wt ekWt ].
Example 8.10.7 Show that for any integer k 0 we have
(2k)! k
E[Wt2k ] = t , E[Wt2k+1 ] = 0.
2k k!
In particular, E[Wt4 ] = 3t2 , E[Wt6 ] = 15t3 .
From Itos formula we have

n(n 1) n2
d(Wtn ) = nWtn1 dWt + Wt dt.
2
Integrate and get

t
n(n 1) t
Wtn = n Wsn1 dWs + Wsn2 ds.
0 2 0
Since the expectation of the rst integral on the right side is zero, taking the
expectation yields the following recursive relation

n(n 1) t
E[Wt ] =
n
E[Wsn2 ] ds.
2 0
Using the initial values E[Wt ] = 0 and E[Wt2 ] = t, the method of mathematical
induction implies that E[Wt2k+1 ] = 0 and E[Wt2k ] = (2k)!2k k!
tk . The details are
left to the reader.
Exercise 8.10.8 (a) Is Wt4 3t2 an Ft -martingale?

(b) What about Wt3 ?
Example 8.10.9 Find E[sin Wt ].
From Itos formula

1
d(sin Wt ) = cos Wt dWt sin Wt dt,
2
then integrating yields

t t
1
sin Wt = cos Ws dWs sin Ws ds.
0 2 0
Taking expectations we arrive at the integral equation

1 t
E[sin Wt ] = E[sin Ws ] ds.
2 0
Let f (t) = E[sin Wt ]. Dierentiating yields the equation f (t) = 12 f (t) with
f (0) = E[sin W0 ] = 0. The unique solution is f (t) = 0. Hence
E[sin Wt ] = 0.
Exercise 8.10.10 Let be a constant. Show that

(a) E[sin(Wt )] = 0;
2
(b) E[cos(Wt )] = e t/2 ;
2
(c) E[sin(t + Wt )] = e t/2 sin t;
2
(d) E[cos(t + Wt )] = e t/2 cos t.
Exercise 8.10.11 Use the previous exercise and the denition of expectation
to show that

2 1/2
(a) ex cos x dx = 1/4 ;
e

2 2
(b) ex /2 cos x dx = .
e
Exercise 8.10.12 Using expectations show that

ax2 +bx b b2 /(4a)
(a) xe dx = e ;
a 2a

2 ax2 +bx 1 b2 b2 /(4a)

(b) x e dx = 1+ e ;
a 2a 2a
(c) Can you apply a similar method to nd a closed form expression for
the integral
2 +bx
xn eax dx?

Exercise 8.10.13 Using the result given by Example 8.10.7 show that
3 /2
(a) E[cos(tWt )] = et ;
(b) E[sin(tWt )] = 0;
(c) E[etWt ] = 0.
For general drift rates we cannot nd the mean, but in the case of concave
drift rates we can nd an upper bound for the expectation E[Xt ]. The following
classical result will be used for this purpose.
Lemma 8.10.14 (Gronwalls inequality) Let f (t) be a non-negative con-
tinuous function satisfying the inequality
t
f (t) C + M f (s) ds
0
for 0 t T , with C, M constants. Then
f (t) CeM t , 0 t T.
Proof: The proof follows Revuz and Yor [41]. Iterating the integral inequality
one gets
t
f (t) C + M f (s) ds
0
t s
C +M C+M f (u) du ds
0 0
t s
= C + M Ct + M 2 f (u) du ds
0 0
t
2
= C + M Ct + M t f (u) du.
0
Working inductively, we obtain the following inequality

t2 tn
f (t) C + M Ct + M 2 C + + M n C
2 n!

M n+1 tn t
+ f (u) du. (8.10.25)
n! 0
The last term tends to 0 as n , since

t
0 f (u) du t max f (u),
0 0ut
M n+1 tn
lim = 0.
n n!
Taking the limit in (8.10.25) it is not hard to obtain

M n tn
f (t) C = CeM t .
n!
k=0
Proposition 8.10.15 Let Xt be a continuous stochastic process such that
dXt = a(Xt )dt + b(t, Wt , Xt ) dWt ,
with the function a() satisfying the following conditions

1. a(x) 0, for 0 x T ;
2. a (x) < 0, for 0 x T ;
3. a (0) = M .
Then E[Xt ] X0 eM t , for 0 Xt T .
Proof: From the mean value theorem there is (0, x) such that
a(x) = a(x) a(0) = (x 0)a () xa (0) = M x, (8.10.26)
where we used that a (x) is a decreasing function. Applying Jensens inequality

for concave functions yields
E[a(Xt )] a(E[Xt ]).
Combining with (8.10.26) we obtain E[a(Xt )] M E[Xt ]. Substituting in the

identity (8.10.23) implies
t
E[Xt ] X0 + M E[Xs ] ds.
0
Applying Gronwalls inequality we obtain E[Xt ] X0 eM t .
Exercise 8.10.16 State the previous result in the particular case when a(x) =
sin x, with 0 x .
Not in all cases can the mean and the variance be obtained directly from
the stochastic equation. In these cases one may try to produce closed form
solutions. Some of these techniques were developed in the previous sections of
the current chapter.
Chapter 9
Applications of Brownian
Motion
This chapter deals with a surprising relation between stochastic dierential

equations and second order partial dierential equations. This will provide
a way of computing solutions of parabolic dierential equations, which is a
deterministic problem, by means of studying the transition probability density
of the underlying stochastic process.
9.1 The Directional Derivative

Consider a smooth curve x : [0, ) Rn , starting at x(0) = x0 with the
initial velocity v = x (0). Then the derivative of a function f : Rn R in the
direction v is dened by

f x(t) f (x) d
Dv f (x0 ) = lim = f x(t) .
t 0 t dt
t=0+
Applying the chain rule yields

n
f dxk (t)
n
f
(x0 )vk (0) = f (x0 ), v,
dt
Dv f (x0 ) = (x(t)) =
xk xk
k=1 t=0+
k=1
where f stands for the gradient of f and , denotes the scalar product.
The linear dierential operator Dv is called the directional derivative with
respect to the vector v. In the next section we shall extend this denition
to the case when the curve x(t) is replaced by an Ito diusion Xt ; in this
case the corresponding directional derivative will be a second order partial
dierential operator.
199
9.2 The Generator of an Ito Diusion

Let (Xt )t0 be a stochastic process with X0 = x0 . We shall consider an
operator that describes innitesimally the rate of change of a function which
depends smoothly on Xt .
More precisely, the generator of the stochastic process Xt is the second
order partial dierential operator A dened by
Ex [f (Xt )] f (x)
Af (x) = lim ,
t 0 t
for any smooth function (at least of class C 2 ) with compact support, i.e.
f : Rn R, f C02 (Rn ). Here Ex stands for the expectation operator given
the initial condition X0 = x, i.e.,

E [f (Xt )] = E[f (Xt )|X0 = x] =
x
f (y)pt (x, y) dy,
Rn
where pt (x, y) = p(x, y; t, 0) is the transition density of Xt , given X0 = x (the

initial value X0 is a deterministic value x).
In the following we shall nd the generator associated with the Ito diusion
dXt = b(Xt )dt + (Xt )dW (t), t 0, X0 = x, (9.2.1)

where W (t) = W1 (t), . . . , Wm (t) is an m-dimensional Brownian motion, with
b : Rn Rn and : Rn Rnm measurable functions.
The main tool used in deriving the formula for the generator A is Itos
formula in several variables. If Ft = f (Xt ), then using Itos formula we have
f 1 2f
dFt = (Xt ) dXti + (Xt ) dXti dXtj , (9.2.2)
xi 2 xi xj
i i,j
where Xt = (Xt1 , , Xtn ) satises the Ito diusion (9.2.1) on components,

i.e.,
dXti = bi (Xt )dt + [(Xt ) dW (t)]i

= bi (Xt )dt + ik dWk (t). (9.2.3)
k
Using the stochastic relations dt2 = dt dWk (t) = 0 and dWk (t) dWr (t) = kr dt,
a computation provides
Applications of Brownian Motion 201

dXti dXtj = bi dt + ik dWk (t) bj dt + jk dWk (t)
k k

= ik dWk (t) jr dWr (t)
k r

= ik jr dWk (t)dWr (t) = ik jk dt
k,r k
= ( T )ij dt.
Therefore
dXti dXtj = ( T )ij dt. (9.2.4)
Substituting (9.2.3) and (9.2.4) into (9.2.2) yields
% &
1 2f f
dFt = (Xt )( T )ij + bi (Xt ) (Xt ) dt
2 xi xj xi
i,j i
f
+ (Xt )ik (Xt ) dWk (t).
xi
i,k
Integrate and obtain
t% f
&
1 2f T
Ft = F0 + ( )ij + bi (Xs ) ds
0 2 xi xj xi
i,j i
t f
+ ik (Xs ) dWk (s).
0 x i
k i
Since F0 = f (X0 ) = f (x) and Ex (f (x)) = f (x), applying the expectation

operator in the previous relation we obtain
% &
t 2f f
1
Ex [Ft ] = f (x) + Ex ( T )ij + bi (Xs ) ds . (9.2.5)
0 2 xi xj xi
i,j i
t
Using the commutativity between the operator Ex and the integral 0, apply-
ing lHospital rule (see Exercise 9.2.7), yields
Ex [Ft ] f (x) 1 2 f (x) f (x)

lim = ( T )ij + bk .
t 0 t 2 xi xj xk
i,j k
We conclude the previous computations with the following result.

Theorem 9.2.1 The generator of the Ito diusion (9.2.1) is given by
1 2
A= ( T )ij + bk . (9.2.6)
2 xi xj xk
i,j k
The matrix is called dispersion and the product T is called diusion

matrix. These names are related with their physical signicance. Substituting
(9.2.6) in (9.2.5) we obtain the following formula
t
E [f (Xt )] = f (x) + E
x x
Af (Xs ) ds , (9.2.7)
0
for any f C02 (Rn ).
Exercise 9.2.2 Find the generator operator associated with the n-dimensional
Brownian motion.
Exercise 9.2.3 Find the Ito diusion corresponding to the generator Af (x) =
f (x) + f (x).
Exercise 9.2.4 Let G = 12 (x21 + x21 x22 ) be the Grushins operator.

(a) Find the diusion process associated with the generator G .
(b) Find the diusion and dispersion matrices and show that they are de-
generate.
Exercise 9.2.5 Let Xt and Yt be two one-dimensional independent Ito diu-

sions with innitesimal generators AX and AY . Let Zt = (Xt , Yt ) with the
innitesimal generator AZ . Show that AZ = AX + AY .
Exercise 9.2.6 Let Xt be an Ito diusion with innitesimal generator AX .

Consider the process Yt = (t, Xt ). Show that the innitesimal generator of Yt
is given by AY = t + AX .
Exercise 9.2.7 Let Xt be an Ito diusion with X0 = x, and a smooth

function. Using lHospital rule, show that

1 t
lim Ex (Xs ) ds = (x).
t0+ t 0
9.3 Dynkins Formula

Formula (9.2.7) holds under more general conditions, when t is a stopping
time. First we need the following result, which deals with a continuity-type
property in the upper limit of an Ito integral.
Lemma 9.3.1 Let g be a bounded measurable function and be a stopping

time for Xt with E[ ] < . Then
k
lim E g(Xs ) dWs = E g(Xs ) dWs ; (9.3.8)
k 0 0
k
lim E g(Xs ) ds = E g(Xs ) ds . (9.3.9)
k 0 0
Proof: Let |g| < K. Using the properties of Ito integrals, we have
k 2 2
E g(Xs ) dWs g(Xt ) dWs =E g(Xs ) dWs
0 0 k

= E g2 (Xs ) ds K 2 E[ k] 0, k .
k
Since E[X 2 ] E[X]2 , it follows that

k
E g(Xs ) dWs g(Xt ) dWs 0, k ,
0 0
which is equivalent to relation (9.3.8).

The second relation can be proved similarly and is left as an exercise for
the reader.
Exercise 9.3.2 Assume the hypothesis of the previous lemma. Let 1{s< } be
the characteristic function of the interval (, )

1, if u <
1{s< } (u) =
0, otherwise.
Show that
k k
(a) g(Xs ) dWs = 1{s< } g(Xs ) dWs ,
0 0
k k
(b) g(Xs ) ds = 1{s< } g(Xs ) ds.
0 0
Theorem 9.3.3 (Dynkins formula) Let f C02 (Rn ), and Xt be an Ito

diusion starting at x. If is a stopping time with E[ ] < , then

Ex [f (X )] = f (x) + Ex Af (Xs ) ds , (9.3.10)
0
where A is the innitesimal generator of Xt .
Proof: Replace t by k and f by 1{s< } f in (9.2.7) and obtain

k
E[1s< f (Xk )] = 1{s< } f (x) + E A(1{s< } f )(Xs ) ds ,
0
which can be written as

k
E[f (Xk )] = 1{s< } f (x) + E 1{s< } (s)A(f )(Xs ) ds
0
k
= 1{s< } f (x) + E A(f )(Xs ) ds . (9.3.11)
0
Since by Lemma 9.3.1
E[f (Xk )] E[f (X )], k

k
E A(f )(Xs ) ds E A(f )(Xs ) ds , k ,
0 0
using Exercise 9.3.2 and relation (9.3.11) yields (9.3.10).
Exercise 9.3.4 Write Dynkins formula for the case of a function f (t, Xt ).
Use Exercise 9.2.6.
More details in this direction can be found in Dynkin [16]. In the following
sections we shall present a few important results of stochastic calculus that
can be obtained as direct consequences of Dynkins formula.
9.4 Kolmogorovs Backward Equation

For any function f C02 (Rn ) let v(t, x) = Ex [f (Xt )], given that X0 = x. The
operator E denotes the expectation given the initial condition X0 = x. Then
v(0, x) = f (x), and dierentiating in Dynkins formula (9.2.7)
t
v(t, x) = f (x) + Ex [Af (Xs )] ds
0
provides
v
= Ex [Af (Xt )].
t
Since we are allowed to dierentiate inside of an integral, the operators A and
Ex commute
Ex [Af (Xt )] = AEx [f (Xt )].
Therefore
v
= Ex [Af (Xt )] = AEx [f (Xt )] = Av(t, x).
t
Hence, we arrive at the following result.
Theorem 9.4.1 (Kolmogorovs backward equation) For any f C02 (Rn )

the function v(t, x) = Ex [f (Xt )] satises the following Cauchys problem
v
= Av, t>0
t
v(0, x) = f (x),
where A denotes the generator of the Itos diusion (9.2.1).
Solving Kolmogorovs backward equation is a problem of partial dierential

equations. The reader interested in several methods for solving this equation
can consult the book of Calin et al. [10].
9.5 Exit Time from an Interval

Let Xt = x0 + Wt be a one-dimensional Brownian motion starting at x0 , with
x0 (a, b). Consider the exit time of the process Xt from the strip (a, b)
= inf{t > 0; Xt (a, b)}.
Assuming E[ ] < 0, applying Dynkins formula yields

1 d2
E f (X ) = f (x0 ) + E f (Xs ) ds . (9.5.12)
0 2 dx2
Choosing f (x) = x in (9.5.12) we obtain
E[X ] = x0 . (9.5.13)
Exercise 9.5.1 Prove relation (9.5.13) using the Optional Stopping Theorem
for the martingale Xt .
Let pa = P (X = a) and pb = P (X = b) be the exit probabilities of Xt

from the interval (a, b). Obviously, pa + pb = 1, since the probability that the
Brownian motion will stay forever inside the bounded interval is zero. Using
the expectation denition, relation (9.5.13) yields
apa + b(1 pa ) = x0 .
Solving for pa and pb we get the following exit probabilities

b x0
pa = (9.5.14)
ba
x0 a
pb = 1 pa = . (9.5.15)
ba
It is worth noting that if b then pa 1 and if a then pb 1.
This can be stated by saying that a Brownian motion starting at x0 reaches
any level (below or above x0 ) with probability 1.
Next we shall compute the mean of the exit time, E[ ]. Choosing f (x) = x2
in (9.5.12) yields
E[(X )2 ] = x20 + E[ ].
From the denition of the mean and formulas (9.5.14)-(9.5.15) we obtain
b x0 x0 a
E[ ] = a2 pa + b2 pb x20 = a2 + b2 x0
ba ba
1 2
= ba ab2 + x0 (b a)(b + a) x20
ba
= ab + x0 (b + a) x20
= (b x0 )(x0 a). (9.5.16)
Exercise 9.5.2 (a) Show that the equation x2 (b a)x + E[ ] = 0 cannot

have complex roots;
(b a)2
(b) Prove that E[ ] ;
4
(c) Find the point x0 (a, b) such that the expectation of the exit time, E[ ],
is maximum.
9.6 Transience and Recurrence of Brownian Motion

We shall consider rst the expectation of the exit time from a ball. Then we
shall extend it to an annulus and compute the transience probabilities.

1. Consider the process Xt = a + W (t), where W (t) = W1 (t), . . . , Wn (t)
is an n-dimensional Brownian motion, and a = (a1 , . . . , an ) Rn is a xed
Figure 9.1: The Brownian motion Xt in the ball B(0, R).
vector, see Fig. 9.1. Let R > 0 be such that R > |a|. Consider the exit time
of the process Xt from the ball B(0, R)
= inf{t > 0; |Xt | R}. (9.6.17)
Assuming E[ ] < and letting f (x) = |x|2 = x21 + + x2n in Dynkins

1
formula

E f (X ) = f (x) + E f (Xs ) ds
0 2
yields

2 2
R = |a| + E n ds ,
0
and hence
R2 |a|2
E[ ] = . (9.6.18)
n
In particular, if the Brownian motion starts from the center, i.e. a = 0, the
expectation of the exit time is
R2
E[ ] = .
n
We make a few remarks:
(i) Since R2 /2 > R2 /3, the previous relation implies that it takes longer for
a Brownian motion to exit a disk of radius R rather than a ball of the same
radius.
(ii) The probability that a Brownian motion leaves the interval (R, R) is
twice the probability that a 2-dimensional Brownian motion exits the disk
B(0, R).
Exercise 9.6.1 Prove that E[ ] < , where is given by (9.6.17).
Exercise 9.6.2 Apply the Optional Stopping Theorem for the martingale Wt =
Wt2 t to show that E[ ] = R2 , where
= inf{t > 0; |Wt | R}
is the rst exit time of the Brownian motion from (R, R).
2. Let b Rn such that b

/ B(0, R), i.e. |b| R, and consider the annulus
Ak = {x; R < |x| < kR}
where k > 0 such that b Ak . Consider the process Xt = b + W (t) and let
k = inf{t > 0; Xt
/ Ak }
be the rst exit time of Xt from the annulus Ak . Let f : Ak R be dened

by '
ln |x|, if n = 2
f (x) = 1
, if n > 2.
|x|n2
A straightforward computation shows that f = 0. Substituting into Dynkins
formula
k 1
E f (Xk ) = f (b) + E f (Xs ) ds
0 2
yields
E f (Xk ) = f (b). (9.6.19)
This can be stated by saying that the value of f at a point b in the annulus is
equal to the expected value of f at the rst exit time of a Brownian motion
starting at b.
Since |Xk | is a random variable with two outcomes, we have

E f (Xk ) = pk f (R) + qk f (kR),
where pk = P (|Xk | = R), qk = P (|XXk |) = kR and pk +qk = 1. Substituting

in (9.6.19) yields
pk f (R) + qk f (kR) = f (b). (9.6.20)
There are two distinguished cases:
(i) If n = 2 we obtain
pk ln R qk (ln k + ln R) = ln b.
Using pk = 1 qk , solving for pk yields
ln( Rb )
pk = 1 .
ln k
Hence
P ( < ) = lim pk = 1,
k
where = inf{t > 0; |Xt | R} is the rst time Xt hits the ball B(0, R).
Hence in R2 a Brownian motion hits with probability 1 any ball. This is
stated equivalently by saying that the Brownian motion is recurrent in R2 .
(ii) If n > 2 the equation (9.6.20) becomes
pk qk 1
n2
+ n2 n2 = n2 .
R k R b
Taking the limit k yields
R n2
lim pk = < 1.
k b
Then in Rn , n > 2, a Brownian motion starting outside of a ball hits it with
a probability less than 1. This is usually stated by saying that the Brownian
motion is transient.
3. We shall recover the previous results using the n-dimensional Bessel process

Rt = dist 0, W (t) = W1 (t)2 + + Wn (t)2 .
Consider the process Yt = + Rt , with 0 < R, see section 3.7. It can be
shown that the generator of Yt is the Bessel operator of order n, see Example
10.2.4
1 d2 n1 d
A= 2
+ .
2 dx 2x dx
Consider the exit time
= {t > 0; Yt R}.
Applying Dynkins formula

E f (Y ) = f (Y0 ) + E (Af )(Ys ) ds
0
Figure 9.2: The Brownian motion Xt in the annulus Ar,R .

for f (x) = x2 yields R2 = 2 + E 0 n ds . This leads to
R2 2
E[ ] =
n
which recovers (9.6.18) with = |a|.
In the following assume n 3 and consider the annulus
Ar,R = {x Rn ; r < |x| < R}.
Consider the stopping time = inf{t > 0; Xt / Ar,R } = inf{t > 0; Yt /

(r, R)}, where |Y | = (r, R). Applying Dynkins formula for f (x) = x 2n
0
yields E f (Y = f (). This can be written as
pr r 2n + pR R2n = 2n ,
where
pr = P (|Xt | = r), pR = P (|Xt | = R), pr + pR = 1.
Solving for pr and pR yields
R n2 r n2
1 1
pr = R n2 , pR = n2 .
r 1 r
R 1
The transience probability is obtained by taking the limit to innity
2n Rn2 1 r n2
pr = lim pr,R = lim = ,
R R r 2n Rn2 1
where pr is the probability that a Brownian motion starting outside the ball
of radius r will hit the ball, see Fig. 9.2.
Exercise 9.6.3 Solve the equation 12 f (x) + n1

2x f (x) = 0 by looking for a
solution of monomial type f (x) = xk .
9.7 Application to Parabolic Equations

This section deals with solving rst and second order parabolic equations using
the integral of the cost function along a certain characteristic solution. The
rst order equations are related to predictable characteristic curves, while the
second order equations depend on stochastic characteristic curves.
9.7.1 Deterministic characteristics

Let (s) be the solution of the following one-dimensional ODE
dX(s)
= a s, X(s) , tsT
ds
X(t) = x,
and dene the cumulative cost between t and T along the solution
T

u(t, x) = c s, (s) ds, (9.7.21)
t
where c denotes a continuous cost function. Dierentiate both sides with

respect to t
T

u t, (t) = c s, (s) ds
t t t

t u + x u (t) = c t, (t) .
Hence (9.7.21) is a solution of the following nal value problem
t u(t, x) + a(t, x)x u(t, x) = c(t, x)

u(T, x) = 0.
It is worth mentioning that this is a variant of the method of characteristics.1

The curve given by the solution (s) is called a characteristic curve.
Exercise 9.7.1 Using the previous method solve the following nal boundary
problems:
(a)
t u + xx u = x
u(T, x) = 0.
(b)
t u + txx u = ln x, x>0
u(T, x) = 0.
9.7.2 Stochastic characteristics

Consider the Ito diusion
dXs = a(s, Xs )ds + b(s, Xs )dWs , tsT

Xt = x,
and dene the stochastic cumulative cost function

T
u(t, Xt ) = c(s, Xs ) ds, (9.7.22)
t
with the conditional expectation

u(t, x) = E u(t, Xt )|Xt = x
T
= E c(s, Xs ) ds|Xt = x .
t
Taking increments on both sides of (9.7.22) yields

T
du(t, Xt ) = d c(s, Xs ) ds.
t
Applying Itos formula on one side and the Fundamental Theorem of Calculus
on the other, we obtain
1
t u(t, x)dt + x u(t, Xt )dXt + x2 u(t, t, Xt )dXt2 = c(t, Xt )dt.
2
1
This is a well known method of solving linear partial dierential equations.
Taking the expectation E[ |Xt = x] on both sides yields

1
t u(t, x)dt + x u(t, x)a(t, x)dt + x2 u(t, x)b2 (t, x)dt = c(t, x)dt.
2
Hence, the expected cost
T
u(t, x) = E c(s, Xs ) ds|Xt = x
t
is a solution of the following second order parabolic equation

1
t u + a(t, x)x u + b2 (t, x)x2 u(t, x) = c(t, x)
2
u(T, x) = 0.
This represents the probabilistic interpretation of the solution of a parabolic

equation.
Exercise 9.7.2 Solve the following nal boundary problems:

(a)
1
t u + x u + x2 u = x
2
u(T, x) = 0.
(b)
1
t u + x u + x2 u = ex ,
2
u(T, x) = 0.
(c)
1
t u + xx u + 2 x2 x2 u = x,
2
u(T, x) = 0.
Chapter 10
Girsanovs Theorem and

Brownian Motion
After setting the basis in martingales, we shall prove Girsanovs theorem,

which is the main tool used in practice to eliminate drift. Then we present
Levys theorem with applications as well as as the time change for Brownian
motions.
10.1 Examples of Martingales

In this section we shall use the knowledge acquired in the previous chapters to
present a few important examples of martingales and some of their particular
cases. Some of these results will be useful later in the proof of Girsanovs
theorem.
Example 10.1.1 If v(s) is a continuous function on [0, T ], then

t
Xt = v(s) dWs
0
The integrability of Xt follows from
t 2
2 2
E[|Xt |] E[Xt ] = E v(s) dWs
0
t t
= E v(s)2 ds = v(s)2 ds < .
0 0
We note the continuity of v(s) can be replaced by the weaker condition v

L2 [0, T ].
215
Xt is obviously adapted to the information set Ft induced by the Brownian

motion Wt . Taking out the predictable part leads to
s t

E[Xt |Fs ] = E v( ) dW + v( ) dW Fs
0 s
t

= Xs + E v( ) dW Fs = Xs ,
s
t
where we used that s v( ) dW is independent of Fs and the conditional
expectation equals the usual expectation
t t

E v( ) dW Fs = E v( ) dW = 0.
s s
t
Example 10.1.2 Let Xt = v(s) dWs be a process as in Example 10.1.1.
0
Then
t
Mt = Xt2 v 2 (s) ds
0
The process Xt satises the stochastic equation dXt = v(t)dWt . By Itos

formula
d(Xt2 ) = 2Xt dXt + (dXt )2 = 2v(t)Xt dWt + v 2 (t)dt. (10.1.1)
Integrating between s and t yields

t t
2 2
Xt Xs = 2 X v( ) dW + v 2 ( ) d.
s s
Then separating the deterministic from the random part, we have

t

E[Mt |Fs ] = E Xt 2
v 2 ( ) d Fs
0
t s

2
= E Xt Xs 2 2 2
v ( ) d + Xs v ( ) d Fs
2
s 0
s t

= Xs 2 2 2
v ( ) d + E Xt Xs 2
v 2 ( ) d Fs
0 s
t
= Ms + 2E X v( ) dW Fs = Ms ,
s
t
where we used relation (10.1.1) and that X v( ) dW is independent of the
s
information set Fs .
Girsanovs Theorem and Brownian Motion 217
The integrability of Mt can be inferred from the following computation.

Since taking the expectation in
t t
Xt2 = 2 v( )X dW + v 2 ( ) d
0 0
yields t
E[Xt2 ] = v 2 ( ) d,
0
which leads to the following estimation
t t
E[|Mt |] E[Xt2 ] + v 2 ( ) d 2 v 2 ( ) d < .
0 0
In the following we shall mention a few particular cases.

1. If v(s) = 1, then Xt = Wt . In this case Mt = Wt2 t is an Ft -martingale.
t
2. If v(s) = s, then Xt = 0 s dWs , and hence
t 2 t3
Mt = s dWs
0 3
Example 10.1.3 Let u : [0, T ] R be a continuous function. Then

t t
u(s) dWs 12 u2 (s) ds
Mt = e 0 0
is an Ft -martingale for 0 t T .
Using Exercise 10.1.8 we obtain E[Mt ] = 1, so Mt is integrable. Consider

t
1 t 2
now the process Ut = u(s) dWs u (s) ds. Then
0 2 0
1
dUt = u(t)dWt u2 (t)dt
2
(dUt )2 = u(t)dt.
Then Itos formula yields

1
dMt = d(eUt ) = eUt dUt + eUt (dUt )2
2
1 2 1
= e u(t)dWt u (t)dt + u2 (t)dt
Ut
2 2
= u(t)Mt dWt .
Integrating between s and t yields

t
Mt = Ms + u( )M dW .
s
t
Since u( )M dW is independent of Fs , then
s
t
t
E u( )M dW |Fs = E u( )M dW = 0,
s s
and hence t
E[Mt |Fs ] = E[Ms + u( )M dW |Fs ] = Ms .
s
Remark 10.1.4 The condition that u(s) is continuous on [0, T ] can be relaxed
by asking only
t
2
u L [0, T ] = {u : [0, T ] R; measurable and |u(s)|2 ds < }.
0
It is worth noting that the conclusion still holds if the function u(s) is replaced
by a stochastic process u(t, ) satisfying Novikovs condition
1 T 2
E e 2 0 u (s,) ds < .
The previous process has a distinguished importance in the theory of martin-
gales and will be useful in the proof of Girsanov theorem.
Denition 10.1.5 Let u L2 [0, T ] be a deterministic function. Then the
stochastic process t
1 t 2
Mt = e 0 u(s) dWs 2 0 u (s) ds
is called the exponential process induced by u.
Particular cases of exponential processes In the following we shall con-

sider a few cases of particular interest:
2
1. Let u(s) = , constant, then Mt = eWt 2 t is an Ft -martingale.
2. Let u(s) = s. Integrating in d(tWt ) = tdWt Wt dt yields
t t
s dWs = tWt Ws ds.
0 0
t
Let Zt = 0 Ws ds be the integrated Brownian motion. Then
t t
s dWs 12 s2 ds
Mt = e 0 0
t3
= etWt 6 Zt
Example 10.1.6 Let Xt be a solution of dXt = u(t)dt + dWt , with u(s)

bounded function. Consider the exponential process
t t
Mt = e 0 0 . (10.1.2)
Then Yt = Mt Xt is an Ft -martingale.
Applying Itos formula we obtain dMt = u(t)Mt dWt . Then
dMt dXt = u(t)Mt dt.
The product rule yields
dYt = Mt dXt + Xt dMt + dMt dXt

= Mt u(t)dt + dWt Xt u(t)Mt dWt u(t)Mt dt

= Mt 1 u(t)Xt dWt .
Integrating between s and t leads to

t

Yt = Ys + M 1 u( )X dW .
s
t
Since M 1 u( )X dW is independent of Fs , we have
s

t t

E M 1 u( )X dW |Fs = E M 1 u( )X dW = 0,
s s
and hence
E[Yt |Fs ] = Ys .
1
Exercise 10.1.7 Prove that (Wt + t)eWt 2 t is an Ft -martingale.
Exercise 10.1.8 Let h be a continuous function. Using the properties of the

Wiener integral and log-normal random variables, show that
t
1 t 2
E e 0 h(s) dWs = e 2 0 h(s) ds .
Exercise 10.1.9 Let Mt be the exponential process (10.1.2). Use the previous
exercise to show that for any t > 0
t
u(s)2 ds
(a) E[Mt ] = 1 (b) E[Mt2 ] = e 0 .
Exercise 10.1.10 Let Ft = {Wu ; u t}. Show that the following processes
are Ft -martingales:
(a) et/2 cos Wt ;
(b) et/2 sin Wt .
Recall that
nthe Laplacian of a twice dierentiable function f is dened by
2
f (x) = j=1 xj f .
Example 10.1.11 Consider the smooth function f : Rn R, such that

(i) f = 0;

(ii) E |f (Wt )| < , t > 0 and x R.
Then the process Xt = f (Wt ) is an Ft -martingale.
Exercise 10.1.12 Let W1 (t) and W2 (t) be two independent Brownian mo-
tions. Show that Xt = eW1 (t) cos W2 (t) is a martingale.
Proposition 10.1.13 Let f : Rn R be a smooth function such that

(i) E |f (Wt )| < ;

t
(ii) E 0 |f (Ws )| ds < .
t
Then the process Xt = f (Wt ) 12 0 f (Ws ) ds is a martingale.
Proof: For 0 s < t we have

1 t
E Xt |Fs = E f (Wt )|Fs E f (Wu ) du|Fs
2 0
s t
1 1
= E f (Wt )|Fs f (Wu ) du E f (Wu )|Fs du.
2 0 s 2
Let p(t, y, x) be the probability density function of Wt . Integrating by parts
and using that p satises the Kolmogorovs backward equation, we have
1
1
E f (Wu )|Fs = p(u s, Ws , x) f (x) dx
2 2

1
= x p(u s, Ws , x)f (x) dx
2

= p(u s, Ws , x)f (x) dx.
u
Then, using the Fundamental Theorem of Calculus, we obtain
t t
1
E f (Wu )|Fs du = p(u s, Ws , x)f (x) dx du
2 u
s s
= p(t s, Ws , x)f (x) dx lim p(, Ws , x)f (x) dx

0

= E f (Wt )|Fs (x = Ws )f (x) dx

= E f (Wt )|Fs f (Ws ).

1 s
E Xt |Fs = E f (Wt )|Fs f (Wu ) du E f (Wt )|Fs + f (Ws )
2 0
s
1
= f (Ws ) f (Wu ) du
2 0
= Xs .
Hence Xt is an Ft -martingale.
Exercise 10.1.14 Use Proposition 10.1.13 to show that the following pro-
cesses are martingales:
(a) Xt = Wt2 t;
t
(b) Xt = Wt3 3 0 Ws ds;
1
t
(c) Xt = n(n1) Wtn 12 0 Wsn2 ds;
t
(d) Xt = ecWt 12 c2 0 eWs ds, with c constant;
t
(e) Xt = sin(cWt ) + 12 c2 0 sin(cWs ) ds, with c constant.
Exercise 10.1.15 Let f : Rn R be a function such that

(i) E |f (Wt )| < ;
(ii) f = f , constant.
t
Show that the process Xt = f (Wt )
2 0 f (Ws ) ds is a martingale.
10.2 How to Recognize a Brownian Motion

Many processes are disguised Brownian motions. How can we recognize them?
We already know that a Brownian motion Bt has the following properties:
(i) Is a continuous martingale with respect to {Bs ; s t};
(ii) Has the quadratic variation B, Bt = t, for t 0.
We state, without proof, a classical theorem due to Paul Levy, which is
a reciprocal of the foregoing result. The following theorem is a useful tool
to show that a one-dimensional process is a Brownian motion. For a more
general result and a proof, the reader can consult Durrett [15].
Theorem 10.2.1 (L evy) If Xt is a continuous martingale with respect to

the ltration Ft , with X0 = 0 and X, Xt = t, for all t 0, then Xt is an
Ft -Brownian motion.
Example 10.2.2 Let Bt be a Brownian motion and consider the process

t
Xt = sgn(Bs ) dBs ,
0
where
1, if x > 0
sgn(x) =
1, if x 0.
We note that dXt = sgn(Bt ) dBt and hence (dXt )2 = dt. Since Xt is a
continuous martingale (because it is an Ito integral) and its quadratic variation
is given by
t t
X, Xt = (dXs )2 = ds = t,
0 0
then Levys theorem implies that Xt is a Brownian motion.
Example 10.2.3 (The squared Bessel process) Let W1 (t), , Wn (t) be

n one-dimensional independent Brownian motions, and consider the
n-dimensional Bessel process

Rt = W1 (t)2 + + Wn (t)2 , n 2.
Dene the process

n t
Wi (s)
t = dWi (s).
0 Rs
i=1
Since the set {; Rt () = 0} has probability zero, the division by Rs does
not cause any problems almost surely. As a sum of Ito integrals, t is an
Ft -martingale, with the quadratic variation given by
n t t 2
Wi (s)2 Rs
, t = 2
ds = 2
ds = t.
0 Rs 0 Rs
i=1
By Levys theorem, t is an Ft -Brownian motion. It satises the following

equation
n
Wi (t)
dt = dWi (t). (10.2.3)
Rt
i=1
From Itos formula and an application of (10.2.3) we get

n
d(Rt2 ) = 2Wi (t) dWi (t) + n dt
i=1

n
Wi (t)
= 2Rt dWi (t) + n dt
Rt
i=1
= 2Rt dt + n dt.
Hence, the squared Bessel process, Zt = Rt2 , satises the following stochas-

dZt = 2 Zt dt + n dt, (10.2.4)
with t Brownian motion.
Example 10.2.4 (The Bessel process) From Example 10.2.3 we recall that
1/2
(dZt )2 = 4Rt2 dt. Then, using Rt = Zt , Itos formula yields
1 1/2 1 3/2
dRt = Zt dZt Zt (dZt )2
2 8
1 1 1 1
= (2Rt dt + ndt) dt
2 Rt 2 Rt
n1
= dt + dt.
2Rt
Hence the Bessel process satises the following stochastic dierential equation
n1
dRt = dt + dt, (10.2.5)
2Rt
where t is a Brownian motion. It is worth noting that the innitesimal gen-
erator of Rt is the operator
1 n1
A = x2 + x , (10.2.6)
2 2x
which is the Bessel operator of order n.
Example 10.2.5 Let f : R2 R be a continuous twice dierentiable func-

tion, and consider the process Xt = f (W1 (t), W2 (t)), with W1 (t), W2 (t) inde-
pendent one-dimensional Brownian motions. Itos formula implies
f f 1 2f 2f
dXt = dW1 (t) + dW2 (t) + + dt.
x1 x2 2 x21 x22
Then Xt is a continuous martingale if f is harmonic, i.e.
2f 2f
+ = 0. (10.2.7)
x21 x22
Then f 2 f 2
(dXt )2 = + dt = |f |2 dt,
x1 x2
so we have t t
2
X, Xt = (dXt ) = |f |2 ds.
0 0
Then the condition X, Xt = t, for any t 0, implies |f | = 1, i.e.
f 2 f 2
+ = 1. (10.2.8)
x1 x2
Equation (10.2.8) is called the eiconal equation. We shall show that if a func-
tion f satises both equations (10.2.7) and (10.2.8), then it is a linear func-
tion. Or, equivalently, a harmonic solution of the eiconal equation is a linear
function.
From equation (10.2.8) there is a continuous function = (x1 , x2 ) such
that
f f
= cos , = sin . (10.2.9)
x1 x2
The closeness condition implies
(cos ) (sin )
= ,
x2 x1

cos + sin = 0. (10.2.10)
x1 x2
Dierentiating in (10.2.9) with respect to x1 and x2 yields
2f 2f
= sin , = cos .
x21 x1 x22 x2
Adding and using (10.2.7) we obtain

cos sin = 0. (10.2.11)
x2 x1
Relations (10.2.10) and (10.2.11) can be written as a system

cos sin x1 0

= ,

sin cos 0
x2

with the solution = 0, = 0. Therefore, is a constant function. This
x1 x2
implies that f is linear, i.e. f (x1 , x2 ) = c1 x1 + c2 x2 , with ci R.
Exercise 10.2.6 If Wt and W t are two independent Brownian motions and

[1,
1] is a constant, use Levys theorem to show that the process Xt =
Wt + 1 2 W t is a Brownian motion (use that dWt dW t = 0).
t
Exercise 10.2.7 Consider the process Xt = 0 f (s) dWs , with f square inte-
grable function.
(a) Show t that Xt is a continuous martingale with the quadratic variation
2
X, Xt = 0 f (s) ds;
(b) Apply Levys theorem to nd all functions f (s) for which Xt is a Brow-
nian motion.
t
Exercise 10.2.8 Let Xt = et 0 es dWs , with Ws Brownian motion.
(a) Show that X, Xt = t for any t 0;
(b) Is Xt a Brownian motion?
The next exercise states that a Brownian motion is preserved by an or-

thogonal transform.
Exercise 10.2.9 Let Wt = (Wt1 , Wt2 ) be a Brownian motion in the plane (i.e.
Wti one-dimensional independent Brownian motions) and dene the process
Bt = (Bt1 , Bt2 ) by
Bt1 = cos Wt1 + sin Wt2

Bt2 = sin Wt1 + cos Wt2 ,
for a xed angle R.

(a) Show that Bt1 , Bt2 are Brownian motions;
(b) Prove that Bt1 , Bt2 are independent processes.
Exercise 10.2.10 Consider the process Yt = tW 1 , if t > 0 and Y0 = 0, where

t
Wt is a Brownian motion. Prove that Yt is a Brownian motion.
10.3 Time Change for Martingales

A martingale satisfying certain properties can always be considered as a Brow-
nian motion running at a modied time clock. The next result is provided
without proof. The interested reader can consult Karatzas and Shreve [26].
Theorem 10.3.1 (Dambis, Dubins and Schwarz, 1965) Let Mt be a con-

tinuous, square integrable Ft -martingale satisfying limt M, M t = , a.s.
Then Mt can be written as a time-transformed Brownian motion as
Mt = BM,M t ,
where Bt is a one-dimensional Brownian motion. Moreover, if we dene for

each s 0 the stopping time
T (s) = inf{t 0; M, M t > s},
then the time-changed process
Bs = MT (s)
is a Gs -Brownian motion, where Gs = FT (s) , 0 s.
Example 10.3.2 (Scaled Brownian motion) Let Wt be a Brownian mo-

tion and consider the process Xt = cWt , which is a continuous martingale.
Assume c = 0. Since (dXt )2 = (c dWt )2 = c2 dt, the quadratic variation of Xt
is t t
X, Xt = (dXs )2 = c2 ds = c2 t , t .
0 0
Then there is a Brownian motion W .t such that Xt = W .c2 t . This can also be
1. 2 .s = cWs/c2 . There-
written as c Wc2 t = Wt . Substituting s = c t, yields W
fore, if Ws is a Brownian motion, then the process cWs/c2 is also a Brownian
motion. In particular, if c = 1, then Ws is a Brownian motion.
Example 10.3.3 This is an application of the previous example. Let T > 0.

Using the scaling property of the Brownian motion and a change of variables,
we have the following identities in law
T T n T
Wtn dt = cWt/c2 dt = c n n
Wt/c 2 dt
0 0 0
T /c2 T /c2
= c n
Wsn c2 ds =c n+2
Wsn ds,
0 0

for any c > 0. Then set c = T and obtain the following identity
T 1
law 1+n/2
Wtn dt = T Wsn ds.
0 0
This relation can be easily veried for n = 1, when both sides are normally
distributed as N (0, T 3 /3), see section 3.3.
Example 10.3.4 It is known that the process Xt = Wt2 t is a continuous

martingale with respect to F = {Ws ; s t}. An application of Itos formula
yields
dXt = 2Wt dWt ,
so the quadratic variation is

t t
X, Xt = (dXs )2 = 4 Ws2 ds := t .
0 0
Therefore, there is a Brownian motion W .t . This is equiv-

.t such that Xt = W
alent to stating that given a Brownian motion Wt , its square can be written
as
.t .
Wt2 = t + W
Example 10.3.5 (Brownian Bridge) The Brownian bridge is provided by

t
1
formula (8.5.13). The Ito integral Mt = dWs can be written as
0 1s
1
a Brownian motion as in the following. Since (dMt )2 = dt, the
(1 t)2
quadratic variation becomes
t
1 t
M, M t = 2
ds = .
0 (1 s) 1t
Then there is a Brownian motion Bt such that Mt = B t , and hence

1t
t
1
(1 t) dWs = (1 t)B t = B
t(1t) ,
0 1s 1t
is also a Brownian motion. It follows that the Brownian bridge for-

where B
mula (8.5.13) can be written equivalently as
Xt = a(1 t) + bt + B
t(1t) , o t 1.
This process has the characteristics of a Brownian motion, while satisfying the
boundary conditions X0 = a and X1 = b.
Example 10.3.6 (Lampertis property) Let Bt be a Brownian motion and

consider the integrated geometric Brownian motion
t
At = e2Bs ds, t 0.
0
We note that the process At is continuous and strictly increasing in t, with

A0 = 0 and lim At = . Therefore, there is an inverse process Tu , i.e.
t
ATu = u, u 0.
Applying the chain rule yields the derivative

dTu
= e2BTu . (10.3.12)
du
Itos formula provides
1
deBt = eBt dBt + eBt dt,
2
which can be written equivalently as
t
1
eBt = 1 + eBs ds + Mt , (10.3.13)
2 0
t
where Mt = 0 eBs dBs . Since Mt is a continuous martingale, with
t t
2
M, M t = (dMs ) = e2Bs ds = At ,
0 0
by Theorem 10.3.1 there is a Brownian motion Wt such that Mt = WAt , or

equivalently, MTu = Wu , u 0. Then replacing t by Tu in equation (10.3.13)
yields
BTu 1 Tu Bs
e =1+ e ds + Wu , (10.3.14)
2 0
which can be written, after applying the chain rule, in dierential notation as
1 dTu
deBTu = eBTu du + dWu .
2 du
Substituting the derivative of Tu from (10.3.12), the foregoing relation becomes
1
deBTu = du + dWu .
2eBTu
Denoting Ru = eBTu , then
1
dRu = du + dWu ,
2Ru
which is equation (10.2.5) for n = 2, i.e. Ru is a Bessel process. Substituting
u = At and t = Tu in Ru = eBTu yields
eBt = RAt .
Hence, the geometric Brownian motion eBt is a time-transformed Bessel

process in the plane, RAt . For a generalized Lamperti property in the case of
a Brownian motion with drift, see Yor [48]. See also Lamperti [31].
Lemma 10.3.7 Let (t) be a continuous dierentiable, with (t) > 0, (0) =
0 and limt (t) = . Then given the one-dimensional Brownian motion
Bt , there is another one-dimensional Brownian motion Wt such that
t
W(t) = (s) dBs , t 0. (10.3.15)
0
t
Proof: The process Xt = 0 (s) dBs is a continuous martingale, with the
quadratic variation
t t t
2 2
X, Xt = (dXs ) =
( (s) dBs ) = (s) ds = (t).
0 0 0
Then by Theorem 10.3.1 there is a Brownian motion Wt such that W(t) = Xt .
Example 10.3.8 If (t) = c2 t, c = 0, then Lemma 10.3.7 yields Wc2 t = cBt .

This is equivalent to stating that for a given Brownian motion Bt , the process
Ws = cBs/c2 , s0
is also a Brownian motion.
Remark 10.3.9 The right side of (10.3.15) is a Wiener integral. Hence, under
certain conditions, a Wiener integral becomes a time-scaled Brownian motion.
Exercise 10.3.10 Let Bt be a Brownian motion. Prove that there is another

Brownian motion Wt such that
t 2
a s
Wea2t 1 = a e 2 dBs .
0
Exercise 10.3.11 (Ornstein-Uhlenbeck process) Consider the equation
dXt = qXt dt + dWt , X0 = 0,
with q and constants, with q > 0.

(a) Show that the solution is given by
t
Xt = eqt equ dWu ;
0
(b) Show that there is a Brownian motion Bt such that
Xt = eqt B(e2qt 1)/(2q) .

Exercise 10.3.12 Consider the equation

dXt = Xt dt + dWt , X0 = 0,
2
t such that
with constant. Show that there is a Brownian motion B
Xt = et/2 B
et .
Theorem 10.3.13 (Time change formula for Ito integrals) Let be a

function as in Lemma 10.3.7, and F a continuous function. Then given a
Brownian motion Bt , there is another Brownian motion Wt such that
1 ()
F (u) dWu = F ((t)) (t) dBt , a.s. (10.3.16)
1 ()
Proof: First, we will prove formula (10.3.16) informally, and then we will
check that the identity holds in law. Formula (10.3.15) can be written in the
equivalent dierential form as

dW(t) = (t) dBt .
Then for a continuous function g we have

b b
g(t)dW(t) = g(t) (t) dBt . (10.3.17)
a a
Using a change of variable, the left side integral becomes

b (b)
g(t)dW(t) = g(1 (u)) dWu . (10.3.18)
a (a)
Relations (10.3.17) and (10.3.18) imply

(b) b
1
g( (u)) dWu = g(t) (t) dBt .
(a) a
Substituting F = g 1 , = (a), and = (b) yields

1 ()
F (u) dWu = F ((t)) (t) dBt . (10.3.19)
1 ()
Each of the sides of formula (10.3.16)

1 ()
X= F (u) dWu , Y = F ((t)) (t) dBt
1 ()
is a random variable, which is given as a Wiener integral, so they are both

normally distributed. Therefore, in order to show that they are equal in law,
it suces to show that they have the same rst two moments.
From the properties of Wiener integrals we have that E[Xt ] = E[Yt ] = 0
and

V ar(X) = F 2 (u) du

1 () 2
V ar(Y ) = F ((t)) (t) dt
1 ()
1 ()
= F ((t))2 (t) dt = F 2 (u) du,
1 ()
so V ar(X) = V ar(Y ). Hence X = Y in law.
Example 10.3.14 For (t) = tan t, formula (10.3.16) becomes

tan1
F (u) dWu = F (tan t) sec t dBt .
tan1
1
If F (u) = and = 0, then we obtain
1 + u2
tan1
1 1
2
dW u = dBt ,
0 1+u 0 sec t
which, after substituting v = tan1 , implies
tan v v
1
dWu = cos t dBt .
0 1 + u2 0
Making v
2 yields
/2
1
dWu = cos t dBt .
0 1 + u2 0
For the sake of completeness, we include next a stochastic variant of Fu-

binis theorem. For a more general variant of this theorem, the reader is
referred to Ikeda and Watanabe [23].
Theorem 10.3.15 (Stochastic Fubini) If f : R+ R+ R is a bounded

measurable function, then
t T T t
f (s, r) dr dWs = f (s, t) dWs dr.
0 0 0 0
10.4 Girsanovs Theorem

In this section we shall present and prove a version of Girsanovs theorem,
which will suce for the purpose of proposed applications. The main usage of
Girsanovs theorem is the reduction of drift. Consequently, Girsanovs theorem
applies in nance where it shows that in the study of security markets the
dierences between the mean rates of return can be removed. For a gentle
introduction into this subject the interested reader can consult Baxter and
Rennie [5], or Neftici [35].
We shall recall rst a few basic notions. Let (, F, P ) be a probability
space. When dealing with an Ft -martingale on the aforementioned probability
space, the ltration Ft is considered to be the -algebra generated by the given
Brownian motion Wt , i.e. Ft = {Wu ; 0 u s}. By default, a martingale
is considered with respect to the probability measure P , in the sense that the
expectations involve an integration with respect to P

E [X] =
P
X()dP ().

We have not used the upper script until now since there was no doubt which
probability measure was used. In this section we shall also use another prob-
ability measure given by
dQ = MT dP,
where MT is an exponential process. This means that Q : F R is given by

Q(A) = dQ = MT dP, A F.
A A
Since MT > 0, M0 = 1, using the martingale property of Mt yields
Q(A) > 0, A = ;

Q() = MT dP = EP [MT ] = EP [MT |F0 ] = M0 = 1,

which shows that Q is a probability on F, and hence (, F, Q) becomes a

probability space. Furthermore, if X is a random variable, then

E [X] =
Q
X() dQ() = X()MT () dP ()

= E [XMT ].
P
The following result will play a central role in proving Girsanovs theorem:
Lemma 10.4.1 Let Xt be the Ito process
dXt = u(t)dt + dWt , X0 = 0, 0 t T,
with u(s) a bounded function. Consider the exponential process

t t
Mt = e 0 0 .
Then Xt is an Ft -martingale with respect to the measure
dQ() = MT ()dP ().
Proof: We need to prove that Xt is an Ft -martingale with respect to Q, so it

suces to show the following three properties:
1. Integrability of Xt . This part usually follows from standard manipulations
of norms estimations. We shall do it here in detail. Integrating in the equation
of Xt between 0 and t provides
t
Xt = u(s) ds + Wt . (10.4.20)
0
We start with an estimation of the expectation with respect to P

t 2 t
EP [Xt2 ] = EP u(s) ds + 2 u(s) ds Wt + Wt2
0 0
t 2 t
= u(s) ds + 2 u(s) ds EP [Wt ] + EP [Wt2 ]
0 0
t 2
= u(s) ds + t < , 0 t T,
0
where the last inequality follows from the norm estimation

t t t 1/2
u(s) ds |u(s)| ds t |u(s)|2 ds
0 0 0
T 1/2
t |u(s)|2 ds = T 1/2 uL2 [0,T ] .
0
Next we obtain an estimation with respect to Q

2
EQ [|Xt |]2 = |Xt |MT dP |Xt |2 dP MT2 dP

= EP [Xt2 ] EP [MT2 ] < ,
T u2 2
u(s)2 ds
since EP [Xt2 ] < and EP [MT2 ] = e 0 =e L [0,T ] , see Exercise 10.1.9.
2. Ft -mesurability of Xt . This follows from equation (10.4.20) and the fact

that Wt is Ft -measurable.
3. Conditional expectation of Xt . From Examples 10.1.3 and 10.1.6 recall that
for any 0 t T :
(i) Mt is an Ft -martingale with respect to probability measure P ;
(ii) Xt Mt is an Ft -martingale with respect to probability measure P.
We need to verify that
EQ [Xt |Fs ] = Xs , s t,
which can be written as

Xt dQ = Xs dQ, A Fs .
A A
Since dQ = MT dP , the previous relation becomes

Xt MT dP = Xs MT dP, A Fs .
A A
This can be written in terms of conditional expectation as

EP [Xt MT |Fs ] = EP [Xs MT |Fs ]. (10.4.21)
We shall prove this identity by showing that both terms are equal to Xs Ms .
Since Xs is Fs -predictable and Mt is a martingale, the right side term becomes
EP [Xs MT |Fs ] = Xs EP [MT |Fs ] = Xs Ms , s T.
Let s < t. Using the tower property (see Proposition 2.12.6, part 3), the left
side term becomes

EP [Xt MT |Fs ] = EP EP [Xt MT |Ft ]|Fs = EP Xt EP [MT |Ft ]|Fs

= EP Xt Mt |Fs = Xs Ms ,
where we used that Mt and Xt Mt are martingales and Xt is Ft -measurable.
Hence (10.4.21) holds and Xt is an Ft -martingale with respect to the proba-
bility measure Q.
Proposition 10.4.2 Consider the process

t
Xt = u(s) ds + Wt , 0 t T,
0
with u L2 [0, T ] a deterministic function, and let dQ = MT dP . Then

EQ [Xt2 ] = t.
t
Proof: Denote U (t) = 0 u(s) ds. Then
EQ [Xt2 ] = EP [Xt2 MT ] = EP [U 2 (t)MT + 2U (t)Wt MT + Wt2 MT ]

= U 2 (t)EP [MT ] + 2U (t)EP [Wt MT ] + EP [Wt2 MT ]. (10.4.22)
From Exercise 10.1.9 (a) we have EP [MT ] = 1. In order to compute EP [Wt MT ]

we use the tower property and the martingale property of Mt
EP [Wt MT ] = EP [EP [Wt MT |Ft ]] = EP [Wt EP [MT |Ft ]]

= EP [Wt Mt ]. (10.4.23)
Using the product rule
d(Wt Mt ) = Mt dWt + Wt dMt + dWt dMt

= Mt u(t)Mt Wt dWt u(t)Mt dt,
where we used dMt = u(t)Mt dWt . Integrating between 0 and t yields

t t

Wt Mt = Ms u(s)Ms Ws dWs u(s)Ms ds.
0 0
Taking the expectation and using the property of Ito integrals we have
t t
E[Wt Mt ] = u(s)E[Ms ] ds = u(s) ds = U (t). (10.4.24)
0 0
EP [Wt MT ] = U (t). (10.4.25)
For computing EP [Wt2 MT ] we proceed in a similar way
EP [Wt2 MT ] = EP [EP [Wt2 MT |Ft ]] = EP [Wt2 EP [MT |Ft ]]

= EP [Wt2 Mt ]. (10.4.26)
Using the product rule yields
d(Wt2 Mt ) = Mt d(Wt2 ) + Wt2 dMt + d(Wt2 )dMt

= Mt (2Wt dWt + dt) Wt2 u(t)Mt dWt

(2Wt dWt + dt) u(t)Mt dWt

= Mt Wt 2 u(t)Wt dWt + Mt 2u(t)Wt Mt dt.
Integrate between 0 and t

t t

Wt2 Mt = [Ms Ws 2 u(s)Ws ] dWs + Ms 2u(s)Ws Ms ds,
0 0
and take the expected value to get

t
2

E [Wt Mt ] =
P
E[Ms ] 2u(s)E[Ws Ms ] ds
0
t

= 1 + 2u(s)U (s) ds
0
= t + U 2 (t),
where we used (10.4.24). Substituting into (10.4.26) yields
EP [Wt2 MT ] = t + U 2 (t). (10.4.27)
Substituting (10.4.25) and (10.4.27) into relation (10.4.22) yields
EQ [Xt2 ] = U 2 (t) 2U (t)2 + t + U 2 (t) = t, (10.4.28)
which ends the proof of the proposition.
Now we are prepared to prove one of the most important results of Stochas-
tic Calculus.
Theorem 10.4.3 (Girsanovs Theorem) Let u L2 [0, T ] be a determin-
istic function. Then the process
t
Xt = u(s) ds + Wt , 0tT
0
is a Brownian motion with respect to the probability measure Q given by

T T
u(s) dWs 12 u(s)2 ds
dQ = e 0 0 dP.
Proof: In order to prove that Xt is a Brownian motion on the probability space
(, F, Q) we shall apply Levys characterization theorem, see Theorem 10.2.1.
Lemma 10.4.1 implies that the process Xt satises the following properties:
1. X0 = 0;
2. Xt is continuous in t;
3. Xt is a square integrable Ft -martingale on the space (, F, Q). Using
Proposition 10.4.2, the martingale property of Wt , and the additivity and the
tower property of expectations yields
EQ [(Xt Xs )2 ] = EQ [Xt2 ] 2EQ [Xt Xs ] + EQ [Xs2 ]
= t 2EQ [Xt Xs ] + s
= t 2EQ [EQ [Xt Xs |Fs ]] + s
= t 2EQ [Xs EQ [Xt |Fs ]] + s
= t 2EQ [Xs2 ] + s
= t 2s + s = t s.
4. The quadratic variation of Xt is

t t
2
X, Xt = (dXs ) = ds = t.
0 0
Choosing u(s) = , constant, we obtain the following consequence.
Corollary 10.4.4 Let Wt be a Brownian motion on the probability space

(, F, P ). Then the process
Xt = t + Wt , 0tT
is a Brownian motion on the probability space (, F, Q), where
1 2 T W
dQ = e 2 T
dP.
This result states that a Brownian motion with drift can be viewed as a regular
Brownian motion under a certain change of the probability measure.
Exercise 10.4.5 Show that for any random variable X on (, F) we have
EP [X] = EQ [XMT1 ].
Exercise 10.4.6 Show that:

(a) EP [f (t + Wt )] = EQ [f (Bt ) MT1 ] for any continuous function f ;
T T
(b) EP [ 0 et+Wt dt] = EQ [ 0 eBt dt MT1 ], where Bt is a Q-Brownian mo-
tion.
Proposition 10.4.7 (Reduction of drift formulas) Let Wt be a Brown-

ian motion and f a measurable function. Then
2 t
(i) E[f (t + Wt )] = e 2 E[f (Wt )eWt ];
2 t
(ii) E[f (Wt )] = e 2 E[f (t + Wt )eWt ].
Proof: (i) Let Wt be a Brownian motion on the space (, F, P ). By Girsanovs

theorem, the process Xt = t + Wt can be considered as a Brownian motion
on (, F, Q) and we have
2 T
E[f (t + Wt )] = EP [f (t + Wt )] = EQ [f (Xt )MT1 ] = EQ [f (Xt )e 2
+WT
]
2 T
+(XT T )
= EQ [f (Xt )e 2 ]
2
2T
= e EQ [f (Xt )eXt e(XT Xt ) ]
2 T
= e 2 EQ [f (Xt )eXt ] EQ [e(XT Xt ) ]
2 T 2
= e 2 EQ [f (Xt )eXt ]e 2
(T t)
2 t
= e 2 EQ [f (Xt )eXt ]
2
2 t
= e E[f (Wt )eWt ].
(ii) We apply Girsanovs theorem for the Q-Brownian motion Xt = t + Wt
and obtain
2 T
EQ [f (Xt )] = EP [f (t + Wt )MT ] = EP [f (t + Wt )e 2
WT
]
2
= EP [f (t + Wt )eWt e 2T (WT Wt )
]
2
2T
= e EP [f (t + Wt )eWt ] EP [e(WT Wt ) ]
2 t
= e 2 EP [f (t + Wt )eWt ].
Replacing Xt by Wt in the rst term yields the desired formula.
Exercise 10.4.8 Use the reduction of drift formulas and Example 8.10.7 to
show
(2k)! 2 t
(a) E[(t + Wt )2k eWt ] = k tk e 2 ;
2 k!
(b) E[(t + Wt )2k+1 eWt ] = 0.
Exercise 10.4.9 Use the reduction of drift formula to show
2
(a) E[sin(t + Wt )] = e t/2 sin t;
2
(b) E[cos(t + Wt )] = e t/2 cos t.
Exercise 10.4.10 Use the reduction of drift formulas to nd
(a) E[cos(t + Wt )eWt ];
(b) E[sin(t + Wt )eWt ].
Exercise 10.4.11 Let Wt be a Brownian motion on the space (, F, P ), and
dQ = MT dP . Show that
t
EQ [et+Wt ] = e 2
(a) by a direct computation;
(b) using Girsanovs theorem.
Exercise 10.4.12 Let Xt = t + Wt , with Wt a P -Brownian motion.

(a) Find EP [Xt ] and EQ [Xt ];
(b) Find EP [Xt2 ] and EQ [Xt2 ].
Exercise 10.4.13 Use Jensens inequality to show that for any convex, mea-
surable function f we have
2 t
E[f (t + Wt )eWt ] f (0)e 2 .
Exercise 10.4.14 Use the reduction of drift formulas to show

2 t
(a) E[Wt eWt ] = te 2 ;
2 t
(b) E[Wt2 eWt ] = (t + 2 t2 )e 2 .
t2
Exercise 10.4.15 Consider the stochastic process Xt = 2 + Wt , where Wt
is a Brownian motion on (, F, P ).
(a) Find the probability measure dQ such that Xt becomes a Q-Brownian
motion;
(b) Compute explicitly EP [Xt MT ];
(c) Use EP [Xt MT ] = EQ [Xt ] = 0 to nd a formula for
t
EP [Wt e 0
s dWs
].
Remark 10.4.16 Girsanovs theorem can be used to compute, at least in

theory, expectations of the form
t
E[f (Wt )e 0
g(s) dWs
],
with f and g continuous functions.
Exercise 10.4.17 Use the drift reduction formula to express V ar[f (t+Wt )].
Chapter 11
Some Applications of
Stochastic Calculus
In this chapter we shall present a few applications of stochastic calculus to a

few applied domains of mathematics. The main idea is that some parameters,
which in the case of deterministic Calculus are kept constant during the evo-
lution of the process, in this case are inuenced by the exterior white noise,
which is modeled by the informal derivative of a Brownian motion, dW dt . This
t
way, the ordinary dierential equations become stochastic dierential equa-

tions, and their solutions are stochastic processes. For more applications of
the white noise in chemistry and electricity one can consult the book of Gar-
diner [20]. For further applications to queueing theory the reader is referred
to Ross [43]. For nancial economics applications, see Sondermann [45], and
for stochastical modeling of oil prices, see Postali and Picchetti [39]. For an
application to car pricing in a stochastic environment see Alshamary and Calin
[1] and [2].
11.1 White Noise

The white noise is used in applications as an idealization of a random noise
that is independent at dierent times and has a very large uctuation at any
time. It can be successfully applied to problems involving an outside noise
inuence, such as trajectory of small particles which diuse in a liquid due
to the molecular bombardments, or signal processing, where it models the
completely unpredictable static inuence. The fact that the noise is not
biased towards any specic frequency, gives it its name white noise. In
this chapter we shall study a few applications of the white noise in kinematics,
population growth, radioactive decay and ltering problems.
The white noise will be denoted by Nt and considered as a stochastic pro-
241
cess. The eect of the noise during the time interval dt is normally distributed
with mean zero and variance dt, and is given as an innitesimal jump of a
Brownian motion, Nt dt = dBt . Thus, it is convenient sometimes to represent
the white noise informally as a derivative of a Brownian motion,
dBt
Nt = .
dt
Since Bt is nowhere dierentiable, the aforementioned derivative does not make
sense classically. However, it makes sense in the following generalized sense:

Nt f (t) dt = Bt f (t) dt,
R R
for any compact supported, smooth function f . Hence, from this point of view,
the white noise Nt is a generalized function or a distribution. In the following
we shall state its relation with the Dirac distribution 0 , which is dened in
the generalized sense as

0 (t)f (t) dt = f (0),
R
for any compact supported, smooth function f .

In order to study the white noise Nt , we should investigate rst the process
(
) 1
Xt = B(t + ) B(t) , t 0,

which models the rate of change of a Brownian motion B(t). Since we have
(
) 1
E[Xt ] = E[B(t + )] E[B(t)] = 0

(
) 1 1
V ar(Xt ) = 2
(t + t) = ,

(
)
the limiting process Nt = lim Xt will have zero mean and innite variance.

0
Consider s < t and choose > 0 small enough, such that (s, s + )
(t, t + ) = . Using the properties of Brownian motions, the dierences
B(s + ) B(s) and B(t + ) B(t) are independent. Hence, the random
variables Ns and Nt are independent for s < t.
In the following we shall compute the covariance of the process Nt . For
reasons which will be clear later we shall extend the parameter t to take values
in the entire real line R. This can be done by dening the Brownian motion
B(t) as
Some Applications of Stochastic Calculus 243

W1 (t), if t 0
B(t) =
W2 (t), if t < 0,
where W1 (t) and W2 (t) are two independent Brownian motions. First, we
(
)
compute the covariance of the process Xt . Assume s < t. Then using the
formula E[B(u)B(v)] = min{u, v}, we have
(
) (
)
Cov(Xs(
) , Xt ) = E[Xs(
) Xt ]
1
= 2 E B(s + ) B(s) B(t + ) B(t)

1
= 2 E B(s + )B(t + ) E B(s + )B(t)

E B(s)B(t + ) + E B(s)B(t)
1
= 2 s + min{s + , t} s + s

1
= 2 s + s + min{, t s}

1 t s 1 ts
= 1 min 1, = max 1 ,0 .

For any s, t we can derive the more general formula
1 |t s|
(
)
Cov(Xs(
) , Xt ) = max 1 ,0 . (11.1.1)

Consider the test function
1

1 , if | |
1

( ) = max 1 , 0 =

0, if | | > ,
1
which veries
( ) 0,
(0) = and

1

( ) d = 1 d = 1.
R

Therefore, we have
lim
( ) = 0 ( ),

0
where 0 is the Dirac distribution centered at 0. In fact, the above limit has
the following meaning

lim
( )f ( ) d = f (0),

0 R
for any test function f .

Since the covariance formula (11.1.1) can also be expressed as
(
)
Cov(Xs(
) , Xt ) =
(t s),
then
(
)
Cov(Ns , Nt ) = lim Cov(Xs(
) , Xt ) = lim
(t s) = 0 (t s).

0
0
We arrive at the following denition of the white noise.
Denition 11.1.1 A white noise is a generalized stochastic process Nt , which

is stationary and Gaussian, with mean and covariance given by
E[Nt ] = 0
Cov(Ns , Nt ) = 0 (t s).
11.2 Stochastic Kinematics

During a race, a cyclist has average speed m. However, his speed varies in
time. Sometimes the cyclist exceeds the speed m, but he gets tired after a
while and slows down. If the cyclists speed decreases under the mean m, then
he recuperates the muscle power and is able to speed up again. The cyclists
instantaneous velocity vt satises a mean reverting process described by the
equation
dvt = a(m vt )dt + dWt ,
where and a are two positive constants that correspond to the volatility and
rate at which the velocity is pulled towards the mean m. The solution is given
by t
vt = m + (v0 m)eat + eat eas dWs . (11.2.2)
0
Since the last term is a Wiener integral, the speed vt is normally distributed
with mean and variance
E[vt ] = m + (v0 m)eat

2
V ar(vt ) = (1 e2at ).
2a
The expectation of the speed as of time u is given by the conditional expec-
tation given the information Fu available at time u
u
E[vt |Fu ] = m + (v0 m)eat + eat eas dWs .
0
The cyclists stochastic coordinate at time t is obtained integrating the velocity

t
xt = x0 + vs ds
0
t s
as as
= x0 + [m + (v0 m)e + e eau dWu ] ds
0 0
t s
1 eat
= x0 + mt + (v0 m) + ea(us) dWu ds.
a 0 0
The expected coordinate is

t
E[xt ] = x0 + E[vs ] ds
0
t
= x0 + [m + (v0 m)eat ] ds
0
1 eat
= x0 + mt + (v0 m)
a
1 eat
= xunif (t) + (v0 m) .
a
The term xunif (t) = x0 + mt denotes the coordinate the cyclist would have
if moving at the constant velocity m. The dierence v0 m provides the
following upper and lower bounds
(1 at/2)|v0 m|t |E[xt ] xunif (t)| |v0 m|t.
This shows that the error between the expected coordinate and the coordinate
of a uniform move is at most linear in time and is controlled by the dierence
v0 m. Therefore, the expected coordinate is the classical coordinate, E[xt ] =
xunif (t), if and only if v0 = m.
The acceleration at is obtained as the derivative of velocity vt with respect
to time
dvt
at =
dt
t
at at dWt
= a(v0 m)e ae eas dWs +
0 dt
t
dW
= a0 eat aeat
t
eas dWs + , (11.2.3)
0 dt
where a0 is the initial acceleration. The rst term is a deterministic function,
the second term is a normally distributed random variable of zero mean, while
dWt dW
t
is the white noise term. Since E = 0, the long run limit of the
dt dt
expectation becomes E[at ] = a0 e at 0, as t .
Let M denote the mass of the cyclist and Ft = M at be the muscle force
developed at time t. The work done by the cyclist between instances 0 and t
is given by
xt xt t
W = Fs dxs = M as dxs = M as vs ds,
x0 x0 0
where the velocity and the acceleration are given by (11.2.2) and (11.2.3).
Computing the exact expression of W is tedious. However, using the properties
of Ito integrals one can compute E[W], see Exercise 11.2.1.
Since the square of velocity is given by
2 t
vt2 = m + (v0 m)e at
+ 2e at
m + (v0 m)eat
eas dWs
0
t 2
2 2at
+ e eas dWs ,
0
the expectation of the kinetic energy becomes

M v2 M 2 t 2
E t
= m + (v0 m)eat + 0 + 2 e2at E eas dWs
2 2 0
M 2 2
= m + (v0 m)eat + (1 e2at ) .
2 2a
1 eat 2 1 e2at 2
E[W] = M a0 m + a0 (v0 m) + t .
a 2 2a 2
Exercise 11.2.2 A snowake in falling motion is described by the equation
dvt = g dt + vt dWt , v0 = 0,
where g and are positive constants.

(a) Find E[vt ];
(b) Compute E[at ];
(c) Solve the equation to nd a formula for the velocity vt ;
(d) Find a formula for the acceleration at .
Exercise 11.2.3 A particle with the initial velocity v0 = 1 m/sec decelerates

in a noisy way according to the equation
dvt = 2vt dt + 0.3 dWt .

(a) What is the probability that at the time t = 3 sec the velocity of the
particle is less than 0.1 m/sec?
(b) Find the work done by the environment on the particle in order to
decelerate it from v0 = 1 m/sec to v = 0.5 m/sec.
Exercise 11.2.4 A snowball rolls downhill with the velocity given by the equa-
tion
dvt = 0.6 vt dt + 0.2 dWt , v0 = 0.
(a) Find the velocity vt ;
(b) What is the probability that the velocity is greater than 10 m/sec at
t = 20 sec?
11.3 Radioactive Decay

Consider a radioactive atom which contains N (t) nuclei at time t. Assume the
number of nuclei which decay during the time interval t is Poisson distributed
n (t)n
P N (t) N (t + t) = n = et ,
n!
where the constant stands for the decay rate. For a small time interval t
the previous formula becomes

P N (t) N (t + t) = 1 = t,
i.e. the probability of the occurrence of one decay in a small time interval is
proportional with the time interval. The probability of the complementary
event is
P N (t) N (t + t) = 0 = 1 t. (11.3.4)
Divide the interval [0, t] into n equidistant subintervals
0 = t0 < t1 < t2 < < tn1 < tn = t,
with t = tk+1 tk = t/n. The event of not having any decays during the
interval [0, t] can be expressed as

n1
{N (0) N (t) = 0} = {N (tk ) N (tk+1 ) = 0}.
k=0
Since the increments of a Poisson process are independent, we have

n1 t n
P N (0)N (t) = 0 = P N (tk )N (tk+1 ) = 0 = (1t)n = 1 .
n
k=0
Let n and obtain

P N (0) N (t) = 0 = et .
For N (t) large enough, this probability represents the fraction of nuclei that
survived the decay during the time interval [0, t]. Since the percentage of nuclei
that are still alive after time t is represented by the quotient N (t)/N (0), we
have
N (t)
= et .
N (0)
This relation, written as N (t) = N (0)et , is the law of radioactive decay.
Now we shall develop a dierential equation for N (t). Relation (11.3.4)
states that the fraction of nuclei that resist the decay during the time interval
[t, t + t] is
N (t + t)
= 1 t.
N (t)
Cross multiplying and subtracting N (t) yields
N (t) N (t + t) = N (t)t. (11.3.5)
Assuming that the period of observation is innitely ne, t dt, the equa-
tion becomes
dN (t) = N (t) dt.
This describes the kinetics of the radioactive decay, stating that the change
in the number of nuclei dN (t) during the time interval dt is proportional with
the number of nuclei N (t). Solving the aforementioned equation, we obtain
again the law of radioactive decay
N (t) = N (0)ekt .
Noisy radioactive decay In real life relation (11.3.5) does not hold exactly,
and some errors of measurement or counting are involved. These will be added
as a noisy term
N (t) N (t + t) = N (t)t + noise.
For t small, this becomes a stochastic dierential equation
dN (t) = N (t) dt + dWt ,
with positive constant and Wt Brownian motion. The obtained equation is
called Langevins equation. We shall solve it as a linear stochastic dierential
equation. Multiplying by the integrating factor et yields
d(et N (t)) = et dWt .
Integrating yields t
et N (t) = N (0) + es dWs .
0
Hence the solution is
t
t t
N (t) = N (0)e + e es dWs . (11.3.6)
0
This is the Ornstein-Uhlenbeck process. Since the last term is a Wiener inte-
gral, by Proposition 8.2.1 we have that N (t) is Gaussian with the mean
t
t
E[N (t)] = N (0)e +E e(st) dWs = N (0)et
0
and variance
t 2
V ar[N (t)] = V ar e(st) dWs = (1 e2t ).
0 2
Using Exercise 10.3.11 we can write the Gaussian term as a Brownian

motion under a time change as
t
es dWs = B(e2t 1)/(2) ,
0
with Bt Brownian motion. Hence the solution can also be written as
N (t) = N (0)et + et B(e2t 1)/(2) .

Using the expansion
e2t = 1 + 2t + o(t2 ), t 0,
then (e2t 1)/(2) = t + o(t2 ), and hence the following approximation holds
for t small
N (t) = N (0)et + et Bt .
Exercise 11.3.1 Let N (t) be a noisy radioactive decay. Dene the half time
h as
1
h = inf{t > 0; N (t) N (0)}.
2
(a) Prove that E[eh ] = 2;
ln 2
(b) Use Jensens inequality to show that E[h] .
Exercise 11.3.2 Consider a machine which consists initially of N distinct

parts. Assume the number of parts which get defective during the time interval
t is Poisson distributed
n (t)n
P N (t) N (t + t) = n = et .
n!
(a) What is the probability that all parts still function perfectly at time t?
(b) Find the time t such that the 90% of the machine functions perfectly
at time t.
Exercise 11.3.3 A living organism has initially N0 cells. Assume the number
of cells which die during the time interval t is Poisson distributed
n (t)n
P N (t) N (t + t) = n = et .
n!
The organism dies when at least 30% of its cells are dead. Find an approxi-
mation of the death time of the organism.
11.4 Noisy Pendulum

The small oscillations of a free simple pendulum can be described by the linear
equation
= k2 (t),
(t) (11.4.7)
where (t) is the angle between the string and the vertical direction. If the
exterior perturbations are modeled by a white noise process, Nt , then the
pendulum equation under small deviations writes as
+ k 2 (t) = Nt ,
(t) (11.4.8)
where k and are constants and the noise is given informally as Nt = dBt
dt .
The general solution of equation (11.4.8) can be expressed as the sum
(t) = p (t) + 0 (t), (11.4.9)
where p (t) is a particular solution of (11.4.8) and 0 (t) is the solution of the
associated homogeneous equation (11.4.7).
Standard ODE methods provide
0 (t) = c1 cos(kt) + c2 sin(kt),
with c1 , c2 R. The particular solution p (t) can be obtained by the method

of variable coecients. We are looking for a particular solution
p (t) = u1 (t) cos(kt) + u2 (t) sin(kt), (11.4.10)

where u1 (t) and u2 (t) are two dierentiable functions, which will be determined
later. Assuming
u1 (t) cos(kt) + u2 (t) sin(kt) = 0, (11.4.11)
dierentiating yields
p (t) = u1 (t) cos(kt) + u2 (t) sin(kt) ku1 (t) sin(kt) + ku2 (t) cos(kt)
= ku1 (t) sin(kt) + ku2 (t) cos(kt).
Then the second derivative is
p (t) = ku1 (t) sin(kt) + ku2 (t) cos(kt) (11.4.12)

2 2
k u1 (t) cos(kt) k u2 (t) sin(kt).
Substituting (11.4.12) and (11.4.10) into (11.4.8) yields
dBt
ku1 (t) sin(kt) + ku2 (t) cos(kt) = , (11.4.13)
dt
dBt
where we used the informal notation for the white noise Nt = . Equations
dt
(11.4.11) and (11.4.13) yield the ODEs system in u1 and u2
u1 (t) cos(kt) + u2 (t) sin(kt) = 0 (11.4.14)

dBt
u1 (t) sin(kt) + u2 (t) cos(kt) = . (11.4.15)
k dt
The reduction method and integration produces the following solutions
t

u1 (t) = sin(ks) dBs (11.4.16)
k 0
t

u2 (t) = cos(ks) dBs . (11.4.17)
k 0
These represent the eect of the white noise dBs along the solutions trajec-
tories sin(ks) and cos(ks). From the properties of Wiener integrals, it follows
that u1 (t) and u2 (t) have normal distributions with the mean, variances and
covariance given by
E[u1 (t)] = E[u2 (t)] = 0

2 t 2 2 t sin(2kt)
V ar[u1 (t)] = sin (ks) ds =
k2 0 k2 2 4k
t
2
2
2
t sin(2kt)
V ar[u2 (t)] = cos (ks) ds = +
k2 0 k2 2 4k
Cov[u1 (t), u2 (t)] = E[u1 (t)u2 (t)]

t
t
= E sin(ks) dBs , cos(ks) dBs
k 0 k 0
t
2 2 sin2 (kt)
= 2 sin(ks) cos(ks) ds = 3 .
k 0 k 2
The particular solution (11.4.10) becomes

t t
p (t) = sin(ks) dBs cos(kt) + cos(ks) dBs sin(kt).
k 0 k 0
(11.4.18)
Hence the general solution for the pendulum equation given by (11.4.9) is

t t
(t) = c1 sin(ks) dBs cos(kt) + c2 + cos(ks) dBs sin(kt),
k 0 k 0
where the constants c1 and c2 depend on the initial data as

(0)
c1 = (0), c2 = .
k
It is worth noting that (t) is not normally distributed. This follows from
the fact that Cov[u1 (t), u2 (t)] = 0 implies that u1 (t) and u2 (t) are not inde-
pendent. However, we are able to compute the mean and variance as in the
following
E[(t)] = 0 (t) = c1 cos(kt) + c2 sin(kt)
V ar[(t)] = V ar[p (t)] = cos2 (kt)V ar(u1 (t)) + sin2 (kt)V ar(u2 (t))
+2 sin(kt) cos(kt)Cov(u1 (t), u2 (t))
2t 2
= sin(2kt).
2k2 4k3
We shall present in the following another method for nding Xt , involving

an integrating factor. Considering X1 (t) = (t), X2 (t) = (t), the pendulum
equation
= k 2 (t) + dBt
(t)
dt
becomes a rst order system of stochastic dierential equations
dX1 (t) = X2 (t) dt

dX2 (t) = k2 X1 (t) dt + dBt .
Denoting / 0 / 0 / 0
X1 0 1 0
Xt = , A= , K= ,
X2 k2 0
the aforementioned system becomes a linear matrix stochastic dierential
equation
dXt = AXt dt + K dBt .
Multiplying by the integrating factor eAt yields the exact equation

d eAt Xt = eAt K dBt .
Integrating we obtain the solution

t
At
Xt = e X0 + eA(ts) K dBs . (11.4.19)
0
Since A2 = k2 I2 , a computation of the exponential of At involving a power

series provides
An tn A2n t2n A2n+1 t2n+1
eAt = = +
n! (2n)! (2n + 1)!
n0 n0 n0
(1) k t2n
n 2n 1 (1)n k2n+1 t2n+1
= I2 + A
(2n)! k (2n + 1)!
n0 n0
1
= cos(kt)I2 + sin(kt)A
/ k 0 / 0
1
cos(kt) 0 0 k sin(kt)
= +
0 cos(kt) k sin(kt) 0
/ 1
0
cos(kt) k sin(kt)
= .
k sin(kt) cos(kt)
The expectation of Xt is given by

/ 1
0/ 0
cos(kt) sin(kt) (0)
E[Xt ] = e X0 =
At k
k sin(kt) cos(kt)
(0)
/ 0
cos(kt)(0) + k1 sin(kt)(0)

= .
k sin(kt)(0) + cos(kt)(0)
Considering each component separately, implies

1
E[(t)] = cos(kt)(0) + sin(kt)(0)
k

E[(t)]
= k sin(kt)(0) + cos(kt)(0).
Since / 0

sin(k(t s))
eA(ts) K = k ,
cos(k(t s))
the integral term of (11.4.19) can be computed as
t

t k sin(k(t s)) dBs
0
eA(ts) K dBs =
t
.

0
cos(k(t s)) dBs
0
Since any electronic circuit is mathematically equivalent to a pendulum
equation, a similar method of study can be applied to it. It is worth noting
that the analysis of noise in electronic circuits was developed as early as 1920s
by Rice [42] and Schottky [44].
Exercise 11.4.1 Consider an electric circuit, in which the charge Qt at time
t satises the equation
d2 Qt dQt
2
+3 + 2Qt = Nt ,
dt dt
dWt
where the external force is just the inuence of the white noise Nt = .
dt
(a) Find the solution of the homogeneous equation;
(b) Show that a particular solution is given by
t
Qpt = e(ts) e2(ts) dWs .
0
(c) Find the general solution, Qt , and show that E[Qt ] satises the homo-
geneous equation.
Exercise 11.4.2 (a) Solve the following stochastic dierential equation
dXt = Yt dt + dWt1
dYt = Xt dt + dWt2 ,
where (Wt1 , Wt2 ) is a 2-dimensional Brownian motion and and are con-
stants.
(b) Use part (a) to nd a solution for the following stochastic pendulum
equation
2 + W
t + t = W 1.
t t
11.5 Stochastic Population Growth

This section presents a few population growth models driven by noisy growth
rates. This implies that the population size is stochastic.
Exponential growth model The population at time t, denoted by Pt , sat-
ises the growth equation
dPt = rt Pt dt, (11.5.20)
where rt is the stochastic growth rate. Assume that rt oscillates irregularly
around some deterministic average function a(t)
rt = a(t) + noise.
If the size of the white noise is , then

dBt
noise = Nt = .
dt
Substituting in (11.5.20) yields the following SDE
dPt = a(t)Pt dt + Pt dBt , (11.5.21)
where > 0 is a constant and Bt is a Brownian motion. The equation (11.5.21)

can be reduced to an exact equation multiplying by the integrating factor
1 2
t = e 2 tBt . Itos formula provides
dt = t ( 2 dt dt),
and hence dt dPt = 2 Yt dt. Denoting Yt = t Pt and applying the product

rule yields
dYt = d(t Pt ) = dt Pt + t dPt + dt dPt

= ( 2 Yt + a(t)Yt 2 Yt )dt + (Yt Yt )dBt
= a(t)Yt dt.
Hence the process Yt satises the deterministic equation
dYt = a(t)Yt dt
with the initial condition Y0 = 0 P0 = P0 . Integrating yields

t
a(s) ds
Yt = P0 e 0 .
Solving for Pt we obtain

t
a(s) ds 2 t/2+Bt
Pt = P0 e 0 . (11.5.22)
2 t/2
Using E[eBt ] = e , we obtain
t t
a(s) ds 2 t/2
E[Pt ] = P0 e 0 E[eBt ] = P0 e 0 a(s) ds
.
It is worth noting that the function P (t) = E[Pt ] satises the deterministic
equation
dP (t) = a(t) P (t)dt.
The population Pt provided by formula (11.5.22) is log-normally distributed.
Pt
In fact, ln is normally distributed
P0
Pt
ln N (m(t), 2 t),
P0
with the mean given by
t
2t
m(t) = a(s)ds .
0 2
Exercise 11.5.1 Consider the population given by the formula (11.5.22).

(a) Find the probability distribution function of Pt .
(b) Find the probability density function of Pt .
Exercise 11.5.2 A bacteria population has an intrinsic growth rate of r =

0.08 and noise size = 0.01 per day. If the population starts with 10, 000
bacteria, nd the probability that there are more than 11, 000 bacteria after 2
days.
Exercise 11.5.3 A population has a noisy growth rate given by rt = t2 + dW t

dt .
Find the doubling time T , which satises
E[PT ] = 2P0 .
Population growth in a stochastic and crowded environment In the

previous exponential growth model the population can increase indenitely. A
more realistic model was obtained by P.F. Verhust in 1832 (and rediscovered
by R. Pearl in the twentieth century), who assumed that due to competition
the population also tends to decrease at a rate proportional with the number
of encounters between the population members, which is proportional with
the square of the population size
dP (t) = rP (t)dt kP (t)2 dt. (11.5.23)

The constant r is the intrinsic growth rate, i.e. the relative rate at which the
population would increase if there were no restrictions on the population. The
positive constant k reects the damping eect on the population growth caused
by competition for resources between the members of the same population.
The solution of the equation (11.5.23) is given by the logistic function
P0 K
P (t) = , (11.5.24)
P0 + (K P0 )ert
where K = r/k is the saturation level, or carrying capacity of the environment.
This represents the equilibrium level to which the population, regardless of its
initial size, will tend in the long run
K = lim P (t).
t
One of the stochastic models for the population growth in a stochastic and
competitive environment is obtained keeping in equation (11.5.23) the rate k
constant, while considering a noisy intrinsic rate of growth
dPt = (r + Nt )Pt dt kPt2 dt. (11.5.25)

This equation can be written equivalently as
dPt = rPt dt kPt2 dt + Pt dBt , (11.5.26)
where the positive constant measures the size of the noise of the system.
Rewriting the equation as
dPt = kPt (K Pt )dt + Pt dBt ,
1 2
and multiplying by the integrating factor t = e 2 tBt leads to the exact
equation
d(t Pt ) = kt Pt (K Pt )dt.
Substituting Yt = t Pt yields the equation
dYt = kYt (K 1
t Yt )dt. (11.5.27)
ert
In order to solve (11.5.27) we shall make the new substitution Zt = . Since
Yt
(dYt )2 = 0, Itos formula provides
ert 1
dZt = r dt 2 ert dYt
Yt Yt
1
= rZt dt 2 ert kYt (K 1
t Yt )dt
Yt
= rZt dt kZt (K 1
t Yt )dt
= kZt Yt 1
t dt
= kert 1
t dt,
where we used that r = kK and Zt Yt = ert . The process Zt satises the

integrable equation
dZt = kert 1
t dt
with the solution t

Zt = Z0 + k ers 1
s ds.
0
1 1
Since Z0 = = , substituting back we obtain the following expression for
Y0 P0
the population
2
ert ert t/2+Bt
Pt = 1
t Yt = 1 = t .
+ k 0 ers 1
t 1
Zt s ds
P0
In the following we shall manipulate the previous expression in order to make

it look as close as possible to the logistic equation (11.5.24).
2 2 2
ert t/2+Bt P0 ert e t/2+Bt P0 Kert e t/2+Bt
Pt = t = = t
1
+ k 0 ers 1 1 + kP0 0 ers 1 K + rP0 0 ers 1
t
P0 s ds s ds s ds
2 t/2+B
P0 Ke t
= t
Kert + rP0 ert rs 1
0 e s ds
2
P0 Ke t/2+Bt
= t
(K P0 )ert + P0 1 + r 0 ers 1
s ds ert
If in the previous formula we let = 0, and hence s = 1, we obtain exactly

the expression (11.5.24).
A more sophisticated model is obtained if in equation (11.5.23) both rates
r and k are noisy, with the size of the noise proportional with the rates as
follows
dBt dBt
rt = r + r , kt = k + k .
dt dt
We note that both rates are driven by the same uncertainty source. Substi-
tuting in the equation yields the following SDE
dPt = (rPt kPt2 )dt + (rPt kPt2 )dBt .
It can be shown that this equation has a unique strong solution, but the
discussion of this subject is beyond the level of this textbook.
Population growth in a stochastic catastrophic environment In the
previous model the population tends to decrease due to competition and lim-
ited space. In the present model the population decreases suddenly due to
some unexpected catastrophic events, such as earthquakes, wars, diseases,

natural calamities, etc. The SDE satised by the population in this case is
dPt = rPt dt Pt dNt , (11.5.28)
where Nt is a Poisson process with rate . The positive constant is a

dPt
measure of the size of the drop in the instantaneous relative change and
Pt
takes values in (0, 1).
We shall construct a solution as in the following. Let Sk denote the kth
jumping time for the Poisson process, i.e. NSk = k and NSk = k1. Consider
t [0, S1 ). Since there are no jumps in this interval the population satises
the stochastic dierential equation
dPt = rPt dt, 0 t < S1 ,
with the solution given by the usual formula Pt = P0 ert , for t [0, S1 ). In
particular, when t = S1 , we have
PS1 = P0 erS1 . (11.5.29)
Since at the jumping time S1 we have
dPS1 PS PS1
= 1 = ,
PS1 PS1
then PS1 = (1 )PS1 . Combining with formula (11.5.29) yields
PS1 = (1 )PS1 = P0 erS1 (1 ). (11.5.30)
Because there are no jumps in the interval [S1 , S2 ) the following dierential
equation holds
dPt = rPt dt, S1 t < S2 ,
with the solution
Pt = PS1 er(tS1 ) .
Combining with (11.5.30) yields
Pt = PS1 er(tS1 ) = P0 erS1 (1 )er(tS1 ) = P0 ert (1 ), S1 t < S2 .
The eect of passing over a jump is to multiply the solution by the factor
(1 ). Using this observation we obtain inductively
Pt = P0 ert (1 )n , Sn t < Sn+1 .

Replacing n by Nt we arrive at the following expression for the population in

a stochastic catastrophic environment
Pt = P0 ert (1 )Nt . (11.5.31)
In the next paragraph we shall compute the mean of Pt using conditional

expectations

E[Pt ] = P0 ert E[(1 )Nt ] = P0 ert E[(1 )Nt |Nt = n]P (Nt = n)
n0
n tn t
= P0 ert (1 )n P (Nt = n) = P0 ert (1 )n e
n!
n0 n0
rt t (1)t (r)t
= P0 e e e = P0 e .
To conclude, if r < , the population tends to decrease and then disappear

in the long run. A population with r = has on average a constant size.
Next we shall evaluate the probability of the event {Pt x} for a given
x > 0. We consider the following transformation of events
x rt
{Pt x} = {P0 e(r)t x} = {P0 ert (1 )Nt x} = {(1 )Nt e }
P0

ln Px0 rt
= Nt ,
ln(1 )

ln x
P0 rt
where we used that ln(1 ) < 0. Denoting yt = , the probability
ln(1 )
can now be evaluated as in the following
k tk
P (Pt x) = et .
k!
kyt
Exercise 11.5.4 An ant colony of 1, 000 ants grows at the intrinsic rate r =
0.30 per month. However, each rainfall kills 2% of the ant population, and it
rains 5 times per year.
(a) Write the stochastic dierential equation for the ant population size;
(b) What is the probability that there are at least 2, 000 ants in the colony
at the end of the rst year?
(c) What is the expected size of the colony after 2 years?
Population growth with stochastic harvesting Besides the eect of expo-

nential growth at the intrinsic rate r, we shall also assume that the population
is harvested at the stochastic rate Ct = + dW t
dt , where > 0 is the mean
harvesting rate and measures the size of the noise. The population at time
t, denoted by Pt satises the following equation
dPt = (rPt Ct )dt,
which can be written equivalently as
dPt = rPt dt dt dWt . (11.5.32)
Multiplying by ert and solving it as a linear stochastic dierential equation

yields the solution
t

Pt = ert P0 + ert ers dWs . (11.5.33)
r 0
This implies that Pt is normally distributed with the mean and variance given
by

E[Pt ] = ert P0 (11.5.34)
r
2 2rt
V ar[Pt ] = (e 1). (11.5.35)
2r
Exercise 11.5.5 Show that the stochastic process (11.5.33) is the solution of
equation (11.5.32).
Exercise 11.5.6 Prove formulas (11.5.34) and (11.5.35).
11.6 Pricing Zero-coupon Bonds

A bond is a nancial instrument which pays back at the end of its lifetime, T ,
an amount equal to B. If the contract does not provide any payments until
maturity, the bond is called a zero-coupon bond or a discount bond. The value
of the bond at time t is denoted by B(t). In the case when the yield r is
constant, i.e., if
dB(t)
= r dt,
B(t)
T
then the bond value is given by the familiar expression B(t) = Be t r ds =
Ber(T t) . We shall assume next the case when the yield of the bond is
aected by the noise in the market
dWt
rt = r + , r > 0,
dt
where is a positive constant that controls the size of the noise. This leads
to the following stochastic process
T T T
Bt = Be t rs ds
= Be t r ds
e t dWs
= Ber(T t) e(WT Wt ) .
The bond value is the conditional expectation of Bt given the information in

the market until time t
E[Bt |Ft ] = Ber(T t) E[e(WT Wt ) |Ft ]

= Ber(T t) E[e(WT Wt ) ]
1 2 (T t)
= Ber(T t) e 2
1 2 )(T t)
= Be(r 2 .
According to the previous model, we note that in the case when the market
2
noise is small, r > 2 , the bond value appreciates, having the maximum value,
2
B, at t = T . If the noise is large, 2 > r, the bond depreciates and has to be
sold as soon as possible.
Exercise 11.6.1 Use Itos formula to show that the process Bt satises the
equation
2
dB(t) = r+ B(t)dt + B(t)dWt
2
B(T ) = B.
11.7 Finding the Cholesterol Level

The amount of cholesterol in the blood of a person at time t is described by a
stochastic process denoted by Ct . The body cholesterol is either manufactured
by the body or it is absorbed from the intaken food. If E denotes the daily
rate at which cholesterol is eaten and C0 stands for the normal level of choles-
terol for a health person, then Ct satises the following stochastic dierential
equation
dCt = a(C0 Ct )dt + bEdt + dWt , (11.7.36)
where a is a production parameter, b is the absorption parameter, and is the
size of the noise in the measurement. Solving (11.7.36) as a linear equation
we obtain the solution
t
b
Ct = C0 eat + (C0 + E)(1 eat ) + eat eas dWs .
a 0
From the properties of Wiener integrals it follows that Ct is normally dis-

tributed, with mean
b
E[Ct ] = C0 eat + (C0 + E)(1 eat )
a
and variance
2
V ar[Ct ] = (1 e2at ).
2a
If the diet is kept for a long time, then the cholesterol level becomes the normal
random variable
b 2
Ct N C0 + E, , t .
a 2a
In order to evaluate the health of a particular person who is on a given diet
(i.e. when E is kept constant for a long time) we have to nd the probability
that the cholesterol level is over a given acceptable level M . Using the long
run normality of Ct , this probability can be evaluated as

a 1 (xC0 2 ab E)
2
P (Ct > M ) = e dx. (11.7.37)

M
Exercise 11.7.1 Assuming that the intaken food does not have a constant
cholesterol level, the term E is replaced by E + noise. Write and solve the
corresponding stochastic dierential equation.
Exercise 11.7.2 The normal cholesterol level in the blood is C0 = 200 (mil-
ligrams per deciliter), the production parameter is a = 0.1 and the absorption
parameter for a particular person is b = 0.15. What is the maximum daily
intake E of cholesterol such that the long run level of cholesterol is less than
220 with a probability of 95%?
11.8 Photons Escape Time from the Sun

This section deals with the computation of a rough approximation of the time
taken by a photon of light to travel from the center of the sun to its surface.
Fusion reactions occur in the core of the sun due to high heat and pressure.
These reactions release high energy photons (gamma rays), which are absorbed
in only a few millimeters by the solar plasma particles and then are emitted
again in a random direction, see Fig. 11.1.
The mathematical model for such a photon emitted and randomly redi-
rected by plasma particles is a Brownian motion process. If Xt denotes the
Xt
photon's trajectory
solar plasma
Figure 11.1: The photon bounces back and forth in its eort to emerge to the
suns surface.
coordinates vector in R3 of the photon at time t, where the center of the sun
is assigned the zero coordinate, then
Xt = a + Wt , (11.8.38)
where a = X0 is the location where the photon was initially created, is
the dispersion function and Wt = (Wt1 , Wt2 , Wt3 ) is a 3-dimensional Brownian
process in R3 . The dispersion is given by Einsteins formula
2 = 2kT /(6),
where k is Boltzmanns constant, T denotes the absolute temperature, is
the diameter of the photon, and is the viscosity of the solar plasma. For the
sake of simplicity we shall assume in the following that is constant.
The time when the photon reaches the surface of the sun is a random
variable denoted by . We shall use a similar method as the one described in
section 9.6 to nd the expected time E[ ]. If RS denotes the radius of the sun,
the time necessary for a photon to emerge to the suns surface is the exit time
= inf{t > 0; |Xt | RS }. (11.8.39)
Since the innitesimal generator of the process (11.8.38) is the operator
2 2 2
= ( + x22 + x23 ),
2 2 x1
using the function f (x) = |x|2 = x21 + x22 + x23 in Dynkins formula
2
E f (X ) = f (x) + E f (Xs ) ds
0 2
yields

RS2 2
= |a| + E 3 2 ds ,
0
and hence
RS2 |a|2
E[ ] = . (11.8.40)
3 2
In particular, if the photon is emitted from the suns center, the expected
emerging time to the surface is
RS2
E[ ] = . (11.8.41)
3 2
Using the numerical values for the suns radius and photons diusion given
by RS = 6.955 105 km and 2 = 0.0025 km2 /sec (this corresponds to a 50
meters per second radial photon displacement), formula (11.8.41) yields the
approximate value E[ ] 2 million years. Some other sources compute slightly
dierent values, but the idea is that it takes a really long time for a photon to
leave the suns interior. This is huge compared with the only 8 minutes spent
by the photon on its way to earth.
It is worth noting that if a star has its radius 100 times the suns radius,
then the expected emerging time multiplies by a factor of 104 . This means
E[ ] 2 1010 years (20 billion years), which is longer than the entire age of
the universe ( 14 billion years)! Hence, it is possible that a photon created
in the center of the star has not found its way out to the surface yet. Since the
life span of a star is usually around 10 billion years, the photon will probably
not get out of the star during its life span.
11.9 Filtering Theory

In early 1960s Kalman and Bucy found a procedure for estimating the state of a
signal that satises a noisy linear dierential equation based on a series of noisy
observations. This is known now under the name of Kalman-Bucy lter. This
theory has useful applications in signal processing of aerospace tracking, GPS
location systems, radars, MRI medical imaging, statistical quality control,
and any other applications dealing with reducing or ltering noise out of an
observed system. For more examples on this subject as well as the complete
proofs, the reader can consult ksendal [37].
Assume t is the input process, i.e., a process that describes the state of a
stochastic system at time t, which needs to be observed. This process has some
noise built in and its evolution satises a given linear stochastic dierential
equation
dt = a(t)t dt + b(t)dWt ,
noise
t St
input signal observed signal
Figure 11.2: The input signal t and the observed signal St .
where a(t) and b(t) are given deterministic functions and Wt is a Brownian
motion, independent of the initial value 0 . We assume that t is observed
continuously with the actual observations St = t + noise, see Fig. 11.2. If
dBt
the white noise is given by noise = (t) , then
dt
St dt = t dt + (t)dBt . (11.9.42)
Introducing the cumulative observation process

t
Qt = Su du
0
and using that the information (-algebras) induced by the processes Qt and
St are the same,
FtQ = FtS ,
it follows that equation (11.9.42) can be replaced by the more mathematically
useful formula
dQt = t dt + (t)dBt , (11.9.43)
with Bt independent of Wt . From now on, Qt will be considered as the ob-
servation process instead of St . It is easier to work with the cummulative
observation process Qt rather than the actual observations St .
The ltering problem can now be stated as:
Given the observations Qs , 0 s t, satisfying equation (11.9.43), nd
the best estimate t of the state t .
One of the best estimators t , which is mathematically tractable, is the
one which minimizes the mean square error
R(t) = E[(t t )2 ].
This means that for any other square integrable random variable Y , which is
measurable with respect to the eld FtQ , we have the inequality
E[(t t )2 ] E[(t Y )2 ].
It turns out that the best estimator t coincides with the conditional expecta-
tion of t given the information induced by Qs , 0 s t, namely,
t = E[t |FtQ ].
The computation of the best estimator t is provided by the following central

result in ltering theory.
Theorem 11.9.1 (Kalman-Bucy lter) Let the state of a system t satisfy

the equation
dt = a(t)t dt + b(t)dWt , (11.9.44)
where a(t) and b(t) are deterministic functions. Assume that the random
variable 0 and the Brownian motion Wt are independent, and let E[0 ] = ,
V ar[0 ] = 2 . Assume the observation Qt satisfy the equation
dQt = (t)t dt + (t)dBt , Q0 = 0, (11.9.45)
with deterministic functions (t), (t) and Brownian motion Bt , independent

of Wt and 0 .
Then the conditional expectation t = E[t |FtQ ] is the solution of the linear
stochastic dierential equation
dt = U (t)t dt + V (t)dQt , 0 = , (11.9.46)
with
(t) 2 (t)
V (t) = R(t), U (t) = a(t) R(t),
2 (t) 2 (t)
and R(t) satisfying the deterministic Riccati equation
dR(t) 2 (t)
= b2 (t) + 2a(t)R(t) 2 R2 (t), R(0) = 2 .
dt (t)
Moreover, the least mean square error is given by R(t) = E[(t t )2 ].
The process t is called the Kalman-Bucy lter of the ltering problem

(11.9.44)-(11.9.45). Furthermore, if
lim R(t) = 0,
t
we say that t is an exact asymptotic estimation of t .

Explicit solutions In the following we shall deal with the closed form solution
of equation (11.9.46).
t
First, we shall solve (11.9.44) as a linear equation.
Denote A(t) = e 0 a(s) ds and multiply by eA(t) to get
d(t eA(t) ) = b(t)eA(t) dWt .
Integrating and solving for t we obtain

t
A(t)
t = 0 e + eA(t)A(s) b(s) dWs . (11.9.47)
0
The mean and variance of the input process t are
E[t ] = E[0 ]eA(t) = eA(t) ;

t
V ar[t ] = e2[A(t)A(s)] b2 (s) ds,
0
where for the second identity we used the properties of Wiener integrals.
Now, equation (11.9.46) can be solved as a linear equation. After multi-
plying by the factor t
(t)1 = exp{ U (s) ds}
0
the equation becomes
d((t)1 t ) = (t)1 V (t)dQt .
Integrating, we obtain the lter solution

t

t = t + (t) (s)1 V (s) dQs
0
t t
1
= t + (t) (s) V (s)(s)s ds + t s (s)1 V (s)(s) dBs ,
0 0
with t given by (11.9.47).

It is worth noting that the expectation of the lter is given by
E[t ] = E[E[t |FtQ ]] = E[t ]

= eA(t) .
Example 11.9.2 (Noisy observations of a random variable) Assume

is a random variable which needs to be measured. Its known mean and vari-
ance are given by E[] = and V ar[] = 2 . For instance, we may consider
to be the heart rate per minute, or the cholesterol blood level, or the systolic
blood pressure for a particular person. Observing any of the aforementioned

variables involve measurement errors that are described by the noise factor.
The actual observations are given by a process St = + noise, which can be
written in terms of the cumulative observations as
dQt = dt + dBt ,
with > 0 constant. This is equivalent to Qt = t + Bt , where the noise

factor represented by the Brownian motion Bt is assumed independent of .
The ltering problem in this case means we have to nd a stochastic process
t which is the best approximation of up to time t, given the measurements
Qs = s + Bs , 0 s t.
In order to nd the lter t consider the constant process t = and associate

the ltering problem
dt = 0
dQt = dt + dBt .
Since in this case (t) = 1, (t) = , a(t) = 0, and b(t) = 0, the Riccati
equation becomes
dR(t) 1
= 2 R2 (t),
dt
2
with the solution R(t) = depending on the parameter C. Using the
t C 2
initial condition R(0) = 2 , we obtain
2 2
R(t) = .
2 t + 2
Then the coecients V (t) and U (t) take the following form
1 2
V (t) = R(t) =
2 2 t + 2
1 2
U (t) = 2 R(t) = 2 .
t + 2
Equation (11.9.46) becomes
2 2
dt + t dt =
dQt .
2 t + 2 2 t + 2
Multiplying by the integrating factor
t 2
0 2 s+ 2 ds 2 t + 2
e =
2
we obtain the exact dierential equation

2t + 2 2
d t = 2 dQt ,
2
which after integration provides the following formula for the best estimate
2 2
t = 0 +
Qt
2t + 2 2 t + 2
1
= ( 2 + 2 Qt ).
t + 2
2
It is worth noting that the foregoing formula implies that the best estimate
of the random variable , given the continuous observations Qs , 0 s t,
depends only on the last observation Qt . In the case of discrete observations,
the best estimate will depend on each observation, see the next example.
Example 11.9.3 Consider a random variable , with E[] = and V ar() =

2 , which is observed n times with the results
S 1 = + 1
S 2 = + 2

S n = + n ,
where j is an error random variable, independent of , with E[j ] = 0, and

V ar(j ) = m2 . We shall assume in the beginning that the errors j are inde-
pendent. The goal is to nd the best estimate of given the observations Sj ,
1 j n,
= E[|S1 , , Sn ].
Consider the ane space generated by the observations Sj

n
L(S) = {0 + cj Sj ; 0 , cj R}.
j=1
It makes sense to look for the best estimator as an element L(S) such
that the distance E[( )2 ] is minimized. This occurs when the estimator
is the orthogonal projection of on the space L(S). This means we have to
determine the constants 0 , cj such that the following n + 1 conditions are
satised
= E[]
E[]
E[( )Sj ] = 0, j = 1, , n.
Let = 0 + c1 S1 + + cn Sn . Since
= 0 + (c1 + + cn )
= E[]
it follows that 0 = c0 , with
c0 + c1 + c2 + cn = 1. (11.9.48)
Therefore, belongs to the convex hull of {, S1 , , Sn }, i.e.
= c0 + c1 S1 + + cn Sn .
A computation provides
j ] E[Sj ]
E[( )Sj ] = E[S
n
= c0 E[Sj ] + ck E[Sk Sj ] E[( + j )]
k=1

n
= c0 2 + ck E[( + k )( + j )] E[ 2 ] E[]E[j ]
k=1

n
n
2 2 2
= c0 + E[ ] ck + m ck kj E[ 2 ]
k=1 k=1
2 2
= c0 + E[ ](1 c0 ) + m2 cj E[ 2 ]
= c0 (2 E[ 2 ]) + m2 cj
= c0 2 + m2 cj .
The orthogonality condition E[( )Sj ] = 0 implies
c0 2
cj = , j = 1, , n. (11.9.49)
m2
Substituting in (11.9.48) provides an equation for c0 , which has the solution
m2
c0 = ,
m2 + n 2
and hence (11.9.49) becomes
2
cj = .
m2 + n 2
In conclusion, the best estimator of , given n observations S1 , , Sn , is given
by

n
= c0 + ck Sk
k=1
m2 2
n
= + Sk .
m2 + n 2 m2 + n 2
k=1
It is worth noting that each observation is

equally weighted. If the number of
1
observations is large, n , then n nk=1 Sk . This means that the best
approximation is given by the average of the observations.
All previous formulas are valid for the case of independent observations j .
Assume now that the covariance matrix
ij = Cov(k , j ) = E[k j ]
is not necessarily diagonal. A similar computation leads in this case to the

formula

n
2
E[( )Sj ] = c0 + ck kj .
k=1
The orthogonality formula becomes

n
ck kj = c0 2 ,
k=1
which is a linear system in ck . Under the assumption that the matrix (kj ) is
non-singular, let (kj ) be its inverse matrix, so the solution of the aforemen-
tioned linear system is

n
ck = c0 2 kj , k = 1, , n. (11.9.50)
j=1
Substituting into (11.9.48) and solving for the coecient c0 yields

1
c0 = n .
1+ 2 k,j=1
kj
Then formula (11.9.50) provides the other coecients

n kj
j=1
ck = 2 n , k = 1, , n.
+ k,j=1 kj
Hence, the best estimator of in terms of the covariance matrix is given by

n

n
j=1
kj
= + Sk .
1 + 2 nk,j=1 kj 2 + nk,j=1 kj
k=1
Example 11.9.4 Consider that the observed state is the blood cholesterol
level, Ct , which is given by the stochastic equation (11.7.36)
dCt = a(C0 Ct )dt + bEdt + dWt , (11.9.51)

where a is a production parameter, b is the absorption parameter, and is the

size of the noise in the measurement. The observation process is given by
dQt = Ct dt + dBt , Q0 = 0.
We need to nd the best estimate C 1t of the cholesterol level Ct , given the

noisy observations Qs , s t. However, Kalman-Bucy lter cannot be applied
directly to (11.9.51). We need to perform rst the substitution
b
t = Ct C0 E,
a
which transforms (11.9.51) into the linear stochastic dierential equation
dt = at dt + dWt .
b
Let Zt = Qt C0 + E t. We note that the -algebras induced by Zt and
a
Qt are the same, FtZ = FtQ , and hence we can consider Zt as an observation
process instead of Qt , satisfying
dZt = t dt + dBt , Z0 = 0.
The Riccati equation associated with the ltering problem having the input
process t and observations Zt is given by
dR(t) 1
= 2aR(t) 2 R2 (t) + 2 , R(0) = 0,
dt
where we used that E[0 ] = ab E and V ar(0 ) = 0. The solution is given by
1 (1 e2Kt )
R(t) = ,
1 12 e2Kt
where

1 = a 2 a2 2 + 2

2 = a 2 + a2 2 + 2

K = a2 + (/)2 .
Solving for the estimation process t , we obtain the formula

t
b t H(s) ds 1 t
t = Ee 0 + 2 e s H(u) du R(s) dZs ,
a 0
where
1
H(s) = a + R(s).
2
Using that
lim R(t) = 2 ,
t
we obtain the following long run behavior for the estimation
t
b 2
lim t = EeKt + 2 eKt eKs dZs .
t a 0
1t and Qt as follows
This can be transformed in terms of C
t
1t = C0 + (1 eKt ) b E 2 (C0 + b E) + 2 eKt
lim C eKs dQs .
t a K 2 a 2 0
Example 11.9.5 A modern application of ltering theory is in GPS location.

The input signal is the true position, while the observation process is the GPS
position. The position is measured by an inertial navigation system (INS).
This has transducers that measure acceleration, which leads to the vehicle po-
sition by double integration and specifying its initial position. Since there are
always errors in the acceleration measurement, this leads to noise in the posi-
tion location.
The INS position can be checked often by observing the GPS receiver.
Hence, the INS position estimate can be corrected by using periodic GPS ob-
servations, which are also susceptible to error of measurement. Finding the
optimal way to use the GPS measurements in order to correct the INS esti-
mates is a stochastic ltering problem, see Bain and Crisan [4].
Exercise 11.9.6 Consider the ltering problem

1+t
dt = dt + dWt , = 0, = 1
2
1 1
dQt = t dt + dBt , Q0 = 0.
1+t 1+t
(a) Show that the associated Riccati equation is given by
dR(t)
= 1 (1 + t)Rt Rt2 , R(0) = 1.
dt
(b) Verify that the solution of the Riccati equation is
1
R(t) = .
1+t
(b) Compute the best estimator for t and show that

1 1 (t+t2 /2) t 1 2
t = e 2 (1 + s)e 2 (s+s /2) dQs .
1+t 0
Exercise 11.9.7 In a Kalman-Bucy lter the state and the observations Qt

are given by dt = dWt , with 0 being normally distributed with mean 1 and
variance 2, and dQt = t dt + 2dBt , with Q0 = 0. Find the best estimate t
for t .
Chapter 12
Hints and Solutions
Here we give the hints and solutions to selected exercises. The reader should
be able to derive the solutions to the rest based on what he has learnt from
the examples in the chapters.
Chapter 2
Exercise 2.9.7 (a) For any random variables A,B, and variable R, inte-
grating in the inequality
2
A() + B() 0,
implies 2
A() + B() dP () 0.

After expanding and collecting the powers of , this can be written as

2 2
A () dP () + 2 A()B() dP () + B 2 () dP () 0.

Substituting

a= A2 () dP (), b = 2 A()B() dP () , c = B 2 () dP (),

the previous inequality becomes
a2 + b + c 0, R.
This occurs when b2 4ac, which in this case becomes
E[AB]2 E[A2 ]E[B 2 ].
277
(b) Substitute A = X X and B = Y B and obtain

E[(X X )(Y Y )] E[(X X )2 ] E[(Y Y )2 ]
Cov(X, Y ) X Y .
(c) They are proportional.

Exercise 2.10.1 Let X N (, 2 ). Then the distribution function of Y is
y
FY (y) = P (Y < y) = P (X + < y) = P X <

2
y
1 (x)2 1 y z(+)

= e 22 dx = e 22 2 dz
2 2
y (z )2
1
= e 2( )2 dz,
2
with = + , = .
2 2 /2
Exercise 2.10.5 (a) Making t = n yields E[Y n ] = E[enX ] = en+n .
(b) Let n = 1 and n = 2 in (a) to get the rst two moments and then use the
formula of variance.
Exercise 2.12.7 The tower property

E E[X|G]|H = E[X|H], HG
is equivalent to

E[X|G] dP = X dP, A H.
A A
Since A G, the previous relation holds by the denition of E[X|G].

Exercise 2.12.8 (a) || = 24 = 16;
(b) P (A) = 3/8, P (B) = 1/2, P (C) = 1/4;
3 3
(c) P (A B) = 16 , P (B C) = 16 ;
3
(d) P (A)P (B) = 16 = P (A B), so A, B independent;
2 3
(e) P (B)P (C) = 16 = 16 = P (B C), so B, C independent; P (B|C) = 34 ;
(f ) we know the outcomes of the rst two tosses but we do not know the order
of the last two tosses;
(g) A G (true), B F (true), C G (true);
(i) E[X] = 0, E[Y ] = 1516 , E[X|G] = X, since X is G-measurable.
Exercise 2.12.9 (a) Direct application of the denition.

(b) P (A) = A dP = A () dP () = E[A ].
Hints and Solutions 279
(d) E[A X] = E[A ]E[X] = P (A)E[X].

(e) We have the sequence of equivalencies

E[X|G] = E[X] E[X] dP = X dP, A G
A A
E[X]P (A) = X dP E[X]P (A) = E[A ],

A
which follows from (d).

Exercise 2.13.5 If = E[X] then
E[(X E[X])2 ] = E[X 2 2X + 2 ] = E[X 2 ] 22 + 2

= E[X 2 ] E[X]2 = V ar[X].
Exercise 2.13.6 From Exercise 2.13.5 we have V ar(X) = 0 X = E[X], i.e.

X is a constant.
Exercise 2.13.7 The same proof as the one in Jensens inequality.
Exercise 2.13.8 It follows from Jensens inequality or using properties of
integrals.
k e
= e(e 1) ;
t
Exercise 2.13.13 (a) m(t) = E[etX ] = k etk k!
(b) It follows from the rst Cherno bound.
Exercise 2.13.16 Choose f (x) = x2k+1 and g(x) = x2n+1 .
Exercise 2.14.2 By direct computation we have
E[(X Y )2 ] = E[X 2 ] + E[Y 2 ] 2E[XY ]

= V ar(X) + E[X]2 + V ar[Y ] + E[Y ]2 2E[X]E[Y ]
+2E[X]E[Y ] 2E[XY ]
= V ar(X) + V ar[Y ] + (E[X] E[Y ])2 2Cov(X, Y ).
Exercise 2.14.3 (a) Since
E[(X Xn )2 ] E[X Xn ]2 = (E[Xn ] E[X])2 0,
the Squeeze Theorem yields limn (E[Xn ] E[X]) = 0.

(b) Writing
Xn2 X 2 = (Xn X)2 2X(X Xn ),
and taking the expectation we get
E[Xn2 ] E[X 2 ] = E[(Xn X)2 ] 2E[X(X Xn )].

The right side tends to zero since
E[(Xn X)2 ] 0

|E[X(X Xn )]| |X(X Xn )| dP

1/2 1/2
X 2 dP (X Xn )2 dP

= E[X 2 ]E[(X Xn )2 ] 0.
(c) It follows from part (b).

(d) Apply Exercise 2.14.2.
Exercise 2.14.4 Using Jensens inequality we have
2 2
E E[Xn |H] E[X|H] = E E[Xn X|H]

E E[(Xn X)2 |H]
= E[(Xn X)2 ] 0,
as n 0.
Exercise 2.16.7 The integrability of Xt follows from

E[|Xt |] = E |E[X|Ft ]| E E[|X| |Ft ] = E[|X|] < .
Xt is Ft -predictable by the denition of the conditional expectation. Using

the tower property yields

E[Xt |Fs ] = E E[X|Ft ]|Fs = E[X|Fs ] = Xs , s < t.
Exercise 2.16.8 Since
E[|Zt |] = E[|aXt + bYt + c|] |a|E[|Xt |] + |b|E[|Yt |] + |c| <
then Zt is integrable. For s < t, using the martingale property of Xt and Yt

we have
E[Zt |Fs ] = aE[Xt |Fs ] + bE[Yt |Fs ] + c = aXs + bYs + c = Zs .
Exercise 2.16.9 In general the answer is no for both (a) and (b). For instance,
if Xt = Yt the process Xt2 is not a martingale, since the Jensens inequality
2
E[Xt2 |Fs ] E[Xt |Fs ] = Xs2
is not necessarily an identity. For instance Bt2 is not a martingale, with Bt the
Brownian motion process.
Exercise 2.16.10 It follows from the identity
E[(Xt Xs )(Yt Ys )|Fs ] = E[Xt Yt Xs Ys |Fs ].
Exercise 2.16.11 (a) Let Yn = Sn E[Sn ]. We have
Yn+k = Sn+k E[Sn+k ]

k
k
= Yn + Xn+j E[Xn+j ].
j=1 j=1
Using the properties of expectation we have

k
k
E[Yn+k |Fn ] = Yn + E[Xn+j |Fn ] E[E[Xn+j ]|Fn ]
j=1 j=1

k
k
= Yn + E[Xn+j ] E[Xn+j ]
j=1 j=1
= Yn .
(b) Let Zn = Sn2 V ar(Sn ). The process Zn is an Fn -martingale i
E[Zn+k Zn |Fn ] = 0.
Let U = Sn+k Sn . Using the independence we have

2

Zn+k Zn = (Sn+k Sn2 ) V ar(Sn+k V ar(Sn )

= (Sn + U )2 Sn2 V ar(Sn+k V ar(Sn )
= U 2 + 2U Sn V ar(U ),
so
E[Zn+k Zn |Fn ] = E[U 2 ] + 2Sn E[U ] V ar(U )

= E[U 2 ] (E[U 2 ] E[U ]2 )
= 0,
since E[U ] = 0.
Exercise 2.16.12 Let Fn = (Xk ; k n). Using the independence
E[|Pn |] = E[|X0 |] E[|Xn |] < ,

so |Pn | integrable. Taking out the predictable part we have
E[Pn+k |Fn ] = E[Pn Xn+1 Xn+k |Fn ] = Pn E[Xn+1 Xn+k |Fn ]

= Pn E[Xn+1 ] E[Xn+k ] = Pn .
Exercise 2.16.13 (a) Since the random variable Y = X is normally dis-

tributed with mean and variance 2 2 , then
1 2 2
E[eX ] = e+ 2
.
Hence E[eX ] = 1 i + 12 2 2 = 0 which has the nonzero solution =

2/ 2 .
(b) Since eXi are independent, integrable and satisfy E[eXi ] = 1, by Exer-
cise 2.16.12 we get that the product Zn = eSn = eX1 eXn is a martingale.
Chapter 3
Exercise 3.1.4 Bt starts at 0 and is continuous in t. By Proposition 3.1.2
Bt is a martingale with E[Bt2 ] = t < . Since Bt Bs N (0, |t s|), then
E[(Bt Bs )2 ] = |t s|.
Exercise 3.1.9 It is obvious that Xt = Wt+t0 Wt0 satises X0 = 0 and
that Xt is continuous in t. The increments are normal distributed Xt Xs =
Wt+t0 Ws+t0 N (0, |t s|). If 0 < t1 < < tn , then 0 < t0 < t1 + t0 <
< tn + t0 . The increments Xtk+1 Xtk = Wtk+1 +t0 Wtk +t0 are obviously
independent and stationary.
Exercise 3.1.10 Let s < t. Then we have
1 1
Xt Xs = (Wt Ws ) N 0, (t s) = N (0, t s).

The other properties are obvious.
Exercise 3.1.11 Apply Property 3.1.7.
Exercise 3.1.12 Using the moment generating function, we get E[Wt3 ] = 0,
E[Wt4 ] = 3t2 .
Exercise 3.1.13 (a) Let s < t. Then

E[(Wt2 t)(Ws2 s)] = E E[(Wt2 t)(Ws2 s)]|Fs

= E (Ws2 s)E[(Wt2 t)]|Fs

= E (Ws2 s)2 = E Ws4 2sWs2 + s2
= E[Ws4 ] 2sE[Ws2 ] + s2 = 3s2 2s2 + s2 = 2s2 .
(b) Using part (a) we have
2s2 = E[(Wt2 t)(Ws2 s)]

= E[Ws2 Wt2 ] sE[Wt2 ] tE[Ws2 ] + ts
= E[Ws2 Wt2 ] st.
Therefore E[Ws2 Wt2 ] = ts + 2s2 .

(c) Cov(Wt2 , Ws2 ) = E[Ws2 Wt2 ] E[Ws2 ]E[Wt2 ] = ts + 2s2 ts = 2s2 .
2s2
(d) Corr(Wt2 , Ws2 ) = 2ts = st , where we used
V ar(Wt2 ) = E[Wt4 ] E[Wt2 ]2 = 3t2 t2 = 2t2 .
Exercise 3.1.14 (a) The distribution function of Yt is given by
F (x) = P (Yt x) = P (tW1/t x) = P (W1/t x/t)

x/t x/t
2
= 1/t (y) dy = t/(2)ety /2 dy
0 0

x/ t
1 2
= eu /2 du.
0 2
(b) The probability density of Yt is obtained by dierentiating F (x)

x/ t
d 1 2 1 2
p(x) = F (x) = eu /2 du = ex /2 ,
dx 0 2 2
and hence Yt N (0, t).
(c) Using that Yt has independent increments we have
Cov(Ys , Yt ) = E[Ys Yt ] E[Ys ]E[Yt ] = E[Ys Yt ]

= E Ys (Yt Ys ) + Ys2 = E[Ys ]E[Yt Ys ] + E[Ys2 ]
= 0 + s = s.
(d) Since
Yt Ys = (t s)(W1/t W0 ) s(W1/s W1/t )
E[Yt Ys ] = (t s)E[W1/t ] sE[W1/s W1/t ] = 0,
and
1 1 1
V ar(Yt Ys ) = E[(Yt Ys )2 ] = (t s)2 + s2 ( )
t s t
(t s)2 + s(t s)
= = t s.
t
Exercise 3.1.15 (a) Applying the denition of expectation we have

1 x2 1 x2
E[|Wt |] = |x| e 2t dx = 2x e 2t dx
2t 0 2t

1 y
= e 2t dy = 2t/.
2t 0
(b) Since E[|Wt |2 ] = E[Wt2 ] = t, we have
2t 2
V ar(|Wt |) = E[|Wt |2 ] E[|Wt |]2 = t =t 1 .

Exercise 3.1.16 By the martingale property of Wt2 t we have
E[Wt2 |Fs ] = E[Wt2 t|Fs ] + t = Ws2 + t s.
Exercise 3.1.17 (a) Expanding
(Wt Ws )3 = Wt3 3Wt2 Ws + 3Wt Ws2 Ws3
and taking the expectation
E[(Wt Ws )3 |Fs ] = E[Wt3 |Fs ] 3Ws E[Wt2 ] + 3Ws2 E[Wt |Fs ] Ws3
= E[Wt3 |Fs ] 3(t s)Ws Ws3 ,
so
E[Wt3 |Fs ] = 3(t s)Ws + Ws3 ,
since
E[(Wt Ws )3 |Fs ] = E[(Wt Ws )3 ] = E[Wts
3
] = 0.
(b) Hint: Start from the expansion of (Wt Ws )4 .

Exercise 3.2.3 Using that eWt Ws is stationary, we have
1
E[eWt Ws ] = E[eWts ] = e 2 (ts) .
Exercise 3.2.4 (a)
E[Xt |Fs ] = E[eWt |Fs ] = E[eWt Ws eWs |Fs ]

= eWs E[eWt Ws |Fs ] = eWs E[eWt Ws ]
= eWs et/2 es/2 .
(b) This can also be written as
E[et/2 eWt |Fs ] = es/2 eWs ,

which shows that et/2 eWt is a martingale.

(c) From the stationarity we have
1 2
E[ecWt cWs ] = E[ec(Wt Ws ) ] = E[ecWts ] = e 2 c (ts)
.
Then for any s < t we have
E[ecWt |Fs ] = E[ec(Wt Ws ) ecWs |Fs ] = ecWs E[ec(Wt Ws ) |Fs ]

1 2 1 2
= ecWs E[ec(Wt Ws ) ] = ecWs e 2 c (ts)
= Ys e 2 c t .
1 2
Multiplying by e 2 c t
yields the desired result.
Exercise 3.2.5 (a) Using Exercise 3.2.3 we have
Cov(Xs , Xt ) = E[Xs Xt ] E[Xs ]E[Xt ] = E[Xs Xt ] et/2 es/2

= E[eWs +Wt ] et/2 es/2 = E[eWt Ws e2(Ws W0 ) ] et/2 es/2
= E[eWt Ws ]E[e2(Ws W0 ) ] et/2 es/2 = e e2s et/2 es/2
ts
2
e(t+s)/2 .
t+3s
= e 2
(b) Using Exercise 3.2.4 (b), we have

E[Xs Xt ] = E E[Xs Xt |Fs ] = E Xs E[Xt |Fs ]

= et/2 E Xs E[et/2 Xt |Fs ] = et/2 E Xs es/2 Xs ]
= e(ts)/2 E[Xs2 ] = e(ts)/2 E[e2Ws ]
= e(ts)/2 e2s = e
t+3s
2 ,
and continue like in part (a).

Exercise 3.2.6 Using the denition of the expectation we have

2 2 1 2 x2
E[e2Wt ] = e2x t (x) dx = e2x e 2t dx
2t
1 14t 2 1
= e 2t x dx = ,
2t 1 4t
1 4t > 0.
if

Otherwise, the integral is innite. We used the standard integral
2
eax = /a, a > 0.
Exercise 3.3.4 (a) It follows from the fact that Zt is normally distributed;
(b) Dierentiate the moment generating function and evaluate it at u = 0.
Exercise 3.3.5 Using the denition of covariance we have

Cov Zs , Zt = E[Zs Zt ] E[Zs ]E[Zt ] = E[Zs Zt ]
s t s t
= E Wu du Wv dv = E Wu Wv dudv
0 0 0 0
s t s t
= E[Wu Wv ] dudv = min{u, v} dudv
0 0 0 0
t s
= s2 , s < t.
2 6
Exercise 3.3.6 (a) Using Exercise 3.3.5
Cov(Zt , Zt Zth ) = Cov(Zt , Zt ) Cov(Zt , Zth )
t3 t th
= (t h)2 ( )
3 2 6
1 2
= t h + o(h).
2
t
(b) Using Zt Zth = th Wu du = hWt + o(h),
1
Cov(Zt , Wt ) = Cov(Zt , Zt Zth )
h
1 1 2 1
= t h + o(h) = t2 .
h 2 2
Exercise 3.3.7 Let s < u. Since Wt has independent increments, taking the
expectation in
eWs +Wt = eWt Ws e2(Ws W0 )
we obtain
E[eWs +Wt ] = E[eWt Ws ]E[e2(Ws W0 ) ] = e e2s
us
2
= e 2 es = e 2 emin{s,t} .
u+s u+s
t t
Exercise 3.3.8 (a) E[Xt ] = 0 E[eWs ] ds = 0 E[es/2 ] ds = 2(et/2 1)
(b) Since V ar(Xt ) = E[Xt2 ] E[Xt ]2 , it suces to compute E[Xt2 ]. Using
Exercise 3.3.7 we have
t t t t
2
E[Xt ] = E e ds
Wt
e du = E
Wu
eWs eWu dsdu
0 0 0 0
t t t t
E[eWs +Wu ] dsdu = e 2 emin{s,t} dsdu
u+s
=
0 0 0 0
u+s u+s
s
= e 2 e duds + e 2 eu duds
D
1 D2
1
4 3
e2t 2et/2 +
u+s
= 2 e 2 eu duds = ,
D2 3 2 2
where D1 {0 s < u t} and D2 {0 u < s t}. In the last identity we

applied Fubinis theorem. For nding the variance, use the formula V ar(Xt ) =
E[Xt2 ] E[Xt ]2 .
Exercise 3.3.9 (a) Splitting the integral at t and taking out the measurable
part, we have
T t T
E[ZT |Ft ] = E[ Wu du|Ft ] = E[ Wu du|Ft ] + E[ Wu du|Ft ]
0 0 t
T
= Zt + E[ Wu du|Ft ]
t
T
= Zt + E[ (Wu Wt + Wt ) du|Ft ]
t
T
= Zt + E[ (Wu Wt ) du|Ft ] + Wt (T t)
t
T
= Zt + E[ (Wu Wt ) du] + Wt (T t)
t
T
= Zt + E[Wu Wt ] du + Wt (T t)
t
= Zt + Wt (T t),
since E[Wu Wt ] = 0.
(b) Let 0 < t < T . Using (a) we have

E[ZT T WT |Ft ] = E[ZT |Ft ] T E[WT |Ft ]
= Zt + Wt (T t) T Wt
= Zt tWt .
Exercise 3.3.10 (a)

T T
2
E Ws ds|Ft = E (Ws Wt + Wt )2 ds|Ft
t t
T
= E (Ws Wt )2 ds|Ft
t
T T
+2E Wt (Ws Wt ) ds|Ft + E Wt2 ds|Ft
t t
T
= E (Ws Wt )2 ds
t
T
+2Wt E (Ws Wt ) ds + E Wt2 (T t)|Ft
t
T t T t
= E Wu2 du + 2Wt E Wu du + Wt2 (T t)
0 0
1
= (T t)2 + Wt2 (T t).
2
(b) Using part (a) we have
T 1
E YT |Ft = E Ws2 ds T WT2 + T 2 |Ft
t 2
T 1
= E Ws2 ds|Ft T E WT2 |Ft + T 2
t 2
t
1 1
= Ws2 ds + (T t)2 + Wt2 (T t) T Wt2 T (T t) + T 2
0 2 2
t
1
= Ws2 ds + tWt2 + t2 = Ys .
0 2
Exercise 3.4.1
t T
E[VT |Ft ] = E e 0 Wu du+ t Wu du |Ft
t T
= e 0 Wu du E e t Wu du |Ft
t T
= e 0 Wu du E e t (Wu Wt ) du+(T t)Wt |Ft
T
= Vt e(T t)Wt E e t (Wu Wt ) du |Ft
T
= Vt e(T t)Wt E e t (Wu Wt ) du
T t
= Vt e(T t)Wt E e 0 W d
3
1 (T t)
= Vt e(T t)Wt e 2 3 .
Exercise 3.6.1
F (x) = P (Yt x) = P (t + Wt x) = P (Wt x t)

xt
1 u2
= e 2t du;
0 2t
1 (xt)2
f (x) = F (x) = e 2t .
2t

1 x2
P (Rt ) = xe 2t dx,
0 t
use the inequality

x2 x2
1 < e 2t < 1
2t
to get the desired result.
Exercise 3.7.3 (a) The mean can be evaluated as

1 2 x2
E[Rt ] = xpt (x) dx = x e 2t dx
0 0 t
3 1 z
1 1/2 y
= y e 2t dy = 2t z 2 e dz
2t 0 0

3 2t
= 2t = .
2 2
(b) Since E[Rt2 ] = E[W1 (t)2 + W2 (t)2 ] = 2t, then
2t
V ar(Rt ) = 2t = 2t 1 .
4 4
Exercise 3.7.4

E[Xt ] 2t
E[Xt ] = = = 0, t ;
t 2t 2t
1 2
V ar(Xt ) = V ar(Rt ) = (1 ) 0, t .
t2 t 4
By Proposition 2.14.1 we get Xt 0, t in mean square.
Exercise 3.8.2
P (Nt Ns = 1) = (t s)e(ts)

= (t s) 1 (t s) + o(t s)
= (t s) + o(t s).
P (Nt Ns 1) = 1 P (Nt Ns = 0) + P (Nt Ns = 1)

= 1 e(ts) (t s)e(ts)

= 1 1 (t s) + o(t s)

(t s) 1 (t s) + o(t s)
= 2 (t s)2 = o(t s).
Exercise 3.8.6 Write rst as

Nt2 = Nt (Nt Ns ) + Nt Ns
= (Nt Ns )2 + Ns (Nt Ns ) + (Nt t)Ns + tNs ,
then
E[Nt2 |Fs ] = E[(Nt Ns )2 |Fs ] + Ns E[Nt Ns |Fs ] + E[Nt t|Fs ]Ns + tNs
= E[(Nt Ns )2 ] + Ns E[Nt Ns ] + E[Nt t]Ns + tNs
= (t s) + 2 (t s)2 + tNs + Ns2 sNs + tNs
= (t s) + 2 (t s)2 + 2(t s)Ns + Ns2
= (t s) + [Ns + (t s)]2 .
Hence E[Nt2 |Fs ] = Ns2 and hence the process Nt2 is not an Fs -martingale.
Exercise 3.8.7 (a)

mNt (x) = E[exNt ] = exk P (Nt = k)
k0
k tk
= exk et
k!
k0
= et ete = et(e
x x 1)
.
(b) E[Nt2 ] = mNt (0) = 2 t2 + t. Similarly for the other relations.

Exercise 3.8.8 E[Xt ] = E[eNt ] = mNt (1) = et(e1) .
Exercise 3.8.9 (a) Since exMt = ex(Nt t) = etx exNt , the moment generat-
ing function is
mMt (x) = E[exMt ] = etx E[exNt ]

= etx et(e
x 1) x x1)
= et(e .
(b) For instance

E[Mt3 ] = m
Mt (0) = t.
Since Mt is a stationary process, E[(Mt Ms )3 ] = (t s).

Exercise 3.8.10
V ar[(Mt Ms )2 ] = E[(Mt Ms )4 ] E[(Mt Ms )2 ]2

= (t s) + 32 (t s)2 2 (t s)2
= (t s) + 22 (t s)2 .
t t
t t2
Exercise 3.11.3 (a) E[Ut ] = E 0 Nu du = 0 E[Nu ] du = 0 u du = 2 .

Nt
t2 t2
(b) E Sk = E[tNt Ut ] = tt = .
2 2
k=1
Exercise 3.11.4 (a) Use associativity of addition.

(b) Use that T1 , , Tn are independent, exponentialy distributed, with E[Tk ] =
1/.
Exercise 3.11.6 The proof cannot go through because a product between a
constant and a Poisson process is not a Poisson process.
Exercise 3.11.7 Let pX (x) be the probability density of X. If p(x, y) is the
joint probability density of X and Y , then pX (x) = y p(x, y). We have

E[X|Y = y]P (Y = y) = xpX|Y =y (x|y)P (Y = y) dx
y0 y0
xp(x, y)
= P (Y = y) dx
P (Y = y)
y0

= x p(x, y) dx
y0

= xpX (x) dx = E[X].
Exercise 3.11.9 (a) Since Tk has an exponential distribution with parameter

E[eTk ] = ex ex dx = .
0 +
(b) We have
Ut = T2 + 2T3 + 3T4 + + (n 2)Tn1 + (n 1)Tn + (t Sn )n
= T2 + 2T3 + 3T4 + + (n 2)Tn1 + (n 1)Tn + nt
n(T1 + T2 + + Tn )
= nt [nT1 + (n 1)T2 + + Tn ].
(c) Using that the arrival times Sk , k = 1, 2, . . . n, have the same distribution
as the order statistics U(k) corresponding to n independent random variables
uniformly distributed on the interval (0, t), we get
Nt

E eUt Nt = n = E[e(tNt k=1 Sk ) |Nt = n]
n n
= ent E[e i=1 U(i)
] = ent E[e i=1 Ui
]
nt
= e E[e ] E[e
U1
] Un
t
1 1 t xn
= ent ex1 dx1 e dxn
t 0 t 0
(1 et )n
=
n tn
(d) Using Exercise 3.11.7 we have

E eUt ] = P (Nt = n)E eUt Nt = n
n0
e tn n (1 et )n
=
n! n tn
n0
t )/
= e(1e
Exercise 3.12.6 Use Doobs inequalityfor the submartingales Wt2 and |Wt |,
and use that E[Wt2 ] = t and E[|Wt |] = 2t/, see Exercise 3.1.15 (a).
Exercise 3.12.7 Divide by t in the inequality from Exercise 3.12.6 part (b).
Exercise 3.12.10 Let = n and = n + 1. Then
N 2 4(n + 1)
t
E sup
ntn=1 t n2
The result follows by taking n in the sequence of inequalities

N 2 N 2 4(n + 1)
t t
0E E sup
t ntn=1 t n2
Chapter 4
Exercise 4.1.2 We have

, if c t
{; () t} =
, if c > t
and use that , Ft .

Exercise 4.1.3 First we note that

{; () t} = {; |Ws ()| K}. (12.0.1)
0<s<t
This can be shown by double inclusion. Let As = {; |Ws ()| K}.

Let {; () t}, so inf{s > 0; |Ws ()| K} t. Then exists
|Wu ()| K, and hence Au .
u t such that
Let 0<s<t {; |Ws ()| K}. Then there is 0 < s < t such that
|Ws ()| K. This implies () s and since s t it follows that () t.
Since Wt is continuous, then (12.0.1) can also be written as

{; () t} = {; |Wr ()| K},
0<r<t,rQ
which implies {; () t} Ft since {; |Wr ()| K} Ft , for 0 < r t.

Hence is a stopping time. It is worth noting that P ( < ) = 1.

P ({; () < }) = P {; |Ws ()| > K} > P ({; |Ws ()| > K})
0<s

1 y2 2K
= 1 e 2s dy > 1 1, s .
|x|<K 2s 2s
1 1
Exercise 4.1.4 Let Km = [a + m, b m ]. We can write

{; t} = {; Xr
/ Km } Ft ,
m1 r<t,rQ
since {; Xr
/ Km } = {; Xr K m } Fr Ft .
Exercise 4.1.6 (a) No. The event A = {; Wt () has a local maximum at time t}
is not in {Ws ; s t} but is in {Ws ; s t + }.
Exercise 4.1.9 (a) We have {; c t} = {; t/c} Ft/c Ft .

(b) {; f ( ) t} = {; f 1 (t)} = Ff 1 (t) Ft , since f 1 (t) t.
(c) Apply (b) with f (x) = ex .
2
Exercise 4.1.11 If we let G(n) = {x; |x a| < n1 }, then {a} = n1 G(n).
Then n = inf{t 0; Wt G(n)} are stopping times. Since supn n = , then
is a stopping time.
Exercise 4.2.3 The relation is proved by verifying two cases:

(i) If {; > t} then ( t)() = t and the relation becomes
M () = Mt () + M () Mt ().
(ii) If {; t} then ( t)() = () and the relation is equivalent to

the obvious relation
M = M .
Exercise 4.2.5 Taking the expectation in E[M |F ] = M yields E[M ] =

E[M ], and then make = 0.
Exercise 4.3.9 Since Mt = WT2 t is a martingale with E[Mt ] = 0, by the

Optional Stopping Theorem we get E[Ma ] = E[M0 ] = 0, so E[W2a a ] = 0,
from where E[a ] = E[W2a ] = a2 , since Wa = a.
Exercise 4.3.10 (a) We have
F (a) = P (Xt a) = 1 P (Xt > a) = 1 P ( max Ws > a)

0st

2 y 2 /2
= 1 P (Ta t) = 1 e dy
2 |a|/ t

2 y 2 /2 2 y 2 /2
= e dy e dy
2 0 2 |a|/ t
|a|/t
2 2
= ey /2 dy.
2 0
2
(b) The density function is p(a) = F (a) = 2 ea /(2t) , a > 0. Then
2t

2 x2 /(2t) 2t
E[Xt ] = xp(x) dx = xe dy =
0 2t 0

2 2 4t 2
E[Xt2 ] = x2 ex /(2t) dx = y 2 ey dy
2t 0 0

2 1 1/2 u 1
= u e du = (3/2) = t
2t 2 0 2t
2
V ar(Xt ) = E[Xt2 ] E[Xt ]2 = t 1 .

2
2 e 4t
x
Exercise 4.3.11 (b) h(x) = t
2N x
2t
1 .
Exercise 4.3.16 It is recurrent since P (t > 0 : a < Wt < b) = 1.

1 1 t1
P (Wt > 0; t1 t t2 ) = P (Wt =
0; t1 t t2 ) = arcsin ,
2 t2
using the independence

1 t1 2
P (Wt1 > 0, Wt2 ) = P (Wt1 > 0)P (Wt2 > 0) = 2 arcsin .
t2
The probability

for Wt = (Wt1 , Wt2 ) to be in one of the quadrants is equal to
4 t1 2
arcsin .
2 t2
P (Xt goes up to ) = P (Xt goes up to before down to )

e2 1
= lim 2 = 1.
e e2
Exercise 4.5.3
P (Xt never hits ) = P (Xt goes up to before down to )

e2 1
= lim 2 .
e e2
Exercise 4.5.4 (a) Use that E[XT ] = p (1 p ); (b) E[XT2 ] = 2 p +

e2 1
2 (1 p ), with p = e2 e2
; (c) Use V ar(T ) = E[T 2 ] E[T ]2 .
Exercise 4.5.7 Since Mt = Wt2 t is a martingale, with E[Mt ] = 0, by the
Optional Stopping Theorem we get E[WT2 T ] = 0. Using WT = XT T
yields
E[XT2 2T XT + 2 T 2 ] = E[T ].
Then
E[T ](1 + 2E[XT ]) E[XT2 ]
E[T 2 ] = .
2
Substitute E[XT ] and E[XT2 ] from Exercise 4.5.4 and E[T ] from Proposition
4.5.5.
Exercise 4.6.11 See the proof of Proposition 4.6.3.
Exercise 4.6.12
d 1 E[XT ]
E[T ] = E[esT ]|s=0 = (p + (1 p )) = .
ds
Exercise 4.6.14 (b) Applying the Optional Stopping Theorem
E[ecMT T (e
c c1)
] = E[X0 ] = 1
E[ecaT f (c) ] = 1
E[eT f (c) ] = eac .
Let s = f (c), so c = (s). Then E[esT ] = ea(s) .

(c) Dierentiating and taking s = 0 yields
E[T ] = aea(0) (0)

1
= a = ,
f (0)
so E[T ] = .

(d) The inverse Laplace transform L1 ea(s) cannot be represented by
elementary functions.
Exercise 4.9.9 Use E[(|Xt | 0)2 ] = E[|Xt |2 ] = E[(Xt 0)2 ].

Exercise 4.10.3 (a) L = 1.
(b) A computation shows
E[(Xt 1)2 ] = E[Xt2 2Xt + 1] = E[Xt2 ] 2E[Xt ] + 1

= V ar(Xt ) + (E[Xt ] 1)2 .
(c) Since E[Xt ] = 1, we have E[(Xt 1)2 ] = V ar(Xt ). Since
V ar(Xt ) = et E[eWt ] = et (e2t et ) = et 1,
then E[(Xt 1)2 ] does not tend to 0 as t .

Exercise 4.11.4 Since Mt and Nt are martingales, then Mt + Nt and Mt Nt
are also martingales. Then (Mt +Nt )2 M +N, M +N t and (Mt Nt )2 M
N, M N t are martingales. Subtracting, yields Mt Nt M, N t martingale.
Exercise 4.11.8 (a) E[(dWt )2 dt2 ] = E[(dWt )2 ] dt2 = 0.
(b) V ar((dWt )2 dt) = E[(dWt2 dt)2 ] = E[(dWt )4 2dtdWt + dt2 ]

= 3st2 2dt 0 + dt2 = 4dt2 .
Exercise 4.12.2 (a) dt dNt = dt(dMt + dt) = dt dMt + dt2 = 0

(b) dWt dNt = dWt (dMt + dt) = dWt dMt + dWt dt = 0.
Chapter 5
Exercise 5.2.3 (a) Use either the denition or the moment generation func-
tion to show that E[Wt4 ] = 3t2 . Using stationarity, E[(Wt Ws )4 ] = E[Wts
4
]=
2
3(t s) .
T
Exercise 5.4.1 (a) E[ dWt ] = E[WT ] = 0.
T 0
1 1
(b) E[ Wt dWt ] = E[ WT2 T ] = 0.
0 2 2
T
T
1 1 1 T2
(c) V ar( Wt dWt ) = E[( Wt dWt )2 ] = E[ WT2 + T 2 T WT2 ] = .
0 0 4 4 2 2
T
Exercise 5.6.3 X N (0, 1 1t dt) = N (0, ln T ).
T
Exercise 5.6.4 Y N (0, 1 tdt) = N 0, 12 (T 2 1)
Exercise
t 5.6.5 Normally distributed with zero mean and variance
1
e2(ts) ds = (e2t 1).
0 2
Exercise 5.6.6 Using the property of Wiener integrals, both integrals have
7t3
zero mean and variance .
3
Exercise 5.6.7 The mean is zero and the variance is t/3 0 as t 0.
Exercise 5.6.8 Since it is a Wiener integral, Xt is normally distributed with
zero mean and variance
t
bu 2 b2
a+ du = (a2 + + ab)t.
0 t 3
b2
Hence a2 + 3 + ab = 1.
t
Exercise 5.6.9 Since both Wt and 0 f (s) dWs have the mean equal to zero,
t t t t
Cov Wt , f (s) dWs = E[Wt , f (s) dWs ] = E[ dWs f (s) dWs ]
0 0 0 0
t t
= E[ f (u) ds] = f (u) ds.
0 0
The general result is

t t
Cov Wt , f (s) dWs = f (s) ds.
0 0
Choosing f (u) = un yields the desired identity.

Exercise 5.8.6 Apply the expectation to

Nt 2
Nt
Nt
f (Sk ) = f 2 (Sk ) + 2 f (Sk )f (Sj ).
k=1 k=1 k=j
Exercise 5.9.1 We have

T kT
E eks dNs = (e 1)
0 k
T 2kT
V ar eks dNs = (e 1).
0 2k
Chapter 6
t
Exercise 6.1.6 Let Xt = 0 eWu du. Then
X tdX X dt teWt dt Xt dt 1 Wt
t t t
dGt = d = = = e Gt dt.
t t2 t2 t
Exercise 6.2.4
(a) eWt (1 + 12 Wt )dt + eWt (1 + Wt )dWt ;
(b) (6Wt + 10e5Wt )dWt + (3 + 25e5Wt )dt;
2 2
(c) 2et+Wt (1 + Wt2 )dt + 2et+Wt Wt dWt ;

(d) n(t + Wt )n2 (t + Wt + n12 )dt + (t + W t )dW t ;

1 1 t
(e) Wt Wu du dt;
t t 0
t
1
(f ) eWt eWu du dt.
t t 0
Exercise 6.2.5
d(tWt2 ) = td(Wt2 ) + Wt2 dt = t(2Wt dWt + dt) + Wt2 dt = (t + Wt2 )dt + 2tWt dWt .
Exercise 6.2.6 (a) tdWt + Wt dt;

(b) et (Wt dt + dWt );
(c) (2 t/2)t cos Wt dt t2 sin Wt dWt ;
(d) (sin t + Wt2 cos t)dt + 2 sin t Wt dWt ;
Exercise 6.2.9 It follows from (6.2.11).
Exercise 6.2.10 Take the conditional expectation in
t
Mt2 = Ms2 +2 Mu dMu + Nt Ns
s
and obtain
t
E[Mt2 |Fs ] = Ms2 + 2E[ Mu dMu |Fs ] + E[Nt |Fs ] Ns
s
= Ms2 + E[Mt + t|Fs ] Ns
= Ms2 + Ms + t Ns
= Ms2 + (t s).
Exercise 6.2.12 Integrating in (6.2.12) yields

t t
f f
Ft = Fs + dWt1 + dWt2 .
s x s y
One can check that E[Ft |Fs ] = Fs .

Exercise 6.2.13 (a) dFt = 2Wt1 dWt1 + 2Wt2 dWt2 + 2dt;
2Wt1 dWt1 + 2Wt2 dWt2
(b) dFt = .
(Wt1 )2 + (Wt2 )2
f
Exercise 6.2.14 Consider the function f (x, y) = x2 + y 2 . Since =
x
x f y 1
, = , f = , we get
2
x +y 2 y 2
x +y 2 2 x + y2
2
f f W1 W2 1
dRt = dWt1 + dWt2 + f dt = t dWt1 + t dWt2 + dt.
x y Rt Rt 2Rt
Chapter 7
Exercise 7.2.4 (a) Use integration formula with g(x) = tan1 (x).
T T T
1 1 1 1 2Wt
dWt = (tan ) (Wt ) dWt = tan WT + dt.
0 1 + Wt2 0 2 0 (1 + Wt )2
1 T
(b) Use E 2 dW t = 0.
0 1 + Wt
x
(c) Use Calculus to nd minima and maxima of the function (x) = .
(1 + x2 )2
Exercise 7.2.5 (a) Use integration by parts with g(x) = ex and get
T T
1
Wt
e dWt = e WT
1 eWt dt.
0 2 0
(b) Applying the expectation we obtain

T
1
E[eWT ] = 1 + E[eWt ] dt.
2 0
If let (T ) = E[eWT ], then satises the integral equation

T
1
(T ) = 1 + (t) dt.
2 0
Dierentiating yields the ODE (T ) = 12 (T ), with (0) = 1. Solving yields

(T ) = eT /2 .
Exercise 7.2.6 (a) Apply integration by parts with g(x) = (x 1)ex to get
T T T
1
Wt eWt dWt = g (Wt ) dWt = g(WT ) g(0) g (Wt ) dt.
0 0 2 0
(b) Applying the expectation yields

1 T Wt
E[WT e WT
] = E[e ] 1 +
WT
E[e ] + E[Wt eWt ] dt
2 0
T
1
= eT /2 1 + et/2 + E[Wt eWt ] dt.
2 0
Then (T ) = E[WT eWT ] satises the ODE (T ) (T ) = eT /2 with (0) = 0.

Exercise 7.2.7 (a) Use integration by parts with g(x) = ln(1 + x2 ).
(e) Since ln(1 + T ) T , the upper bound obtained in (e) is better than the
one in (d), without contradicting it.
Exercise 7.3.3 By straightforward computation.
Exercise 7.3.4 By computation.

Exercise 7.3.15 (a) e2 sin( 2W1 ); (b) 12 e6 sin(2W3 ); (c) 1 (e 2W4 4
2
1).
Exercise 7.3.16 Apply Itos formula to get
1
d(t, Wt ) = (t (t, Wt ) + x2 (t, Wt ))dt + x (t, Wt )dWt
2
= G(t)dt + f (t, Wt ) dWt .
Integrating between a and b yields

b b
(t, Wt )|ba = G(t) dt + f (t, Wt ) dWt .
a a
Chapter 8
t
Exercise 8.2.2 (a) Xt = 1 + sin t 0 sin s dWs , E[Xt ] = 1 + sin t, V ar[Xt ] =
t 2 1
0 (sin s) ds = 2
t
4 sin(2t);
t 2
(b) Xt = e 1 + 0 s dWs , E[Xt ] = et 1, V ar[Xt ] = t2 ;
t
4
(c) Xt = 1 + 12 ln(1 + t2 ) + 0 s3/2 dWs , E[Xt ] = 1 + 12 ln(1 + t2 ), V ar(Xt ) = t4 .
t
Exercise 8.3.6 (a) Xt = 13 Wt3 tWt + et ; (b) Xt = 13 Wt3 tWt cos t;

3 2
(c) Xt = eWt 2 + t3 ; (d) Xt = et/2 sin Wt + t2 + 1.
t
Exercise 8.3.7 (a) Xt = 14 Wt4 + tWt ; (b) Xt = 12 Wt2 + t2 Wt 2t ;

(c) Xt = et Wt cos Wt + 1; (d) Xt = teWt + 2.
Exercise 8.4.5 (a)
dXt = (2Wt dWt + dt) + Wt dt + tdWt

= d(Wt2 ) + d(tWt ) = d(tWt + Wt2 )
so Xt = tWt + Wt2 .
(b) We have
1 1
dXt = (2t Wt )dt + dWt
t2 t
1 1
= 2tdt + d( Wt ) = d(t2 + Wt ),
t t
so Xt = t2 + 1t Wt 1 W1 .
1
(c) dXt = et/2 Wt dt + et/2 dWt = d(et/2 Wt ), so Xt = et/2 Wt .
2
(d) We have
dXt = t(2Wt dWt + dt) tdt + Wt2 dt

1
= td(Wt2 ) + Wt2 dt d(t2 )
2
t2
= d tWt2 ,
2
2
so Xt = tWt t2 .

(e) dXt = dt + d( tWt ) = d(t + tWt ), so Xt = t + tWt W1 .
t
4t 1 4t
Exercise 8.5.4 (a) Xt = X0 e + (1 e ) + 2 e4(ts) dWs ;
4 0
3t 2 3t 3t
(b) Xt = X0 e + (1 e ) + e Wt ;
3
1 t
(c) Xt = e (X0 + 1 + Wt2 ) 1;
t
2 2
t 1
(d) Xt = X0 e4t (1 e4t ) + e4t Wt ;
4 16
(e) Xt = X0 et/2 2t 4 + 5et/2 et cos Wt ;
(f ) Xt = X0 et + et Wt .
t t
dWs + 12 2 ds
Exercise 8.8.2 (a) The integrating factor is t = e 0 0 =
2
Wt + 2 t
e , which transforms the equation in the exact form d(t Xt ) = 0.
2
Then t Xt = X0 and hence Xt = X0 eWt 2
t
.
2
Wt + 2 t
(b) t = e , d(t Xt ) = t Xt dt, dYt = Yt dt, Yt = Y0 et , t Xt = X0 et ,
2
(1 2 )t+Wt
Xt = X0 e .
Exercise 8.8.3 t dAt = dXt At dt, E[At ] = 0, V ar(At ) = E[A2t ] =

1 t 2 X02
t2 0 E[Xs ] ds = t .
t t
Exercise 8.10.1 Integrating yields Xt = X0 + 0 (2Xs + e2s ) ds + 0 b dWs .
Taking the expectation we get
t
E[Xt ] = X0 + (2E[Xs ] + e2s ) ds.
0
Dierentiating we obtain f (t) = 2f (t) + e2t , where f (t) = E[Xt ], with f (0) =
X0 . Multiplying by the integrating factor e2t yields (e2t f (t)) = 1. Inte-
grating yields f (t) = e2t (t + X0 ).
Exercise 8.10.6 (a) Using product rule and Itos formula, we get
1
d(Wt2 eWt ) = eWt (1 + 2Wt + Wt2 )dt + eWt (2Wt + Wt2 )dWt .
2
Integrating and taking expectations yields
t
1
E[Wt2 eWt ] = E[eWs ] + 2E[Ws eWs ] + E[Ws2 eWs ] ds.
0 2
Since E[eWs ] = et/2 , E[Ws eWs ] = tet/2 , if let f (t) = E[Wt2 eWt ], we get by
dierentiation
1
f (t) = et/2 + 2tet/2 + f (t), f (0) = 0.
2
Multiplying by the integrating factor et/2 yields (f (t)et/2 ) = 1 + 2t. Inte-

grating yields the solution f (t) = t(1 + t)et/2 .
(b) Similar method.
Exercise 8.10.8 (a) Using Exercise 3.1.17
E[Wt4 3t2 |Ft ] = E[Wt4 |Ft ] 3t2

= 3(t s)2 + 6(t s)Ws2 + Ws4 3t2
= (Ws4 3s2 ) + 6s2 6ts + 6(t s)Ws2 = Ws4 3s2 .
Hence Wt4 3t2 is not a martingale.

(b) E[Wt3 |Fs ] = Ws3 + 3(t s)Ws = Ws3 , and hence Wt3 is not a martingale.
Exercise 8.10.10 (a) Similar method as in Example 8.10.9.

(b) Applying Itos formula
1
d(cos(Wt )) = sin(Wt )dWt 2 cos(Wt )dt.
2
2
Let f (t) = E[cos(Wt )]. Then f (t) = 2 f (t), f (0) = 1. The solution is
2
f (t) = e 2 t .
(c) Since sin(t + Wt ) = sin t cos(Wt ) + cos t sin(Wt ), taking the expectation
and using (a) and (b) yields
2
E[sin(t + Wt )] = sin tE[cos(Wt )] = e 2
t
sin t.
(d) Similarly starting from cos(t + Wt ) = cos t cos(Wt ) sin t sin(Wt ).

Exercise 8.10.11 From Exercise 8.10.10 (b) we have E[cos(Wt )] = et/2 .
From the denition of expectation

1 x2
E[cos(Wt )] = cos x e 2t dx.
2t
Then choose t = 1/2 and t = 1 to get (a) and (b), respectively.

Exercise 8.10.12 (a) Using a standard method involving Itos formula we
2
can get E(Wt ebWt ) = bteb t/2 . Let a = 1/(2t). We can write

ax2 +bx 1 x2
xe dx = 2t xebx e 2t dx
2t

bWt
b2 t/2 b b2 /(4a)
= 2tE(Wt e ) = 2tbte = e .
a 2a
The same method for (b) and (c).

Wt2n t2n E[Wt2n ]t2n

E[cos(tWt )] = E (1)n = (1)n
(2n)! (2n)!
n0 n0
t2n (2n)!tn t3n
= (1)n = (1)n
(2n)! 2n n! 2n n!
n0 n0
t3 /2
= e .
(b) Similar computation using E[Wt2n+1 ] = 0.

Chapter 9
n
Exercise 9.2.2 A = 12 = 1
2
2
k=1 k .
X1 (t) = x01 + W1 (t)
t
0
X2 (t) = x2 + X1 (s) dW2 (s)
0
t
0 0
= x2 + x1 W2 (t) + W1 (s) dW2 (s).
0
Exercise 9.5.2 (c) E[ ] is maximum if bx0 = x0 a, i.e. when x0 = (a+b)/2.

The maximum value is (b a)2 /4.
Exercise 9.6.1 Let k = min(k, ) and k . Apply Dynkins formula
for k to show that
1
E[k ] (R2 |a|2 ),
n
and take k .
Exercise 9.6.3 x0 and x2n .
Exercise 9.7.1 (a) We have a(t, x) = x, c(t, x) = x, (s) = xest and
T
u(t, x) = t xest ds = x(eT t 1).
2 2
(b) a(t, x) = tx, c(t, x) = ln x, (s) = xe(s t )/2 and
T
2 2 T t2
u(t, x) = ln xe(s t )/2 ds = (T t) ln x + (T + t) .
t 6 3
Exercise 9.7.2 (a) u(t, x) = x(T t) + 12 (T t)2 .

3
(b) u(t, x) = 23 ex e 2 (T t) 1 .
(c) We have a(t, x) = x, b(t, x) = x, c(t, x) = x. The associated diusion is
dXs = Xs ds + Xs dWs , Xt = x, which is the geometric Brownian motion
1 2 )(st)+(W
Xs = xe( 2 s Wt )
, s t.
The solution is
T 1 2 )(st)+(W

u(t, x) = E xe( 2 s Wt )
ds
t
T 1
= x e( 2 (st))(st) E[e(st) ] ds
t
T 1 2 (st)/2
= x e( 2 (st))(st) e ds
t
T
x (T t)
= x e(st) ds = e 1 .
t
Chapter 10
Exercise 10.1.7 Apply Example 10.1.6 with u = 1.
t t
Exercise 10.1.8 Xt = 0 h(s) dWs N (0, 0 h2 (s) ds). Then eXt is log-
1 1
t
h(s)2 ds
normal with E[eXt ] = e 2 V ar(Xt ) = e 2 0 .
Exercise 10.1.9 (a) Using Exercise 10.1.8 we have
t t
u(s) dWs 12 u(s)2 ds
E[Mt ] = E[e 0 e 0 ]
t t t t
12 u(s)2 ds 1
u(s)2 ds 1
u(s)2 ds
= e 0 E[e 0
u(s) dWs
] = e 2 0 e2 0 = 1.
(b) Similar computation as (a).

Exercise 10.1.10 (a) Applying the product and Itos formulas we get
d(et/2 cos Wt ) = et/2 sin Wt dWt .
Integrating yields
t
et/2
cos Wt = 1 es/2 sin Ws dWs ,
0
which is an Ito integral, and hence a martingale; (b) Similarly.

Exercise 10.1.12 Use that the function f (x1 , x2 ) = ex1 cos x2 satises f =
0.
Exercise 10.1.14 (a) f (x) = x2 ; (b) f (x) = x3 ; (c) f (x) = xn /(n(n 1));
(d) f (x) = ecx ; (e) f (x) = sin(cx).
t a2 s
Exercise 10.3.10 Let Xt = a 0 e 2 dBs . The quadratic variation is
t
2
X, Xt = (dXs )2 = aa t 1.
0
Then apply Theorem 10.3.1.

Exercise 10.3.11
t (a) By direct computation;
(b) Let Xt = 0 equ dWu . The quadratic variation is
t
e2qt 1
X, Xt = e2qs ds = .
0 2q
Applying Theorem 10.3.1 yields the result.

Exercise 10.4.15 (c) We evaluate EQ [Xt ] in two ways. On one side EQ [Xt ] =
0, because Xt is a Q-Brownian motion. On the other side, using Girsanov
theorem
t2 T 2 T 3

EQ [Xt ] = EP [Xt MT ] = EP + Wt e 0 s dWs e 6
2
t2 2
6 T 3
t T
= +e EP [Wt e 0 s dWs
]EP [e t s dWs
]
2
t2 t t3
= + EP [Wt e 0 s dWs ]e 6 .
2
Equating to zero yields
t t2 t3
EP [Wt e 0
s dWs
]= e 6 .
2
Chapter 11
Exercise 11.2.2 (a) Integrating yields
t
vt = gt + vs dWs ,
0
so E[vt ] = gt.
(b) at = g + vt Nt , E[at ] = g.
1 2
(c) Multiply by the integrating factor t = eWt + 2 t and obtain the exact
equation
d(t vt ) = t g dt.
Integrating we get
t
Wt 12 2 t 1 2
vt = ge eWs + 2 s ds.
0
Exercise 11.2.3 (a) Solving as a linear equation yields

t
2t 2t
vt = e + 0.3e e2s dWs ,
0
and hence vt is normally distributed with mean = e2t and variance 2 =

0.09 4t ). In our case = 0.00247875 and 2 = 0.0224999, = 0.15.
4 (1 e
Then v
3
P (v3 < 0.5) = P < 0.650 = 0.74.

Exercise 11.3.1 (a) Substitute t = h in formula

t
t t
N (t) = N (0)e + e es dWs .
0
Then take the expectation in

h
1 h
N (0) + es dWs = e N (0)
0 2
and obtain N (0) = 12 E[eh ]N (0), which implies the desired result.
(b) Jensens inequality for the random variable h becomes E[eh ] eE[h] .
This can be written as 2 eE[h] .
Exercise 11.3.2 (a) Use that N (t) = N (0)et . (b) t = (ln 0.9)/.
Exercise 11.4.1 (a) Q0 (t) = c1 et + c2 e2t . (c) Qt = Qpt + Q0 (t).
Exercise 11.4.2 (a) Let Zt = (Xt , Yt )T and write the equation as dZt =
AZt + KdWt and solve it as a linear equation. (b) Use substitutions Xt = t ,
Yt = t .
Exercise 11.5.1 (a) Pt is log-normally distributed, with
t 2

P (Pt x) = P P0 e 0 a(s)ds t/2+Bt

1 x t 1 t
= P Bt ln + a(s)ds
P0 2 0
1 t
x t 1
= FBt ln + a(s)ds ,
P0 2 0
2
1 e 2t
u
where FBt = 2t
.
Exercise 11.5.3 The noisy rate is rt = a(t) + dWt

dt , so a(t) = t2 . The expec-
T
tation is given by E[PT ] = P0 e 0 a(s) ds
= P0 eT 3 /3 . Then T = (3 ln 2)1/3 .
Exercise 11.7.1 dCt = a(C0 Ct )dt + bEdt + bdBt + dWt .
Exercise 11.7.2 Use formula (11.7.37).
Bibliography
[1] B. Alshamary and O. Calin. Stochastic optimization approach of car value

depreciation. Applied Stochastic Models in Business and Industry, vol.29,
3, 2012, pp.208-223.
[2] B. Alshamary and O. Calin. Pricing a stochastic car value depreciation

deal. Applied Stochastic Models in Business and Industry, vol.30, 4, 2014,
pp. 509-516.
[3] V.I. Arnold. Ordinary Dierential Equations. MIT Press, Cambridge,

MA, London, 1973.
[4] A. Bain and D. Crisan. Fundamentals of Stochastic Filtering. Springer,

2009.
[5] M. Baxter and A. Renie. Financial Calculus. Cambridge University Press,

1996.
[6] J. Bertoin. Levy Processes. Cambridge University Press, 121, 1996.
[7] W.E. Boyce and C. DiPrima. Elementary Dierential Equations and

Boundary Value Problems, 6th ed. John Wiley and Sons, Inc., 1997.
[8] R. Brown. A brief account of microscopical observations made in the

months of June, July and August, 1827, on the particles contained in
the pollen of plants; and on the general existence of active molecules in
organic and inorganic bodies. Phil. Mag. 4, pp. 161-173, 1828.
[9] Z. Brzezniak and T. Zastawniak. Basic Stochastic Processes. Springer,

London, Berlin, Heidelberg, 1999.
[10] O. Calin, D.C. Chang, K. Furutani, and C. Iwasaki. Heat Kernels for
Elliptic and Sub-elliptic Operators. Birkhauser, Applied and Numerical
Harmonic Analysis, 2011.
[11] J. R. Cannon. The One-dimensional Heat Equation. Encyclopedia of

Mathematics and Its Applications, 23, Cambridge University Press, 2008.
309
[12] K.L. Chung and R. Williams. Introduction to Stochastic Integration.

Birkhauser, 1990.
[13] W. D. D. Wackerly, Mendenhall, and R. L. Scheaer. Mathematical Statis-

tics with Applications, 7th ed. Brooks/Cole, 2008.
[14] J.L. Doob. Stochastic Processes. John Wiley and Sons, 1953.
[15] R. Durrett. Stochastic Calculus. CRC Press, 1996.
[16] E. B. Dynkin. Markov Processes I, II. Springer-Verlag, 1965.
[17] A. Einstein. Uber die von der molekularkinetischen Theorie der

Warme geforderte Bewegung von in ruhenden Flussigkeiten suspendierten
Teilchen. Annalen der Physik, 322 (8), pp. 549-560, 1905.
[18] I. Fenyes. Eine wahrscheinlichkeitstheoretische Begr

undung und Inter-
pretation der Quantenmechanik. Zeitschrift f ur Physik, 132, 1952, pp.
81-106.
[19] D. Freedman. Brownian Motion and Diusion. Springer-Verlag, 1983.
[20] C. W. Gardiner. Handbook of Stochastic Processes, 3nd ed. Springer,

2004.
[21] P.R. Halmos. Measure Theory. Van Nostrand Company, Inc., 1950.
[22] T. Hida. Brownian Motion. Springer-Verlag, 1980.
[23] N. Ikeda and S. Watanabe. Stochastic Dierential Equations and Difus-

sion Processes. North-Holland (2nd ed., 1989.
[24] K. Ito. Stochastic Integral. Proc. Imp. Acad. Tokyo, 20, 1944, pp. 519-524.
[25] K. Ito. On a stochastic integral equation. Proc. Imp. Acad. Tokyo, 22,
1946, pp. 32-35.
[26] I. Karatzas and S. E. Shreve. Brownian Motion and Stochastic Calculus,

2nd ed. Springer-Verlag, 1991.
[27] F. Knight. Essentials of Brownian Motion and Diusion. AMS, 1981,

1989.
[28] A. N. Kolmogorov. Grundbegrie der Wahrscheinlichkeitsrechnung.

Ergeb. Math. 2, 1933.
[29] N.V. Krylov and A.K.Zvonkin. On strong solutions of stochastic dier-

ential equations. Sel. Math. Sov. I, pp. 19-61, 1981.
Bibliography 311
[30] H.H. Kuo. Introduction to Stochastic Integration. Springer, Universitext,

2006.
[31] J. Lamperti. Probability. W. A. Benjamin, Inc., 1966.
[32] P. Langevin. Sur la theorie du mouvement brownien. C. R. Acad. Sci.

(Paris) 146, pp.530-533, 1908.
[33] P. Levy. Processus Stochastiques et Mouvement Brownien. Gauthier-

Villars, Paris, 1948.
[34] P. Morters and Y. Peres. Brownian Motion. Cambridge University Press,

2010.
[35] S. Neftici. Mathematics of Financial Derivatives. Academic Press, 1996.
[36] E. Nelson. Dynamical Theories of Brownian Motion. Princeton University

Press, 2nd ed., 1967.
[37] B. ksendal. Stochastic Dierential Equations, An Introduction with Ap-

plications, 6th ed. Springer-Verlag Berlin Heidelberg New-York, 2003.
[38] J.W. Pitman. One-dimensional Brownian motion and the three-

dimensional Bessel process. Adv. Appl. Probab. 7 pp. 511-526, 1975.
[39] F.A.A. Postali and P. Picchetti. Geometric Brownian Motion and struc-
tural breaks in oil prices: A quantitative analysis. Energy Economics, 28,
pp. 506-522, 2006.
[40] P. Protter. Stochastic Integration and Dierential Equations. 2nd ed.

Springer-Verlag, 2004.
[41] D. Revuz and M. Yor. Continuous Martingales and Brownian Motion,

3th ed. Springer, 2005.
[42] S. O. Rice. Mathematical Analysis of Random Noise. Bell Syst. Tech.J.,

23, 24 pp. 282-332, 46-156, 1944, 1945.
[43] S. M. Ross. Stochastic Processes. Second ed., John Wiley & Sons, Inc.,
1996.
[44] W. Schottky. Uber spontane Stromschwankungen in verschiedenen Elek-

trizitatsleitern. Ann. Phys. 57, pp. 451-567, 1918.
[45] D. Sondermann. Introduction to Stochastic Calculus for Finance.

Springer, Lecture Notes in Economics and Mathematical Systems, 2006.
[46] D.V. Widder. The Heat Equation. Academic Press, London, 1975.
[47] N. Wiener. Dierential Space. J. Math. Phys. 58, 1923, pp.131-174.
[48] M. Yor. On some exponential functionals of Brownian motion. Adv.

Appl. Prob., 24, pp. 509-531, 1992.
Index
Ft -adapted, 63 crowded environment, 256

cumulative cost, 211
rst passage of time, 79
Pearson distribution, 81 Dambis, Dubins and Schwarz, 225
Dirac distribution, 242
almost certain limit, 104 directional derivative, 199
almost sure limit, 37 discount bond, 261
annulus, 208 distribution, 6, 242
Arc-sine Law, 84 distribution function, 16, 60
dominated convergence theorem, 16
barrier, 82 Doobs
Bessel process, 58, 209, 223, 228 martingale inequality, 72
Beta distribution, 24 submartingale inequality, 71
Boltzmann, 264 double barrier, 97
bounded variation, 6 drift rate, 166
Brown, Robert, 45 Dynkins formula, 203, 204, 208
Brownian
bridge, 57, 166, 181, 183, 227 eiconal equation, 224
motion, 45, 58, 102, 190 equivalence in law, 100
motion with drift, 58 Euclidean distance, 58, 102
event, 12
catastrophic environment, 258 exact equation, 172
Central Limit Theorem, 1, 46 existence, 189
characteristic exit time, 206
curve, 212 expectation, 18
function, 14 exponential
Cherno bounds, 35, 93 Brownian motion, 52
cholesterol level, 262 growth model, 255
compensated Poisson process, 63, 98 integrated Brownian motion, 57
conditional process, 218
distribution, 68
expectations, 27 ltering theory, 265
correlation coecient, 20 Fubinis theorem, 19, 231
cost function, 211 Fundamental Theorem of Calculus, 191,
covariance, 19, 63 220
313
Gamma distribution, 23 limit

generator, 200 in distribution, 39, 104
of an Ito diusion, 200 in probability, 38, 104
geometric Brownian motion, 51 Linear stochastic dierential equation,
Girsanovs theorem, 215, 232, 236 178
Gronwalls inequality, 196 Lipschitz, 189
log-normal distribution, 22
hitting time, 88 logistic
function, 257
independence, 17
model, 2
independent
lookback option, 79
increments, 45
information eld, 42 Markovs inequality, 34, 183
input process, 265 martingale, 43
integrated martingale convergence theorem, 106
Brownian motion, 53 mean square limit, 37, 104
Poisson process, 66 mean-reverting, 181
integrating factor, 187 measurable function, 13
integration, 14 moment generating function, 22
interarrival times, 64 monotone convergence theorem, 15
Inverse Gaussian distribution, 25
Ito noisy observations, 268
diusion, 145, 199 noisy pendulum, 250
integral, 169 normal distribution, 21
process, 233 Novikovs condition, 218
Itos
formula, 170, 188, 192, 200 Optional Stopping Theorem, 77, 205,
208
Jensens inequality, 32, 46, 197, 239, Ornstein-Uhlenbeck process, 180, 183,
249 184, 229, 249
Kalman-Bucy lter, 265, 267 Pearl, 256

Kolmogorovs backward equation, 204, Pearson distribution, 24
220 pendulum, 3
photon, 264
Levy, 98 Picard, 189
processes, 46 Pitman, 98
theorem, 221 Poisson
Lampertis property, 227 distribution, 24
Langevins equation, 248 process, 61, 183
Laplace transform, 91 population growth, 1, 255
Laplacian, 220 probability
Leibnizs formula, 81 density function, 16
Index 315
distribution function, 16 Strong Law of Large Numbers, 37

space, 11 symmetric, 108
quadratic variation, 108 time change, 225, 230, 249

total variation, 111
radioactive decay, 247 transience, 206
Radon-Nikodym theorem, 29 transient, 84
random variable, 13
recurrence, 206 uniform distribution, 68
reection principle, 80, 99 uniqueness, 189
running
maximum, 98 variation of parameters, 184
minimum, 102 Verhust, 256
volatility, 166
sample space, 11
scaled Brownian motion, 226 waiting times, 66
squared Bessel process, 222 Walds
standard deviation, 19 distribution, 60
stationary, 45 identities, 91
stochastic Weibull, 60
harvesting, 261 white noise, 5, 241
kinematics, 244 Wiener
processes, 41 integral, 244
stopping process, 47
theorem for martingales, 77
zero-coupon bonds, 3, 261
time, 73

Calin o An Informal Introduction To Stochastic Calculus With

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Calin o An Informal Introduction To Stochastic Calculus With

Enviado por

Direitos autorais:

Formatos disponíveis

An Informal Introduction to

9620_9789814678933_tp.indd 1 4/6/15 9:39 am

9620_9789814678933_tp.indd 2 4/6/15 9:39 am

Library of Congress Cataloging-in-Publication Data

British Library Cataloguing-in-Publication Data

Copyright 2015 by World Scientific Publishing Co. Pte. Ltd.

RokTing - An Informal Introduction to Stochastic Calculus.indd 1 8/5/2015 3:41:36 PM

vi An Informal Introduction to Stochastic Calculus with Applications

Heartfelt thanks go to the reviewers who made numerous comments and

O. Calin Michigan, January 2015

List of Notations and Symbols

The following notations have been frequently used in the text.

x An Informal Introduction to Stochastic Calculus with Applications

List of Notations and Symbols ix

1 A Few Introductory Problems 1

xii An Informal Introduction to Stochastic Calculus with Applications

2.15 Properties of Mean-Square Limit . . . . . . . . . . . . . . . . . 40

3 Useful Stochastic Processes 45

4 Properties of Stochastic Processes 73

5 Stochastic Integration 117

5.7 Poisson Integration . . . . . . . . . . . . . . . . . . . . . . . . . 129

6 Stochastic Dierentiation 139

7 Stochastic Integration Techniques 149

8 Stochastic Dierential Equations 165

9 Applications of Brownian Motion 199

xiv An Informal Introduction to Stochastic Calculus with Applications

10 Girsanovs Theorem and Brownian Motion 215

11 Some Applications of Stochastic Calculus 241

12 Hints and Solutions 277

A Few Introductory Problems

1.1 Stochastic Population Growth Models

2 An Informal Introduction to Stochastic Calculus with Applications

A Few Introductory Problems 3

1.2 Pricing Zero-coupon Bonds

1.3 Noisy Pendulum

F (t) = f (t) + (noise).

1.4 Diusion of Particles

4 An Informal Introduction to Stochastic Calculus with Applications

1.5 Cholesterol Level

1.6 Electron Motion

A Few Introductory Problems 5

This type of description of electrons is usually seen in stochastic mechanics.

1.7 White Noise

All aforementioned problems involved a noise inuence. This noise is di-

6 An Informal Introduction to Stochastic Calculus with Applications

The solution depends on the Brownian motion starting at x, i.e (t) = x + Bt .

1.8 Bounded and Quadratic Variation

A Few Introductory Problems 7

Figure 1.3: (a) Smooth. (b) Rough. (c) Very rough.

Furthermore, if f is continuously dierentiable, the computation can be con-

is bounded above by a given constant.

The total variation of f on [a, b] is dened by

The amount V (f ) measures in a certain sense the roughness of the function.

8 An Informal Introduction to Stochastic Calculus with Applications

We note that if f is continuously dierentiable, then the total variation

Proposition 1.8.2 Let f : [a, b] R be a function. Then the graph y = f (x)

Proof: Consider the simplifying notations (f )k = f (xk+1 ) f (xk ) and

and then applying the sup yields

which implies the desired conclusion.

since the area under the curve t |Nt | is innite.

A Few Introductory Problems 9

It is worth noting that if f has bounded total variation, V (f ) < , then

V (2) (f ) max |f (xk+1 ) f (xk )|V (f ) 0, as |x| 0.