Escolar Documentos
Profissional Documentos
Cultura Documentos
If a concave function f( ) is defined on an open interval of the real line then f( ) is continuous and is
continuously differentiable almost everywhere on that interval. ( ′ denotes partial derivatives)
f ′ ( ) is non-increasing if f( ) is concave. So, if f( ) is concave and twice differentiable then f
″( ) is non-positive.
Generally, we will be concerned with f( ) such that f ′ ( ) 0.
U(w)
[3]
U(wo) =
U(½(wo+δ)+½(wo-δ))
~
E[U (wo )]
Wealth, w
wo – δ wo wo + δ
~ ~
This compares U ( wo ) U ( E[ wo ]) with E[U ( wo )] 1 2 U ( wo ) 1 2 U ( wo )
Thus it is the concavity of U( ) that causes the agent to be unwilling to accept the fair gamble.
Intuitively, risk aversion derives from a downside loss causing a reduction in utility that is greater than
the increase in utility from an equivalent upside gain (f ′ ( ) is non-increasing).
The two definitions provided above naturally lead to the following theorem.
The concave functions we are concerned with are of course utility functions. In finance, we commonly
simplify things and deal with utility of wealth. In this course we will consider, directly or indirectly, the
implications of the investment problem of maximizing agents. In the standard two-date, consumption-
investment problem, agents control two types of variables: (1) at the first date they invest their after
consumption wealth in the marketed securities; and (2) at the second date they sell these securities/assets
to buy consumption goods. Two components make the problem interesting: time and risk.
[4]
Advanced Financial Engineering
Eduardo Mendes Machado
Thus, the investment decision consists of forming a portfolio that transfers wealth from one date to
the next, and the consumption decision allocates the resulting wealth among the various goods (in a
multi-period problem, these goods include savings).
If a complete set of state contingent claims (futures contracts) are available, then the date 2
consumption goods can be purchased at date 1 and both allocations can be made simultaneously. If
there is some incompleteness in the market, then the decision must take place in two distinct steps.
Since it is also possible to consider the complete markets single-step problem as a two-step problem
we will generally think of the problem in two pieces – investment then consumption.
If there is a single consumption good in the economy/model, the utility of wealth function is just a
standard utility function over consumption of the single good.
If multiple goods exist, we are dealing with a derived utility of wealth function U(W) where U(W) ≡
Max V(c) s.t. p′c = W where c is a vector of consumption goods, p is a price vector, and W the
level of the budget constraint. U(W) is just the envelope of V(c) for different levels of final wealth,
W. In this case, we think of the date 2 consumption allocations as specifying a derived utility of
wealth function that the investment decision seeks to maximize. We can then concentrate on the
investment decision itself. To illustrate in a simple two good example, consider:
x2
3
1 2
x1
U(W) W1 W2 W3
W
W1 W2 W3
x1 is the numeraire
[5]
(1) Arrow’s measure of risk aversion – what is the “compensation” required for a risk averse agent
to accept a gamble?
This defines a probability (or an expected payoff/return) based “risk premium.” In this sense, it is
related to the finance or asset pricing view.
~
If ≡ with probability p
- with probability 1-p
~
Then, the question can be stated as: for what value of p (> ½) is E[U(wo + )] = U(wo)?
Or, for what p is pU(wo + ) + (1-p)U(wo – ) = U(wo)?
To solve, take a Taylor series expansion of the left-hand side of this last equation at wo
p[U(wo) + U (wo) + ½ ² U (wo) + o ²] +
(1-p) [U(wo) - δ U (wo) + ½ ² U (wo) + o ²] = U(wo)
Note that for linear U( ) (so that U = 0) this holds at p = ½ as would be expected.
Note that if U (wo) > 0 and U ( wo) < 0 (utility is increasing and concave) then A(wo) > 0
Note: A(wo) is a property of the preference relation and not its utility function representation U( ).
A(wo) relates to the curvature of the utility function at wo (think of the Jensen’s inequality picture). So,
clearly U ( ) belongs, but why is 1/ U ( ) there?
Since utility functions are unique only up to a positive affine transformation 1/U ( ) is a standardization
used to make sure A(wo) is truly a property of and not merely of U( ). Note that A(wo) is a local
measure (at wo) and that the result is strictly true only for ‘small’ gambles.
~
Let’s interpret p: What we see is that the agent must be compensated for bearing the risk of , Ep(wo+
~ ~
) = p(wo + ) + (1-p)(wo - ) > wo, or, Ep( ) > 0 iff A(wo) > 0
U' ' (wo)
Since A(wo) = - and U (wo) > 0, then A(wo) > 0 iff U (wo) < 0 (i.e. iff U( ) is concave).
U' (wo)
Thus, the amount by which we adjust p from ½ – the fair gamble level – is proportional to A(wo) (a
positive amount for risk averse preferences) the Arrow-Pratt coefficient of ‘absolute risk aversion,’ and
a metric for the amount of risk in (the size of) the simple gamble.
[6]
Advanced Financial Engineering
Eduardo Mendes Machado
U' ' (wo)
p ½ + ¼ R(wo) where R(wo) = -wo is the coefficient of ‘relative risk aversion.’
U' (wo)
(2) Pratt’s measure of risk aversion – what payment would a risk averse agent make to avoid a fair
gamble (insurance premium)?
~
≡ with probability = ½ (note that in this case the gamble is fixed as “fair”)
- with probability = ½
~
The question is then, for what value of πi is it true that E[U(wo + )] = U(wo-πi) ?
Taking a Taylor’s series expansion of both sides (around wo) and solving for πi (do this) gives:
πi ½ ²A(wo) = ½ A(wo)Var( )
(A(wo) is again an approximation to measuring risk aversion for small and in this simple setting Var( )
measures the “size” of the gamble or the amount of risk.)
~
wo – πi defines the certainty equivalent wealth for wo+ given U( ).
Again, R(wo) would appear if the gamble considered were a proportional one.
(3) Also consider the question: for what πc – compensating wealth premium – is it true that:
~ ~
E[U(wo + πc + )] = U(wo) for a fair gamble ?
We can show that πc depends on A( ). Since “we can,” do so as an exercise. What do you find?.
w o wo w o
(2)
E 1 [U (wo )] U (w0 i )
2
wo wo wo
wo i
[7]
(3)
E 1 [U (wo c )] U ( w0 )
2
wo c wo c wo c
wo
Each is determined by the nature of the curvature of U( ) at w0, i.e. its concavity.
Practice exercise: Assume U( ) is twice continuously differentiable. Show this holds iff A(w) is non-increasing in w
Practice exercise: Show that if an individual is decreasingly risk averse then U′″(w) > 0 if U( ) is thrice differentiable.
~
Definition: Individual 1 is strictly more risk averse than individual 2 if simple gambles on
~
and wo the insurance premium (πi) individual 1 would pay to avoid the gamble is strictly larger than
that which individual 2 would pay. (Or, if 1 always chooses a safe investment over a simple gamble
whenever 2 does.)
Theorem: Consider two strictly increasing concave utility functions U1 and U2 – the following are
equivalent:
1) A1(w) > A2(w) w “1 is more risk averse than 2”
2) G( ) with G′( ) > 0 and G″( ) < 0 such that U1(w) = G[U2(w)]
(i.e., U1( ) is a “concavification” of U2( ))
~
3) i1 > i2 w0 and
Proof:
(3) (1) we know ij ½ Aj(w)Var( ) for j = 1, 2
~
where πij is the insurance premium agent j will pay to avoid the gamble
~
Thus, i1 > i2 w and iff A1(w) > A2(w) w.
[8]
Advanced Financial Engineering
Eduardo Mendes Machado
(2) (3) U1(w- 2) = G(U2(w- 2)) from the definition of G( )
~
= G[E(U2(w+ ))] from the definition of 2
~
> E[G(U2(w+ ))] Jensen’s inequality (G is strictly concave)
~
= E[U1(w+ )] from the definition of G( )
1
= U1(w- ) from the definition of 1
U '' R
=
U' w
U′(w) = e ( R log( w) I1 ) = e I1 w R
[9]
Case 1: R = 1
U(w) = e I1 log( w) I 2
Case 2: R 1
U′(w) = e I1 w R
U(w) = e I1 (w1-R/(1-R)) + I2
1
The risk tolerance measure is , the inverse of risk aversion:
A( w)
1 U ' ( w)
T(w) = =-
A( w) U ' ' ( w)
U ' ( w)
Linear Risk tolerance implies T ( w) = aw + b, a linear function of w.
U ' ' ( w)
U ' ' ( w) 1
HARA is simply A( w) = , a hyperbolic function of w.
U ' ( w) (aw b)
Case 1: a = 0
1
If a=0, this is simply CARA utility: A(w) = =A
(b)
Case 2: b = 0
1
If b=0, this is simply CRRA utility: = A(w)
(aw)
1
= wA(w) = R
( a)
[10]
Advanced Financial Engineering
Eduardo Mendes Machado
General Case: a,b 0
In this case, we have…
1
-log(U′(w)) = log(aw+b) + I1 as long as aw+b is positive
( a)
b
Assume a > 0, w >
a
( 1a log( aw b ))
Then U′(w) = e e I1
1
= e I1 (aw b) a
and,
U(w) = e I1 log(aw+b) + I2 if a = 1
1
1
a
1 I1 (aw b)
U(w) = e 1 + I2 if a ≠ 1
( a) 1
a
b 1
For simplicity’s sake, let ws = - and R* = , so base utility can be written…
a a
U(w) = log(w-ws) if R* = 1 (a = 1) Generalized log utility
( w ws )1 R*
U(w) = if R* 1 (a 1) Generalized power utility
1 R *
These two base utility functions are generalized log and generalized power utility functions.
They are defined only for w > ws. ws is thought of as a subsistence level of wealth, below which
utility equals negative infinity.
Risk – A general notion of risk we will study later, but for now, a quick introduction
General notion – Risk is defined as any property of a set of random outcomes that is disliked by a risk
averse agent.
This is pretty general and seemingly broad enough to be almost useless. You’d be surprised. Commonly
we are in a situation of trying to focus this idea enough to make it fit within a standard economic model
and get a useful result to come out.
~ ~
Rothschild & Stiglitz presented the idea this way: If uncertain outcomes X and Y have the same
~ ~
location (the same mean), X is said to be weakly less risky than Y for a class of utility functions U if no
~ ~
individual with a utility function in U prefers Y to X . That is,
~ ~
E[U( X )] ≥ E[U( Y )] U( ) U.
Risk, defined in this way, clearly depends on the class of utility functions considered. At the most
general level U is taken to be the set of all risk averse (concave) utility functions.
Practice exercise: Let U be the set of quadratic utility functions so (wlog) we can write:
[ 11 ]
bz 2
U(z) = z - . What can we use to measure risk for this class of utility functions?
2
~ ~ ~ ~
Theorem: X is weakly less risky than Y iff Y is distributed like X + ~ , where ~ is a fair game with
~
respect to X . (That is E( ~ |X) = 0 X.) The “fair game” property is not as strong as independence
but stronger than a lack of correlation. Why would that be a requirement?
~ ~
Proof (sufficiency): Y is distributed like X plus noise – the proof should depend on concavity.
~ ~
E[U( Y )] = E[U( X + ~ )] due to the equivalence of the distributions
~ ~
= E[E(U( X + )|X)] just conditional expectations
~ ~
≤ E[U(E( X + |X))] Jensen’s inequality (i.e. U( ) is concave)
~
= E[U( X )]
~ w p
Example: X = 1 with probability and
w2 1 p
w1 e .5 p
~
Y = w1 e with probability .5 p
w 1 p
2
~
Since e~ is actuarially fair, any risk averse agent should prefer to avoid the second gamble contained in Y .
We now show that the Arrow-Pratt measure does not properly account for this situation in comparing
risk aversions.
The Arrow-Pratt measure approaches the problem by saying that U1( ) is more risk averse than U2( ) if
U1( ) prefers a riskless payoff r to a gamble y whenever U2( ) does. Ross instead directly compares the
~ ~
gambles X and Y given above. The first approach assumes complete insurance is possible (can always
evaluate a risky position against a riskless payoff); the stronger measures were developed under the
assumption that this is not possible. They highlight how the possibility of perfect insurance simplifies
this and many other issues.
[12]
Advanced Financial Engineering
Eduardo Mendes Machado
The Ross Theorem – the following are equivalent:
U 1 ' ' (w) U ' (w) U 1 ' ' (w) U ' (w)
(1) inf ≥ sup 1 or ≥λ≥ 1 w
w U 2 ' ' ( w) w U 2 ' ( w) U 2 ' ' ( w) U 2 ' ( w)
~ + e~ )] = E[U ( w
Proof: First, let’s find : E[U1( w ~ - )] where
1 1
w
~ = 1 with probability p
w and
w2 1 p
w1 e .5 p
~
w + e = w1 e with probability .5 p
w 1 p
2
E[U1(w+e)] = p½ [U1(w1+e) + U1(w1-e)] + (1-p)U1(w2)
pU1(w1) + (1-p)U1(w2) + ½pU1″(w1)e²
Note that the effect of e is only on the second order term (and only at w1).
As compared to the A-P measure of risk aversion, or the related insurance premium, we
get a contamination of by the different wealth levels that develop (w1 and w2) out of
the lack of perfect insurance.
1 ≥ 2 since U1′ ≥ 0
(1) (2) Define G( ) by U1( ) = U2( ) + G where U1( ) and U2( ) are scaled so (1) holds
for some > 0
G′( ) = U1′( ) – U2′( ) then since ≥ U1′ / U2′ from (1) we know G( ) 0
[ 13 ]
U1′′ / U2′′ = + G″( )/U2″( ) ≥ from (1)
or,
Note - The Ross ordering implies the A-P ordering, simply let w1 = w2 and rearrange the Ross
definition – but we’ll see that the A-P ordering does not imply the Ross ordering.
Example: ~~
Suppose wealth is distributed w e as above
12 pU i ' ' ( w1 )e 2
We know that i
pU i ' ( w1 ) (1 p )U i ' ( w2 )
But, for w1-w2 sufficiently large, so that w2-w1 is very negative (i.e. far from perfect insurance),
we end up with:
[14]
Advanced Financial Engineering
Eduardo Mendes Machado
then, for p small enough, 1 < 2 and U1( ) prefers some gambles that U2( ) does not
despite U1( ) being more risk averse under the A-P measure.
The marginal value of insurance is determined by the second order effect -U″(w1)e². The cost of
insurance, on the other hand (for very low p), is valued at the margin by U′(w2), the marginal utility of the
likely state. The premium is thus determined by a tradeoff between the benefits of insurance at w1 and
the costs at w2.
The example forces these wealth levels far apart. The A-P measure cannot consider the two separate
wealth levels and so it fails to order properly the required premiums. Note that the Ross measure
controls for this nicely in this example. See representation (1) in the Theorem.
[ 15 ]
[16]
Advanced Financial Engineering
Eduardo Mendes Machado
2 Pricing by Arbitrage
Here we will take a first look at a financial market using a simple state space model. We first develop
some structure then examine the implications of the absence of arbitrage.
Often in finance problems, uncertainty is characterized by the use of a set of random variables with a
particular joint distribution, perhaps something like ~
y ~ N(, ).
Here, we characterize uncertainty by considering a state space tableau of payoffs on the primitive
assets. We assume that there are a finite number of states of nature and that each security has its
payoffs written explicitly as a function of the realized state of nature.
We index states by s = 1, 2, …, S (not a problem for S = but intuition can be lost as we look at
this for the first time) and assets by i = 1, 2, …, N.
The 2-date investment problem can be characterized by the tableau of per share dollar payoffs on
the N assets in each of the S states at date 2 (Y) and a set of current prices (v).
We want to impose some structure on Y right off. In the investment decision, the agent can make
choices only over outcomes (states) which can be distinguished by different patterns of payoffs on
the marketed assets. Thus, for the investment decision, if there are states with identical payoffs on
all of the assets, then we cannot distinguish between the two so we can collapse them into a single
state (that is, the payoff matrix should not have 2 identical rows).
Example:
1 3 2 3 1 1 1 2
1 3
2 1 2 1 3 2 2 2
2 1 2 1 but, 3 1 2 and
2 2 1
are both fine from this perspective.
3 2
3 2 3 2
1 3 3 1
To complete the description of the “technology” or opportunities of the model, we use the vector v
v1
v
= 2 to represent the current price per share of each asset.
v
N
[ 17 ]
The decision maker’s choice variable is a portfolio (an N × 1 vector) n where ni is the number of
shares of asset i held in the portfolio.
The price of this portfolio is n′v, so the budget constraint the decision maker is faced with can be
written n′v = W0 where W0 is the initial (after date 1 consumption) investable wealth.
…where, for now, we will assume only that u(·) is increasing. Y is the technological restriction
in this model and we are interested in the space of payoffs spanned by the columns of Y, i.e.
what different returns patterns can be generated by trading in the marketed securities (n is a
vector in N ) – this, the budget constraint, and any other restrictions on n (short sales
restrictions etc.) define the opportunity set for the agent.
It will often be convenient to describe a market by a returns tableau rather than a payoff tableau plus
initial prices.
z11 z12 z1N
z 21 z22
Z ≡ S states and N assets again an S × N matrix
z
S1 z S 2 z SN
…where [zsi] = [ysi][diag(vi)]-1, or zsi = ysi / vi. Thus, Z is a matrix of gross returns.
Z can also be thought of as a payoff tableau where the number of shares of each asset is adjusted so
that all current prices per share (vi) are $1.
Note: The construction of Z (from Y) requires only that all initial prices are not zero (i.e. we are not
modeling a market of futures contracts). Zero prices on some assets are not a problem as long as
there is at least one asset whose current price is non-zero. If so we can construct a new set of assets
by adding the payoff of this asset to that of all of the others.
Since we are ultimately interested in the set of returns possibilities spanned by the columns of Y or Z
and we do not change this by forming a new basis for this space, we will usually assume that it is not
the case that all prices are zero and that any assets with a zero price have been transformed as just
described. This is not a big stretch since in most applications the primitive securities have limited
[18]
Advanced Financial Engineering
Eduardo Mendes Machado
liability which implies a non-zero price as we will see. Negative prices are not a problem, but,
conceptually, is this really an ‘asset’?
We often normalize by initial wealth (W0) and consider wi = (nivi)/W0 so the vector w is a vector
N
of portfolio weights. The budget constraint is then written 1′w = i 1
wi = 1 and Zw is the vector
of gross returns (per dollar invested) on the portfolio w (often written as Zw).
An ‘arbitrage portfolio’ ( in the text) is a nontrivial vector of dollar commitments that sum to
zero. That is, an arbitrage or zero investment portfolio is a vector where 0 with 1′ = 0 =
N
i 1
i . We distinguish this from a positive investment portfolio, η.
If all prices, all vi, are positive, as is usual, then it must be that j < 0 for some (at least one) asset j
as the long positions in are financed with short positions in other assets.
No normalization is done for an arbitrage portfolio, and so we talk of 1 ' = (5, -5) as being the
same portfolio as 1 ' = (20, -20). Thus, an arbitrage portfolio is scale free. So, for any scalar , if
is an arbitrage portfolio, then is the same arbitrage portfolio.
Don’t confuse an arbitrage portfolio with an arbitrage opportunity (which we will discuss shortly). If
an arbitrage opportunity exists, it is often possible to exploit it with an arbitrage portfolio – and it
[ 19 ]
can thus be ‘run’ at an unlimited scale, its profits being limited only by the supply of assets or (more
likely) price reaction to trading – but they are different concepts.
Equivalently, if there exists an arbitrage portfolio α (which must be nontrivial α ≠ 0, you will recall)
with payoff exactly equal to zero in all states (Z = 0, 1'α = 0), then there exists a duplicable
portfolio. (Why is this equivalent?)
Clearly, this is a property of the set of returns as a whole. If w1 and w2 duplicate each other, then for
any portfolio w, ŵ = w + w1 – w2 = w + ( w) will duplicate w.
Duplicability is, therefore, usually expressed as a redundancy in the primitive assets, i.e. the columns
of Z. A redundant asset is one such that if it were removed from the market there would be no
change in the space of returns possibilities spanned by the marketed securities.
z11 z1N
Ẑ = (Z′, 1) = has rank less than N, then some assets are redundant.
z S1 z SN
1 .. 1 .. 1
Loosely…we want to have N linearly independent assets (really the columns of Ẑ ) or else we have a
redundancy. If we don’t have this situation, then any asset or portfolio is duplicable, but only the assets
contributing to the collinearity are redundant assets.
The row of 1’s in Ẑ is important to ensure that we consider only ‘equivalent investment’ duplications
of assets (otherwise we could label an arbitrage opportunity as a redundancy). Consider the
0 0 1
following example: Z = . None of the assets is redundant since Ẑ = -1. Thus, the
1 2 0
augmented returns tableau has full rank of N=3. Clearly, assets 1 and 2 are linearly dependent but
the augmented assets 1 and 2 are not. As far as the opportunities for the investor go, both assets are
very important in this economy and we would substantially alter an investor’s opportunity set by
eliminating either asset. We will generally assume that redundant assets have been eliminated from Z
meaning the augmented tableau has full column rank, N (though Z may only have column rank N-
1).
LeRoy and Werner consider an alternative definition: essentially imposing a restriction of the absence of
arbitrage before they consider the issue of redundant assets (their definition is actually in terms of Y but
that is not of consequence).
[20]
Advanced Financial Engineering
Eduardo Mendes Machado
Definition: The right inverse of Z is a matrix R such that Z R I S and the left inverse of Z is a
SxN NxS SxS
Suppose there are no redundant assets, Rank(Z) = N, and N ≤ S, find L such that LZ = IN.
(Z′Z)-1(Z′Z) = IN if (Z′Z)-1 exists and if Rank(Z) = N then it does exist (in fact, it’s iff)
Then L is uniquely defined as L (Z′Z)-1Z′.
Again, since they add nothing to the uncertainty spanned by the marketed returns matrix (i.e. the returns
that can be generated by an investor’s portfolio) we usually assume that all redundant assets have been
removed from Z (or Y).
Definition: If you can construct a portfolio (from the assets in Z) that pays off only in one state, then
that state is said to be insurable. More precisely, state s is insurable if there exists a solution ηs to Zηs =
0
1
is Arrow-Debreu security for state s (such as for s = 2: i2 = 0 ), and, 1′ηs is the cost of insurance,
0
per dollar received, against the occurrence of state s.
Theorem: A state is insurable iff the asset returns in that state are linearly independent from the
asset returns in all other states (i.e. the sth row of Z is linearly independent from the other rows of Z).
Said another way: zs. is not collinear with z1., z2., …, zs-1., zs+1., …, zS.
Proof: If zs. is linearly dependent on the returns in other states, then zs. = z
s
.
[ 21 ]
That is, zs. can be written as a linear combination of the asset returns in other states, z . s
for some set of scalars λσ.
So, for every portfolio , we have z′s. = z .
s
But, if state s is insurable, then for some s the right hand side of the equality must be zero, term
by term, and cannot sum to one as required (by the left hand side). So, if the sth row of Z is
linearly dependent on the other rows of Z, that state is not insurable.
If zs. is linearly independent from the other rows, then Rank(Z, is) = Rank(Z) and by the rule of
rank, a solution s to Z′ s = is exists.
If all states are insurable, the market is said to be complete. This requires that all the rows of Z are
mutually linearly independent. So, for our example above:
1 0 1 0
Z = 1 = 2 =
2 1 2 1
so, the market is complete (which is one of the ways this was a very special case). What is the cost
of insurance in each state? What’s going on here? What do these insurance costs mean? This
illustrates basic dominance (another special aspect of this example).
We call R the gross riskless or risk free return and R-1 = rf the risk free rate (net return).
Often, no riskless asset or portfolio will exist. In this case there is a ‘shadow’ riskless return for the
economy that will depend upon other aspects of the economy we haven’t highlighted so far
(preferences for example). The shadow risk free return can be bounded in this context, below by the
largest return that can be guaranteed for some portfolio (i.e. the best lower bound for any portfolio’s
return) and above by the largest return that can be achieved by every portfolio (i.e. the worst
maximum return on any portfolio).
[22]
Advanced Financial Engineering
Eduardo Mendes Machado
2 1
w
Example – 2 assets and 3 states: Z = 2 3 With 2 assets, any positive investment portfolio w = 1
1 2 w2
w
can be written as w = 1 (simply from 1′w = 1) so it fits easily in a picture.
1 w1
The returns on any portfolio are given by Zw – or as a function of w1 in this 2 asset case:
2 1 2 w1 (1 w1 ) w1 1 Z w1
w1
2 3 = 2 w1 3(1 w1 ) = 3 w1 = Z w 2
1 2 1 w1 w 2(1 w ) 2 w Z
1 1 1 w3
We can graph the returns in the states {1, 2, 3} as a function of w1. Clearly, there can be no risk free
portfolio with 1′w = 1 as there is no point (w1) where the three lines meet.
Zws Zw1
R
R
w1 1
2 w1 1 w1
Zw2
Zw3
At w1 < ½ the lowest return is in state 1. For w1 > ½ the lowest return on any portfolio is in
state 3. Z1 increases in w1 and Z3 decreases in w1 so the max of this min is at w1 = ½ and this
identifies R = 1.50.
Similarly, R = 2.0 at w1 = 1.0 from dominance again.
If we were to introduce an asset with Z1 = Z2 = Z3 = 1.20, what would be true?
Definition: Dominance
[ 23 ]
A (positive investment) portfolio (or asset) w1 dominates portfolio w2 if Zw1 ≥ Zw2 (strict in at least one
element).
Recall that any portfolio has 1′w =1 (spreads $1 around). So, the initial price of the two positions is
the same. Such a circumstance would be an example of an arbitrage opportunity. It is a special
example in that it requires dominance of one asset over another state by state.
It is clear that no investor preferring more to less would ever hold a positive amount of a dominated
asset (any investor with strictly increasing preferences would hold instead the dominating asset). It is
further true that no non-satiated investor can find an internal optimal portfolio (a finite optimum) if
w1 and w2 are both available, even if neither would be held by the investor.
This illustrates the idea of an arbitrage opportunity. = (w1 – w2) is an arbitrage portfolio that exploits
the opportunity. It also illustrates why AA is necessary for equilibrium with non-satiated agents;
there can be no equilibrium if no agent has a finite optimum.
1 0
From our example above, Z =
2 1
Asset one dominates asset two; this lies behind the negative insurance price. In this example, for any
k
portfolio α = k 0, represents an arbitrage opportunity (and a special kind).
k
Dominance is a special form of arbitrage, others are easily defined.
[24]
Advanced Financial Engineering
Eduardo Mendes Machado
Let be a risk free arbitrage 1′ = 0 and Z = k1
Then w + is a positive investment portfolio (1′( w + ) = 1) with return
Z(w + ) = (R+ k)1 and a judicious choice of gets you any desired level of riskless
return
Conversely, if there exists a riskless asset a riskless arbitrage opportunity exists whenever there exists
a solution to 1′w = 1 and Zw = k1 where k R
More generally:
Definition: 1st Type (arbitrage in LeRoy &Werner)
An arbitrage opportunity of the 1st type is defined as any such that 1′ 0 and Z 0.
This says that no (or possibly negative) investment buys a limited liability payoff that is strictly
positive in at least one state at the future date.
The existence of arbitrage opportunities (of the 1st and/or 2nd types) can be succinctly stated as: arbitrage
opportunities exist if there exists some η (or some n), for which the following is true:
1 v' n
0 or, 0
Z S 1 x1 Yn
S 1x1
When there is no such (or n) then there are said to be no arbitrage opportunities (of the 1st or 2nd
types) in the market represented by Z (or Y) an Absence of Arbitrage.
__
Example:
2 1
Z = 2 3 think of the elements of Z as a $ payoff per dollar invested
1 2
Then Z′p = 1 2p1 + 2p2 + p3 = 1
p1 + 3p2 + 2p3 = 1 (when you solve this system of equations you will find)
p1 = ¼ + ¼ p3
p2 = ¼ - ¾ p3
p3 = p3 is a set of supporting prices as a function of p3
By construction, if we multiply either of the primitive securities in Z, or any portfolio of these
securities, by the vector p we will get the current price of the security/portfolio, 1.
[ 25 ]
Linearity of the pricing rule: is the same as the lack of monopoly power in the financial market. (1) The
cost of money in state r is independent of how much is purchased in state s so there are no economies of
scope (although the possible payoffs in states r and s in portfolios of the traded assets may be tied by any
incompleteness in Z).
(2) The cost of $2 in state s is just exactly twice the cost of $1 in state s so there are no economies of
scale that exist.
Our goal is to derive implications for p from the absence of arbitrage opportunities.
An important implication is going to be p > 0 such that Z′p = 1. (Why?)
What does p look like? What would happen if we were to multiply an Arrow-Debreu security (in Y)
by the vector p? The elements of p can be seen as state (or insurance) prices. Since ps is the current
cost of purchasing a dollar in state s and zero elsewhere, ps ≤ 0 is a clear arbitrage.
Example: To more completely show the link between a positive price vector and arbitrage:
z z12
Let Z = 11 Each asset’s current price is $1 since we are working with Z.
0 z 22
z 0 p1 z11 p1 1 (1)
Z′p = 11 = =
z12 z 22 p 2 z12 p1 z 22 p 2 1 (2)
Suppose p1 < 0. Then from (1) p1 = 1 z11 so p1 < 0 z11 < 0
In this case, just shorting asset 1 is an arbitrage (receive $1 now) since z11 < 0 implies that a short
position in asset 1 has a payoff > 0. Similarly, from equation (2):
1 p1 z12 1 z12 z11 z11 z12
p2 = = =
z 22 z22 z 22 z11
z12 z11
Consider the portfolio w: w1 = w2 = . w1 + w2 = 1 and,
z12 z11 z12 z11
z12
z z12 z12 z11 0
Zw = 11 = 1
0 z 22 z11 p
z12 z11 2
So, if p2 < 0 shorting the portfolio w is an arbitrage opportunity.
The secret behind this example…This is a complete market which implies that we can form
any returns pattern on the assets we would like. This makes the relation easy to see since we
can then form portfolios that pay off only in one state (i.e. an Arrow-Debreu security) where
the payoff in a state and the state price are explicitly related.
The law of one price is equivalent to the existence of a supporting price vector but places no
restrictions on it. Consider the following problem: define 1 = – *,
[26]
Advanced Financial Engineering
Eduardo Mendes Machado
If the law of one price holds, clearly the solution to this problem is a minimum of zero. A finite
solution to the primal problem implies that the dual program is feasible. The dual to this
minimization problem is written:
Proof: This version of the proof uses Stiemke’s Lemma or the “Theorem of the alternative”:
“Let A be a matrix in NxM . Then one and only one of the following is true
(a) There exists an x M s.t. Ax = 0
(b) There exists an n N with A′n 0. ”
(This is based on a separating hyperplane argument if you are familiar with them from
your economics classes.)
Let A =
NxS 1
Y '
N S
v so M = S+1 and N = N
N 1
S 1
Suppose (a) is true. Then using Stiemke’s Lemma, x such that Ax = 0.
Write this as:
Y11 Y21 YS 1 v1 x1
Y12 Y22 YS 2
=0
Y1N YSN vN xS 1
[ 27 ]
S
2nd row s 1
Ys 2 x s v 2 x S 1 = 0
S
Nth row s 1
YsN x s v N x S 1 = 0
S 1
Now we know x
so all the xs are strictly positive, in particular x S 1 > 0. Divide both
sides of each equation by x S 1 and rearrange
S x
s 1 Ys1 x s = v1
s 1
S x x
s 1YsN x s = v N Now define s = ps > 0 s
x s 1
s 1
S
s 1
Ys1 p s = v1
S
s 1
YsN p s = vN
or, Y′p = v with p > 0
So, (a) is equivalent to p > 0 s.t. Y′p = v and the alternative (b) cannot hold.
Now, suppose that (b) holds. Recall, we let A = Y v . (b) says n N s.t. A′n ≥ 0.
Y
A′ =
v
S 1 xN
Yn
So, there exists an n such that A′n 0 or 0
v n
S 1x1
This implies that Yn 0 and v′n 0, where at least one of these inequalities is strict. This is just the
general definition of the existence of arbitrage opportunities of the first and/or second types.
Therefore, we have shown that either (a) holds which is the existence of a strictly positive supporting
price vector and (b), the existence of arbitrage opportunities, does not or (b) holds which is the presence
of arbitrage and (a) does not – proving the equivalence of the absence of arbitrage and the existence of a
strictly positive price vector.
Now, part (3) of the Theorem part (1) (or not 1 not 3):
An investor’s maximization problem can be written: Maxn s s u s (W0 vn, (Yn) s ) . So, maximize
expected utility of current consumption and date 2 wealth. This allows for state dependent utility using
us(·,·) for generality; requiring only increasing preferences so s us(·,·) increases in all arguments
(current spending and future wealth in all states).
If there is an arbitrage opportunity n*, then the investor’s problem cannot have a finite optimum if
preferences are increasing since for any n: s s u s (W0 v(n kn*), [Y (n kn*)]s ) is strictly increasing
in k. With an arbitrage opportunity the investor can increase current consumption or future wealth, in at
[28]
Advanced Financial Engineering
Eduardo Mendes Machado
least one future state, without sacrificing current consumption or wealth in other contingencies (other
states) and will thus desire to do so without bound.
Now, (1) (3): If there is no arbitrage then there exists a positive supporting price vector p. Let W0 =
0.
p
Consider us(co, c1s) = -exp[-(W0 – vn)] - s exp[-(Yn)s]. Each us( ) is strictly increasing and strictly
s
concave, infinitely differentiable and additively separable. Using the fact that p is strictly positive and
the relation v′ = p′Y you can show (and you should) that with this utility function the FOCs for a
maximum (which are necessary and sufficient by concavity) are satisfied at n = 0, therefore this investor
would find an internal optimum.
More intuitively…the absence of arbitrage means that consumption at date 1 or in any state at date 2 has
a positive price, i.e. it can only be increased at the expense of consumption elsewhere, either at t=1 or in
some t=2 state. So, pick any strictly concave strictly increasing utility function with u′(-∞) = ∞ and u′(∞)
= 0 and standard convex programming arguments show that this function will have an interior optimum
due to the tradeoffs implied by the positive state prices.
1
Example: Let Z = . Thus, our set of assets is a simple gamble. Clearly no arbitrage
2
opportunities exist. What is the supporting price vector?
In general, the supporting state prices are not unique. Z is an S×N matrix, Z′p = 1 places only (if N
< S) N restrictions on S prices leaving S – N degrees of freedom and we will commonly have at least
one non-positive vector p that satisfies this equation, even if there is an Absence of Arbitrage.
If state s is insurable, then the element ps of p is unique across all supporting vectors p.
Recall that a state being insurable means: s s.t. Z s is
Thus, for all supporting vectors p (not necessarily positive)
p ( Z s ) p is
( p Z ) s ) p i s
1 s p s indicating that ps is the price of insurance against state s occurring.
This is true for all supporting vectors p and if the law of one price holds (and it does since there is a
supporting price vector) then 1 s p s is the same for all p such that Z p 1 and all s such that
Z s is so that ps is unique.
If Z has full row rank, i.e. all rows of Z are linearly independent so all states are insurable (i.e. the
market is complete) then p itself is unique (requires N ≥ S). Full row rank implies there exists a
[ 29 ]
unique right inverse, R, for Z so p is unique:
p′Z = 1′ p′Z(Z′(ZZ′)-1) = 1′(Z′(ZZ′)-1)
p′I = p′ = 1′(Z′(ZZ′)-1) = 1′R which is uniquely determined by R.
Riskless Asset:
1 1
If there exists a riskless asset, the sum of all state prices must equal: for all supporting price
R 1 rf
vectors p.
Assume there exists a wR such that 1′wR = 1 and ZwR = R1. Then,
p′(ZwR) = (p′Z)wR
= 1′wR
=1 and,
p′(ZwR) = p′(R1)
= R(p′1)
= R ( s p s ) so,
1
1 = R ( s p s ) s
ps = p s.t. Z′p = 1
R
Example: re-examined
2 1
Let Z = 2 3
1 2
From Z′p = 1 we found p1 1 1 p3 and p 2 1 3 p3 and p3 p 3
4 4 4 4
For p > 0 we require 0 < p3 < 1/3. Look at the ‘edges’ of this range for p3…
For p3 “near” zero p3 0 p1 1/4 p2 ¼. And, s p s = ½ R = 2
For p3 “near” 1/3 p3 1/3 p1 1/3 p2 0. And, s
p s = 2/3 R = 1.5
Recall that earlier using this example we found no riskless asset but bounds on the shadow
riskless return given by: R = 2 and R = 1.5. Thus, these bounds are ‘tight’: in the sense that we
1
can find a vector p with Z′p = 1 such that = R for any R in the interval R < R < R .
ps
This is generally true, here it occurs because s
p s = ½ + ½ p3 is continuous and monotonic
in p3.
[30]
Advanced Financial Engineering
Eduardo Mendes Machado
Representation Theorem:
The following are equivalent:
(1) The existence of a positive linear pricing rule.
(2) The existence of positive ‘risk neutral’ probabilities and an associated riskless rate.
(3) The existence of a positive state price density.
Y′p = v or Z′p = 1
S N S
p (Yn) s i 1 ni vi
s 1 s s 1
p s ( Zw) s 1
ps 1 s*
To show the equivalence between (1) and (2) simply set s* and R* = or ps .
ps ps R*
*
Clearly, if the economy has a riskless asset then R = R for all valid p.
Here, as in the proof of existence of the state prices, we simply require that all traders believe that
the same set of states are possible. For equivalence we require that the same states have positive
probability under both measures.
Every investor agrees on the set of valid p’s (if all believe the same set of states are possible) so all
will necessarily agree on the set of valid risk neutral probabilities. Thus, all investors price assets the
same under both approaches.
The use of risk neutral probabilities can also be thought of as developing a market based certainty
equivalent measure for any risky asset. Since it is a “certainty equivalent,” the proper discount rate is
the associated risk free rate.
“Stochastic Discount Factor” – if we define ps = 1/Rs as a discount rate appropriate for state s cash
flows, we see that the standard MBA presentation of valuation is an aggregated version of the linear
E (CashFlow)
pricing rule. There, v = . Here, we explicitly recognize the state dependence of the
(1 r (risk ))
cash flows and discounting each at its appropriate state dependent rate eliminates the need to risk
[ 31 ]
adjust our discount rate. The size of the cash flows state-by-state does this for us as opposed to
considering only expected cash flow and a risk adjusted discount rate that applies to the expectation.
This suggests that the relation between state contingent payoffs for an asset, state probabilities, and
state prices will be important will be important in the risk adjusting of a discount rate.
N
v ni s s s (Yn) s
i 1 i
1 s s s ( Zw) s
ps
Equivalence follows from defining s or s s p s
s
Note: E() = 1/R if there exists a riskless asset or 1/R* if not.
Clearly, is positive for all states s iff p is positive. This representation is most valuable when we
move to a continuum of states since p(s) and (s) may be zero for all states s yet λ(s) may be well
defined.
1 cov( , Zw)
E(Zw) = = R – R·cov( , Zw) (assume there exists a riskless asset)
E ( )
So, if cov( , Zw) = 0, the asset has no risk premium. In other words, the same message we see in
other asset pricing frameworks is illustrated here: (1) some risk is not priced, (2) the expected return
on risky assets is the risk free return plus a premium, and (3) marginal risk is determined by
covariances. Note also that the correlation between the state price density (state prices and
probabilities) and payoffs appears as was suggested above.
[32]
Advanced Financial Engineering
Eduardo Mendes Machado
Idiosyncratic Risk – Illustration:
Suppose payoffs can be written as: a + f( ~ x1 ~ x N ) ~ where E[f(·)] = E[~ ] = 0 (all expectations
are reflected in a). Then, the price of any asset is given by:
E * (a f ) a E * ( f ) E * ( )
v
R* R*
If, under the risk neutral probabilities, the expectation of ~ is also zero (E*[ ] = 0) then the risk or
variability of this component of returns does not affect this asset’s price, i.e. the risk of ~ does not
imply a differential expected return.
a
v E[(a f ) ] E[f ] E[ ] (Assuming there exists a riskless asset.)
R
Consider…
ps
E[λε] = s s s s = s s
s
s
ps
=R -1 s
=
*
R-1 s s s
ps
s
= R-1 E*[ ] = 0 iff E*[ ] = 0
Risk is not priced – it carries no risk premium – if it is uncorrelated with the state price
density, .
Note – Almost none of what we have said so far has had to do with actual or subjective probabilities of
the states – a seemingly strange omission when talking about asset pricing.
The message is this…Dominance and arbitrage are dependent upon possibilities not probabilities – state
by state comparisons (with no regard for the likelihood of each state). Though this is, in some sense,
blunt, it carries us a long way.
[ 33 ]
[34]
Advanced Financial Engineering
Eduardo Mendes Machado
Assuming a state independent von Newman-Morgenstern utility function of wealth (as is standard) we
simplify the problem by substituting the technology constraint into the maximand:
Max w ,co E[U (co , (Wo co )w' ~
z )] subject to: 1′w = 1
L
(1) = E[U 1 () U 2 ()w* ' ~
z] 0
c o
L
(2) = E[U 2 ()(Wo c o ) ~
z ] 1 0
w
L
((2) is a vector of equations, one for each asset i that hold at the optimal w)
wi
L
(3) = 1 - 1′w* = 0
(1′) from (1) and (2′) E[U 1 ()] E[U 2 ()w* ' ~
z ] (Wo co )
(1′) gives as proportional to expected marginal utility of current consumption and to the expected
marginal utility of invested wealth or savings (i.e. (W0 – c0)).
The Message: consume from endowed wealth until the expected marginal utility of current consumption
and savings (the second part of (1′) is the derivative of the Lagrangian with respect to (W0 – c0)) are equal
(that’s (1′)), then allocate savings across the assets until each asset gives an equal contribution to
expected marginal utility of future wealth (that’s the set of equations that make up (2)).
[ 35 ]
In the study of finance, we concentrate on the second problem of allocating after consumption wealth
among assets and often assume the consumption decision can be made independently (or is taken as
given), so the concentration is on versions of (2) – become familiar with it!
u ( z ) u (w' ~
z ;Wo , co ) U (co , W1 ) U (co , (Wo c o ) w' ~
z ) by taking W0 and c0 as given we are
isolating the investment or portfolio choice decision. Then,
U
u' ( z) (Wo c o )U 2 ()
w' ~
z
Asset by asset, equation (2) becomes E[u ( w* ' ~z )~z i ] assets i – we might think of this as
normalizing after consumption (date t = 0) wealth (savings) to 1.
L = E[u ( w' ~
z )] (1 1w)
L
E[u (w*' ~
z )~
z ] 1 0 (again, a vector of equations, one for each asset)
w
The expression says essentially that the expected return on any asset is related to the covariance of the
marginal utility of the return of an optimal portfolio with the return on that asset:
This expression might already look familiar (from last our prior discussions). Rewrite the equations:
E[u (~
z *)~
zi ] i
[36]
Advanced Financial Engineering
Eduardo Mendes Machado
u
p s s s 0 since each component is positive we can alternatively write
u '
s s (state price density), so we have seen the FOC before.
Or, write E[ z i ] 1 as 1 E[ ]E[ z i ] Cov ( , zi ) for another way to see that λ (the state price
density) is proportional to marginal utility of return on an optimal portfolio (see the similar equation
above) or equals the marginal rate of substitution between consumption at time 0 and consumption in
state s if we can identify as before.
Example: Risk neutral agent and there exists a riskless asset. If u(·) is risk neutral (linear), u′(·) is a
u
constant. So, p s s . Thus, state prices are proportional to the actual probabilities.
u 1 * p *
p s and since s s we know that s s . The risk neutral probabilities equal
R ps
the actual probabilities, so the true expected return on all assets must be R.
s u s ~ S u 1
This should not be too surprising. From
z si 1 i and
s 1
R
we can see that the
FOC is, in this example, equivalent to s z si E (~
z i ) R i.
So, E[ z i ] R under the actual probabilities if there is a single risk neutral agent in the market (facing no
trading restrictions: short sales constraints or position restrictions). A risk neutral agent doesn’t require
compensation for risk and buys or sells each asset until his FOC holds for each asset. Note here, the
behavior of the agents determines the relation between prices and payoffs. Whereas before we
examined the relation between given prices and payoffs considering only that this relation did not allow
for arbitrage opportunities.
[ 37 ]
Example: Now, assume u(·) is quadratic: u(z) = az + ½bz2 with b < 0 for concavity (we must also
constrain the range of z to ensure monotonicity). In this case, u′(z) = a + bz is linear.
Thus, the expected return on any asset is linearly related to its covariance with an optimal portfolio’s
returns – here, wealth/return on an optimal portfolio is a sufficient statistic for marginal utility. (There
is a negative sign in the expected return, but recall that b < 0.)
If there is a riskless asset, R since the riskless asset has zero covariance with all zi.
a bz *
Rb
So, the expected return relation becomes z i R Cov ( z*, zi ) which must hold for any asset or
*
* Rb Rb z* R
portfolio. Thus it must hold for z so we write: z R Var ( z*) . Thus .
Var ( z * )
Substituting this into the equation for the expected return on asset i we get
Cov ( z*, zi ) *
zi R ( z R) R i ( z * R)
Var ( z*)
Except for the * rather than an M (to indicate the “market” portfolio), this is the CAPM pricing relation.
We will see what allows the switch from * to M later.
More generally:
E[u ( z*) z i ]
Cov (u , z i ) u z i
Cov (u , z i )
zi Why is there a negative sign?
u
So, it is (the negative of) the covariance with marginal utility of return on an optimal portfolio that is
important.
R
With a riskless asset: R and z i R Cov (u , zi ) . Again this must hold for z* so
u
R Cov (u ' , z i )
z * R Cov (u , z * ) and we can solve to find that z i R ( z * R) . This illustrates one
Cov (u ' , z * )
of the main challenges in deriving useful asset pricing equations.
[38]
Advanced Financial Engineering
Eduardo Mendes Machado
When we consider the Capital Asset Pricing Model (“CAPM”), we will look at a full portfolio problem
with many assets. For now, let’s consider Properties of Simple Portfolios:
1 risky asset with return ~z and one riskless asset with return R
Let w be the portfolio weight on the risky asset.
Define, for convenience, the excess return as ~x ~z R
Alternatively, the 2nd order condition is: E[u ( wx R) x 2 ] 0 for all risk averse investors, so the
FOC is decreasing in w. So, if at w = 0, the FOC is positive, it is zero at some w* > 0.
In the simple portfolio problem, we have full insurance available, so using the Arrow-Pratt measure of
risk aversion is appropriate. If agent A is more risk averse than agent B, how do you think they should
behave towards holding the risky asset in this simple setting?
Consider uA and uB such that there exists some G with G′>0 and G″<0 and uA=G(uB). So, A is more
risk averse than B in the A-P sense.
Now, consider a single investor, so hold the utility function constant, but look at different wealth levels.
Why? Because A(W0) depends on W0. Do we get what we think we should?
[ 39 ]
Thus, to increase initial wealth we can increase W0 or equivalently increase z and R together: ~z ~z a
and R R a . Thus, ~ x z R stays the same. So, we can leave ~x unchanged and increase R (i.e. a
comparative static change in R leaving the distribution of ~x fixed is equivalent to a change in W0). We
decrease the price of the safe asset and leave the reward for bearing risk the same (thus decreasing the
price of the risky asset as well).
…the denominator is negative, the numerator is negative or positive depending on whether A(·) is
decreasing or increasing. What if A(·) is constant? Thus, decreasing absolute risk aversion is sufficient
for the optimal amount of the risky asset, w*, to be increasing in initial wealth. The economic
interpretation…the wealth effect (driven by the dependence of A(.) on wealth) drives this move. There
is no substitution effect as both the price of the safe and risky assets have declined.
E[ A()u () ~
x ](1 w*) E[u ()]
=
E[u () ~
x2]
w *
= (1 w*) + a negative term
R ~x
Since the distribution of z remains fixed the wealth effect is not as powerful as when the distribution of
x was held fixed and we now see opposing forces at work. From above, if w* 1 and A(·) is increasing,
this entire expression is negative. So, a sufficient condition for an increase in R (holding the distribution
of z fixed) to decrease the demand for the risky asset is increasing absolute risk aversion. The income
and substitution effects work in the same direction. There are no simple sufficient conditions for a
positive reaction towards w* with a general utility function. Intuitively, if A(·) is decreasing, since we
have allowed the reward for risk bearing to decline, we would need A(·) to decrease fast enough to offset
w *
the substitution effect (that negative term) in order to get 0.
R z
[40]
Advanced Financial Engineering
Eduardo Mendes Machado
Finally, consider a comparative static change in the expected return on the risky asset; shift the
z or R ~
distribution by adding a constant. Replace ~z with ~ z R~
x becomes ~
x (so ~ x )
and see what happens as changes from zero holding R fixed.
w * E[u () w * ( ~
x ) u ()]
R E[u ()( ~
x )2 ]
Evaluate this at =0
Since we know w* > 0 (if E(x) > 0), the first term in the numerator is negative iff A(·) is decreasing. The
2nd term in the numerator and the denominator are negative. So, decreasing absolute risk aversion is
sufficient but not necessary for w* to be increasing in . We have increased the reward for risk bearing
(decreased the price of the risky asset) so for decreasing absolute risk aversion investor, the income and
substitution effects work together. It is not necessary since we could allow for “slowly” increasing A(·)
and still have w* increasing.
Example: CARA Utility u(z) = -e-Az with A > 0 (A is constant so there is no wealth effect)
The problem is: Max w E[u ( z )] E[e A( wx R ) ] …so the budget constraint is in the maximand.
We have normalized W0 = 1
Let ~
x ~ N(a, b) (note: b > 0 since it’s a variance and presume a > 0 as is common).
Given that ~
x is normally distributed (and so wx+R is normal), we can write the problem as:
A2 w2b
Maxw E[u ( z )] exp A( wa R ) ,
2
by using the moment generating function for normal random variables. Know this trick
Note: A(wa+R) is the mean of the random variable of interest and A2w2b is its variance.
Take minus the log of minus E[u(z)] (a monotonic transformation) and the problem becomes:
A2 w2b
Max w A(wa R)
2
a
The FOC is: Aa – A2w*b = 0 or, w*
Ab
w *
Note: (1) 0
A
(2) w* > 0 iff a > 0 (as before)
(3) w* is independent of R (since A(·) is constant) and increasing in a
[ 41 ]
The amount of wealth w* in the risky asset depends on the mean and variance of excess returns (why?)
and on preferences – parameterized by A(·), the measure of absolute risk aversion.
w *
We see that 0 which implies that the dollar amount invested in the risky asset is independent of
R
initial wealth. So, for all initial wealth levels the same dollar amount is put in the risky asset; the rest is
made up of a (positive or negative) position in the riskless asset.
pR (k h) kR
So, w* =
hk
R[ ph (1 p )k ] R x
= > 0 if x > 0 as before.
hk hk
Note that now w is not independent of initial wealth (here R), a general result for CRRA utility.
w *
is in fact a constant, so the same proportion of initial wealth is put into the risky asset for all wealth
R
levels. Similarly, w* increases in R (since A(·) is decreasing) and x .
That is…if f(z1) and g(z2) are the marginal density functions for the returns on portfolios 1 and 2,
respectively, then portfolio 1 FOSD’s portfolio 2 if:
x x
f ( z1 )dz1 g ( z 2 )dz 2 x
[42]
Advanced Financial Engineering
Eduardo Mendes Machado
~
d ~ ~
z2 ~
z1 where is a non-positive random variable
~ ~ ~
z2 ~
z1 where is a non-positive random variable (these are realizations)
This requirement of equality of distributions illustrates the important difference between FOSD and
dominance. The probabilities of the states are irrelevant when considering dominance, yet they are
crucial in FOSD (though the states themselves are not).
2 1
In other words, if represents the market returns, then asset 1 dominates asset 2 (also, 1 FOSD’s 2
0 0
– why?), so there is an arbitrage opportunity.
2 0
If represents the market, then there is no dominance or arbitrage.
0 1
However, if prob(state1) 0.5, then asset 1 FOSD’s asset 2
Investors don’t disagree about dominance as long as they view the same set of states as possible. They
can, however, disagree about FOSD if they have different probability assessments.
An immediate result is that no non-satiated investor will ever hold all of his/her wealth in a/the risky
~
asset, ~
z 2 , that is first order stochastically dominated. This is because ~ z 2 and ~z1 have the same
distributions, so: E[u ( z 2 )] E[u ( z1 )] E[u ( z1 )] by the non-positivity of and the strict
monotonicity of u(·).
~
Think of as a vector of all zeros and a -$1 in one state – the idea is why take the chance of throwing
away $1 when the two portfolios “cost” the same.
Alternatively, we can compare 2 “normal” distributions, where one is the other minus a constant:
Investors may hold FOSD’ed assets as part of their optimal portfolios (but will never hold FOSD’ed
portfolios). Recall dominated assets cannot exist or no investor has an optimal portfolio.
2 0
Why? Think of our example when (prob(s1)) (prob(s2)) then asset 1 FOSDs asset 2.
0 1
[ 43 ]
Asset 2, however, is a hedge against the risk of asset 1. It will be held in a positive amount by all strictly
risk averse investors regardless of the probabilities of the states. It is 2’s negative correlation with 1 that
gives it value – however, if you had to choose between them, nobody would choose asset 2.
CAPM Example – negative beta assets are useful and held in portfolios, but you would never put all of
your wealth in a negative beta portfolio. Why?
[44]
Advanced Financial Engineering
Eduardo Mendes Machado
4 CAPM
What we want to do here is build some intuitions by presenting a special case of more general future
results on the risk/return relation.
Mean-Variance Analysis: The basis for an equilibrium pricing relation known as the Capital Asset Pricing
Model (“CAPM”) which relates a measure of the risk of an asset to the expected return of that asset.
Foundation: The risk of a portfolio can be measured by the variance of that portfolio’s return. The
idea is that there is a derived utility function over mean return and variance of return so that these two
2
parameters completely determine expected utility – E[u ( z )] v ( z p , p ) – where generally we think
that v1 > 0 and v2 < 0. This second condition is necessary for consistency with the maximization of
expected utility, the first, in the absence of a riskless asset, is not.
Notation: The matrix Z again represents our market: or we can use the random vector ~z . Note: if
there exists a riskless asset, it is not included in Z, we keep track of it “on the side.” z is a vector of
expected returns and Σ is the variance-covariance matrix of Z. For any portfolio w:
N N
Expected Returns: z w wz wo R i 0 wi z i ; z o R ; i 0 wi 1
N N
2
Variance of return: w w w wi w j ij
i 1 j 1
Covariance Vector: w w
Given our derived mean/variance utility function, our problem is to describe the mean-variance efficient
set of portfolios: the set of portfolios with the largest mean for a given level of variance. Any agent with
such a utility function will choose a portfolio from this set.
2
We will find that it is easier to work with the larger set – the minimum-variance portfolios – the set of
portfolios with the smallest variance (or std. dev.) for each given level of expected return.
2
[ 45 ]
We first consider the problem of deriving the minimum variance set in the absence of a riskless asset.
To derive the minimum variance set we minimize (by choice of a portfolio w) portfolio variance (a
quadratic function of w) subject to the linear (in w) constraints – that the portfolio’s expected return be
at a given level and the budget constraint. The problem is nicely behaved and the first-order conditions
are necessary and sufficient due to this special structure. Symbolically:
Lagrangian:
L = ½ w w (1 1w) ( z w)
FOCs:
L
(1) w * 1 z 0
w
L
(2) 1 w* 1
L
(3) z w*
Rewrite (1)…
(1′) w* = 1 1 1 z (note this is really w*())
Now solve for λ() and γ() (the whole problem is based on a given )
(3′) z w* z 1 1 z 1 z
Define: A 1 1 1 0 B 1 1 z
C z 1 z 0 AC B 2 0 (as long as z k 1 )
C B A B
( ) ( )
[46]
Advanced Financial Engineering
Eduardo Mendes Machado
This lets us find the equation of the minimum variance set.
2 ( ) = w * w* w * ( 1 1 1 z )
= w * 1 w * z
=
Substituting for λ and γ and rearranging we find:
A 2 2 B C
2 ( ) = .
This is the equation of a parabola in (σ², μ) space. If we write this equation as μ is a function of σ, i.e. in
(σ, μ) space, it is the equation of a hyperbola.
wg
Now we want to locate a specific portfolio or two. We can locate the global minimum variance
portfolio by noting that this portfolio solves the minimization problem with a slack return constraint: γ
= 0.
g A B
From w* = 1 1 1 z , if γ = 0, then w g 1 1 . Also, 0
or, g A B 0. So, g B . Thus, B > 0 iff g 0 , which is what we think of as the typical case.
A
C B
If g B then 1 , which we find from either or from 1w g 1.
A A g
So, we have:
1 1 1 1 B 2 1
wg 1
, g , and g from wg′ Σ wg.
A 1 1 A A
Alternatively, we take the derivative of the variance equation to find its unconstrained minimum:
2 A 2 2 B c 2 A 2 B
0 at g.
B
g
A
B2 B2
C C
C B A A 1
Then,
AC B 2 A
[ 47 ]
A B B B
0 and,
1 1
w g 1 1 1 z as before.
1 1 1
From w* 1 1 1 z we can see that all minimum variance portfolios can be formed through
portfolio combinations of 2 distinct portfolios. Since wg corresponds to one of these portfolios, it is
natural to look at the ‘other one’ implicit in the equation for w*.
1 z 1 z
That is, define wd
1 1 z B
We just saw that all min-var portfolios are portfolio combinations of wg and wd and all portfolio
combinations of these portfolios are min-var portfolios. If investors want to hold any min-var portfolio
they don’t need access to all tradable assets; they only need access to 2 mutual funds, wg and wd.
wg is the global minimum variance portfolio – can we similarly locate wd?
1 z C
wd has expected return: d z wd z
B B
2 wd 1 z z 1 1 z z 1 z C
wd has variance of return: d wd wd 2
2
2
B B B B
Further, we know that wd is on the upper limb of the hyperbola (in the “normal” case) since:
C B
zd zg We know that A & > 0
B A AB
so z d z g is positive if B > 0. This is true if μg > 0 which is “normal.”
We may also use any two distinct min-var portfolios in place of wg and wd – they will “sketch out” or
span the entire min-var set, i.e. the minimum variance set is derived from portfolio combinations of any
two distinct minimum variance portfolios.
[48]
Advanced Financial Engineering
Eduardo Mendes Machado
wb = bwg + (1-b)wd
wa and wb are min-var portfolios since they are portfolio combinations of wg and wd by construction.
From w* = Awg + Bwd we can get:
A b 1 1 a A
w* = wa wb
ba ba
by solving the equations for wa and wb for wg and wd and substituting these into the equation for w* and
remembering that A + B = 1.
Since the coefficients on wa and wb sum to one, the proposition is proved.
The portfolio weight of any asset is linear in along the min-var frontier. So, as we increase , the
portfolio weight on any asset either linearly increases or decreases when we stay in the set of minimum
variance portfolios.
w* = Awg Bwd
AC AB AB B 2
= wg wd
AC B 2 B2 AB AB B2
= wg wg wg wd wd
( AB B 2 )
= wg ( wd w g )
So,
( AB B 2 )
wi* = wig ( wid wig )
If B > 0 and since 0, so A B 0 , assets represented more heavily in wd than in wg are held in
larger and larger amounts in min-var portfolios as we increase above g .
Also note that for each asset there is one min-var portfolio in which it has zero weight. In all portfolios
below (above if wid-wig is negative) it is sold short.
1 1 1 1
Cov(zwg, zwp) = w g w p 1
wp 1
0
1 1 1 1 A
This is true for any asset or portfolio. Note 1/A is also wg’s covariance with itself (i.e. its variance).
Consider any two min-var portfolios: wa = (1-a)wg + awd and wb = (1-b)wg + bwd (without loss of
generality). Let’s assume a 0 and b 0 so we are not looking at wg in either case.
Cov (~
za , ~
2 2
z b ) (1 a )(1 b) g ab d [a(1 b) b(1 a)] dg
[ 49 ]
1 C 1
= (1 a )(1 b) ab 2 (a b 2ab)
A B A
1 ab
= ab
A AB 2
This is the covariance between any two min-var portfolios, it is completely determined by the choice of a
and b.
Fix any a 0 (again so wa wg) by choosing b we can get ab to be any value in ( , ), thus, moving b
you pass through zero:
1 ab
0= 2
0 B 2 ab
A AB
B2
or, b a 0 (recall 0 only if z k1 )
a
So, if a > 0 then b < 0, and if wa is on the upper limb of the hyperbola, then the min-var portfolio with a
zero covariance with wa is on the lower limb and vice versa.
Look at where wg and wd are and the weight on wd implied by a > 0 and b < 0.
In fact, it can easily be shown, by finding the slope of the frontier at a particular point and finding the
interception of the tangent line with the z axis, that the expected return on a portfolio with zero
covariance with a frontier portfolio is located at this interception.
Try this for wd for homework.
[50]
Advanced Financial Engineering
Eduardo Mendes Machado
The covariance of a min-var portfolio with any other asset or portfolio:
Cov(zm, zp) = wm w p
= mwg w p (1 m) wd w p
m z 1
= (1 m) wp
A 1 1 z
m zp
= (1 m)
A B
Thus, the covariance of the return on any asset or portfolio ~z p with the return on a min-var portfolio
~
z is a function of the expected return on the arbitrary portfolio alone (and, of course, which min-var
m
portfolio is chosen). Picture!
1 1 z z
Comes from g = and d = so for any portfolio p with 1’wp = 1 we see that
A B B
z 1
pd = p and (of course) σpg = . The covariance of the return on any min-var portfolio and any
B A
other portfolio is some combination of these two covariances.
FOCs:
L
(1) w * ( z R1) 0
w
L
(2) ( z R1)w* R
[ 51 ]
Solve for using R ( z R1) w * so, solves
2 w * w* w * 1 ( z R1)
= w * ( z R1)
= ( R)
2 ( R) 2
So, 2
, again a parabola in (2, ) space
C 2 RB R A
1
In ( , ) space this is: R [C 2 RB R 2 A] 2 ,
a pair of rays originating at R:
Once again, all min-var portfolios are portfolio combinations of any 2 min-var portfolios. One natural
choice is, of course, the riskless asset, and for the other, use the frontier (min-var) portfolio that includes
none of the riskless asset. (How do we know that there is one and only one? How would you prove this?) This
risky asset only portfolio on the min-var frontier is called the ‘tangency portfolio.’
1 ( z R1)
wot = 0 so, wt = from 1′wt = 1
B AR
z 1 z Rz 1 1 C BR
z t z wt
B AR B AR
[52]
Advanced Financial Engineering
Eduardo Mendes Machado
2 C 2 RB R 2 A
t wt wt
( B AR) 2
The interesting, but not surprising result, is that wt is also on the risky asset only frontier. We can use z t
2
and t to show this.
On the risky asset only frontier we know: 2
2 C 2 RB R 2 A
t =
( B AR) 2
(C BR)
=
( B AR)
C BR 2 C BR
C BR)
C B A B
= B AR B AR B AR
AR 2 BR C BR 2
= 2
, which is t .
( B AR)
wt is the portfolio on the risky asset frontier whose tangent line hits the z axis at R,
i.e. wt is in “both” min-var sets.
zt wt
B
A wg
R
This is the picture for the case R B A z g . Why do we think this is natural? If R B A , the
tangency is between the lower limb of the hyperbola and the lower ray. Agents hold the riskless asset
and are long (short) wt if R < (>) B A .
Any portfolio on the pair of rays can be formed from portfolio combinations of the two portfolios wt
and the risk free asset – called “2 fund money separation.”
[ 53 ]
What if R = B A ? No tangency:
R BA
Use w* = 1 ( z R1)
B
= 1 ( z 1)
A
R B
= 2
1 ( z 1)
C 2 RB R A A
Look at 1′w*:
R 1 B 1
= 2 1 z 1 1
C 2 RB R A A
R B
= 2 B A 0
C 2 RB R A A
Thus, 1′w* = 0 so wo* = 1. We see agents holding the riskless asset and an arbitrage portfolio of the
risky assets.
[54]
Advanced Financial Engineering
Eduardo Mendes Machado
Since in the presence of a riskless asset all the min-var portfolios are perfectly correlated, the same
(scaled) result holds for them (substitute the relation wa = k wt).
t wt is the vector of covariances of the returns on the individual assets with the return ~z t since
the covariance of a random variable with a weighted sum of random variables is the weighted sum of the
individual covariances.
z R1 t2 ( zt R) t ( z t R)
t
Or, for each asset:
z i R it2 ( z t R) it ( z t R) or z i R it ( z t R)
t
The story I always tell my MBA students goes like this: The market pays you for two things (1)
surrendering your capital (delaying consumption) for which you get the “rental rate” R and (2) taking a
part of the total or aggregate risk that the market must distribute across all investors (in this model
aggregate risk is measured by t2 , why?) for which you get a risk premium. That’s what this equation
says. And in particular your risk premium is determined not by the risk you hold but rather you get paid
for the portion of the aggregate risk you hold. This is determined by the beta and the excess return on
the tangency portfolio is the price per unit risk.
Because, when there exists a riskless asset, all min-var portfolios are perfectly correlated with the
tangency portfolio (each is just a blend of the riskless asset and the tangency portfolio), exactly the same
result holds for every min-var portfolio (any portfolio on the rays).
When there is no riskless asset, then we can use any min-var portfolio and a special companion for it to
most simply express expected returns.
1 1 1 z
Recall: wg wd
A B
[ 55 ]
Let wa (1 a )w g awd a 0, a 1
1 a a
a wa 1 z and,
A B
2 1 a az a
a wa wa
A B
For any other portfolio wp (not-necessarily min-var, we require only that it be a positive investment
portfolio, 1’wp = 1), we find the covariance:
1 a az p
pa wp wa .
A B
As we saw before, the correlation of any asset/portfolio’s (p) return with a frontier portfolio’s return is a
function of the portfolio’s (p’s) expected return. Reversing this interpretation, this indicates that z p is a
linear function of ~z ’s covariance with any min-var portfolio.
p
2 (1 a)
Solve the equations for a and pa for A
and Ba , substitute the resulting relations into a :
2
z p a z a ap za z p
z 2
1 2
a
a ap a ap
Now, make a clever choice of p – make it a portfolio (could be the frontier portfolio although this is not
necessary) with a zero covariance with a (i.e. pa = 0). Then, the above simplifies to:
a
z zz1 2
( z a z z ) z z 1 a ( z a z z ) the subscript z is for “zero beta” portfolio.
a
Thus, when there is no riskless asset, expected returns are linearly related to an asset’s beta with a
reference portfolio if that reference portfolio is a minimum variance portfolio.
z
Capital Market Line
za wa
wg
zz wz
This looks like the “Black CAPM” named for Fisher Black.
A special case is wd. Since wd is located as it is (with z z 0 ), if wd is used as the reference portfolio (a =
1 z
1 above), it must be that z d z d . We can verify this using wd .
B
[56]
Advanced Financial Engineering
Eduardo Mendes Machado
z 2 z
Then, d , and d d z d2 z d d z d .
B B d
We have shown that if the reference portfolio is a min-var portfolio, the expected return on any asset is
a linear function of the covariance between that asset’s return and the return on the reference portfolio.
Important point: The linearity property of expected returns and beta or covariance holds only if the
benchmark portfolio is in the min-var set.
Since 1′wp = 1, 1 and wp is a portfolio combination of wg and wd and so is in the min-var set. So,
the vector of expected returns is linear in the betas with a reference portfolio if and only if the reference
portfolio is a minimum variance portfolio. Roll critique.
Variance Decomposition
The expected returns relation is telling us that only a part of an individual asset’s variance (or risk) is
priced, only that part that covaries with the return on a min-var portfolio. It is instructive to decompose
the variance of any asset or portfolio to see just how this happens. For any portfolio wp let wm be the
min-variance portfolio with the same expected return. Write:
w p w g ( wm w g ) ( w p wm )
= wg s d ,
where s and d are arbitrage (zero-investment) portfolios.
The value of this decomposition is that the returns on these portfolios are orthogonal (mutually), i.e., the
covariance between any pair is zero:
gs wg s wg ( wm w g ) 0 ,
sd s d wm ( w p wm ) wg ( w p wm ) 0
The second term is zero as above and the first is also zero since the covariance of any portfolio with wm
(a min-var portfolio) is completely determined by that portfolio’s expected return and wm and wp have the
same expected return by construction.
Thus, from w p w g s d and the mutual orthogonality of the terms we are able to decompose
the variance of the portfolio p as follows:
Labels:
[ 57 ]
g 2 = “unavoidable risk” – global min-var portfolio risk
2
s = “systematic risk” – added risk that provides added compensation ( z p z g )
2
d = “diversifiable risk” – due to being off the min-var frontier, added risk that
brings no added expected return.
2 2 2
Thus, only g and s contribute to expected return. d can be large, small, or zero without
affecting expected return. Draw a picture and see!
Alternatively: w p wm d
Recall, wm is chosen so z p z m . Let a be any min-var portfolio. Then
z p z z ap ( z a z z ) and, z m z z am ( z a z z )
We can’t use the results so far to price assets since you recall to get here we started by assuming we
knew the expected returns vector to find the min-var portfolios – a bit circular.
Also, even if investors are not mean-variance optimizers, the min-var portfolios exist (as long as mean
and variance are well defined) and the absence of arbitrage provides the pricing results we just saw.
Thus, there is no economic content until we can identify one of the min-var portfolios. This is really
what the CAPM does. The mutual fund theorem says if all investors are mean-variance optimizers we
can, in equilibrium, identify a min-var portfolio.
Assume:
(1) Each investor chooses his/her portfolio to maximize a derived utility function over z
and 2 v( z , 2 ) where v2 < 0 and v1 > 0 and v is concave
(2) Investors have a common time horizon and homogenous beliefs about z and
(3) Each asset is infinitely divisible
(4) Unconstrained trading in the riskless asset
These assumptions are sufficient for all investors to hold mean-variance efficient portfolios. Think
about what each says.
[58]
Advanced Financial Engineering
Eduardo Mendes Machado
FOC:
v
v1 ()( z R1) 2v 2 () w* 0
w
So,
v1 () 1 v1 ()( B RA)
w* 2v () ( z R1) 2v 2 () wt
2
Note that w* is proportional to the tangency portfolio and that it is chosen by each investor. Thus, the
aggregate demand for each risky asset is in proportion to its representation in wt.
The (positive or negative) remainder of each investor’s wealth is invested in R. Since, in equilibrium,
demand equals supply, it must be that the market portfolio of all risky assets wm is proportional to wt. In
other words, since all investors hold wt and R, the market portfolio – wm – the wealth weighted aggregate
of all investor’s holdings must be some version of this. Thus wm is a min-var portfolio.
If the riskless asset is in zero net supply, wt = wm. If the riskless asset is in positive net supply, then the
market portfolio is located to the left of wt on the capital market line.
We can now write: z R1 m ( z m R) since the market portfolio wm is on the min-var frontier.
FOCs:
v
v1 () z 2v 2 () w * 1 0
w
v
1 w* 1
So,
v1 () 1 v1 () B A
w* z 1 1 = wd wg
2v 2 () 2v 2 () 2v 2 () 2v 2 ()
Further, since 1w* 1 , w* is a portfolio combination of wd and wg. Thus, all investors again hold a
portfolio combination of wd and wg or min-var portfolios. Aggregate demand is therefore a portfolio
combination of wd and wg and so wm must be a min-var portfolio itself.
[ 59 ]
Thus, we can write:
z z is the expected return on a portfolio uncorrelated with the market portfolio and now this is the Black
CAPM.
When there is no riskless asset, we would like to use the same idea as we did before and say that more
risk averse investors (in the Arrow-Pratt sense) hold less of the market portfolio. Because the holdings
are in two risky assets, we can’t in general say this. However,
v1 () 1
w* z 1 1 says that more risk averse investors ((-v2) is bigger)
2v 2 () 2v 2 ()
hold less of wd and so more of wg. We cannot, however, say things about the market portfolio,
individual assets, or other pairs of min-var portfolios.
(Note: 1w* 1 2v2 () 1A 2v2v1() BA so 2v2 () as(v 2 ()) )
Note: The CAPM has given us 2 measures of risk (1) the variance of a portfolio which
determines the efficiency of a given portfolio (macro risk if you will) and (2) beta which
measures the systematic risk of individual assets (micro level risk).
Note: Mean-variance analysis generates the separation results and the pricing results. The
equilibrium analysis simply identifies the market portfolio as being a minimum variance
portfolio.
[60]
Advanced Financial Engineering
Eduardo Mendes Machado
Portfolio combinations of normal random variables are normal and normal random variables are
completely characterized by their means and variances. Thus, each asset or portfolio is characterized by
z and 2 . Thus, for any u(z), E[u(z)] is characterized by z and 2 .
N N N
3
The skewness of a portfolio is given by: m p wi w j wk mijk
i 1 j 1 k 1
2 3
We consider that investors have derived utility functions: v( z p , p , m p ) - could come from a cubic
2
utility of returns where z p and m 3p are liked and p is disliked.
2 3
To solve this problem, we could try to hold z p and p fixed and max m p (analogously to the CAPM
approach), but this just gives a big mess.
2 3
Instead, start with an investor at his optimum z o , o , mo (“o” for optimal) then consider perturbing
him away from this optimum by having him sell some small amount (w) of his optimal portfolio and buy
this amount of individual asset i. The resulting portfolio p has:
z p (1 w) z o wz i
p 2 (1 w) 2 o2 2w(1 w) io w 2 i 2
3 3 3
m p (1 w) 3 mo 3(1 w) 2 wm ioo 3(1 w) w 2 moii w3 mi
[ 61 ]
Note: Only io and mioo are important in the expected returns relation. As in the CAPM, i2 is not
important. Also, miio and mi3 are not important since they provide no tradeoffs at the margin:
mo3
3(mioo mo3 ) . miio and mi3 don’t appear, so they don’t affect the optimum.
w w 0
mioo contributes since skewness mo3 wi * mioo so it is asset i’s contribution to the skewness of the
optimal portfolio, just as o2 wi * io implies asset i’s covariance with the optimal portfolio
determines asset i’s contribution to portfolio variance.
z i 3v3
Also note : If v3 > 0 (so positive skewness is liked), this term is negative so there is a
mioo v1
substitution between z and mioo at the optimum.
Then,
2 3
3v m 2 3v 3
z i z o ( z o R ) io 2 o 3 o 2 ( io o ) 3 (mioo mo )
o v1 o v1
o 3v3 3 o
= R ( z o R) i mo ( i io )
v1
m
where io io2 and io ioo3
o mo
If some investor’s optimal portfolio is the market portfolio (by happenstance) then the 1st two terms
duplicate the CAPM.
Finally, consider the asset with returns, ~ z z , that are uncorrelated with ~z o and that has the smallest
variance of all such assets.
Here we need the further identification since not all zero-beta assets have the same expected
return as they did in the CAPM. Then,
3v 3v3 ( z z R)
z z R 3 mo3 ( zo ) and 3
v1 v1 mo ( zo )
[62]
Advanced Financial Engineering
Eduardo Mendes Machado
Any zero-beta asset will do – the difference will be accounted for by different values of z z and zo .
Note, however, that we are still left with the need to identify some investor’s optimal portfolio if we
were to try to apply this pricing relation.
[ 63 ]
[64]
Advanced Financial Engineering
Eduardo Mendes Machado
First, Risk: The CAPM makes the assumption that the variance of the return on a portfolio measures
its risk.
Consider the idea that risk is the combination of properties of a set of random outcomes that change the
evaluation of E[u ( ~ x )] away from u (x ) for concave utility u( ). Two aspects of this definition to
highlight are: (1) that what alters this evaluation (or is, on net, “disliked”) is, of course, dependent upon
the utility function used in the evaluation. Thought of this way, risk is necessarily a property defined for
a class of utility functions. This definition of risk also (2) conveys only the notion of dispersion of
outcomes. This requires that we correct for the mean (location) when we talk about risk. It also means
that all other aspects (good and bad) of the distribution (beyond location) are lumped into “risk.”
More formally…
Definition: If uncertain outcomes ~ x and ~ y have the same location (expectation), then ~ x is said to
~
be weakly less risky than y for the class of utility functions U if no individual with a utility function in U
prefers ~
y to ~ x . That is: E[u ( ~
x )] E[u ( ~
y )] u U. ~ x is strictly less risky if the inequality is strict for
some u U.
For some restricted classes of utility functions (some U), this ordering is complete (i.e. for all pairs of
random variables ~ x and ~ x is weakly less risky than ~
y , with the same mean, either ~ y or ~ y is weakly
less risky than ~
x or both).
Example: Quadratic utility: as we have seen, variance is the measure of risk. Any quadratic utility
function can be written as:
b 1
u(z) = z z 2 we restrict z to be z so that u ( z ) 0
2 b
Expected utility (for any distribution for which mean and variance are defined) is written:
b
E[u ( z )] z ( z 2 2 ) .
2
x and ~
So, for any ~ y with x y , E[u ( ~
x )] E[u ( ~
y )] iff x2 y2 .
The completeness of this ordering is unfortunately not a common property across different (broader) classes
of utility functions. Consider the example introduced by Ingersoll:
~
x = 0 with prob. = ½ ~
y = 1 with prob. = 7
8
4 with prob. = ½ 9 with prob. = 1
8
[ 65 ]
Thus, for any investor with quadratic utility, ~x is preferred to ~
y (it provides higher expected utility).
However, for u ( x) x , ~y provides higher expected utility. Thus, for any class of utility function that
x and ~
includes both quadratic and square root utility, ~ y cannot be ranked.
Further, note that it is not the case that variance is the appropriate measure of risk whenever the
ordering is complete. Consider the class of cubic utility functions defined as u ( z ) z cz 3 with
1
c > 0, where we restrict outcomes to be bounded between zero and (3c) 2 .
Expected utility is written (translating the non-central moment to the central moments):
E[u ( z )] z c[ E ( z 3 )] z c(m 3 z 3 3 2 z ) ,
where m 3 is the third central moment E[( z z ) 3 ] that we examined before. (Note that for this class
of utility functions variance and skewness are both disliked.)
x is preferred to ~
If x y and ~ y then it must be that c[3x ( y2 x2 ) (m 3y m x3 )] 0 .
Thus, 3 z 2 m 3 is the proper measure of risk for this class of utility functions. So, even for classes of
utility functions that imply a complete ordering of random variables, variance is not a universal measure
of risk.
For the general class of all risk averse (concave) utility functions, the ordering is obviously incomplete.
We need to find a way to reduce the scope of the problem if we are to say more, i.e., restrictions on
distributions or restrictions on utility functions (which are rarely thought to be particularly interesting or
valuable).
x and ~
Mean Preserving Spreads: For illustration, let’s take a look at a particular relation between ~ y
that allows a comparison for all u( ) in U (the class of all risk averse utility functions).
Intuitively, if we take ~
x and add mean zero noise to it we should wind up with something less attractive
to risk averse agents. It turns out it’s not quite that simple. Why?
Definition:
for c x c t
for c x c t
s ( x ) for d x d t
for d x d t
0 elsewhere
where: (c c ) ( d d ) t >0
>0 c t c d t
>0 d t d
[66]
Advanced Financial Engineering
Eduardo Mendes Machado
From a uniform distribution we might get the following:
So, we have four (non-overlapping) intervals of non-zero value with the middle two negative and the
outside two positive.
Note that: s ( x)dx 0 so what is added is subtracted elsewhere
and, xs ( x)dx 0 “mean” zero (really addition of s(x) doesn’t change mean)
Thus, if f(x) is a density function for x, then f(x) + s(x) is also a density function that gives the same
expected value. (As long as you don’t violate non-negativity for the resulting density.)
An exercise you should work through in Ingersoll’s text shows that if ~ y has density g(·) = f(·) + s(·),
~ ~
where s(·) is a mean preserving spread, then y is riskier than x (whose density is f(·)) for the class of all
concave utility functions. The proof simply compares expected utility of ~ x and ~ y for general concave
utility functions (lots of Jensen’s inequality applications). Transitivity implies this works for a series for
MPS’s as well.
Theorem 1: For the concave class of utility functions, outcome ~ x is weakly less risky than outcome
~
y iff ~ x ~ where ~ is a fair game with respect to ~
y is distributed like ~ x.
d
That is: ~
y ~x ~ and E[~ x ] 0 x
Note that the law of iterated expectations tells us x y in this case.
d
Proof (sufficiency): If ~
y ~x ~ …
E[u ( y )] E[u ( x )] equivalence of distributions
E{E[u ( x ) x]} law of iterated expectations
E{u[ E ( x x)]} Jensen’s inequality
E[u ( x)] fair game property
Given this result, we can easily see that variance is a valid measure of risk for the class of all concave
utility functions when we restrict ourselves to normal random variables.
[ 67 ]
For any ~
y ~ N( , b) and ~
x ~ N( , a) , with b a, we can always write:
d
~
y ~x ~ where ~ ~ N(0, b - a) is independent of ~x . Since independence is stronger than
~
the fair game property is therefore also a fair game with respect to ~
x.
When location differs, we can correct by simply de-meaning and comparing ~ x x vs. ~
y y or
~ ~
comparing y vs. x ( y x ). This, however, shifts around the distributions and is itself somewhat
cumbersome and even a little odd when our goal is to describe a risk/return tradeoff.
A closely related concept to riskier is second order stochastic dominance – “SSD”. This concept
incorporates the correction for location within the comparison.
x displays weak 2nd order stochastic dominance over ~
~ y if:
d ~ ~
~y ~x ~ where 0
and, E[ x ] 0 x,
Proof (sufficiency):
Theorem 1 provides E[u ( x )] E[u ( x )] for all concave utility functions.
st
Because is non-positive, 1 order stochastic dominance provides that:
E[u ( x)] E[u ( x )] for strictly increasing utility functions.
Thus E[u ( x)] E[u ( y )] for all increasing concave utility functions (since ~ y and x have
the same distributions they have the same expected utility).
The relation between riskiness and SSD is strong but they are not identical.
x x is less risky than ~
If x y and ~ x 2nd order stochastically dominates ~
y y we know that ~ y since
d
we can write ~y ~x ( y x ) ~ with E ( | x ) 0 (from “riskier”) so that E ( | x ( y x )) 0 where
y x is a degenerate non-positive random variable.
x SSD ~
However, if ~ y while we know that x y we cannot say that ~
y y is riskier than ~
x x . For
~
one thing, they may not even be ranked. Further if they are, it is possible that x x is riskier
[68]
Advanced Financial Engineering
Eduardo Mendes Machado
~ ~ ~ x dominates ~
than y y . Let y ~ U(0,1) and x ~ U(2,8) . ~ y , and so it FSDs and SSDs ~
y , but
~ ~
x x is riskier than y y .
x and ~
The difference lies in the correction for location. In SSD it is built in, we compare ~ y , while in
~ ~ ~ ~
“riskier” we compare x x to y y . An x with a big mean and dispersion versus a y with a small
mean and small dispersion may be evaluated differently than ~x x vs ~ yy.
After a deviation to review the general portfolio problem and results on portfolio choice, where this
is heading is a general study of the efficient set of portfolios. SSD will play a large role.
Table: x and y are random variables defined on [a, b] (see Ingersoll pg 123)
Concept: Utility Condition Random Variable Distributional
Condition Condition
~ ~ ~ ~ ~ ~ ~
x~
Dominance E[u ( x w)] E[u ( y w)] for y~ x y
~
any random variable w for all ~ Outcome by outcome
0
increasing utility d
Note: = not
FOSD E[u ( ~
x )] E[u ( ~
y )] for all ~
d ~ F(x), G(y) are
y ~x distribution functions:
increasing utility functions ~
0 G(t) ≥ F(t) t
Prob(x≥t) ≥ Prob(y≥t)
t
SOSD E[u ( ~
x )] E[u ( ~
y )] for all d ~ t
~
y ~ x ~
increasing concave utility ~ a [G (s) F ( s)]ds
functions 0
(t ) 0 t
E[ x ] 0 x,
“on average” G>F
Riskier E[u ( ~
x )] E[u ( ~
y )] for all d (t ) 0t
~
y ~x ~
concave utility functions (b) 0
E[ x ] 0
[ 69 ]
Now, Portfolio choice: To quickly review: The general portfolio problem can be cast as: an agent
chooses a portfolio to maximize his/her expected utility of returns:
Max w E[u (~
z w )] ~
zw ~
z w i 1 wi ~
N
z si
s.t. 1w 1 (i.e. wi 1 )
FOCs:
L
E[u (~
z *)~
z] with 0 and z* = z′w*
w
L
E[u (~
z *)~
zi ] for all assets i
wi
L
1w* 1
The concavity of u(·) and linearity of the constraint implies that the FOCs are necessary and sufficient
for a maximum.
The FOC must hold for the riskless asset, if it exists: E[u ( z*) R]
So, we can subtract this from the general condition to find: E[u ( z*)(~ z i R)] 0 i
in the presence of a riskless asset. This will help us examine some results concerning portfolio
formation in this general context.
Theorem: If a solution w* exists for a strictly concave utility function and a set of assets Z then the
probability distribution of its return is unique and if there are no redundant assets then the portfolio w*
is unique as well.
[70]
Advanced Financial Engineering
Eduardo Mendes Machado
θ* = Zw*
If there are no redundant assets Z has a unique left inverse L ( Z Z ) 1 Z and w* L * is then also
unique.
We can say something specific about choices made by agents in this very general context but not much:
Theorem: The optimal portfolio for a strictly risk averse, non-satiated investor will be the riskless
security if and only if z j R j = 1, 2, …, N.
Proof: When a riskless asset exists, w* satisfies the FOC: E[u ( z*)( z j R )] 0
If z j R j then z* R1 will satisfy the FOC. Since u(·) is strictly concave we know the probability
distribution of the return is unique. If there are no redundant assets the portfolio is unique. (Strictly risk
averse agents hold only risk free portfolios or the risk free asset. We also see why the “strictly concave”
is required for uniqueness of the solution in the theorem above.)
Finally, Efficient portfolios: the efficient set of portfolios – those portfolios for which there are no
other portfolios with the same or greater expected return and less risk. Alternatively, those portfolios
which are not second order stochastically dominated.
More specifically…
Definition: A portfolio w is efficient if w E
E {wˆ R N (u U )( Eu ( z wˆ ) MaxwR N ,1 w1 E[u ( z w)])}
i.e. if there exists a utility function u U (strictly monotone, concave class), for which w solves the
investor’s problem:
Maxw E[u ( z w)]
such that 1w 1 budget constraint
N
w R w is a portfolio of the traded assets – a restriction whose strength
is determined by the structure of Z
We use U, the strictly monotone, concave class, because if we the use monotone, concave class E is
trivially all w R N with 1w 1 since this class allows a constant utility function, making it
uninteresting and a useless definition.
The following theorem shows that: Efficient portfolios are those for which there are no other portfolios
with the same or greater expected return and less risk (those that are not SSDed). But, not the set for
which there is not a less risky portfolio with the same mean return – this set conceptually includes the
lower limb of the CAPM hyperbola.
Theorem: For some efficient portfolio k with returns given by ~z ek if ~z ek z ek is riskier than
~
z w z w (for any portfolio w), then z ek z w .
[ 71 ]
Proof: If ~z ek z ek is riskier than ~
z w z w , then
~ ~
E[u ( z e )] E[u ( z w z w z ek )] u U (by the definition of riskier).
k
If z ek z w , then
E[u ( ~
z w z w z ek )] E[u (~
z w )] so, E[u ( ~
z ek )] E[u ( ~
z w )] u U (monotonicity).
But, this contradicts the assumption that ~z is an efficient portfolio’s return, so it must
e
k
be
k
that z z w holds.
e
Now let’s consider an alternative way to characterize the efficient set using a simple complete markets
example with two states (so we can draw it). Since the market is complete, we can find a portfolio that
has any distribution of wealth across the two states that we would like. Therefore, we can make W1 and
W2 (wealth in states 1 and 2: Woz1* and Woz2*) the choice variables.
subject to: p1W1 + p2W2 = Wo (budget constraint is written using the unique state prices)
2u (W2 )
2u (W2 ) p2 0 or, p2
p s u (Ws )
Or, s = state price density …just as we saw before.
s
Note: Under risk neutrality, state prices are proportional to the actual probabilities (or the state price
density is constant across the states). Thus, again we see that a single risk neutral agent in the economy
(who faces no restrictions on short sales or borrowing) trades to set prices in such a way that is
constant. Any risk averse agents in the economy (in equilibrium) will then also trade so that u (Ws ) is
constant across all states. Thus, their optimal choice must be to hold riskless positions. Said another
way: the risk neutral agent trades so that there is no reward for risk bearing (all assets have E(z)=R) so
no risk averse agent bears any risk.
[72]
Advanced Financial Engineering
Eduardo Mendes Machado
Examine the set of efficient (optimal) portfolios for this two state example:
W2 1W1 2W2 k
k 1
or, W2 W1
2 2
slope= pp12 riskless asset
p1W1 p2W2 Wo
W p
or, W2 o 1 W1
p2 p2
r
d Simply assume:
p
45 line f slope= 12 1 1
2 p2
W1
p1
Line with slope = p2
is the budget constraint
Line with slope = 1 is a line of constant expected wealth
2
Draw an indifference curve for an arbitrary risk averse utility function. How do we know that the
constant expected wealth line is tangent to the indifference curve at the 45° line?
It must be – from r outward on the line, you stay at the same expected wealth but add risk if you move
in either direction. Thus, the constant expected wealth line through r must be tangent as it must be
below the indifference curve away from r on either side.
Consider point d in the picture. By concavity of the utility function, d is no better than r (same
expected wealth but more risk) and by strict monotonicity f is dominated by d since d has larger W1
and W2. So, by analogy all points on the budget line below r are dominated by r, the riskless asset
(they all give less expected wealth and more risk).
Rational choices are above the point r on the budget line in this picture. Thus, only choices with W2 ≥
W1 are optimal. Moving in this direction gets more risk and higher (not lower) expected wealth.
Where will the indifference curve be tangent to the budget constraint?
Depends on u but we know it will be above point r because we assumed the budget line had a
p p
steeper slope than the constant expected value line (that is, 1 1 1 1 or,
2 p2 p2 2
p1 p 2
or, 1 2 ).
1 2
Thus, the state price per unity probability of wealth in state one is greater than in state 2. Choosing
W2 ≥ W1 is, in this sense, a cost minimizing choice (for a given distribution of wealth, choosing W2
≥ W1 has lower cost than the reverse). Given state independent vN-M utility we will show later that
cost minimization in this way is really the only restriction imposed by maximizing behavior.
We can also find the result: if 1 > 2 then W2 ≥ W1 from the FOC of the investor’s decision problem
in this example:
[ 73 ]
u (Ws ) s
So, if 1 > 2 we require u (W1 ) u (W2 ) , or, since u is weakly decreasing, W1 W2 .
Proof (sketch): Suppose w is an efficient portfolio, then it follows from the FOC
( u ( Zws ) s ) and the concavity of u(·) that Zwr Zws s r (the 2nd inequality is weak since we
don’t require strict concavity). The contra-positive of this is: r s Zwr Zws . Now suppose that
r s Zwr Zws or equivalently Zwr Zws s r . Graphically this says:
Zw
Choose any function g(·) which has g(Zws)= s for s > 0. We know that this function is positive and
weakly decreasing so re-label it u ( Zws ) s .
Integrate this function and you get a strictly monotone concave function u(·) such that it satisfies the
FOC at the candidate w. Note that we have found a u(·) scaled so that the Lagrange multiplier equals
1. But, we can always do this since u(·) is unique only up to a positive affine transformation. The
constant of integration from the step u u is the “rest” of this transformation. This is the
equivalence between maximizing behavior and cost minimization.
Problem:Look back to the two state complete markets example. Prove that if state prices are proportional to the actual
probabilities then any choice not on the 45° line is riskier than a bundle on the 45° line.
Instead of the formal proof of sufficiency for the last theorem, to provide a little intuition, look at the
situation of a complete market where s 1 S s=1,…,S (an unnecessary simplification). Now, since
there are S states, there are S! ways to order or assign the lottery outcomes to states (which state
generates greater wealth than which). This means that there is some cheapest way to order the lottery.
Suppose that one of the cheapest ways does not assign outcomes in reverse order to the state price
density. Then states r, s such that r s but wr ws . Now, switch the outcomes for states r and s.
The change in cost is:
( p r ws p s wr ) ( p r wr p s ws ) (ws wr )( p r p s ) ( ws wr )( r s ) S 2
which is a negative number – implying a cost decrease with no change in expected utility for state
independent utility of wealth – and so a contradiction.
[74]
Advanced Financial Engineering
Eduardo Mendes Machado
same orderings on their returns as that of an efficient portfolio are efficient, i.e., some strictly increasing
concave utility function sees that portfolio as optimal.
If N < S, the market is incomplete, is not unique (there are N equations in S unknowns in the
supporting equation) so not all efficient portfolios need have the same orderings of returns. An example
from the text will help illustrate this – here it has been changed by putting things in terms of the s
where the book uses marginal utilities – see the FOC for the translation:
0.6 2.4
1
Consider the market characterized by: Z 1.2 1.5 where 1 2 3
3
3.0 0.6
S=3 3! or 6 potential orderings of returns. With 2 assets, only 4 orderings are feasible.
w1 3 5 (1, 2,3)
3 w 3
7 1 5 ( 2,1,3) st
The feasible set is: Market possibilities – state with lowest return is 1
1 w 3 ( 2,3 ,1)
3 1 7
w1 3
1 (3,2,1)
(Write returns in each state as a function of w1 and 1 – w1 then graph Zws s = 1, 2, 3 against w1.)
[ 75 ]
Graph these three lines:
1 3
1 , 2 , 3
2
3
The reverse of the final ordering is not feasible (see above), we can’t find a portfolio of these two assets
that gives returns that are lowest in state 1 and highest in state 2. The first three represent all the
portfolios with w1 13 . So, any such portfolio is optimal for some agent. Of the feasible orderings, only
(3, 2, 1) ( w1 13 ) isn’t efficient. This isn’t a practical way to approach the issue but it helps us
understand the next set of theorems and the discussion of systematic risk to come.
In complete markets λ, and so the ordering of the λs’s across states, is unique. Thus, we have:
E M {z z r1 z s1 } {z z r2 z s2 } {z z rL z sL }
where a pair of states (r, s) is considered in these restrictions, (r, s) = (ri, si), for some i r s and
L is the number of distinct pairs ( r , s ) with r s .
[76]
Advanced Financial Engineering
Eduardo Mendes Machado
If we label states so that 1 2 3 then E is the intersection of the sets of returns patterns with
z1 z 2 if 2 1 and the set with z 2 z 3 if 3 2 etc. All these intersected with the set of
marketed assets. We then have the following:
Proof: In this case λs is constant across states. Thus, all orderings of returns across states are efficient.
From the FOC we know z r z s r s , but also z s z r r s .
Thus, both orderings are efficient if r s .
Alternatively: All assets have the same expected return and so all possible portfolios may be held by
maximizing risk neutral agents.
Definition: A market is said to exhibit k-fund separation (“kfs”) if z 1 , z 2 ,..., z k M such that
E {z z i 1 wi z i , ik1 wi 1}
k
Proof: From 2fs, E is contained in a line, and since E is connected, E is convex since connectedness
implies convexity in R1.
Thus, we now have three special cases where we know E is convex and so we know that the market
portfolio is an efficient portfolio. However, one is uninteresting from a risk-return perspective and the
other two are actually incompatible (see Dybvig & Ingersoll). Now, we present a counter-example from
Dybvig & Ross that shows E is not convex in general.
Theorem: The efficient set is not necessarily convex and kfs (with k 3) does not guarantee convexity.
Proof: Shown by counter-example. Assume 3 assets and 4 states. We could have less trivial 3fs by
splitting one state into many indistinguishable states and introducing fair gambles with respect to these
new states as new primary assets.
[ 77 ]
66 82 72
44 52 48
Let 1 2 3 4 1
4 Let Z =
52 38 48
50 50 48
The valid set of price vectors is: R4 / {0} span{(1,68,40,59), (21, 28,40,39)}
We can divide by 8,088 or 6,648 respectively to get Z p 1 .
However, 1
2 w1 1 2 w2 (74,48,45,50) is not efficient since it isn’t in the opposite order to either valid
p ( ).
Alternatively, consider w* (1,.4,2.4) . Because vN-M agents view equally probable states
symmetrically, we know that for every strictly monotone vN-M (state independent) utility function that:
E[u ( zw*)] E[u (74,50.4,48,45.2)]
E[u (74,48,45.2,50.4)] (state independent utility)
E[u (74,48,45,50)] (strictly monotone utility)
E[u ( 1 2 w1 1 2 w2 )]
Thus, a convex combination of two efficient portfolios w1 and w2 is not an efficient portfolio since it is
dominated by another portfolio for every monotone state independent utility agent. Thus, E is not a
convex set.
w2
Picture: E is the shaded area
1
2 w1 1
2 w2
w1
[78]
Advanced Financial Engineering
Eduardo Mendes Machado
Analogous to the CAPM and Beta, we have seen that the way in which the risk or variability of an
asset’s returns affects the risk/expected return of an individual’s optimal portfolio is through its
correlation with the state price density (rather than its correlation with Zm.)
o Equivalently, the correlation between an investor’s marginal utility of the return on
his/her optimal portfolio and any asset’s return.
That is, if an asset’s returns have an inverse ordering across the states of nature as does the marginal
utility of z* (λ) then it has a similar correlation with the marginal utility of z* as does z* itself and
much of that asset’s variability contributes to the value of the portfolio – think of CAPM and
correlation with the market return.
Systematic risk is the notion of an individual asset’s contribution to the risk of an efficient portfolio
(i.e. what portion of an asset’s risk is priced).
This measure possesses a portfolio property that the b of a portfolio is the weighted average of the
b’s of the assets in the portfolio (using the portfolio weights)
The ordering asset i is riskier than asset j by this measure is a complete ordering and the ordering is
independent of the efficient portfolio chosen. Therefore, if we can identify an efficient portfolio and
measure marginal utility we could correct the problem we had before of an incomplete ordering on
riskiness.
If there is a riskless asset, we can write excess expected returns as being proportional to b. Rearrange
the FOC to write:
E[u k z i ] RE[u k ] now rewrite the left hand side and rearrange the equation
Cov[u k , z i ] RE[u k ] E[u k ]z i E[u k ]( z i R)
Cov (u k , z i ) ( z i R)
bik or, z i R bik ( z ek R )
Cov (u k , z ek ) ( z ek R)
[ 79 ]
And, since we know z ek R , z i R is positively proportional to bik .
Suppose there is no riskless asset – What is the equivalent of a zero-beta asset here?
The relation can come more quickly from the standard E[λ zi] = 1 for all assets i.
Our problem is b is in terms of u k , which is not something we can easily measure. In order to see the
generality of this result is let’s examine some special (familiar) cases.
The consumption CAPM (Breeden) spills out of this, as well. What we are seeing is that it all
depends on what is a sufficient statistic for marginal utility or λ. With quadratic utility or multivariate
normal returns, the return on an efficient portfolio is sufficient for u ( z ek ) . The assumptions of the
CCAPM imply aggregate consumption is a sufficient statistic for marginal utility.
Nonsystematic Risk:
Consider the ik from our conceptual regression. ik depends upon both the benchmark portfolio and
the utility function chosen. Thus, the ik we identified is not an unequivocal measure of nonsystematic
risk that will be agreed upon by all investors.
The one exception is a complete market or an effectively complete market. There we know that the
marginal utilities of all investors are exactly proportional (λ is unique), thus for all efficient portfolios and
all utility functions the u k s are perfectly correlated and ik ij for all investors. (In a pareto efficient
market all investors see the same state prices, so no valuable trades can be created. Thus, the marginal
utilities must all be proportional.)
Sufficient conditions for -risk to be nonsystematic if it is uncorrelated with the market return (not true
generally) are a pareto efficient market and that is a fair game with respect to z m .
That is, if we write ~ z as ~
i z ~
ix ~ where E[~ z ] 0 and that the market is effectively complete
i i i m
[80]
Advanced Financial Engineering
Eduardo Mendes Machado
with u m ( z m ) and so uncorrelated with any u k . is therefore recognized as nonsystematic risk by
all investors and will “have no price”.
In particular models the notion of nonsystematic risk being risk that is uncorrelated with the state price
density can be a handy representation.
[ 81 ]
[82]
Advanced Financial Engineering
Eduardo Mendes Machado
Sufficiency:
If all investors (k) have u k a k bk u with bk 0 k , then the FOCs of all investors look
alike:
E[u k (~z k *)(~
zi ~z j )] 0 k i, j
bk E[u (~ z k *)(~zi ~z j )] 0 k i, j .
Note that the individual preference parameter does not affect the FOC so:
E[u (~ z *)(~zi ~z j )] 0 k i, j ,
i.e., the FOC is the same for all investors so all hold the same portfolio: z k * z * k .
Necessity:
If all investors hold the same portfolio, regardless of how assets’ returns are distributed, then it
must be that E[u k ( z*) z i ] k k i for all possible asset returns distributions.
If the assets’ have returns distributions that are dirac delta functions, then the FOC is:
u k ( z *) z i k k i
So, asset by asset we have:
u k ( z *) 1 u ( z *) u j ( z *)
for each investor k or j so k .
k zi k j
Thus it is necessary that there is some kj such that
u k ( z*) kj u j ( z*)
i.e., for this to hold for a unique z* (regardless of its mean), the marginal utility of z* can differ only by a
multiplicative constant. So, u k k u . Integrating gives: u k ( z ) k k u ( z ) k .
Alternatively…For every investor to hold the same portfolio, regardless of the returns
distribution, then we must have that for all returns z:
u a ( z ) u b ( z ) u ' (1)
or u ' a ( z ) a u 'b ( z )
u a (1) u b (1) u 'b (1)
i.e., all investors must have the same marginal rate of substitution across realizations of returns. If this is
true across all investors for all levels or return, z, we get 1fs.
Integrate:
r r u a (1)
u (r )dr u
1
a a (r ) u a (1)
1 u b (1)
u b (r )dr
u a (1)
= [u b (r ) u b (1)] r
u b (1)
[ 83 ]
So ua is a positive affine transformation of ub – clearly all will hold the same portfolio the utility functions
are all the same.
Now, look at a (complete) market with one riskless asset and one risky asset:
~ r with prob 1 2
z
1 with prob 1 2
Then, if the ratio of the marginal utilities is not the same for all investors, so u a u b , the investors
see different relative state prices and so choose different portfolios in this market.
Cass & Stiglitz worked this out considering utility of wealth – by considering utility of returns, we
implicitly assumed initial wealth was the same for all investors. The Cass & Stiglitz results hold for all
Wo 0 and so all investors must have an affine transformation of one CRRA utility function.
When does MaxE[u (Wo Zw)] s.t. 1w 1 look the same for all Wo ?
If, for example:
W 1 (W Zw)1
u (W ) , then E[u (Wo Zw)] E o
1 1
E (Zw)1
= Wo1
1
So, the optimal portfolio is independent of Wo . u (W ) Ln(W ) also works.
Sufficiency: with no redundant assets, wm is unique and E ( i ) 0 i . So, any holding other than
wm has the same expected return and more risk than wm. That is,
~
z m Zw m wim ~ x wim ~i ~
x
Any other portfolio has returns of:
~
zk ~ z m ~k
x wik i ~
Where, E[ k z m ] wik E[ i x] 0
[84]
Advanced Financial Engineering
Eduardo Mendes Machado
So, all portfolios have the same expected return and zm is less risky than all other portfolios k, thus only
wm will be held by an investor with concave utility.
d
Necessity: (Need to show ~ zi ~
x ~i not ~ x ~i )
zi ~
Let zm be the returns on the assumed optimal portfolio. By leaving the ei unspecified, we can, without loss
of generality, write ~ z m e~i . We do know, however, that E[ei ] 0 i must be true since if not
zi ~
some assets will have different expected returns and some investors will hold different portfolios,
trading off risk and return in different ways.
The FOC of any investor, regardless of utility function, must hold for all zi and for zm. So,
Since zm must be optimal for all monotonic, concave utility functions, i.e. any positive, decreasing u (z ) ,
by the fair game lemma, this zero covariance implies E[ei z m ] E[ei ] 0 .
( cov( x, g ( y )) 0, g (.) E[ x y ] E[ x] .)
So, we have:
~ zm ~
zi ~ ei with E[ei z m ] 0
Finally, we required, by construction, that the assumed optimal portfolio has no e-risk. So, wm with
wim ei 0 .
The story is simply that there is a single source of systematic risk and all assets have the same exposure
to it. Only in this way is there no risk-return tradeoff (must be no such tradeoff so that only one
portfolio will be held regardless of the utility function involved) and only with the fair game property do
you get nice results on riskiness for the idiosyncratic component of returns.
[ 85 ]
It’s simple – all assets must have the same expected return: z i z m i , an uninteresting case.
The only known class of utility functions that permits non-degenerate 2fs is the quadratic class.
Quadratic utility investors hold portfolio combinations of two efficient funds.
Since the riskless asset is one of the funds, the other need include only risky assets and we, therefore,
want to find classes of utility functions for which the risky assets are held in a fixed proportion. i.e.
find U for which wik ij (constant) for all agents k where i, j 0 .
w jk
In other words, we want all investors holding a levered position in the same (market) portfolio of
risky assets.
The investor’s FOC: E[u k ( z*)( z i R)] 0 for all i determines the optimal portfolio of risky assets.
A sufficient condition for this is that all utility functions are in the HARA (or LRT) class:
1 a bz with b constant for all k.
A k
Or, u k (Z ) ( Ak B k Z ) c (u (W ) ( Ak Bk W Wo ) c )
Where c is constant across investors and Bk, and c must have the same sign for concave utility.
[86]
Advanced Financial Engineering
Eduardo Mendes Machado
U: Recall that the class of utility functions includes the following.
(W Wˆ k ) W A
(1) with 1 ( 0) 1 c Wˆ k ok k
Bk
Ŵk subsistence level of wealth
(Wˆ k W )
(2) with 1 Ŵk satiation level of wealth
The optimal wk * is clearly independent of Bk and depends only on c. Thus, all investors hold
the same w * since c is common to all (all investors have the same standard CRRA utility function, look
at Ŵk above) and it degenerates to 1fs.
Case 2: General –
Consider that investor k first places an arbitrary amount, 1-ak, in the riskless asset and ak in a
portfolio of all risky assets and the riskless asset. This is without loss of generality since the second
portfolio contains the riskless asset and there are no short sales restrictions.
Let the amount ak be split up as o , with 0N i 1 .
The final portfolio weights are therefore written:
wok 1 a k a k ok
wik a k ik i=1,…,N
E[( Ak Bk (1 a k ) R Bk Nj 0 a k jk ~
z j ) c (~
z i R )] 0
Ak
Now since ak is arbitrary, choose something clever and set a k 1 .
( RBk )
Ak
So, 1 a k .
RBk
[ 87 ]
Thus, ok* and ik* i=1,…,N are now independent of Ak and Bk: ik i for all k.
This demonstrates that the same o and are optimally chosen by all investors with the same utility
parameter c, the FOC holds, and the constraints iN0 wi 1 iN0 i 1 also hold.
All investors hold the same portfolio of risky assets – thus it must be the market portfolio.
The individual utility parameters Ak and Bk simply determine the amount of leverage.
wok 1 a k a k ok 1 a k a k o
Ak A
o 1 k
RBk RBk
W A
For 1 : Wˆ k ok k is the subsistence level of wealth for “next period”.
Bk
u (Wˆ k ) or u (Wˆ k ) so investors hold a portfolio that insures W Wˆ k .
[88]
Advanced Financial Engineering
Eduardo Mendes Machado
then divide the remaining wealth among all assets (including the riskless asset) according to the optimal
portfolio weights, o , , where these weights are the optimal choice for an investors with utility
u (W ) W which depends only on (or c).
Example:
( w wˆ k ) 2
u ( w)
2
ŵk W
For 1
The demand is as given but these demand functions represent utility minimizing portfolios, we
Wˆ
must bound wealth below Ŵk . So, i (Wok k ) must be negative.
R
That is, the investor takes a short position in ( o , ).
[ 89 ]
For exponential utility:
W 1
wik Wok i wok Wok o k ok
k k k
So, the investor’s demand for the risky assets is constant for all Wo. The i are optimal for an investor
with absolute risk aversion parameter 1Wok , u (W ) exp( W ) , so there is unitary absolute risk
Wo
aversion on utility of returns u(z) = –exp(-z ).
There are two residual risk free portfolios that have distinct exposure to “Y” risk. Without this last
condition it would degenerate to 1fs.
w1 and w2 are the two funds.
~ ~
The systematic risk of any asset is given by X bi Y
This combination of conditions implies ε is idiosyncratic risk for all investors and that all desirable
risk/return combinations can be achieved by trading in w1 and w2.
Zm
X
2
[90]
Advanced Financial Engineering
Eduardo Mendes Machado
Two Fund Money Separation
The necessary and sufficient conditions are:
bi , Y , i , w m , R such that
~
z i R bi Y ~i
~ i
E[~ Y ] 0
i i, Y
w m such that 1w m 1 , wim ~i 0 and wim bi 0
0
w m
Let 1 denote the portfolios weights on what will be the market portfolio of risky assets
m
wN
and,
0
b
Let b 1 denote the augmented vector of sensitivity levels to the single risk factor
bN
wb
For any portfolio w define as the relative level of systematic risk of portfolio w (relative to the
b
market portfolio).
Sufficiency: The story is again that all desirable risk return tradeoffs can be achieved with R and Zm
(both of which are devoid of ε-risk) and that nobody holds ε-risk.
m
0 wN N
Since was chosen so that wb b (i.e. in wm and (1- ) in R mimics the systematic return of w)
we know b 0 must hold. α is an arbitrage portfolio with no systematic risk exposure.
(Or: by construction, w (1 )i1 . Post-multiply by b wb (1 ) 0 b b
And, since we chose so that wb b , it must be true that b 0 . Thus, doesn’t contribute
to systematic risk. And since E ( ) 0 , α doesn’t contribute to expected returns either.)
[ 91 ]
All risk-return tradeoffs can by accomplished with R and wm.
We need only show that:
E[u ( wz )] E[u ((1 ) R z z )]
E[u (1 ) R z)] E[u (q)]
The two portfolios w and q have the same expected return since E[ i ] 0 i so we simply need
to show that q is riskier than q.
i.e. show that E[ q ] 0 (or E[ q ] 0 )
E[ q] i E[ i q ]
But, since knowing q implies knowing implies knowing Y and vice versa, this is equivalent to:
i E[ i Y ] 0
Thus, q~ ~ is riskier than q in a R-S sense, and no investor with a concave utility function holds the
arbitrage portfolio all hold some combination of R and wm.
Necessity:
The general FOC is: E[u (q~ )( z i R)] 0 if q is the assumed optimal portfolio’s returns
If q~ allows 2fms q [z m (1 ) R] where wm is some portfolio of the risky assets (1 w m 1)
~ ~z R
Define Y m and bi bm i
bm
~
and you have all returns zi can be written as: z i R bi Y ei
[92]
Advanced Financial Engineering
Eduardo Mendes Machado
So,
0 i E[u (q )( z m R)] E[u (q )ei ] i and for all u(·) in the monotone concave class.
By the assumption of separation, the first term is zero, so it must be that E[u (q)ei ] 0 i , u(·).
As before,
0 E[u (q )ei ] Cov (u (q ), ei ) since E[ei ] 0 i , u(·)
so the fair game property implies that the FOC (assuming separation) holding implies that the general
FOC holds, which proves necessity.
For 2fms - we also know that wm must, in equilibrium, be the market portfolio. So, the market
portfolio is efficient and our general pricing formulas can be applied using the restrictions on allowed
utility functions and recognizing that z*=zm.
Cov (u ( z m ), z i )
zi R ( z m R)
Cov (u ( z m ), z m )
2fms Distribution Based - Again, the risky asset portfolio must be the market portfolio. From our
~ ~
conditions z m R bm Y and ~z i R bi Y ~i we get:
~
b
z i R bi Y and z m R bm Y , or z i R i ( z m R)
bm
and that
Cov ( z i , z m ) bi bmVar (Y ) and Var ( z m ) bm2Var (Y )
bi Cov( z i , z m )
So, i
bm Var ( z m )
So, 2fms, if the variance of Y is defined, allows CAPM pricing. This had to follow since all optimal
portfolios, being combinations of R and wm, can be completely described by their means and variances
alone.
[ 93 ]
From 2fs, we get the Black-CAPM pricing equation.
Without a riskless asset:
m is one portfolio
let portfolio “0” be the other portfolio
Z o X bo Y Zm X Y where bm 1
X Z o boY and, Y Z m X
Z o bo Z m
X Z o bo Z m bo X
1 bo
Z o bo Z m
Y Zm X Zm ( Z m Z o )(1 bo ) 1
1 b
z i X bi Y (1 bo ) 1 ( Z o bo Z m ) bi (Z m Z o )(1 bo ) 1
(1 bo ) 1 ( Z o bo Z m bi ( Z m Z o ))
b bo b bo
z i Z o 1 i Z m i
1 bo 1 bo
from z i X bi Y i
Cov ( z i , z m ) Var ( X ) biVar (Y ) (1 bi )Cov ( X , Y )
or,
Var ( X ) Cov ( X , Y )
bo
Var (Y ) Cov ( X , Y )
So, z i z o (1 i ) i z m z o i ( z m z o )
Thus, the Black-CAPM equation occurs for the same reason as in the case of 2fms.
K-Fund Separation
Investors will hold combinations of no more than K risky mutual funds and the riskless asset if:
~
(1) z i R Kk1 bik Yk i
[94]
Advanced Financial Engineering
Eduardo Mendes Machado
(2) E[ i Y1 ,...,YK ] 0 i k
(5) Rank(A) = K
Each element of A is a portfolio weighted average of the b’s on one Yk for a given fund l.
i.e. alk is fund l’s factor loading on factor k (Yk).
a11 a1K
A
a K 1 a KK
Condition (1) describes a K factor model of the returns generating process. Y’s are factors and b’s are
factor loadings.
Condition (2) says the ε’s are idiosyncratic risk that no investor wants to hold.
Conditions (3) and (4) say that each mutual fund is a well diversified portfolio and that there are K such
funds.
Condition (5) says that we have K linearly independent (non-collinear) funds – thus, a rotation of the
factors can set A to a diagonal matrix (i.e. there is one portfolio combination of the funds that creates a
fund with no ε-risk and bi > 0 for only one i) and we can therefore get a portfolio with any combination
of factor loadings by using the K funds (i.e. from any possible risk-return tradeoff using the funds).
Proof: Basically the same as before – show that no one holds ε-risk and can accomplish any desired
risk/return combination using the K funds.
This must then hold for the K funds as well – use the “rotated factors” so that each of the funds has a
non-zero factor loading on only one of the factors.
Let z k be the expected return on the mutual fund with a non-zero loading only on factor k. And,
Let b k wik bik be that loading.
Then,
z k R b k Yk
[ 95 ]
( z k R)
z i R k bik .
bk
Furthermore,
Cov ( z i , z k ) bik b k Var (Yk )
2
Var ( z k ) b k Var (Yk )
So,
Cov ( z i , z k ) bik
k
k ik
Var ( z ) b
and
z i R k ik ( z k R ) .
[96]
Advanced Financial Engineering
Eduardo Mendes Machado
where,
(2) E (~ ) 0
~
(3) E( f ) 0
(4) E ( ff ) I
(5) E (f ) 0
(6) E ( ) D a diagonal matrix
A brief survey of history tells us that, prior to the development of the CAPM, the standard intuition was:
E (~
zi ) R f i
People invest in risky assets if you give them a positive expected return in addition to compensation
for the time value of money, Rf.
This implies that if you want a preference based theory, you should use concave preferences since
risk averse investors are the ones that will demand a positive premium for holding risk.
Portfolio theory and the equilibrium arguments in the CAPM gave an identification of the risk
premium i i that said only market risk, the projection of an asset’s returns on the market
portfolio’s returns (the part of the asset’s return that is correlated with the market portfolio’s return),
[ 97 ]
is priced – you ignore all the rest.
The supposed intuition of the CAPM is that idiosyncratic risk can be diversified away leaving only
systematic (market) risk to be priced.
But,…
Idiosyncratic risk in the CAPM framework is defined with reference to the market portfolio, it’s the
residual of the regression or projection of an asset’s return on the market’s return.
No further assumptions about the ' s are made – i.e. they could be highly correlated. In fact, they
are linearly dependent since when weighted by the market value weights they must sum to zero. So,
in any large portfolio, we cannot use the law of large numbers to say this portfolio has negligible
idiosyncratic risk, contrary to our intuition.
The exception, of course, is the market (tangency) portfolio. But, then the intuition that
diversification leads to pricing based on the market portfolio is circular at best.
The APT is also a static model but it can be seen as a static version of most inter-temporal models in
finance in which the factors represent innovations in the underlying state variables.
The APT directly assumes a return structure in which the systematic and idiosyncratic components
of returns are defined a priori. Thus, the standard notion of diversification is directly used.
The APT is based upon the absence of arbitrage and pricing is done in terms of the exogenous
factors.
One view of pricing is that from the absence of arbitrage (and so it will be true in any equilibrium
model with increasing preferences) we know that :
p p1 p s 0 s.t. Y p v
In a complete market we can formally identify the p’s as A-D state prices. In an incomplete market,
the p’s still exist but technically we can’t identify them as A-D prices. This approach is currently
being pursued in interesting ways.
Traditionally, however, people have been looking for a pricing relation like: z R f i
Think like an economist for a moment. Y is your technology – it determines how money is moved
from state to state and across time in this economy.
[98]
Advanced Financial Engineering
Eduardo Mendes Machado
Either restriction we know translates to more mean and less variance being goods (i.e. you consider
u(mean, variance) as your preference restriction in the economy). This provides the 2fs and as we
have seen the simplification of the general pricing equation.
The APT is a look at how we can restrict Y – the technology rather than preferences.
We can interpret the returns generating model as a rank restriction on the Y matrix – this will be
illustrated more concretely in what follows. In other words the factor model restricts the nature of
what can constitute the “total risk” of the capital market.
As an illustration let’s develop an arbitrage based approach to the CAPM pricing equation or the SML.
Suppose:
~
z i z i i f ~i
~ a 1-factor model
Z w w~
z w( z f ) wz ( w ) f w
Set w 0 - i.e. choose a portfolio with no systematic risk. This assumes it is not the case that all
assets have the same level of systematic risk. That is, there exist assets i, j s.t. i j (or that the
vectors 1 and are not collinear). Note, since relative pricing is the goal this is a minimal assumption,
on the order of i, j with z i z j , made in the CAPM derivation.
(2) Suppose the portfolio w is also well diversified. That is, in some metric you specify the wi’s are
close to 1/N for all assets i.
[ 99 ]
(3) Assume either an upper bound on the variances of the ’s or more simply assume that 2i 2
i . i.e. the ’s all have the same (finite) variance. Also assume that E (~i ~j ) 0 if i j (uncorrelated
residuals).
2 1
The variance of w is then “close to” 2 if wi is “close to” i .
N N
2
For example, if wi 0( N ) Var ( w ) 0 2 .
1 Or,
N
w 0 as N E ( w ) 0 Var ( w ) 0 as N thus we can ignore the ’s.
(This is where we see the “bite” of equation (6) and where all the difficulty comes in the APT.)
Now, since we set w 0 we have: Zw wz ( w ) f w wz for “large N”
Further, picking an asset with β = 1 we see that z 1 R where z 1 is the expected return on an asset
with a factor loading equal to 1.
[100]
Advanced Financial Engineering
Eduardo Mendes Machado
Thus, z i R ( z 1 R )
And, we get the standard intuition that for a risky asset the expected return is made up of the time value
of money and a risk premium.
~
In the returns equation ~z i z i i f ~i , economics tells us that z i and i should be related. We
found:
z i R i ( z 1 R)
Here, it is a linear combination of a vector of ones and something else (f – the factor).
Is it always a vector of ones?
[ 101 ]
Now, go back and see what’s wrong.
z j R ( z R ) j is only an approximate pricing result – the rank restriction is only
approximate when there is residual risk. There is a whole literature on “what is close” – Absence of
Asymptotic Arbitrage – and how big can any pricing errors can be?
Since there is no restriction placed on preferences other than monotonicity, we don’t or can’t say
anything about these questions.
where b , c , and 1 are not collinear (so that our factor loadings are “sufficiently different” a term we
will use now and make more precise later).
Again, this is a singular matrix where the last 2 rows are not collinear.
So, it must be that:
z i R ai R 1bi 2 ci i
Again, let z i be the return on a portfolio with one unit of factor i risk.
Unavoidable Risk:
Consider the two factor model:
~
z i ai bi f 1 f 2 i 0
Any positive investment portfolio has returns:
[102]
Advanced Financial Engineering
Eduardo Mendes Machado
Zw wa wbf 1 f 2
Clearly, we do not have the factor loadings on factor 2 sufficiently different across assets – this is an
extreme version.
However, any portfolios with the same b are perfectly correlated and must have the same expected
return from the absence of arbitrage (here simple dominance).
Thus,
b
0 1 ( ai a o ) 2 ( a j a o ) 1 a i a o i ( a j a o )
bj
(by solving 1bi 2 b j 0 for 2 )
This must, then, hold for any choice of 1 (and its accompanying 2 ). So,
o
ai a o a j a
bi bj
The linear pricing relation is then (by simply rearranging this expression):
ai a o bi or,
z i ai z o bi ( z 1 z o )
where z o a o
z 1 = expected return on a portfolio with b = 1
Note: The riskless rate does not appear here, even if a riskless asset exists. This occurs because we
cannot create a risk free asset with these risky assets – the absence of arbitrage therefore does not provide
a relation between R and the expected return on risky assets. In particular, z o R .
In fact, we expect z o R is the usual relation since you must compensate investors for their exposure
to the unavoidable factor 2 risk. Similarly, z 1 can be greater than or less than z o .
[ 103 ]
An interesting question to ask at this point is what can we learn from the pricing relation? Can we put it
in the context of other results we have seen?
u( Z ek )
Recall: that (i.e. the state price density is equal to the marginal utility of future return on the
optimal portfolio scaled by the lagrange multiplier or the marginal utility of current consumption). Thus
the general pricing model tells us that pricing is determined by the covariance of an asset’s with the
marginal utility of the return of an optimal portfolio or the its covariance with the state price density.
Recall also that when there was a set of circumstances in which there was a simple sufficient statistic for
this marginal utility we were able to derive standard pricing equations like the CAPM (quadratic utility or
normally distributed returns) or the CCAPM (when consumption or consumption growth is sufficient
for marginal utility) from the general formula.
To make this point in a slightly different way consider the CAPM. Given the model:
(1) a bz m and 1 E[ z i ]
We can find constants γ and δ such that
(2) E ( z i ) im .
Conversely, given (2) we can find constants a and b such that (1) holds.
[104]
Advanced Financial Engineering
Eduardo Mendes Machado
Thus these are equivalent pricing relations. This also gives us an idea of how we might look for or select
factors. The factors should be things that help model marginal utility. Thus select variables that will
proxy for how “happy” people are: zm, the business cycle, production, consumption, input and energy
prices, etc.
n
i 1
win 0 n (all are arbitrage portfolios)
n
n
w
i 1
i zi 0 n (expected return is bounded away from zero)
n n
w n ' w n win w nj ij 0 (risk disappears)
i 1 j 1
For the variance all that is technically required is that some infinite sub-sequence has variance with a
limit of zero.
We focus on variance because as n risk 0 . So, as n you get a riskless payoff and mean-
variance analysis is a good approximation.
1
One way this can be done is to set n ( w n ' w n ) 4
(2) AAO is not a preference free idea. A riskless arbitrage opportunity guarantees infinite wealth
and an infinite certainty equivalent utility of return. i.e u ( Zwn ) u () , so an unbounded
position is taken
Chebyshev’s inequality (Pr ob[ ~ x t ] t 1 ) tells us that an AAO does guarantee infinite wealth
with probability one, however an infinite certainty equivalent wealth is not guaranteed. That is, not all
investors take a position in an AAO.
[ 105 ]
A counter-example in the text demonstrates this and we can see that an opportunity to increase wealth
with no investment and vanishingly small risk does not guarantee an increase in expected utility – in the
example, no investor invests in the AAO.
If a utility function is bounded below by a quadratic function, then an AAO is a good deal. But, few
“nice” utility functions are bounded in this way.
1 1 2
Lim vi2 Lim v n 0
n n n n
where v n is the Euclidean norm of the length of the vector v: v n in1 vi2
Proof: Select n assets and number them 1,…n. “Regress” their expected returns on the bik and call the
regression coefficients k . (This is really a “population exercise,” formally it is a projection of the ai on
the space spanned by the matrix B and the vector 1 , the constant in the regression. If there is a multi-
collinearity problem, prespecify as many of the k as is necessary to remove the problem.)
vi
Now, consider the arbitrage portfolio: win
vn n
The payoff on this portfolio is:
[106]
Advanced Financial Engineering
Eduardo Mendes Machado
n vn 1
i vi z i ( n vn
1
i vi (a i k bik f k i )
1
( n vn i vi ( a i i )
vi2
Now suppose that the theorem is false so that: 0 in the limit.
n
vn
Then, cannot go to zero and an AAO exists.
n
So, if no AAO’s exist, then
2
vn vn v i2
0 and 0 as well. QED.
n n n
The derived no arbitrage condition is 1 N i (a i o k bik k ) 2 0 . Each term in this sum is non-
negative. The average term is zero, thus all but a finite number must be negligible.
More precisely, if we order the assets by the size of their absolute pricing error: v1 v n , then
for any , no matter how small, there exists a finite N such that fewer than N assets are mispriced by
more than .
v1 v N 1 v N
With an infinite number of assets, the probability (picking one at random) of getting one which has an
error of more than is zero as the assets with this size error or worse are a finite set which has a
measure zero in an infinite set of assets.
Thus, the linear model prices most assets correctly and all with a negligible mean squared error. It can,
however, be arbitrarily bad at pricing a finite number of assets.
Theorem 2: Under the conditions of Theorem 1, the pricing error must satisfy the following:
[ 107 ]
1 vi2
0
Lim i 2
n n
i
Where,
v i bik v 1
i 0 k and i i 0
i i i i
[108]
Advanced Financial Engineering
Eduardo Mendes Machado
Consider the portfolio:
vi
2
w n i
i 1
vi2 2
n i 2
i
1 v 2
i 1 vi2 2
cannot go to zero and an AAO exists.
0 in the limit, then 2
n i
2
n i
[ 109 ]
1
1 vi2 2
2
2 and 1 vi both 0 as n .
n i
n 2i
We must show T is a vector with one non-zero entry and that the transformed model is valid.
The transformed model is proper since:
E (fˆ ) E (f T ) E (f )T 0 and,
E ( fˆfˆ ) E (T ff T ) T E ( ff )T T IT I for any orthogonal T.
1
Now, choose a matrix T T ( ) 2 , x
kxk 1
…where x is any matrix (k x k-1) whose columns are mutually orthogonal and all are orthogonal to .
By construction, T is orthogonal T T I
1
So, I T T (T ( ) , T x )
2
And,
1
T (( ) ,0,0,...,0)
2
The lesson being that in any equilibrium, only a portion of the uncertainty (risk) brings compensation.
This can be true even when unpriced risk is common to many or all assets. We can’t make any
meaningful economic statements until priced and unpriced sources of risk are identified. The CAPM by
its preference restriction does this. In particular, we must say something further about the nature of the
factors in the returns generating model or we cannot attach any significance to the size or the sign of the
k .
[110]
Advanced Financial Engineering
Eduardo Mendes Machado
Fully Diversified Portfolios:
We would like to say that k ( z k R ) . The problem is that all such ~
z k may not be priced the same
since pricing is only approximate. So, k would not be well defined.
A fully diversified portfolio is the limit of a sequence of positive net investment portfolios whose
weights satisfy:
n
Lim ni 1 ( wi (n)) 2 c i.e.. the wi vanish for most assets.
n
The important feature is of course that they have no residual risk in the limit:
2
n 2 2 n 2
Lim i 1 wi i Lim(ni 1 wi ) 0
n n
n
There may also exist less than fully diversified portfolios with no residual risk.
Theorem 3: The expected returns on all fully diversified portfolios are given correctly with zero error
(in the sense that ( wi v i ) 2 0 so the error on any fully diversified portfolio is negligible) by any linear
pricing model satisfying Theorem 1.
Proof:
The expected return on the nth portfolio in a sequence is:
aF o K wi bik k wi vi
o K bFk k v F
2 2 2 2 2
v i2
Consider v F ( w i v i ) w v
i i (n w ) i
n
(From Cauchy-Schwartz inequality, ( E[ XY ]) 2 E ( X 2 ) E (Y 2 )) , but the 1st term is bounded and the
second term goes to zero from Theorem 1.)
So, for a fully diversified portfolio, we know o is the return on a portfolio with no factor risk, but it
also has no residual risk since it is fully diversified. So, o R must hold.
Also, for fully diversified portfolio with one source of factor risk.
i.e. bik=1 and bij=0 j k , the expected return is exactly z k R k
So, k z k R is now well defined.
[ 111 ]
Define Qn ˆn ˆ n / n and assume the sequence Qn has a limit. Qn is k+1 x k+1 independent of n and
has a limit if each element does. We know these are all bounded since the factor loadings are bounded.
Consider the sequence of portfolio formation problems using n = k+1, k+2,… assets:
Min 1 2 wi2 such that wi 1 wi bik 1 wi bij 0 j k
So, form well diversified portfolios with single factor risk (positive investment):
L 1 2 ww (c ˆ w)
where is a vector of Lagrange multipliers and c is a vector with a 1 in the 1st and kth positions and
zeros elsewhere.
By construction, this portfolio has a single source of factor risk and is fully diversified if:
nww nc (ˆ ˆ ) 1 ˆ ˆ (ˆ ˆ ) 1 c c Q 1c is bounded.
Since c and Qn are finite in size this can be unbounded only if Qn becomes singular in the limit. Qn is
non-singular for finite n since n is of full column rank.
If c is the vector i1 the portfolio has no factor risk and is fully diversified.
Its expected return must be o R .
This makes precise our loose “sufficiently different” phrase in an economy with residual risk.
A sufficient condition:
At least one investor chooses to hold a fully diversified portfolio and each asset’s idiosyncratic risk is
a fair game with respect to the factors:
E ( i f1 f k ) 0 i
[112]
Advanced Financial Engineering
Eduardo Mendes Machado
Exact pricing follows from the separation result we saw previously.
By symmetry, all investors hold assets 2, 3, … equally, in the limit this duplicates the riskless asset, so
a=R in any equilibrium.
Suppose 1 ~ N(0, 2 )
u e Z
We do have…
Theorem 5: A sufficient condition for the arbitrage model to provide exact pricing in the limit is if:
(i) returns are given by a factor model with E ( i f1 f k ) 0 .
(ii) The market proportion or supply of each asset is negligible
(iii) The loadings on each factor are spread evenly among the assets
(iv) No investor takes an unboundedly large position in any asset ( w *i w i )
(v) Marginal utility is bounded above zero. Then, the pricing relation
ai o bik k i
holds and the pricing errors converge to zero in the sense that:
n
Lim i 1 vi2 0
n
n vi2
i.e. all errors must be negligible in the limit (it’s not Lim i 1 0)
n n
“Proof”:
(ii) and (iii) describe a “fully diversified economy”
(iii) - no asset is a big portion of the market portfolio
(iii) - assures that each factor affects many assets and that each factor must therefore make an
identifiable contribution to market risk – no multi-collinearity in the columns of n
[ 113 ]
In a fully diversified economy, all investors can simultaneously hold fully diversified portfolios with
factor loadings spread among their portfolios in any form.
(iv) and (v) guarantee an interior optimum to each investor’s choice problem
(iv) - says risk aversion doesn’t vanish so variance is always disliked
(v) - nonsatiation – expected returns are always liked
(i) the residuals don’t matter for pricing in the limit, so you get a bound no the vi 0 has the same
role as in the discussion of the separating distributions.
[114]
Advanced Financial Engineering
Eduardo Mendes Machado
Production: There are n productive units (or “Lucas trees” as they have come to be known)
exogenously producing (no production decision is made nor is any scarce resource used in
production) different (weakly positive) amounts of the same good. yit is the stochastic output of unit
i at time t and y t ( y1t , , y nt ) is the vector of output levels of the economy at time t.
The good is perishable (none can be saved so there is no riskless asset), so aggregate consumption at
time t must satisfy: 0 c t in1 y it 1 y t (consumption is weakly positive).
ct is a choice variable (recall: there is no investment of the consumption good in production).
yt follows an exogenously specified Markov process defined by the transition function:
F ( y , y ) Pr{ y t 1 y y t y} the agent can not manipulate this in any way.
Each productive unit has one (perfectly divisible) share of stock outstanding which trades on a
competitive market.
Ownership of a share at time t (“the beginning of date t”) entitles the owner to the time t production
of the associated “tree” – shares then trade at (“the end of”) time t at ex-dividend prices
pt ( p1t , , p nt ) to assign ownership of the time t+1 production
Denote a consumer’s beginning of period t holdings as z t ( z1t , , z nt ) , thus at time t the agent
chooses ct and zt+1 (what to eat and what to buy at prices pt ( p1t , , p nt ) ).
The representative agent nature of the model and the assumption of increasing utility imply that, in
equilibrium, the following must be true: ct in1 y it (consume or it vanishes, there is no other use)
and z t 1 (1,1, ,1) t (demand must equal supply in equilibrium).
Because the only uncertainty in the model is introduced via “production” and because utility is
recursive – at each date the choice problem looks the same – price should be a fixed function of the
“useful history” of production. Given that production, yt, follows a Markov process the useful
history of production is summarized in the current state. Thus, the price vector at the end of date t
is a fixed function of the time t production: pt = p(yt).
[ 115 ]
Therefore, knowing the transition function governing production F ( y , y ) and the price function
p(yt) will allow determination of the process followed by price.
Similarly, ct () (current consumption) and z t 1 () (future holdings), which are the agent’s time t
choice variables, depend on current holdings (zt), current production (yt), and what you think the
price will be ( pt) at each future point in time. So we have fixed functions ct c t ( z t , y t , pt ) and
z t 1 z t 1 ( z t , y t , pt ) .
To “close” the model - Lucas uses rational expectations – the idea that the anticipated or
hypothesized price function used in the agent’s optimization problem is the same as the realized
market clearing price function (the true price function).
[116]
Advanced Financial Engineering
Eduardo Mendes Machado
Write the agent’s optimization problem as:
Maxco , x u (co ) E[ u ( x y1 )] s.t. co p x y o p 1
To transform this result into something familiar, recall that everything here is in real terms and that
there is a single consumption good.
So,
W1 c1 wealth at t=1 results only from “dividends”
W
W1 R1 (Wo c 0 )
y
where, the return to aggregate wealth is: R1W in1 wi R1i in1 wi i1
pi
with i wi 1
W
R 1 is the return on aggregate wealth or the return on the market portfolio.
[ 117 ]
Now write:
(c1 c*) [ R1W (Wo c o ) c*]
(c o c*) (co c*)
c * (Wo c o ) W
R1
co c * co c *
a o bo R1W
So, the state price density (stochastic discount factor or marginal rate of substitution) can be written
as a linear function of the return to aggregate wealth or the return on the market portfolio – which
leads to CAPM pricing as we know….and it should given the assumption of quadratic utility. We
could instead just note that there must be 1fs in this model and proceed from that direction using
the pricing results we developed previously.
which basically says that at the optimum the cost of buying marginally more of any asset
(u ( y i ) pi ( y )) (cost put in utility terms) will equal the expected marginal benefit of the purchase
(again in utility terms). The same interpretation as the FOC given above.
This simply uses market clearing conditions, rewrites the expectation, and restores (for clarity) some
time subscripts that Lucas removes due to the stationarity of the decision variables. Note this holds
not just for times 0 and 1 but for any t and t+1.
Now write:
u (c1 )
pi ( y o ) E ( y i1 pi ( y1 ))
u (c o )
Something that looks a lot like an old friend, but not quite – the difference derives from the multi-
period nature of the problem used here. We can think of the expression as pointing out that in the
infinite horizon model you derive two benefits from owning assets, dividends and capital gains.
[118]
Advanced Financial Engineering
Eduardo Mendes Machado
Now recall that in this problem, price is a fixed function of aggregate output – it is independent of
time:
u (c1 ) u (c1 )
pi ( y o ) E y1 E p i ( y1 )
u (co ) u (co )
u (c1 ) 2 u (c 2 )
E y1 E ( y 2 p i ( y 2 ))
u (co ) u (co )
More familiarly,
pi ( y o ) t 1 Emot y t (to anticipate the notation in our next lecture we write m as the
stochastic discount factor instead of λ)
u (ct )
where, mot t mo1 m12 m 23 m34 mt 1t
u (c o )
So, the multi-period version of our standard pricing representation is just a natural extension of the static
model, today’s price is today’s value of the entire stream of dividends that are expected to be received
from ownership of the asset. Valuation at each date is done by multiplying the payoff by the state price
density or stochastic discount factor for that date relative to today.
Proposition 1:
This tells us that for each function p(y) we might choose, there exists a unique value function
v(z, y; p) that satisfies condition (i) – i.e.:
v( z, y ) Maxc, x u (c ) v( x, y )dF ( y , y )
s.t. c p ( y ) x y z p( y ) z c0 0xz
The value function v() represents the optimal consumption and investment decisions for the agent given
the expected price function p(y).
Proposition 1 also establishes that for each output vector y, the value function v(z, y; p) is an increasing
function of z – so we have a well-behaved problem.
Proposition 2:
Gives us the derivative of v(z, y; p) with respect to z
[ 119 ]
With this derivative and the equilibrium conditions in (ii), we can derive the stochastic Euler equation (6)
that represents the solution to the problem.
Then, he notes that (6) does not involve the particular value function v(z, y; p) used in its own derivation.
Thus, (6) must hold for any equilibrium price function (see proposition 1). Conversely, if p*(y) solves (6)
and v(z, y; p*) is constructed as in proposition 1 to be the unique value function associated with p* then
the pair p*(y) and v(z, y; p*) represent an equilibrium. Thus, all equilibrium price functions solve (6) and
any solution of (6) is an equilibrium price function.
Proposition 3:
This provides that there is exactly one solution to (6) and so exactly one equilibrium price function for
this economy.
In the endowment economy asset prices adjust to conform to the consumption pattern.
Example:
The two date structure of the last example is unattractive because most investors do not set their
portfolios, consume, and then die. We can derive our old friend the CAPM in an intertemporal context
by substituting for the 2nd date utility function with a quadratic value function.
Thus, we suppose the investor cares about current consumption and the wealth she carries forward.
To simplify things (also necessary for the quadratic example) we are also going to place an extra
restriction on the transition function F ( y , y ) by requiring the return to aggregate wealth RW to be
i.i.d. each period.
If such an investor considers whether to buy marginally more of an asset for price pio and receive an
increase in its payoff at time 1 (of yi1 per unit), the “FOC” associated with this decision is:
p io u (c o ) Eo [v (W1 )( y i1 pi1 )]
Which gives us:
v (W1 ) v (W1 )
pio E ( y i1 pi1 ) or, m01
u (c o ) u (c o )
Noting that at an optimum it must be that the marginal value of an extra penny consumed must
equal the marginal value of a penny saved or,
v (W1 )
u (c o ) v (Wo c o ) we can write m01
v (Wo co )
Now, impose the restriction that the value function is quadratic: v(W1 ) (W1 W *) 2
2
Then, as we did above, we can write m01 a o bo R1W which tells us that we have a case where the
CAPM holds period by period if our value function makes any sense.
Now, let’s see where this assumed quadratic value function comes from:
To derive the CAPM result we used two assumptions: (1) the value function depends only on wealth
and (2) it’s a quadratic function of wealth. The first told us pricing would be done in terms of wealth
[120]
Advanced Financial Engineering
Eduardo Mendes Machado
or returns on aggregate wealth and the second told us it would be a linear function of the returns on
aggregate wealth that sets prices.
We really want to start from:
u (c, c, c, ) E t j 0 j u (ct j ) as in the Lucas paper.
Let’s continue to talk in terms of returns to wealth or the return on the market portfolio – this is just
thinking in terms of all the n assets together and of each “y” as a dividend.
So, paying the price pt gets you yt+1 (dividend) and pt+1 (capital gain)
Now, let’s define a value function – as Lucas does, we’ll label it v(Wt) rather than v(z, y). We drop the
z since we will suppress the portfolio choice in what comes and we substitute W for y since we are
putting things in terms of wealth.
v(Wt ) Maxc , x Et j 0 j u (ct j )
This is the power of using a value function in dynamic programming – it allows us to express an
infinite-period problem as if it were a two-date problem; if we’re careful.
Then, we plug this optimal consumption back into the equation for v(Wt) – if the guess was right, we
will get a quadratic function for v(Wt) and be able to determine the unknown parameters.
The “two-date” problem for consumption choice (continue to submerge x since it won’t affect what
we want to learn)
v(Wt ) Maxct 1 2 (ct c*) 2 Et (Wt 1 W *) 2
2
W
subject to Wt 1 Rt 1 (Wt ct )
Substitute the constraint into the maximand and take the derivative with respect to ct.
0 (ct c*) E t [( RtW1 (Wt ct ) W *) RtW1 ] at the optimum
[ 121 ]
Let ĉt represent the optimum, so:
cˆt c* E t [( RtW1 (Wt cˆt ) W *) RtW1 ]
2
c * E ( RtW1 )Wt W * E ( RtW1 )
cˆt 2
[1 E ( RtW1 )]
showing that v(Wt) is a quadratic function of Wt and ĉt since a quadratic function of a linear
function is quadratic. Thus, the value function is indeed a quadratic function of Wt. The complete
solution of the problem solves for the unknown parameters and W*.
This is, essentially, what Lucas does in a much more general framework and with much greater
precision.
[122]
Advanced Financial Engineering
Eduardo Mendes Machado
1
Rf Demonstrating this is simple.
E[m]
If a risk free asset exists – m must price it.
1
1 E[mR f ] R f since Rf is a constant.
E[m]
Let’s use this relation to consider the economics behind risk free rates:
1 1
Assume u (c) c power utility for illustration (log utility as 1) so u (c) c .
1
(B) Real rates are high when consumption growth is high. If consumption growth is high, it
requires a high rate of interest to induce investors to consume less now in return for more
consumption later. This interpretation considers Rf to be determined by the consumption
pattern which follows production (as in Lucas).
(C) Real rates are more sensitive to changes in consumption growth when is large:
1
R f c
t 1 Increasing in
ct 1
ct
ct
If is large, then utility is more concave (for this utility function, measures both relative
risk aversion – aversion to consumption changing across states of nature and intertemporal
substitution – aversion to consumption changing across time periods). With more concave
utility, the investor wants very badly to maintain a smooth consumption stream across time
[ 123 ]
and across states of nature. This implies investors are less willing to change the consumption
stream across time in response to interest rate incentives. Thus, it takes a bigger change in Rf
to induce the investor to a given consumption growth. Said differently, consumption is less
sensitive to interest rates when the desire for a smooth consumption path is high.
Why? Investors don’t like uncertainty about consumption. Assets that payoff more in states you are
wealthy and less in those you are poor add to the volatility of consumption. Requires a lower price on
this asset for you to hold it – i.e. it provides compensation for the risk or a risk premium.
[124]
Advanced Financial Engineering
Eduardo Mendes Machado
In terms of returns:
1 E[mz i ] E[m]E[ z i ] Cov (m, z i )
so,
E[ z i ]
1 Cov (m, z i ) or, E[ z i ] R f R f Cov (m, z i )
Rf
Cov (u (ct 1 ), z i )
E[ z i ] R f
E[u (ct 1 )]
All assets have an expected return that is the risk free rate plus a premium that is positive for assets
whose returns covary positively with consumption.
E[ y ]
If Cov(m,y) = 0, then v( y ) , no matter how volatile is y (how large is 2 ( y ) ). Recall 2 ( y ) has
Rf
no first order effect on consumption volatility.
For any random payoff y, consider the decomposition: y proj ( y m) where proj ( y | m) is the
projection of y on m. This is the portion of y’s volatility that is perfectly correlated with m (it’s like
m where is a regression coefficient from a regression of y on m with no intercept).
E[my ]
proj ( y m) m , further we know that E[ ] E[ m] 0 by construction
E[ m 2 ]
Then, the value of the projection of y on m is equal to the value of y itself.
E[my ] m 2 E[my ]
v( proj ( y | m)) v 2
m
E 2
E[my ] v( y )
E[m ] E[m ]
So, must have a price of zero: v( ) 0 its expectation is zero and it is orthogonal to m.
Note that here the ’s can be uncorrelated across assets (as is assumed in the APT) or not (as allowed in
the CAPM). But, since this is based on the absence of arbitrage and the APT is as well, and the CAPM
is an equilibrium model, we knew that would follow.
[ 125 ]
(4) Expected Return/Beta Representations
Assume there exists a riskless asset:
1 E[mz i ] i 1 E[m]E[ z i ] Cov (m, z i )
E[ z i ] R f R f Cov (m, z i )
Cov (m, z i ) Var (m)
Rf
Var (m) E[m]
R f i ,m m
where m is the price per unit risk (and a function of Var(m)) and i, m measures the quantity of risk for
asset i.
c
To relate this to the underlying variables of interest, recall m t 1 assuming power utility.
ct
Cov (m, z i ) Var (m) c
Performing a Taylor’s series expansion of E[ z i ] R f with m t 1
Var (m) E[m] ct
ct 1
around consumption growth, , gives:
ct
Cov ( z i , c) c
E[ z i ] R f i , c c where i , c , c Var (c ) , and c t 1
Var (c) ct
In the continuous time limit this approximation becomes precise.
This says that expected returns increase linearly with an asset’s beta with consumption growth. This
is the consumption CAPM relation. This falls directly out of the power utility framework.
The price of risk in this case depends upon the risk aversion coefficient of the investor and the
variance of consumption growth (the fundamental risk facing the investor).
The more risk averse are agents or the more risky their consumption, the larger is the expected
return required to induce them to hold risky assets (assets that covary positively with consumption).
Now consider the following: If instead of the SDF (m) being a function of consumption growth we
assume m a bz Mkt , then we know that for any asset i:
1 E[m z i ] E[az i bz Mkt z i ]
or,
1 az i bz i z Mkt bCov ( z i , z Mkt )
or,
1 b
zi Cov ( z i , z Mkt ) R f bR f Cov ( z i , z Mkt )
a bz Mkt a bz Mkt
recognizing that R f 1 E[ m ]
This must hold for all assets i and also for zMkt. So,
[126]
Advanced Financial Engineering
Eduardo Mendes Machado
and,
( z Mkt R f )
b
R f Var ( z Mkt )
So,
Cov ( z i , z Mkt )
zi R f ( z Mkt R f )
Var ( z Mkt )
So if the SDF is a linear function of the returns on the market portfolio we get our familiar CAPM
pricing relation.
One comment: Cochrane likes to use returns in specifying m since then it has a neat interpretation. But,
if m=a +bzMkt it may not always be positive. He’s a little loose with this – a stochastic discount factor
that is not strictly positive can correctly price the assets it is just not the strictly positive SDF guaranteed
by the absence of arbitrage. If we instead rely on the law of one price we know there is a SDF, but it is
not restricted to be positive. In the absence of arbitrage there must be a strictly positive SDF
( m( s ) ke az Mkt ( s ) 0 ) giving a beta pricing representation.
Asset i
(z )
(B) Only if 1 does asset i lie on the minimum variance efficient frontier. Thus all portfolios
on the frontier are perfectly correlated with the SDF, m. Returns on the upper limb have
m , z 1 , so are perfectly negatively correlated with m and thus, perfectly positively
i
correlated with consumption growth. They are maximally risky and so demand the highest
[ 127 ]
expected return per unit variance. The converse is true for assets on the lower limb.
(C) All frontier returns are also perfectly correlated with each other. They are all perfectly
correlated with m. Therefore, we know we can span the return of any frontier portfolio
using any two distinct frontier returns. For example, pick any single frontier return z MV 1
(not Rf). Note that Rf is also on the frontier. Any other frontier return zMV2 can be written
z MV 2 R f a ( z MV 1 R f ) for some constant a.
Show this is true as a homework problem.
(D) Since each return on the frontier is perfectly correlated with m, we can find constants a, b, d,
e such that for any minimum variance return:
m a bz MV and, z MV d em
What does this mean? It means that any mean-variance efficient portfolio contains all the
pricing information in m. For example, in the CAPM the market contains all the pricing
information in m, which we knew already. Its return is a sufficient statistic for m or marginal
utility. Thus, zMkt can serve as m – requires zMkt is on the frontier. Given any mean-variance
efficient return and the risk free rate, we can find a SDF that prices all assets, and vice versa.
See problems 1-3 in Cochrane.
(E) Given m, we can also construct a single-beta representation such that expected returns are
expressed in a single-beta model using the return on any mean-variance efficient portfolio
(except Rf):
E[ z i ] R f i , MV ( E[ z MV ] R f )
(F) All asset returns can be decomposed into a “priced” or systematic component and a “non-
priced” or idiosyncratic component. The priced component is perfectly correlated with m
and any frontier return and so this component would “plot on the frontier” (see picture on
previous page). The unpriced component is uncorrelated with m and generates no expected
return or risk adjustment. (Recall the decomposition we did in our development of the
CAPM.)
Note: Assets “inside” the frontier are not “worse” than assets on the frontier. The frontier
and its interior characterize equilibrium asset returns. Rational investors are happy to hold
all assets. You just don’t put all your wealth in an inefficient asset, but you are happy to put
small amounts of wealth in many such assets.
(6) The Slope of the Mean/Standard Deviation Frontier and “The Equity Premium Puzzle”
The ratio of the mean excess return to standard deviation is known as the Sharpe Ratio:
E[ z i ] R f
Sharpe Ratio
( zi )
This is more interesting and a better indication of performance than mean return alone. For example, if
you borrow at the risk free rate and invest the proceeds into some risky security, you increase expected
return but you don’t increase the Sharpe ratio since p increases at the same rate as E[zp].
The slope of the mean/standard deviation frontier is the maximal Sharpe ratio. It tells us how much
more mean return you can get by taking on added (priced) volatility.
[128]
Advanced Financial Engineering
Eduardo Mendes Machado
( m)
From E[ z i ] R f ( zi )
E[m]
We can see that:
E[ z i ] R f ( m)
(m) R f for all assets i.
( zi ) E[m]
For zMV (that is, for frontier returns), since their correlation with m is one:
E[ z MV ] R f (m)
MV
( m) R f .
(z ) E[m]
Thus the slope of the frontier is governed by the volatility of m and this slope we know determines the
risk premium.
c
Consider again the power utility framework: u (c) c and so m t 1
ct
c
t 1
MV
E[ z ] R f ct
Then:
MV
(z ) c
E t 1
ct
(m) increases if consumption is more volatile or if is large.
If consumption growth is lognormal, “it can be shown,” using the transformation above, that:
E[ z MV ] R f 2 2 ( Ln ( ct 1 ))
e 1 (Ln(ct 1 ))
( z MV )
This shows more directly that the slope of the mean/standard deviation frontier is higher if
consumption growth is more volatile or if risk aversion is higher.
In post-war data (50 years) for the U.S., E[ z Mkt ] 9% , ( z Mkt ) 16% , and R f 1% (all in real
terms).
Aggregate consumption growth has had a mean about equal to 1% and a standard deviation of about
1%. We can plug these values into the above equation to get:
9% 1%
0.50 (.01)
16%
This implies a risk aversion coefficient of roughly 50!
This is an order of magnitude too high to be believable. The interpretation is that consumption is not
volatile enough to explain asset returns unless investors have risk aversion coefficients much larger than
we think they are. This is the point of the famous Mehra-Prescott paper.
Possible Explanations:
1. People are much more risk averse than we think.
2. Stock returns are largely a result of unexpected good fortune over the last 50 years and are not
indicative of expectations.
3. There may be real problems with measures of consumption.
[ 129 ]
4. Something is deeply wrong with the unconditional model.
This question/issue has been a central focus of the asset pricing literature for the last 15-20 years.
The law of one price implies there is a discount factor SDF linear in the factors that price the assets:
m a bf 1 df 2 .
It is also true that if returns are given as above, we can use 1=E[mz] to derive the beta pricing relation:
z i z i i1 f1 i 2 f 2
The law of one price implies we can use m to price both sides of this equation:
1 E[ z i m] z i m i1 E[mf 1 ] i 2 E[mf 2 ]
zi R f R f i1 E[mf 1 ] R f i 2 E[mf 2 ]
R f i1 ( R f E[mf 1 ]) i 2 ( R f E[mf 2 ])
R f i11 i 2 2
Now, form portfolios with returns:
z 1 z 1 1 f1 0 f 2
z 2 z 2 0 f1 1 f 2
The above pricing equation must hold for these portfolios, so:
z 1 R f 1 so, 1 z 1 R f and,
z 2 R f 2 so, 2 z 2 R f
So, this representation takes us a ways towards developing the APT. See Cochrane Ch. 9.
Now, if investors are risk neutral (or if consumption is constant), dt+1=0, and 1 (the time horizon is
very short), then the FOC becomes:
pt Et [ pt 1 ]
or, we can write:
pt 1 pt t 1
So, prices follow a martingale (or a random walk if we further assume that 2 ( t 1 ) is constant across
time) and expected returns should be constant across time.
The basic FOC really tells us that prices (adjusting for dividends) scaled by marginal utility follow a
martingale. For short horizons prices should be close to a martingale since consumption and risk
aversion don’t vary much over a day. Returns over longer horizons may be predictable.
[130]
Advanced Financial Engineering
Eduardo Mendes Machado
we see that variation in E[ z t 1 ] R ft could come from variation in t ( z t 1 ) . However, this is not borne
out in the data, as variables correlated with mean changes are not correlated with variance changes (and
vice versa).
Predictable excess returns can then come from changes in aggregate risk - t (Ln(ct 1 )) - or risk
aversion - t - or perhaps from t (mt 1 , z t 1 )
The literature doesn’t consider t (mt 1 , z t 1 ) much. But, it is natural to think that t (Ln(ct 1 )) and
t change over the business cycle which is the horizon over which returns are relatively predictable
(which is not short).
Now, suppose that you can purchase the stream {dt+j} for pt, the FOC gives us the pricing formula directly
(developed in the Lucas paper):
j
u (ct j )
pt Et j 1 d t j Et j 1 mt ,t j d t j
u (ct )
u (ct 1 )
Now by noticing that this holds at t and at t+1 and noting that multiplying pt+1 by puts the equation
u (ct )
for pt+1 in terms of mt,t+j rather than mt+1,t+1+j. We can then get to:
So, the two date solution and the multi-period solution are equivalent.
[ 131 ]
[132]
Advanced Financial Engineering
Eduardo Mendes Machado
10 Option Pricing
Background
Options are side bets between investors concerning the future price level of an underlying asset
(which we will refer to as a stock for simplicity) relative to a fixed benchmark. Even more generally,
a bet about the future realization of a random outcome (weather). As they are created when two
investors take opposite sides of the “bet,” they are in zero net supply.
A call option is a financial security which gives its owner the right (but not the obligation) to buy an
underlying asset (stock) for a pre-specified price (this is the fixed benchmark called the strike or
exercise price, k) on (or before) the expiration date (T) of the option contract. A “European” call
option can be exercised only on the expiration date whereas an American call can be exercised at any
time up to and including the expiration date.
A put option gives its owner the right (but not the obligation) to sell an underlying asset at the strike
price on or before the expiration of the put option contract.
Given the transactional complexity (at least for the first time you discuss options) of a call option,
we will first identify several relevant prices in the hopes of avoiding confusion (we will concentrate
our discussion on call options):
c or ct is the current price/value of the call option
cT is the payoff/value of the call at the expiration date
s or st is the current price/value of the underlying asset
sT is the price/value of the underlying asset at T
k is the contractually specified strike price
Similarly,
p or pt is the current price of the put option
pT is the payoff of the put at expiration
The basic option pricing literature concentrates on finding ct as a function of st and other variables in
a variety of circumstances.
Payoffs at Expiration
A call option has value to its owner at T only if the price of the underlying asset sT is above the strike
price k. Thus buying a call is a bet the price will end up above this benchmark. If it is, the payoff on
the call is sT - k; if not the payoff is not sT - k (which is negative if sT < k) but rather 0 (because the
owner has a choice, an option). The owner of the option simply lets it expire unexercised when the
exercise value would be negative. Note that the buyer of an option purchases a choice (right), the
seller of an option incurs an obligation.
[ 133 ]
Graphically, on the expiration date…
ct
k sT sT
-ct
The dotted line is the “profit” of the position which was commonly but incorrectly examined.
A put option works in the “opposite” way. It allows you to sell for k. So, you (the owner) are
interested in (are betting on) states in which the true price is low (lower than k). Note, however, that
you do not have the same interest in a low price as does the seller of a call option.
k sT if sT k
pT
0 if sT k
k sT sT
Put-Call Parity
The law of one price allows us to find a particular relation between the current value of a put, the
current value of a call, the current value of the underlying stock, and a risk free bond’s value.
[134]
Advanced Financial Engineering
Eduardo Mendes Machado
The time T payoffs of these 2 strategies as a function of the price of the underlying are:
k sT k sT k sT
-k -k
sT k sT k sT
-k -k
Now, since their future payoffs are equivalent, cT - pT = sT – k, in all states of nature (note that the
only uncertainty for both positions concerns the future, time T, price of the underlying stock), their
current prices must also be equal.
So, ct p t st k
Rf
Thus, pt ct st k
Rf
so we can find pt if we know ct – hence our concentration, as is traditional, on call options.
Proposition (1) – American and European put and call option prices are always at least weakly positive.
This comes from the limited liability of the payoff equations for options.
Proposition (2) – The final payoffs on options are weakly positive and as given above. Options are
never exercised out of the money as the exercise decision is a choice of the owner.
Proposition (3) – American calls and puts must always sell for at least their exercise values, ct s t k
(exercise “value” not “price”). Otherwise an immediate arbitrage is available.
Proposition (4) – For two American calls (puts) written on the same underlying with the same strike
price, the call (put) with the greater time to maturity is at least as valuable as the “shorter”
contract. With a “longer” American contract, the owner can do all she could with a shorter
[ 135 ]
contract and more (i.e. exercise before, on, or after the expiration of the shorter contract), the
current value of the extra right must be weakly positive.
Proposition (5) – An American call (put) is worth at least as much as a European call (put) with the
same characteristics. Again, the added rights have value. Here, the right is to exercise not only
at maturity, but before T as well.
Proposition (6) – Call option values are non-increasing in the exercise price and put option values are
non-decreasing in the exercise price
Consider calls – if you have two calls, one with a low exercise price and one with a high exercise
price, then whenever the second can be profitably exercised, the first can also be profitably
exercised and for a strictly greater payoff. Also, the first can be profitably exercised in some
states in which the second with the higher exercise price cannot be. The first call must have a
higher current value.
[136]
Advanced Financial Engineering
Eduardo Mendes Machado
To address this question we examine a portfolio formed by selling 3 calls, buying 2 shares of the
underlying stock, and borrowing $40 now (so at T we pay back $50 = $40 x Rf).
Since, in “all” states of nature the payoff on the portfolio is zero (riskless hedge), its current price must
be zero by the law of one price. So 3c0 - 60 = $0, or c0 = $20.
The law of one price again says c0 = $20 (-3c0 = -$60) and here we see that an appropriately levered
position in the underlying (1) replicates the payoff on the call option (2). Replicating the payoff on the
option is the approach used in the early option pricing literature. This implies that the call option is
redundant, i.e., is already in the asset span. Ross (1976) looks at the usefulness of call options when this
is not true.
q us
1-q ds
[ 137 ]
One Period Analysis:
If the call expires in one period, then its price process looks like:
Now, as in the numerical example, form a portfolio containing shares of stock and the (dollar)
amount B in riskless bonds. The initial cost is s B and its payoff distribution is:
( pr q ) us R f B
s B
( pr 1 q) ds R f B
With and B chosen this way, we have a replicating portfolio and c = s+B as long as this is not less
than s – k. If this is less than s – k, then c = s – k (recall the American option can not sell for less than
its exercise value or it would represent an arbitrage opportunity).
R f d u Rf
cu c d
u d ud
(again, or s - k)
Rf
This is a risk neutral pricing equation where the risk neutral probabilities are given by:
Rf d u Rf
p* 1 p*
ud ud
Note that the risk neutral probabilities depend only on parameters of the stock price process and the risk
free rate. We could also represent the pricing equation using state prices or a state price density (or
stochastic discount factor) by adjusting for the actual probabilities and taking an expectation:
ct E[mt ,T cT ] . Question: where is q?
[138]
Advanced Financial Engineering
Eduardo Mendes Machado
Finally, c [ p * cu (1 p*)c d ] R f
[ p * Max(us k ,0) (1 p*)Max(ds k ,0)] R f (or s - k)
[ p * (us k ) (1 p*)(ds k )] R f
s k Rf s k if R f 1 as above
Note: >0 and B<0, so the replicating portfolio is again a levered position in the underlying. is
referred to as the “delta” of the option – the change in option value for a given change in the price of
the underlying stock. The general form of the pricing formula is times the current price of the
underlying less the amount borrowed (which we can represent as the present value of the exercise price
times some factor) necessary to form the replicating portfolio.
u2s cuu=Max(u2s-k, 0)
q us cu
ds cd
1-q
d2s cdd=Max(d2s-k, 0)
Given what we have just derived and that, under our assumptions for Rf and the stock price process, the
environment does not change from period to period, so we can automatically write:
cu [ p * cuu (1 p*)cud ] R f
and,
c d [ p * c ud (1 p*)c dd ] R f
Careful, the forms are the same but not everything is. For example, examine the Δ at each node of the
tree using a numerical example.
At time 0 we could again form a portfolio of shares and B in bonds to replicate the call value cu if s
us or cd if s ds. Again, the forms of and B are unchanged but their values are not.
c cd uc dcu
u ≥0 B d ≤0
(u d ) s (u d ) R f
Simply substitute the new cu and cd from above to get the s+B representation for the current call price.
As before, c = Max( s+B, s-k).
Alternatively, we can again write the current value of the call as:
c [ p * cu (1 p*)c d ] R f
again, if this is greater than s-k or it will be s-k otherwise.
Substitution allows us to derive the risk neutral probability representation of the call price:
c [ p * 2 cuu 2 p * (1 p*)cud (1 p*) 2 c dd ] R 2f
[ p *2 Max(u 2 s k ,0) 2 p * (1 p*)Max(uds k ,0) (1 p*) 2 Max(d 2 s k ,0)] R 2f
[ 139 ]
which we can see is always greater than s-k using the same process as was used above.
The n period problem: By extension (of the last equation) we can write:
n n! j
c j 0 p * (1 p*) n j Max(u j d n j s k ,0) R nf
j!(n j )!
Now, let a be the smallest non-negative integer such that with a up moves and n-a down moves
the option finishes in the money:
u a d n a s k
This can also be stated as a is the smallest non-negative integer greater than
Ln(k sd n ) Ln(u d ) .
If a > n the option can never be in the money prior to its expiration and c=0.
Rf d
Where: p* p* u R p * , and
ud f
Again, c s B where the ’s are the complementary binomial distribution function – the probability
of getting a or more u’s in n tries if the probability of u is p * (or p*).
Here we let the “n” from the n-period problem get large. In doing so, we want to let the length of a
period of time go to zero. We need to take some care in doing this so that we don’t wind up with
ridiculous parameter values that say the stock price is expected to change by 20% over an instant in
time rather than over a year’s time.
[140]
Advanced Financial Engineering
Eduardo Mendes Machado
Let h represent the elapsed time between successive stock price changes (this is what we will let go to
zero), and let T be the fixed length of calendar time to expiration (fixed number of “units” of time),
and n is the number of periods of length h prior to expiration ( so h T n) . We want to see what
happens as n or h 0.
We first need to adjust Rf. Rf is one plus the rate of interest over a “unit” of calendar time, so over T
units, R Tf is the riskless return.
Denote by R̂ f one plus the riskless rate over a period of length h. Then, over the time to expiration
there are n such periods. So, R̂ nf is the total return until expiration (date T).
T
We therefore require: Rˆ nf R Tf so Rˆ f R f n
Over each discrete period, in the n period model we assumed the stock price would experience a one
plus rate of return of u with probability q and d with probability 1-q.
It’s easier to work with Ln(u) or Ln(d) which gives the continuously compounded rate of return over
each period. The continuously compounded return is a binomial random variable with realization
Ln(u) with probability q and Ln(d) with probability (1-q) in each period.
Then,
E Ln s * s Ln u d E ( j ) nLn(d )
and,
Var Ln s * Ln u
s d
2
Var ( j )
The expected outcome for each trial is q (the probability of an “up”), so in n trials E ( j ) nq
So, E Ln s * qLn u Ln(d ) n ̂n
s d
s q(1 q)Lnu d n ˆ n
Var Ln s *
2
2
[ 141 ]
If T and 2T are the empirical values of the stock’s expected return and variance over the time until
expiration (T periods till expiration with as the expected return over each unit of time and 2 the
variance over each unit of time), then we want:
d n T
q(1 q) Ln u
2
2
as n
So, the mean is correct for all n and the variance converges in the limit (as n ).
Since Rˆ f n R f T ,
c s[a ; n, p*] kR f T [a ; n, p*]
which means the binomial model converges to the Black-Scholes model, which is written
T
Ln( s / kR f T ) 1
c sN ( x) kR f N ( x T ) where x 2 T , if
T
[142]