NKJNJ

[2]
Advanced Financial Engineering

Eduardo Mendes Machado
1 Risk and Risk Aversion

The most interesting aspect of Asset Pricing, the focus of this course, considers how securities markets
price risk (the time dimension alone is largely mechanical although there are interesting interactions
between the two). For this question to be interesting, it must be that there is a positive price for risk –
i.e. investors require some compensation for exposing their portfolios to risk (this certainly appears to be
true from the data). Theoretically, this in turn requires that investors dislike risk or that they are risk
averse. For intuition’s sake, we will review some of the relevant concepts.
Definition: Let  be a preference relation with an expected utility representation.  is said to

~ ~
exhibit or display risk aversion if for any simple gamble  with expected value g, denoted  g , the
~ ~
relation weakly prefers the fixed value g to the simple gamble → g   g  g,  g . The weak
preference allows for indifference so “weak risk aversion” includes risk neutrality.
(Strict risk aversion, risk neutrality, and risk seeking (weak or strict) are defined analogously.)
~
Example: A simple gamble: Consider a random payoff  g which pays  1 > 0 with probability 1 ≥
~
p ≥ 0 or  2 ≠  1 with probability 1 - p. The expected value of  g is
~ ~
p  1 + (1-p)  2 = E(  g ) = g. This gamble is said to be ‘fair’ if E[  g ] = g = 0. We can
alternatively define a risk averse agent as one who is unwilling or indifferent to taking any fair
gamble, and strictly risk averse if unwilling to accept any fair gamble. In the above definition, a
~
risk averse individual (weakly) prefers to receive the amount E(  g ) = g rather than face the bet
~
g .
Definition: A function f( ): W →  (reals) is concave if f(az + (1-a)y)  af(z) + (1-a)f(y)  z, y  W

and all a  [0, 1]. (f is affine if f(z) = bz + c and b & c ≠ 0 are constants.)
If a concave function f( ) is defined on an open interval of the real line then f( ) is continuous and is
continuously differentiable almost everywhere on that interval. ( ′ denotes partial derivatives)
 f ′ ( ) is non-increasing if f( ) is concave. So, if f( ) is concave and twice differentiable then f
″( ) is non-positive.
 Generally, we will be concerned with f( ) such that f ′ ( )  0.
The definition of concavity leads naturally to Jensen’s Inequality:

f(E( ~
x )) ≥ E(f( ~x )) if f( ) is a concave function and ~
x is a random variable.
(Intuitively, just think of a and (1 – a) as probabilities in the definition of concavity.)
Illustration: Consider a fair gamble defined by  1   2   and p = ½. Label f( ) = U( ).
U(w)
[3]
U(wo) =
U(½(wo+δ)+½(wo-δ))
~
E[U (wo   )]
Wealth, w
wo – δ wo wo + δ
~ ~
This compares U ( wo )  U ( E[ wo   ]) with E[U ( wo   )]  1 2 U ( wo   )  1 2 U ( wo   )
Thus it is the concavity of U( ) that causes the agent to be unwilling to accept the fair gamble.
Intuitively, risk aversion derives from a downside loss causing a reduction in utility that is greater than
the increase in utility from an equivalent upside gain (f ′ ( ) is non-increasing).
The two definitions provided above naturally lead to the following theorem.
Theorem: An agent is strictly risk averse iff U( ) is strictly concave.

Proof: “An agent is strictly risk averse if U( ) is strictly concave”
Assume U( ) is strictly concave.
Thus, for any x and y it’s true that U(  x + (1-  )y) >  U(x) + (1-  )U(y)    (0, 1).
Now, consider any arbitrarily chosen fair gamble, (δ1, δ2, p).
Since the gamble is fair, we know that for any wo
wo = wo + pδ1 + (1-p)δ2 = p(wo+δ1) + (1-p)(wo+δ2)
Now, use the strict concavity of U( ) and label  = p, x = wo+δ1, y = wo+δ2
Then, U(wo) = U (p(wo+δ1) + (1-p)(wo+δ2)) = U(λx + (1-λ)y)
~
> λU(x) + (1 – λ)U(y) = pU(wo+δ1) + (1-p)U(wo+δ2) = E(U(wo+  )),
where the inequality follows from strict concavity. Strict concavity of U( ) implies this agent
always strictly prefers not to accept any fair gamble and so is strictly risk averse.
“An agent is strictly risk averse only if U( ) is strictly concave”

Assume the agent is strictly risk averse and so is unwilling to accept fair gambles.
Therefore,
U(wo) > pU(wo+δ1) + (1-p)U(wo+δ2) (eqn ☼)
holds for any fair gamble (δ1, δ2, p) and any wo.
Let (x, y,  )   2  (0,1) be arbitrarily chosen.
Simply let wo =  x + (1-  )y, δ1 = x – wo, δ2 = y – wo, and p = 
Then eqn ☼ becomes the equation for strict concavity and we are done if this gamble is in fact
fair: pδ1 + (1-p)δ2 = (x-wo) + (1-)(y-wo) = x +(1-)y - wo = wo - wo = 0.
The concave functions we are concerned with are of course utility functions. In finance, we commonly
simplify things and deal with utility of wealth. In this course we will consider, directly or indirectly, the
implications of the investment problem of maximizing agents. In the standard two-date, consumption-
investment problem, agents control two types of variables: (1) at the first date they invest their after
consumption wealth in the marketed securities; and (2) at the second date they sell these securities/assets
to buy consumption goods. Two components make the problem interesting: time and risk.
[4]
 Thus, the investment decision consists of forming a portfolio that transfers wealth from one date to
the next, and the consumption decision allocates the resulting wealth among the various goods (in a
multi-period problem, these goods include savings).
 If a complete set of state contingent claims (futures contracts) are available, then the date 2
consumption goods can be purchased at date 1 and both allocations can be made simultaneously. If
there is some incompleteness in the market, then the decision must take place in two distinct steps.
 Since it is also possible to consider the complete markets single-step problem as a two-step problem
we will generally think of the problem in two pieces – investment then consumption.
 If there is a single consumption good in the economy/model, the utility of wealth function is just a
standard utility function over consumption of the single good.
 If multiple goods exist, we are dealing with a derived utility of wealth function U(W) where U(W) ≡
Max V(c) s.t. p′c = W where c is a vector of consumption goods, p is a price vector, and W the
level of the budget constraint. U(W) is just the envelope of V(c) for different levels of final wealth,
W. In this case, we think of the date 2 consumption allocations as specifying a derived utility of
wealth function that the investment decision seeks to maximize. We can then concentrate on the
investment decision itself. To illustrate in a simple two good example, consider:
x2
3
1 2
x1
U(W) W1 W2 W3
W
W1 W2 W3
x1 is the numeraire
 If V(c) is increasing and p > 0 then U(W) is increasing

 If U(W) depends only on W it is called “state independent.” We commonly assume this is true. If
it also depends on relative prices – i.e. the price vector p – it is “state dependent.”
Arrow-Pratt Measures of Risk Aversion

We consider 3 approaches to measuring risk aversion. All evaluate a riskless wealth level versus a simple
gamble.
[5]
(1) Arrow’s measure of risk aversion – what is the “compensation” required for a risk averse agent
to accept a gamble?
This defines a probability (or an expected payoff/return) based “risk premium.” In this sense, it is
related to the finance or asset pricing view.
~
If  ≡  with probability p
-  with probability 1-p
~
Then, the question can be stated as: for what value of p (> ½) is E[U(wo +  )] = U(wo)?
Or, for what p is pU(wo +  ) + (1-p)U(wo –  ) = U(wo)?
To solve, take a Taylor series expansion of the left-hand side of this last equation at wo
p[U(wo) +  U  (wo) + ½  ² U  (wo) + o  ²] +
(1-p) [U(wo) - δ U  (wo) + ½  ² U  (wo) + o  ²] = U(wo)
We can drop the o  ² terms as inconsequential if  is ‘small,’ leaving:
U(wo) + (2p-1)  U  (wo) + ½  ² U  (wo)  U(wo)
Note that for linear U( ) (so that U  = 0) this holds at p = ½ as would be expected.
U' ' (wo)

Solving for p, we find that p  ½ - ¼ δ
U' (wo)
U' ' (wo)
Define the measure of absolute risk aversion, A(wo) = - , then p  ½ + ¼  A(wo)
U' (wo)
Note that if U  (wo) > 0 and U  ( wo) < 0 (utility is increasing and concave) then A(wo) > 0
Note: A(wo) is a property of the preference relation  and not its utility function representation U( ).
A(wo) relates to the curvature of the utility function at wo (think of the Jensen’s inequality picture). So,
clearly U  ( ) belongs, but why is 1/ U  ( ) there?
Since utility functions are unique only up to a positive affine transformation 1/U  ( ) is a standardization
used to make sure A(wo) is truly a property of  and not merely of U( ). Note that A(wo) is a local
measure (at wo) and that the result is strictly true only for ‘small’ gambles.
~
Let’s interpret p: What we see is that the agent must be compensated for bearing the risk of  , Ep(wo+
~ ~
 ) = p(wo + ) + (1-p)(wo - ) > wo, or, Ep(  ) > 0 iff A(wo) > 0
U' ' (wo)
Since A(wo) = - and U  (wo) > 0, then A(wo) > 0 iff U  (wo) < 0 (i.e. iff U( ) is concave).
U' (wo)
Thus, the amount by which we adjust p from ½ – the fair gamble level – is proportional to A(wo) (a
positive amount for risk averse preferences) the Arrow-Pratt coefficient of ‘absolute risk aversion,’ and 
a metric for the amount of risk in (the size of) the simple gamble.
If we had instead considered a proportional (to initial wealth) lottery:

~
 ≡  wo with probability p
-  wo with probability 1-p
then the above exercise would give us:
[6]
U' ' (wo)
p  ½ + ¼  R(wo) where R(wo) = -wo is the coefficient of ‘relative risk aversion.’
U' (wo)
(2) Pratt’s measure of risk aversion – what payment would a risk averse agent make to avoid a fair
gamble (insurance premium)?
~
 ≡  with probability = ½ (note that in this case the gamble is fixed as “fair”)
-  with probability = ½
~
The question is then, for what value of πi is it true that E[U(wo +  )] = U(wo-πi) ?
Taking a Taylor’s series expansion of both sides (around wo) and solving for πi (do this) gives:
πi  ½  ²A(wo) = ½ A(wo)Var(  )
(A(wo) is again an approximation to measuring risk aversion for small  and in this simple setting Var(  )
measures the “size” of the gamble or the amount of risk.)
~
wo – πi defines the certainty equivalent wealth for wo+  given U( ).
Again, R(wo) would appear if the gamble considered were a proportional one.
(3) Also consider the question: for what πc – compensating wealth premium – is it true that:
~ ~
E[U(wo + πc +  )] = U(wo) for a fair gamble  ?
We can show that πc depends on A( ). Since “we can,” do so as an exercise. What do you find?.
Consider the relation between the three versions:

(1)
E p (U ( w0   ))  U ( w0 )
Read Ep( ) as the expectation using p

as the probability of the good
outcome in the gamble 
w o  wo w o 
(2)
E 1 [U (wo   )]  U (w0   i )
2
wo   wo wo  
wo   i
[7]
(3)
E 1 [U (wo   c   )]  U ( w0 )
2
wo   c   wo   c wo   c  
wo
Each is determined by the nature of the curvature of U( ) at w0, i.e. its concavity.
Comparing Risk Aversion

~
Definition: A decision maker is decreasingly risk averse if  gambles  and  z, w, w′   with
w′ > w
~ ~
If E[U(w +  )] > U(w + z) then E[U(w′ +  )] > U(w′ + z)
(Increasing risk aversion can be defined analogously.)
Practice exercise: Assume U( ) is twice continuously differentiable. Show this holds iff A(w) is non-increasing in w
Practice exercise: Show that if an individual is decreasingly risk averse then U′″(w) > 0 if U( ) is thrice differentiable.
~
Definition: Individual 1 is strictly more risk averse than individual 2 if  simple gambles  on 
~
and  wo the insurance premium (πi) individual 1 would pay to avoid the gamble  is strictly larger than
that which individual 2 would pay. (Or, if 1 always chooses a safe investment over a simple gamble
whenever 2 does.)
Theorem: Consider two strictly increasing concave utility functions U1 and U2 – the following are
equivalent:
1) A1(w) > A2(w)  w “1 is more risk averse than 2”
2)  G( ) with G′( ) > 0 and G″( ) < 0 such that U1(w) = G[U2(w)]
(i.e., U1( ) is a “concavification” of U2( ))
~
3)  i1 >  i2  w0 and  
Proof:
(3)  (1) we know  ij  ½ Aj(w)Var(  ) for j = 1, 2
~
where πij is the insurance premium agent j will pay to avoid the gamble 
~
Thus,  i1 >  i2  w and   iff A1(w) > A2(w)  w.
[8]
(2)  (3) U1(w-  2) = G(U2(w-  2)) from the definition of G( )
~
= G[E(U2(w+  ))] from the definition of 2
~
> E[G(U2(w+  ))] Jensen’s inequality (G is strictly concave)
~
= E[U1(w+  )] from the definition of G( )
1
= U1(w-  ) from the definition of 1
So,  1 >  2 since U1′ > 0 (that is, U1 is an increasing function)
(1)  (2) Define G( ) = U1[U2-1(w)] then

U1(w) = G(U2(w)) by definition
Since U1 and U2 are strictly increasing U1′ = G′(U2)U2′  G′ > 0
So, G( ) must be increasing since U1 and U2 are
Similarly, U1″ = G″(U2)(U2′)² + G′(U2) U2″

We can write this as:
U 1' ' G ' ' (U 2)(U 2' ) 2 G ' (U 2)U 2' ' G ' 'U 2 ' U 2' '
= + = +
U 1' G ' (U 2)U 2' G ' (U 2)U 2' G' U 2'
G ' 'U 2 '
 A1(w) – A2(w) = - thus (1)  G″ < 0,
G'
so strict concavity of G follows from (1).
Examples of Commonly Used Utility Functions
 Constant Absolute Risk Aversion – CARA

U' ' (w)
No wealth effects: A(w) = - = A, a positive constant independent of wealth
U' (w)
So, given this, what is U(w)?
-log(U′(w)) = Aw + I1
U′(w) = e  ( Aw I1 )
U(w) =  ( 1 A)e  ( Aw I1 )  I 2
e  I1
= * (-e(-Aw)) + I2
A
For all I1 and I2 these are positive affine transformations of the negative exponential utility function U(w)
= -e-Aw  CARA Utility
 Constant Relative Risk Aversion – CRRA

 wU ' '
Now, = R – a positive constant independent of wealth
U'
 U '' R
=
U' w
-log(U′(w)) = Rlog(w) + I1 as long as w > 0
U′(w) = e (  R log( w)  I1 ) = e  I1 w  R
[9]
Case 1: R = 1
U(w) = e  I1 log( w)  I 2
A positive affine transformation of the base utility function U(w) = log(w)

 CRRA utility implies log utility if the level of relative risk aversion, R = 1.
Case 2: R  1
U′(w) = e  I1 w  R
U(w) = e  I1 (w1-R/(1-R)) + I2
So, the base utility function is U(w) = w1-R/1-R

 CRRA utility implies power utility if R ≠ 1. Note, we could alternatively take the limit as R
 1 to derive case 1.
 Linear Risk Tolerance (a.k.a. “Hyperbolic Absolute Risk Aversion – HARA”)

LRT or HARA encompasses all of the above
1
The risk tolerance measure is , the inverse of risk aversion:
A( w)
1 U ' ( w)
T(w) = =-
A( w) U ' ' ( w)
 U ' ( w)
Linear Risk tolerance implies T ( w)  = aw + b, a linear function of w.
U ' ' ( w)
 U ' ' ( w) 1
HARA is simply A( w)  = , a hyperbolic function of w.
U ' ( w) (aw  b)
Case 1: a = 0
1
If a=0, this is simply CARA utility: A(w) = =A
(b)
Case 2: b = 0
1
If b=0, this is simply CRRA utility: = A(w)
(aw)
1
= wA(w) = R
( a)
[10]
General Case: a,b  0
In this case, we have…
1
-log(U′(w)) = log(aw+b) + I1 as long as aw+b is positive
( a)
b
Assume a > 0, w > 
a
(  1a log( aw b ))
Then U′(w) = e  e  I1
1
= e  I1 (aw  b)  a
and,
U(w) = e  I1 log(aw+b) + I2 if a = 1
1
1
a
1  I1 (aw  b)
U(w) = e 1 + I2 if a ≠ 1
( a) 1
a
b 1
For simplicity’s sake, let ws = - and R* = , so base utility can be written…
a a
U(w) = log(w-ws) if R* = 1 (a = 1)  Generalized log utility
( w  ws )1 R*
U(w) = if R*  1 (a  1)  Generalized power utility
1 R *
These two base utility functions are generalized log and generalized power utility functions.
They are defined only for w > ws. ws is thought of as a subsistence level of wealth, below which
utility equals negative infinity.
Risk – A general notion of risk we will study later, but for now, a quick introduction
General notion – Risk is defined as any property of a set of random outcomes that is disliked by a risk
averse agent.
This is pretty general and seemingly broad enough to be almost useless. You’d be surprised. Commonly
we are in a situation of trying to focus this idea enough to make it fit within a standard economic model
and get a useful result to come out.
~ ~
Rothschild & Stiglitz presented the idea this way: If uncertain outcomes X and Y have the same
~ ~
location (the same mean), X is said to be weakly less risky than Y for a class of utility functions U if no
~ ~
individual with a utility function in U prefers Y to X . That is,
~ ~
E[U( X )] ≥ E[U( Y )]  U( )  U.
Risk, defined in this way, clearly depends on the class of utility functions considered. At the most
general level U is taken to be the set of all risk averse (concave) utility functions.
Practice exercise: Let U be the set of quadratic utility functions so (wlog) we can write:
[ 11 ]
bz 2
U(z) = z - . What can we use to measure risk for this class of utility functions?
2
~ ~ ~ ~
Theorem: X is weakly less risky than Y iff Y is distributed like X + ~ , where ~ is a fair game with
~
respect to X . (That is E( ~ |X) = 0  X.) The “fair game” property is not as strong as independence
but stronger than a lack of correlation. Why would that be a requirement?
~ ~
Proof (sufficiency): Y is distributed like X plus noise – the proof should depend on concavity.
~ ~
E[U( Y )] = E[U( X + ~ )] due to the equivalence of the distributions
~ ~
= E[E(U( X +  )|X)] just conditional expectations
~ ~
≤ E[U(E( X +  |X))] Jensen’s inequality (i.e. U( ) is concave)
~
= E[U( X )]
Stronger measures of Risk Aversion – introduced by Ross

~ ~
We might think that if Y is distributed as X plus noise, where the conditional expectation of the noise
is zero, and if U1( ) is at least as risk averse as U2( ) as measured by A-P risk aversion, then if U2( ) prefers
~ ~
X to Y , so will U1( ).
~ w p
Example: X = 1 with probability and
w2 1 p
w1  e .5 p
~ 
Y = w1  e with probability .5 p
w 1 p
 2
~
Since e~ is actuarially fair, any risk averse agent should prefer to avoid the second gamble contained in Y .
We now show that the Arrow-Pratt measure does not properly account for this situation in comparing
risk aversions.
The Arrow-Pratt measure approaches the problem by saying that U1( ) is more risk averse than U2( ) if
U1( ) prefers a riskless payoff r to a gamble y whenever U2( ) does. Ross instead directly compares the
~ ~
gambles X and Y given above. The first approach assumes complete insurance is possible (can always
evaluate a risky position against a riskless payoff); the stronger measures were developed under the
assumption that this is not possible. They highlight how the possibility of perfect insurance simplifies
this and many other issues.
[12]
The Ross Theorem – the following are equivalent:
U 1 ' ' (w) U ' (w) U 1 ' ' (w) U ' (w)
(1) inf ≥ sup 1 or ≥λ≥ 1  w
w U 2 ' ' ( w) w U 2 ' ( w) U 2 ' ' ( w) U 2 ' ( w)
(2)  G,  with G′ and G″ ≤ 0,  > 0 such that U1(w) =  U2(w) + G(w)
(3) ~ , e~ with E( e~ |w) = 0,  ≥  where E[U ( w

 w ~ + e~ )] = E[U ( w
~ –  )]
1 2 i i i
~ + e~ )] = E[U ( w
Proof: First, let’s find  : E[U1( w ~ -  )] where
1 1
w
~ =  1 with probability p
w  and
w2 1 p
w1  e .5 p
~ 
w + e = w1  e with probability .5 p
w 1 p
 2
E[U1(w+e)] = p½ [U1(w1+e) + U1(w1-e)] + (1-p)U1(w2)
 pU1(w1) + (1-p)U1(w2) + ½pU1″(w1)e²
Note that the effect of e is only on the second order term (and only at w1).
E[U1(w-  1 )] = pU1(w1 -  1) + (1-p) U1(w2 -  1)

 pU1(w1) + (1-p) U1(w2) - p  1U1′(w1) – (1-p)  1U1′(w2)
Now equate these two approximations to find:

 12 pU 1 ' ' (w1 )e 2  12 U 1 ' ' ( wo )e 2
1≈ vs. A-P  1 
pU 1 ' ( w1 )  (1  p )U 1 ' ( w2 ) U 1 ' ( wo )
As compared to the A-P measure of risk aversion, or the related insurance premium, we
get a contamination of  by the different wealth levels that develop (w1 and w2) out of
the lack of perfect insurance.
(2)  (3) U1 = λU2 + G λ > 0, G′, G″≤ 0

E[U1( w -  1)] = E[U1(w+e)]
=  E[U2(w+e)] + E[G(w+e)]
=  E[U2(w+e)] + E[E(G(w+e)|w)] conditional expectations
≤  E[U2(w+e)] + E[G(w)] Jensen’s inequality
=  E[U2(w-  2)] + E[(G(w)] definition of 
≤  E[U2(w-  2)] + E[(G(w-π2)] G is decreasing
= E[U1(w-  2)]
  1 ≥  2 since U1′ ≥ 0
(1)  (2) Define G( ) by U1( ) =  U2( ) + G where U1( ) and U2( ) are scaled so (1) holds
for some  > 0
G′( ) = U1′( ) –  U2′( ) then since  ≥ U1′ / U2′ from (1) we know G( )  0
[ 13 ]
U1′′ / U2′′ =  + G″( )/U2″( ) ≥  from (1)
 G″( ) ≤ 0 since  > 0 and U2″( ) < 0
(3)  (1)  1 ≥  2 only if  w1, w2, p
 pU 1 ' ' ( w1 )  pU 2 ' ' ( w1 )

≥
pU 1 ' (w1 )  (1  p )U 1 ' (w2 ) pU 2 ' ( w1 )  (1  p )U 2 ' ( w2 )
or,
U 1 ' ' ( w1 ) pU 1 ' ( w1 )  (1  p )U 1 ' (w2 )

≥
U 2 ' ' (w1 ) pU 2 ' ( w1 )  (1  p )U 2 ' ( w2 )
Now, if for some w1 and w2,
U 1 ' ( w2 ) U ' ' ( w1 )

> 1
U 2 ' ( w2 ) U 2 ' ' (w1 )
then, for p sufficiently small we have a contradiction.

So,  w1 and w2, we must have:
U 1 ' ' ( w1 ) U ' (w )

≥ 1 2 for all w1 and w2
U 2 ' ' (w1 ) U 2 ' ( w2 )
U 1 ' ' (w) U ' (w)

or, inf ≥ sup 1
w U 2 ' ' ( w) w U 2 ' ( w)
Note - The Ross ordering implies the A-P ordering, simply let w1 = w2 and rearrange the Ross
definition – but we’ll see that the A-P ordering does not imply the Ross ordering.
Example: ~~
Suppose wealth is distributed w e as above
 12 pU i ' ' ( w1 )e 2
We know that  i 
pU i ' ( w1 )  (1  p )U i ' ( w2 )
Now let U1( ) = -e-mw U2( ) = -e-nw with m > n

 U1 ' '  U 2 ''
So, m = > = n these are the A-P measures
U1 ' U2'
But, for w1-w2 sufficiently large, so that w2-w1 is very negative (i.e. far from perfect insurance),
we end up with:
 U 1 ' ' ( w1 )  U 2 ' ' ( w1 )

= m e m ( w2  w1 ) < n e n ( w2  w1 ) =
U 1 ' ( w2 ) U 2 ' ( w2 )
[14]
then, for p small enough,  1 <  2 and U1( ) prefers some gambles that U2( ) does not
despite U1( ) being more risk averse under the A-P measure.
Why does this example work?

Even though U1( ) is uniformly more risk averse in the A-P sense, the lottery places a low likelihood on
the event w1 (p is small), the state in which insurance against e risk is meaningful.
The marginal value of insurance is determined by the second order effect -U″(w1)e². The cost of
insurance, on the other hand (for very low p), is valued at the margin by U′(w2), the marginal utility of the
likely state. The premium is thus determined by a tradeoff between the benefits of insurance at w1 and
the costs at w2.
The example forces these wealth levels far apart. The A-P measure cannot consider the two separate
wealth levels and so it fails to order properly the required premiums. Note that the Ross measure
controls for this nicely in this example. See representation (1) in the Theorem.
Thought exercise: What happens as we move outside this example’s structure?
[ 15 ]
[16]
2 Pricing by Arbitrage
Here we will take a first look at a financial market using a simple state space model. We first develop
some structure then examine the implications of the absence of arbitrage.
 Often in finance problems, uncertainty is characterized by the use of a set of random variables with a
particular joint distribution, perhaps something like ~
y ~ N(, ).
 Here, we characterize uncertainty by considering a state space tableau of payoffs on the primitive
assets. We assume that there are a finite number of states of nature and that each security has its
payoffs written explicitly as a function of the realized state of nature.
 We index states by s = 1, 2, …, S (not a problem for S =  but intuition can be lost as we look at
this for the first time) and assets by i = 1, 2, …, N.
 The 2-date investment problem can be characterized by the tableau of per share dollar payoffs on
the N assets in each of the S states at date 2 (Y) and a set of current prices (v).
 y11 y12  y1N 

 
y y22  
Y ≡  21 S states and N assets  S × N matrix
    
 
y yS 2  y SN 
 S1
We want to impose some structure on Y right off. In the investment decision, the agent can make
choices only over outcomes (states) which can be distinguished by different patterns of payoffs on
the marketed assets. Thus, for the investment decision, if there are states with identical payoffs on
all of the assets, then we cannot distinguish between the two so we can collapse them into a single
state (that is, the payoff matrix should not have 2 identical rows).
Example:
1 3 2 3 1 1 1 2
  1 3    
2 1   2 1 3 2 2 2
 2 1    2 1  but,  3 1 2  and 
2 2 1
are both fine from this perspective.
   3 2    
 3 2   3 2  
   1 3 3 1 
 To complete the description of the “technology” or opportunities of the model, we use the vector v
 v1 
 
v 
=  2  to represent the current price per share of each asset.

 
v 
 N
[ 17 ]
 The decision maker’s choice variable is a portfolio (an N × 1 vector) n where ni is the number of
shares of asset i held in the portfolio.
 The final payoff on the portfolio n is an (S × 1) vector: ~

y n = Yn that gives the dollar payoff in each
state for the chosen portfolio.
 The price of this portfolio is n′v, so the budget constraint the decision maker is faced with can be
written n′v = W0 where W0 is the initial (after date 1 consumption) investable wealth.
 Thus, we might think of solving the following economic problem:

~ ~
Max E[u( y n )] subject to y n = Yn
n
v′n = W0
n  N
or,
S N
Max  s 1
 s u ( y ns ) subject to y ns  i 1 niYsi s  1,..., S
n
N
 v ni
i 1 i
= W0 and n  N .
…where, for now, we will assume only that u(·) is increasing. Y is the technological restriction
in this model and we are interested in the space of payoffs spanned by the columns of Y, i.e.
what different returns patterns can be generated by trading in the marketed securities (n is a
vector in  N ) – this, the budget constraint, and any other restrictions on n (short sales
restrictions etc.) define the opportunity set for the agent.
 It will often be convenient to describe a market by a returns tableau rather than a payoff tableau plus
initial prices.
 z11 z12  z1N 
 
 z 21 z22  
Z ≡ S states and N assets  again an S × N matrix
   
 
z 
 S1 z S 2  z SN 
…where [zsi] = [ysi][diag(vi)]-1, or zsi = ysi / vi. Thus, Z is a matrix of gross returns.
 Z can also be thought of as a payoff tableau where the number of shares of each asset is adjusted so
that all current prices per share (vi) are $1.
 Note: The construction of Z (from Y) requires only that all initial prices are not zero (i.e. we are not
modeling a market of futures contracts). Zero prices on some assets are not a problem as long as
there is at least one asset whose current price is non-zero. If so we can construct a new set of assets
by adding the payoff of this asset to that of all of the others.
 Since we are ultimately interested in the set of returns possibilities spanned by the columns of Y or Z
and we do not change this by forming a new basis for this space, we will usually assume that it is not
the case that all prices are zero and that any assets with a zero price have been transformed as just
described. This is not a big stretch since in most applications the primitive securities have limited
[18]
liability which implies a non-zero price as we will see. Negative prices are not a problem, but,
conceptually, is this really an ‘asset’?
Portfolios in Returns Space:

 We consider the vector  where ηi = nivi is the dollar amount committed to each asset i.
Thus, 1′ = Wo is the budget constraint and Z is the vector of dollar payoffs on the portfolio ,
the equivalent of Yn.
 We often normalize  by initial wealth (W0) and consider wi = (nivi)/W0 so the vector w is a vector
N
of portfolio weights. The budget constraint is then written 1′w =  i 1
wi = 1 and Zw is the vector
of gross returns (per dollar invested) on the portfolio w (often written as Zw).
Simple Numerical Example:

 2 0  2
Suppose Y =   and v =   .
 4 1 1
1 0
Then we can write Z =   (this is a very special case).
 2 1
 2
Consider a commitment of W0 = $5 divided as  =   (not meant to be in any way optimal).
 3
 2  0  2
Then the payoff is Z =   =   this is return per dollar times the number of dollars invested
 4  3 7
or simply the dollar payoff.
1
This could also be written in terms of Y and (since i = nivi) n =   (one share of asset 1 costs $2 and
3
 2
3 shares of asset 2 costs $3) so that the dollar payoff is again given as Yn =   .
7
 25   25 
As portfolio weights, this is w =   with gross returns per dollar invested of Zw =   , then you need to
 35   75 
know the number of dollars invested ($5) to determine the dollar payoff.
 An ‘arbitrage portfolio’  (  in the text) is a nontrivial vector of dollar commitments that sum to
zero. That is, an arbitrage or zero investment portfolio is a vector  where   0 with 1′  = 0 =
N
 i 1
 i . We distinguish this from a positive investment portfolio, η.
 If all prices, all vi, are positive, as is usual, then it must be that  j < 0 for some (at least one) asset j
as the long positions in  are financed with short positions in other assets.
 No normalization is done for an arbitrage portfolio, and so we talk of 1 ' = (5, -5) as being the
same portfolio as 1 ' = (20, -20). Thus, an arbitrage portfolio is scale free. So, for any scalar  , if
 is an arbitrage portfolio, then   is the same arbitrage portfolio.
 Don’t confuse an arbitrage portfolio with an arbitrage opportunity (which we will discuss shortly). If
an arbitrage opportunity exists, it is often possible to exploit it with an arbitrage portfolio – and it
[ 19 ]
can thus be ‘run’ at an unlimited scale, its profits being limited only by the supply of assets or (more
likely) price reaction to trading – but they are different concepts.
More on the Construction of Z:

Note that the characteristics of the market will be very important for our analysis, so it pays to spend
some time on its construction up front.
(1) Redundant Assets

Definition: Let w1 be a specific positive investment (1′w1 = 1) portfolio. We call this portfolio
duplicable if there exists a distinct (w2  w1) portfolio w2 (1′w2 = 1) with Zw1 = Zw2.
 Equivalently, if there exists an arbitrage portfolio α (which must be nontrivial α ≠ 0, you will recall)
with payoff exactly equal to zero in all states (Z  = 0, 1'α = 0), then there exists a duplicable
portfolio. (Why is this equivalent?)
 Clearly, this is a property of the set of returns as a whole. If w1 and w2 duplicate each other, then for
any portfolio w, ŵ = w + w1 – w2 = w +  (  w) will duplicate w.
 Duplicability is, therefore, usually expressed as a redundancy in the primitive assets, i.e. the columns
of Z. A redundant asset is one such that if it were removed from the market there would be no
change in the space of returns possibilities spanned by the marketed securities.
 Redundancy is formally stated as: If the “augmented” returns tableau
 z11  z1N 
 
    
Ẑ = (Z′, 1) =  has rank less than N, then some assets are redundant.
z S1  z SN 
 
 1 .. 1 .. 1 
 
Loosely…we want to have N linearly independent assets (really the columns of Ẑ ) or else we have a
redundancy. If we don’t have this situation, then any asset or portfolio is duplicable, but only the assets
contributing to the collinearity are redundant assets.
 The row of 1’s in Ẑ is important to ensure that we consider only ‘equivalent investment’ duplications
of assets (otherwise we could label an arbitrage opportunity as a redundancy). Consider the
0 0 1
following example: Z =   . None of the assets is redundant since Ẑ = -1. Thus, the
1 2 0
augmented returns tableau has full rank of N=3. Clearly, assets 1 and 2 are linearly dependent but
the augmented assets 1 and 2 are not. As far as the opportunities for the investor go, both assets are
very important in this economy and we would substantially alter an investor’s opportunity set by
eliminating either asset. We will generally assume that redundant assets have been eliminated from Z
meaning the augmented tableau has full column rank, N (though Z may only have column rank N-
1).
LeRoy and Werner consider an alternative definition: essentially imposing a restriction of the absence of
arbitrage before they consider the issue of redundant assets (their definition is actually in terms of Y but
that is not of consequence).
[20]
Definition: The right inverse of Z is a matrix R such that Z R  I S and the left inverse of Z is a
SxN NxS SxS
matrix L such that L Z  I N .

NxS SxN NxN
If Rank(Z) = N = S, then Z is a square matrix and has an inverse. If N  S, then Z can’t have an
inverse. Z, however, may have a left (if Z has full column rank) or a right (if Z has full row rank) inverse
in this case. Their definition: Lack of redundant assets   a left inverse for Z.
Suppose there are no redundant assets, Rank(Z) = N, and N ≤ S, find L such that LZ = IN.
(Z′Z)-1(Z′Z) = IN if (Z′Z)-1 exists and if Rank(Z) = N then it does exist (in fact, it’s iff)
Then L is uniquely defined as L  (Z′Z)-1Z′.
Pick any portfolio w of the marketed securities generating return Zw.

Now, apply the left inverse:
Zw = Zw so LZw = LZw or
w = LZw (since LZ = IN)
So, the portfolio that generates the returns Zw is uniquely defined by the left inverse of Z, L.
Thus, there can be no redundant assets. Again this is a bit more restrictive than the definition used by
Ingersoll but has a bit easier intuition.
Now, further suppose S = N and Rank(Z) = S = N

We know that (AB)-1 = B-1A-1 if A and B are square matrices.
Now, L = (Z′Z)-1Z′
= (Z-1Z′-1)Z′
= Z-1(Z′-1Z′)
= Z-1 (the same is true of R)
So, if N = S and Rank(Z) = N = S, then L = R = Z-1.
Again, since they add nothing to the uncertainty spanned by the marketed returns matrix (i.e. the returns
that can be generated by an investor’s portfolio) we usually assume that all redundant assets have been
removed from Z (or Y).
(2) Insurable States
Definition: If you can construct a portfolio (from the assets in Z) that pays off only in one state, then
that state is said to be insurable. More precisely, state s is insurable if there exists a solution ηs to Zηs =
0
 
1
is  Arrow-Debreu security for state s (such as for s = 2: i2 =  0  ), and, 1′ηs is the cost of insurance,
 

0
 
per dollar received, against the occurrence of state s.
Theorem: A state is insurable iff the asset returns in that state are linearly independent from the
asset returns in all other states (i.e. the sth row of Z is linearly independent from the other rows of Z).
Said another way: zs. is not collinear with z1., z2., …, zs-1., zs+1., …, zS.
Proof: If zs. is linearly dependent on the returns in other states, then zs. =   z
 s
.
[ 21 ]
That is, zs. can be written as a linear combination of the asset returns in other states, z .   s
for some set of scalars λσ.
So, for every portfolio , we have z′s.  =   z .
 s
But, if state s is insurable, then for some  s the right hand side of the equality must be zero, term
by term, and cannot sum to one as required (by the left hand side). So, if the sth row of Z is
linearly dependent on the other rows of Z, that state is not insurable.
If zs. is linearly independent from the other rows, then Rank(Z, is) = Rank(Z) and by the rule of
rank, a solution  s to Z′ s = is exists.
Rule of Rank: A non-homogenous system of equations x1 c1 + x 2 c 2 + … + x N c N = b has a

solution iff Rank( c1 , c 2 , …, c N ) = Rank( c1 , c 2 , …, c N , b ).
 If all states are insurable, the market is said to be complete. This requires that all the rows of Z are
mutually linearly independent. So, for our example above:
1 0  1  0
Z =    1 =    2 =  
 2 1   2 1
so, the market is complete (which is one of the ways this was a very special case). What is the cost
of insurance in each state? What’s going on here? What do these insurance costs mean? This
illustrates basic dominance (another special aspect of this example).
 Connection: Suppose that all states are insurable: Rank(Z) = S.

This requires N ≥ S (there may be redundant securities). Then Z has a right inverse = R
ZR = I SxS What is R?
ZZ′(ZZ′) = I SxS if (ZZ′)-1 exists, which it will iff Rank(Z) = S, or full row rank
-1
ZR = I SxS where R = Z′(ZZ′)-1

A complete market is a special structure, not an innocuous transformation as is assuming a lack of
redundant assets. We draw an explicit distinction between complete and incomplete markets.
(3) Riskless Asset or Portfolio
Definition: A positive investment portfolio w with the same return in every state is called a riskless
asset. That is Zw = R1 with 1′w = 1 (this is a very special asset). The existence of a riskless asset (or
lack thereof) has a large impact on many issues in finance.
 We call R the gross riskless or risk free return and R-1 = rf the risk free rate (net return).
 Often, no riskless asset or portfolio will exist. In this case there is a ‘shadow’ riskless return for the
economy that will depend upon other aspects of the economy we haven’t highlighted so far
(preferences for example). The shadow risk free return can be bounded in this context, below by the
largest return that can be guaranteed for some portfolio (i.e. the best lower bound for any portfolio’s
return) and above by the largest return that can be achieved by every portfolio (i.e. the worst
maximum return on any portfolio).
Or, R = maxw [mins(Zw)]
R = minw [maxs(Zw)], these bounds are determined by dominance.
[22]
2 1
  w 
Example – 2 assets and 3 states: Z =  2 3  With 2 assets, any positive investment portfolio w =  1 
1 2  w2 
 
 w 
can be written as w =  1  (simply from 1′w = 1) so it fits easily in a picture.
1  w1 
The returns on any portfolio are given by Zw – or as a function of w1 in this 2 asset case:
2 1  2 w1  (1  w1 )   w1  1   Z w1 
   w1       
 2 3    =  2 w1  3(1  w1 )  =  3  w1  =  Z w 2 
 1 2  1  w1   w  2(1  w )  2  w  Z 
   1 1   1  w3 
We can graph the returns in the states {1, 2, 3} as a function of w1. Clearly, there can be no risk free
portfolio with 1′w = 1 as there is no point (w1) where the three lines meet.
Zws Zw1
R
R
w1  1
2 w1  1 w1
Zw2
Zw3
At w1 < ½ the lowest return is in state 1. For w1 > ½ the lowest return on any portfolio is in
state 3. Z1 increases in w1 and Z3 decreases in w1 so the max of this min is at w1 = ½ and this
identifies R = 1.50.
Similarly, R = 2.0 at w1 = 1.0 from dominance again.
If we were to introduce an asset with Z1 = Z2 = Z3 = 1.20, what would be true?
(4) Dominance and Arbitrage
 An arbitrage opportunity is an investment strategy that guarantees a positive payoff in some

contingency (date 1 wealth or at some date 2 state or states) with no possibility of a negative payoff
in any contingency (i.e. a money pump).
 The modern study of arbitrage is the study of the implications of the absence of arbitrage
opportunities in the financial market for the pricing of assets.
 The assumption of the absence of arbitrage opportunities is an appealing place to start for a number
of reasons:
(1) It is a more “primitive” concept than is ‘equilibrium.’ We will show that AA is a
necessary but not sufficient condition for equilibrium.
(2) Only relatively few rational agents are needed to bid away arbitrage opportunities as
opposed to all agents optimizing as in standard equilibria.
(3) We need only assume increasing preferences.
Definition: Dominance
[ 23 ]
A (positive investment) portfolio (or asset) w1 dominates portfolio w2 if Zw1 ≥ Zw2 (strict in at least one
element).
 Recall that any portfolio has 1′w =1 (spreads $1 around). So, the initial price of the two positions is
the same. Such a circumstance would be an example of an arbitrage opportunity. It is a special
example in that it requires dominance of one asset over another state by state.
 It is clear that no investor preferring more to less would ever hold a positive amount of a dominated
asset (any investor with strictly increasing preferences would hold instead the dominating asset). It is
further true that no non-satiated investor can find an internal optimal portfolio (a finite optimum) if
w1 and w2 are both available, even if neither would be held by the investor.
Proof: Suppose w1 dominates w2 and w is any portfolio.

Define X  Z(w1 - w2)  0 (by dominance).
Any portfolio w is dominated by w +  (w1 - w2)   > 0.
Note that 1′( w +  (w1 - w2)) = 1, so this is a feasible portfolio, and that
Z( w +  (w1 - w2)) = Zw +  X ≥ Zw. So, w +  (w1 - w2) is a portfolio that dominates w.
Furthermore, the portfolio return is increasing in  so because it is costless to do so and it
increases the portfolio return in at least one state, every investor with increasing preferences
seeks to always increase the scale of his position in (w1 – w2). Thus, no investor has an internal
optimal portfolio. Alternatively, dominated assets can’t exist – price pressure will quickly
eliminate the dominance.
 This illustrates the idea of an arbitrage opportunity.  = (w1 – w2) is an arbitrage portfolio that exploits
the opportunity. It also illustrates why AA is necessary for equilibrium with non-satiated agents;
there can be no equilibrium if no agent has a finite optimum.
1 0
From our example above, Z =  
 2 1
Asset one dominates asset two; this lies behind the negative insurance price. In this example, for any
 k 
portfolio α =   k  0,  represents an arbitrage opportunity (and a special kind).
 k 
 Dominance is a special form of arbitrage, others are easily defined.
Definition: Riskless Arbitrage – not only arbitrage, but a certain payoff

A riskless arbitrage opportunity is any  such that 1′ ≤ 0 and Z  = 1 , then  guarantees a riskless
payoff when no (or possibly negative) investment is required.
 The existence of such an  does not imply the existence of a riskless asset:
1 0 1
e.g. let Z =   then  =   is a riskless opportunity: Z  = 1 and 1′  = 0.
 2 1  1
But, for any w with 1′w = 1…
 1 0   w1   w   R
Zw =     =  1     for any w1 or R so no positive investment
 2 1  1  w1  1  w1   R
portfolio with a riskless payoff exists.
 If a riskless asset is available in an economy with a riskless arbitrage opportunity, it is possible to

create a risk free asset with any level of return.
Proof: If w is a riskless asset, the 1′w = 1 and Zw = R1
[24]
Let  be a risk free arbitrage 1′  = 0 and Z  = k1
Then w +   is a positive investment portfolio (1′( w +   ) = 1) with return
Z(w +   ) = (R+  k)1 and a judicious choice of  gets you any desired level of riskless
return
 Conversely, if there exists a riskless asset a riskless arbitrage opportunity exists whenever there exists
a solution to 1′w = 1 and Zw = k1 where k  R
More generally:
Definition: 1st Type (arbitrage in LeRoy &Werner)
An arbitrage opportunity of the 1st type is defined as any  such that 1′  0 and Z  0.
 This says that no (or possibly negative) investment buys a limited liability payoff that is strictly
positive in at least one state at the future date.
Definition: 2nd Type (strong arbitrage in LeRoy & Werner)

An arbitrage opportunity of the 2nd type is defined as any η such that 1′ < 0 and Z  0.
 This says that you can generate money now and have at worst a limited liability payoff at the
future date.
Exercise – In equation (33) in Ingersoll, there is an arbitrage opportunity of the second type. Describe how to implement
it.
The existence of arbitrage opportunities (of the 1st and/or 2nd types) can be succinctly stated as: arbitrage
opportunities exist if there exists some η (or some n), for which the following is true:
  1    v' n 
   0 or,    0
 Z   S 1 x1  Yn 
S 1x1
When there is no such  (or n) then there are said to be no arbitrage opportunities (of the 1st or 2nd
types) in the market represented by Z (or Y)  an Absence of Arbitrage.
__
Definition: A supporting state price vector (or linear pricing rule):

An S×1 vector p (more generally, a pricing function p( )) is said to support the market Z
if Z′p = 1 (or Y′p = v). That is, the vector p “prices” all assets correctly in that it relates future payoffs
to current prices (Z and 1 or Y and v).
Example:
2 1
 
Z =  2 3  think of the elements of Z as a $ payoff per dollar invested
1 2
 
Then Z′p = 1  2p1 + 2p2 + p3 = 1
p1 + 3p2 + 2p3 = 1 (when you solve this system of equations you will find) 
p1 = ¼ + ¼ p3
p2 = ¼ - ¾ p3
p3 = p3 is a set of supporting prices as a function of p3
By construction, if we multiply either of the primitive securities in Z, or any portfolio of these
securities, by the vector p we will get the current price of the security/portfolio, 1.
[ 25 ]
Linearity of the pricing rule: is the same as the lack of monopoly power in the financial market. (1) The
cost of money in state r is independent of how much is purchased in state s so there are no economies of
scope (although the possible payoffs in states r and s in portfolios of the traded assets may be tied by any
incompleteness in Z).
(2) The cost of $2 in state s is just exactly twice the cost of $1 in state s so there are no economies of
scale that exist.
 Our goal is to derive implications for p from the absence of arbitrage opportunities.
 An important implication is going to be  p > 0 such that Z′p = 1. (Why?)
What does p look like? What would happen if we were to multiply an Arrow-Debreu security (in Y)
by the vector p? The elements of p can be seen as state (or insurance) prices. Since ps is the current
cost of purchasing a dollar in state s and zero elsewhere, ps ≤ 0 is a clear arbitrage.
Example: To more completely show the link between a positive price vector and arbitrage:
z z12 
Let Z =  11  Each asset’s current price is $1 since we are working with Z.
 0 z 22 
z 0   p1   z11 p1  1 (1)
Z′p =  11    =   =  
 z12 z 22   p 2   z12 p1  z 22 p 2  1 (2)
Suppose p1 < 0. Then from (1) p1 = 1 z11 so p1 < 0  z11 < 0
In this case, just shorting asset 1 is an arbitrage (receive $1 now) since z11 < 0 implies that a short
position in asset 1 has a payoff > 0. Similarly, from equation (2):
1  p1 z12 1  z12 z11 z11  z12
p2 = = =
z 22 z22 z 22 z11
z12  z11
Consider the portfolio w: w1 = w2 = . w1 + w2 = 1 and,
z12  z11 z12  z11
 z12 
z z12   z12  z11   0 
Zw =  11    = 1 
 0 z 22    z11   p 
z12  z11   2

So, if p2 < 0 shorting the portfolio w is an arbitrage opportunity.
 The secret behind this example…This is a complete market which implies that we can form
any returns pattern on the assets we would like. This makes the relation easy to see since we
can then form portfolios that pay off only in one state (i.e. an Arrow-Debreu security) where
the payoff in a state and the state price are explicitly related.
The Law of One Price

This is a weak (less restrictive) version of the absence of arbitrage, i.e., it requires less of the market. It
states that perfect substitutes must have the same price, i.e., two distinct assets with identical payoffs
must have the same current price. Clearly, this requires a redundancy in the primitive assets. It is a
consequence of the absence of arbitrage but does not imply it.
 Formally:
If n  n* and Yn = Yn* then v′n = v′n* or,
If    * and Z = Z * then 1′ = 1′ *
 The law of one price is equivalent to the existence of a supporting price vector but places no
restrictions on it. Consider the following problem: define  1 =  –  *,
[26]
Choose 1 to Min 1′ 1 subject to Z 1 = 0

(find the smallest difference in cost between two portfolios with identical payoffs)
If the law of one price holds, clearly the solution to this problem is a minimum of zero. A finite
solution to the primal problem implies that the dual program is feasible. The dual to this
minimization problem is written:
Maxp 0′p subject to Z′p = 1

(where p is the vector of Lagrange multipliers from the primal problem)
Because the primal problem has a finite solution this dual problem is feasible, i.e. there is a finite
p that solves the maximization problem. Thus, if the law of one price holds, some p exists with
Z′p = 1, i.e. there exists a supporting price vector. The constraint in the primal problem is an
equality constraint so p is unconstrained in the dual. Now if we consider the existence of a
supporting state price vector the maximization problem is feasible and its objective must be zero.
The theorem of duality implies then that the minimization problem has an objective of zero so
the law of one price holds. Alternatively, if    * and Z = Z  * but 1′ ≠ 1′ * then no
vector p can exist such that p'Zη = 1'η and p'Zη* = 1'η* (which are not equal). See also
Theorem 2 in Ingersoll.
Considering an absence of arbitrage we get The Fundamental Theorem of Asset Pricing.

Theorem: The following are equivalent:
(1) The absence of arbitrage.
(2) The existence of a strictly positive supporting price vector.
(3) The existence of an internal optimum (portfolio) for some agent with strictly increasing
preferences.
Proof: This version of the proof uses Stiemke’s Lemma or the “Theorem of the alternative”:
“Let A be a matrix in  NxM . Then one and only one of the following is true
(a) There exists an x   M s.t. Ax = 0
(b) There exists an n   N with A′n  0. ”
(This is based on a separating hyperplane argument if you are familiar with them from
your economics classes.)
Let A =
NxS 1
Y '
N S
 v so M = S+1 and N = N
N 1
S 1
Suppose (a) is true. Then using Stiemke’s Lemma,  x    such that Ax = 0.
Write this as:
 Y11 Y21  YS 1  v1   x1 
   
 Y12 Y22  YS 2     
        =0
   
       
   
 Y1N   YSN  vN   xS 1 
Now explicitly write out the multiplication of Y   v  x

S
1st row × col x’s  s 1
Ys1 x s  v1 x S 1 = 0
[ 27 ]
S
2nd row  s 1
Ys 2 x s  v 2 x S 1 = 0


S
Nth row  s 1
YsN x s  v N x S 1 = 0
S 1
Now we know x   
so all the xs are strictly positive, in particular x S 1 > 0. Divide both
sides of each equation by x S 1 and rearrange
S x
s 1 Ys1 x s = v1
s 1


S x x
s 1YsN x s = v N Now define s = ps > 0  s
x s 1
s 1
S
  s 1
Ys1 p s = v1


S
 s 1
YsN p s = vN
or, Y′p = v with p > 0
So, (a) is equivalent to  p > 0 s.t. Y′p = v and the alternative (b) cannot hold.
Now, suppose that (b) holds. Recall, we let A = Y   v  . (b) says  n   N s.t. A′n ≥ 0.
 Y 
A′ =  
  v 
S 1 xN
 Yn 
So, there exists an n such that A′n  0 or    0
  v n 
S 1x1
This implies that Yn  0 and v′n  0, where at least one of these inequalities is strict. This is just the
general definition of the existence of arbitrage opportunities of the first and/or second types.
Therefore, we have shown that either (a) holds which is the existence of a strictly positive supporting
price vector and (b), the existence of arbitrage opportunities, does not or (b) holds which is the presence
of arbitrage and (a) does not – proving the equivalence of the absence of arbitrage and the existence of a
strictly positive price vector.
Now, part (3) of the Theorem  part (1) (or not 1  not 3):
An investor’s maximization problem can be written: Maxn s  s u s (W0  vn, (Yn) s ) . So, maximize
expected utility of current consumption and date 2 wealth. This allows for state dependent utility using
us(·,·) for generality; requiring only increasing preferences so  s us(·,·) increases in all arguments
(current spending and future wealth in all states).
If there is an arbitrage opportunity n*, then the investor’s problem cannot have a finite optimum if
preferences are increasing since for any n: s  s u s (W0  v(n  kn*), [Y (n  kn*)]s ) is strictly increasing
in k. With an arbitrage opportunity the investor can increase current consumption or future wealth, in at
[28]
least one future state, without sacrificing current consumption or wealth in other contingencies (other
states) and will thus desire to do so without bound.
Now, (1)  (3): If there is no arbitrage then there exists a positive supporting price vector p. Let W0 =
0.
p
Consider us(co, c1s) = -exp[-(W0 – vn)] - s exp[-(Yn)s]. Each us( ) is strictly increasing and strictly
s
concave, infinitely differentiable and additively separable. Using the fact that p is strictly positive and
the relation v′ = p′Y you can show (and you should) that with this utility function the FOCs for a
maximum (which are necessary and sufficient by concavity) are satisfied at n = 0, therefore this investor
would find an internal optimum.
More intuitively…the absence of arbitrage means that consumption at date 1 or in any state at date 2 has
a positive price, i.e. it can only be increased at the expense of consumption elsewhere, either at t=1 or in
some t=2 state. So, pick any strictly concave strictly increasing utility function with u′(-∞) = ∞ and u′(∞)
= 0 and standard convex programming arguments show that this function will have an interior optimum
due to the tradeoffs implied by the positive state prices.
1
Example: Let Z =   . Thus, our set of assets is a simple gamble. Clearly no arbitrage
 2
opportunities exist. What is the supporting price vector?
Z′p = 1  p1 + 2p2 = 1 or p1 = 1 – 2p2

For any p2 with 0 < p2 < ½ these define strictly positive supporting state prices.
 .8   .5 
p1 =   and p2 =   both support this market.
 .1   .25 
  1
It is also clear that p3 =   also supports this market. What does this mean?
1
The result doesn’t say that there are no non-positive supporting price vectors, only that the
existence of one strictly positive price vector is equivalent to the absence of arbitrage.
 In general, the supporting state prices are not unique. Z is an S×N matrix, Z′p = 1 places only (if N
< S) N restrictions on S prices leaving S – N degrees of freedom and we will commonly have at least
one non-positive vector p that satisfies this equation, even if there is an Absence of Arbitrage.
 If state s is insurable, then the element ps of p is unique across all supporting vectors p.
Recall that a state being insurable means:  s s.t. Z s  is
 Thus, for all supporting vectors p (not necessarily positive)
p ( Z s )  p is
( p Z ) s )  p i s
1 s  p s indicating that ps is the price of insurance against state s occurring.
 This is true for all supporting vectors p and if the law of one price holds (and it does since there is a
supporting price vector) then 1 s  p s is the same for all p such that Z p  1 and all  s such that
Z s  is so that ps is unique.
 If Z has full row rank, i.e. all rows of Z are linearly independent so all states are insurable (i.e. the
market is complete) then p itself is unique (requires N ≥ S). Full row rank implies there exists a
[ 29 ]
unique right inverse, R, for Z so p is unique:
p′Z = 1′  p′Z(Z′(ZZ′)-1) = 1′(Z′(ZZ′)-1)
p′I = p′ = 1′(Z′(ZZ′)-1) = 1′R which is uniquely determined by R.
Riskless Asset:
1 1
If there exists a riskless asset, the sum of all state prices must equal:  for all supporting price
R 1  rf
vectors p.
Assume there exists a wR such that 1′wR = 1 and ZwR = R1. Then,
p′(ZwR) = (p′Z)wR
= 1′wR
=1 and,
p′(ZwR) = p′(R1)
= R(p′1)
= R ( s p s ) so,
1
1 = R ( s p s )   s
ps =  p s.t. Z′p = 1
R
Example: re-examined
2 1
 
Let Z =  2 3 
1 2
 
From Z′p = 1 we found p1  1  1 p3 and p 2  1  3 p3 and p3  p 3
4 4 4 4
For p > 0 we require 0 < p3 < 1/3. Look at the ‘edges’ of this range for p3…
For p3 “near” zero p3  0 p1  1/4 p2  ¼. And, s p s = ½  R = 2
For p3 “near” 1/3 p3  1/3 p1  1/3 p2  0. And,  s
p s = 2/3  R = 1.5
Recall that earlier using this example we found no riskless asset but bounds on the shadow
riskless return given by: R = 2 and R = 1.5. Thus, these bounds are ‘tight’: in the sense that we
1
can find a vector p with Z′p = 1 such that = R for any R in the interval R < R < R .
 ps
This is generally true, here it occurs because  s
p s = ½ + ½ p3 is continuous and monotonic
in p3.
[30]
Representation Theorem:
The following are equivalent:
(1) The existence of a positive linear pricing rule.
(2) The existence of positive ‘risk neutral’ probabilities and an associated riskless rate.
(3) The existence of a positive state price density.
(1) Linear Pricing Rule, our basic representation

Any asset or portfolio is correctly priced by p:
Y′p = v or Z′p = 1
So, for any portfolio n or w…
p′Yn = v′n p′Zw = 1′w = 1

or,
S N S
 p (Yn) s  i 1 ni vi
s 1 s  s 1
p s ( Zw) s  1
(2) Risk Neutral Probabilities (indicated via the *)

The current price of any asset or portfolio is given by the expected payoff under the risk
neutral probabilities discounted by the associated risk free rate. So,
1 1  1 * 
v′n = * E * (Yn)  *  * 'Yn 1  * E ( Zw) 
R R  R 
or,
N 1  1 * 
i 1 ni vi  R*  s  s* (Yn) s 1  *  s  s ( Zw) s 
 R 
ps 1  s*
To show the equivalence between (1) and (2) simply set  s*  and R* = or ps  .
 ps  ps R*
*
Clearly, if the economy has a riskless asset then R = R for all valid p.
 Here, as in the proof of existence of the state prices, we simply require that all traders believe that
the same set of states are possible. For equivalence we require that the same states have positive
probability under both measures.
 Every investor agrees on the set of valid p’s (if all believe the same set of states are possible) so all
will necessarily agree on the set of valid risk neutral probabilities. Thus, all investors price assets the
same under both approaches.
 The use of risk neutral probabilities can also be thought of as developing a market based certainty
equivalent measure for any risky asset. Since it is a “certainty equivalent,” the proper discount rate is
the associated risk free rate.
 “Stochastic Discount Factor” – if we define ps = 1/Rs as a discount rate appropriate for state s cash
flows, we see that the standard MBA presentation of valuation is an aggregated version of the linear
E (CashFlow)
pricing rule. There, v = . Here, we explicitly recognize the state dependence of the
(1  r (risk ))
cash flows and discounting each at its appropriate state dependent rate eliminates the need to risk
[ 31 ]
adjust our discount rate. The size of the cash flows state-by-state does this for us as opposed to
considering only expected cash flow and a risk adjusted discount rate that applies to the expectation.
This suggests that the relation between state contingent payoffs for an asset, state probabilities, and
state prices will be important will be important in the risk adjusting of a discount rate.
(3) State Price Density

The current price of any asset or portfolio is given by the expectation of the product of the state price
density () and the asset’s payoff.
v′n = E(Yn) 1 = E(Zw) (Note no *’s)
N
 v ni  s  s  s (Yn) s
i 1 i
1  s  s s ( Zw) s
ps
Equivalence follows from defining  s  or  s  s  p s
s
Note: E() = 1/R if there exists a riskless asset or 1/R* if not.
 Clearly,  is positive for all states s iff p is positive. This representation is most valuable when we
move to a continuum of states since p(s) and  (s) may be zero for all states s yet λ(s) may be well
defined.
 Note: we can write 1 = E(  Zw) as: 1 = E(  )E(Zw) + cov(  , Zw) or,
1  cov( , Zw)
E(Zw) = = R – R·cov(  , Zw) (assume there exists a riskless asset)
E ( )
 So, if cov(  , Zw) = 0, the asset has no risk premium. In other words, the same message we see in
other asset pricing frameworks is illustrated here: (1) some risk is not priced, (2) the expected return
on risky assets is the risk free return plus a premium, and (3) marginal risk is determined by
covariances. Note also that the correlation between the state price density (state prices and
probabilities) and payoffs appears as was suggested above.
[32]
Idiosyncratic Risk – Illustration:
Suppose payoffs can be written as: a + f( ~ x1    ~ x N )  ~ where E[f(·)] = E[~ ] = 0 (all expectations
are reflected in a). Then, the price of any asset is given by:
E * (a  f   ) a  E * ( f )  E * ( )
v 
R* R*
If, under the risk neutral probabilities, the expectation of ~ is also zero (E*[  ] = 0) then the risk or
variability of this component of returns does not affect this asset’s price, i.e. the risk of ~ does not
imply a differential expected return.
Looked at another way (using the state price density):
a
v  E[(a  f   ) ]   E[f ]  E[ ] (Assuming there exists a riskless asset.)
R
Consider…
ps
E[λε] =  s s s s =  s s
s
s
 ps 
=R -1    s

=
*
R-1 s  s  s
  ps
s

= R-1 E*[  ] = 0 iff E*[  ] = 0
So, if E*[  ] = 0 then E[   ] = E[  ]E[  ] + cov(  ,  ) = 0 and so it must be that

cov(  ,  ) = 0 since we assumed that E[  ] = 0.
 Risk is not priced – it carries no risk premium – if it is uncorrelated with the state price
density, .
Note – Almost none of what we have said so far has had to do with actual or subjective probabilities of
the states – a seemingly strange omission when talking about asset pricing.
The message is this…Dominance and arbitrage are dependent upon possibilities not probabilities – state
by state comparisons (with no regard for the likelihood of each state). Though this is, in some sense,
blunt, it carries us a long way.
[ 33 ]
[34]
3 The General Portfolio Problem

The General Consumption Portfolio Problem:
Maxw,c0 E[U (c0 , c1 )] subject to: c1s  (W0  c0 ) w' z s .  s

1′w = 1 (a scalar)
What are the constraints? – Technology & Budget

Why isn’t it generally possible to allow our agent unrestricted choice over consumption levels c0 and c1s?
We may alternatively write the technology constraint as a vector constraint:

c1  (W0  c0 ) Zw or W1  (W0  c 0 )w' ~
z if we write in terms of wealth at time 1.
Assuming a state independent von Newman-Morgenstern utility function of wealth (as is standard) we
simplify the problem by substituting the technology constraint into the maximand:
Max w ,co E[U (co , (Wo  co )w' ~
z )] subject to: 1′w = 1
so L = E[U (c o , (Wo  co ) w' ~

z )]   (1  1w)
We have a concave objective function with a linear constraint.

The First Order Conditions, which are necessary and sufficient given this structure, are written:
L
(1) = E[U 1 ()  U 2 ()w* ' ~
z]  0
c o
L
(2) = E[U 2 ()(Wo  c o ) ~
z ]  1  0
w
L
((2) is a vector of equations, one for each asset i  that hold at the optimal w)
wi
L
(3) = 1 - 1′w* = 0

Now, rearrange some equations…

(2′) premultiply (2) by w*′ and use (3) E[U 2 ()(Wo  co ) w* ' ~
z] 
(1′) from (1) and (2′) E[U 1 ()]  E[U 2 ()w* ' ~
z ]   (Wo  co )
(1′) gives  as proportional to expected marginal utility of current consumption and to the expected
marginal utility of invested wealth or savings (i.e. (W0 – c0)).
The Message: consume from endowed wealth until the expected marginal utility of current consumption
and savings (the second part of (1′) is the derivative of the Lagrangian with respect to (W0 – c0)) are equal
(that’s (1′)), then allocate savings across the assets until each asset gives an equal contribution to
expected marginal utility of future wealth (that’s the set of equations that make up (2)).
[ 35 ]
In the study of finance, we concentrate on the second problem of allocating after consumption wealth
among assets and often assume the consumption decision can be made independently (or is taken as
given), so the concentration is on versions of (2) – become familiar with it!
To simplify: Consider a derived utility function over returns defined as
u ( z )  u (w' ~
z ;Wo , co )  U (co , W1 )  U (co , (Wo  c o ) w' ~
z ) by taking W0 and c0 as given we are
isolating the investment or portfolio choice decision. Then,
U
u' ( z)   (Wo  c o )U 2 ()
w' ~
z
Asset by asset, equation (2) becomes E[u ( w* ' ~z )~z i ]    assets i – we might think of this as
normalizing after consumption (date t = 0) wealth (savings) to 1.
Alternatively, the problem can be stated (using utility of returns):
Max w E[u ( w' ~

z )] subject to 1′w = 1
L = E[u ( w' ~
z )]   (1  1w)
The FOCs are:
L
 E[u (w*' ~
z )~
z ]  1  0 (again, a vector of equations, one for each asset)
w
plus, 1′w* = 1 again
Each equation is: E[u ( z*) z i ]    assets i

This is an expression we will see again and again.
The expression says essentially that the expected return on any asset is related to the covariance of the
marginal utility of the return of an optimal portfolio with the return on that asset:
E[u ( z*) z i ]      E[u ( z*)]E[ z i ]  Cov (u ( z*), z i )
This expression might already look familiar (from last our prior discussions). Rewrite the equations:
E[u (~
z *)~
zi ]    i
by assuming a discrete number of states and writing out the expectation:

S
s 1 s u ( z*) s z si    i or,
S   s u s 
 
s 1 
 z si  1  i this is a supporting price equation with…
  
[36]
  u 
p s   s s   0 since each component is positive we can alternatively write
  
u '
s  s (state price density), so we have seen the FOC before.

Or, write E[ z i ]  1 as 1  E[ ]E[ z i ]  Cov ( , zi ) for another way to see that λ (the state price
density) is proportional to marginal utility of return on an optimal portfolio (see the similar equation
above) or equals the marginal rate of substitution between consumption at time 0 and consumption in
state s if we can identify  as before.
If there is a riskless asset, E[u ( z*) R]   so, E[u ( z*)( zi  R)]  0 i

S   s u s  S 1
and,  
s 1 
 R  1 or  s 1
p s  E[ ]  as before.
   R
Example: Risk neutral agent and there exists a riskless asset. If u(·) is risk neutral (linear), u′(·) is a
  u 
constant. So, p s   s  . Thus, state prices are proportional to the actual probabilities.
  
u  1  * p *
p s    and since  s  s we know that  s   s . The risk neutral probabilities equal
  R  ps
the actual probabilities, so the true expected return on all assets must be R.
  s u s  ~ S u 1
This should not be too surprising. From  
 z si  1  i and
s 1

 R
we can see that the
  
FOC is, in this example, equivalent to   s z si  E (~
z i )  R  i.
So, E[ z i ]  R under the actual probabilities if there is a single risk neutral agent in the market (facing no
trading restrictions: short sales constraints or position restrictions). A risk neutral agent doesn’t require
compensation for risk and buys or sells each asset until his FOC holds for each asset. Note here, the
behavior of the agents determines the relation between prices and payoffs. Whereas before we
examined the relation between given prices and payoffs considering only that this relation did not allow
for arbitrage opportunities.
Now consider the state price density:

p u  u  u 1
s  s  a constant, so, for each state s, E[ ]  E       . We write 1 = E(zi) when
s     R
 is not a constant so this example again illustrates why λ is called a “stochastic discount factor.”
[ 37 ]
Example: Now, assume u(·) is quadratic: u(z) = az + ½bz2 with b < 0 for concavity (we must also
constrain the range of z to ensure monotonicity). In this case, u′(z) = a + bz is linear.
The equation E[u ' (~

z *)~z i ]   becomes
E[(a  bz*) z i ]   or,
a z i  b[Cov ( z*, z i )  z * z i ]   or,
  bCov ( z*, z i )
zi 
a  bz *
Thus, the expected return on any asset is linearly related to its covariance with an optimal portfolio’s
returns – here, wealth/return on an optimal portfolio is a sufficient statistic for marginal utility. (There
is a negative sign in the expected return, but recall that b < 0.)

If there is a riskless asset, R  since the riskless asset has zero covariance with all zi.
a  bz *
Rb
So, the expected return relation becomes z i  R  Cov ( z*, zi ) which must hold for any asset or

*
* Rb Rb z*  R
portfolio. Thus it must hold for z so we write: z  R  Var ( z*) . Thus   .
  Var ( z * )
Substituting this into the equation for the expected return on asset i we get
Cov ( z*, zi ) *
zi  R  ( z  R)  R   i ( z *  R)
Var ( z*)
Except for the * rather than an M (to indicate the “market” portfolio), this is the CAPM pricing relation.
We will see what allows the switch from * to M later.
More generally:
E[u ( z*) z i ]  
Cov (u , z i )  u z i  
  Cov (u , z i )
zi  Why is there a negative sign?
u
So, it is (the negative of) the covariance with marginal utility of return on an optimal portfolio that is
important.
 R
With a riskless asset: R  and z i  R  Cov (u , zi ) . Again this must hold for z* so
u 
R Cov (u ' , z i )
z *  R  Cov (u , z * ) and we can solve to find that z i  R  ( z *  R) . This illustrates one
 Cov (u ' , z * )
of the main challenges in deriving useful asset pricing equations.
[38]
When we consider the Capital Asset Pricing Model (“CAPM”), we will look at a full portfolio problem
with many assets. For now, let’s consider Properties of Simple Portfolios:
 1 risky asset with return ~z and one riskless asset with return R
Let w be the portfolio weight on the risky asset.
Define, for convenience, the excess return as ~x ~z R
The investor’s problem is:

Maxw E[u ( w~z )] subject to 1′w = 1 (i.e. the portfolio weights are w and 1 – w).
Substitute the budget constraint into the maximand to get:

z  (1  w) R)]  E[u ( w~
Max w E[u ( w~ x  R)]
The FOC is:

E[u ( w * ~
x  R) ~x]  0 or,
E[u ( w * z  (1  w*)R)( z  R)]  0 just a simple version of the general FOC.
Evaluate this FOC at w = 0  E[u ( R) ~ x ]  u ( R) E ( x) (since u(R) is a constant). This is greater

than or less than zero as E(x) is greater than or less than zero (or as E(z) is greater than or less than R).
We usually assume that E(z) > R so that the FOC is greater than zero at w = 0. Examine the FOC, since
u′ is a decreasing function of return, as w is increased above 0, the small realizations of x are weighted
more heavily than the large realizations of x in the expectation, reducing the value of the FOC. So, if
E(x) > 0 at w = 0, then some w > 0 solves the FOC. Therefore, all risk averse investors have demand
for the risky asset of the same sign: positive if z  R (and negative if z  R ).
Alternatively, the 2nd order condition is: E[u ( wx  R) x 2 ]  0 for all risk averse investors, so the
FOC is decreasing in w. So, if at w = 0, the FOC is positive, it is zero at some w* > 0.
In the simple portfolio problem, we have full insurance available, so using the Arrow-Pratt measure of
risk aversion is appropriate. If agent A is more risk averse than agent B, how do you think they should
behave towards holding the risky asset in this simple setting?
Consider uA and uB such that there exists some G with G′>0 and G″<0 and uA=G(uB). So, A is more
risk averse than B in the A-P sense.
The FOC for A is written: E[u A (wx  R) ~

x ]  E[G (u B )u B () ~
x]
...evaluate this at w B * - if G′ is a constant (G(uB) = g×uB+k) this FOC equals zero since w B * solves
E[u B () ~
x ]  0 (i.e. if uA if is an affine transformation of uB they represent the same preferences and so
have the same optimum).
Since G″ < 0 (i.e. G′ is decreasing), low (negative) x and so low uB(wx+R) are weighted more heavily
than are high x (relative to B’s FOC). Thus, the FOC for A evaluated at w B * is less than zero:
E[u A ( wB * ~ x  R) ~
x ]  0. The same reasoning as before this tells us that some w A * < w B * sets A’s
FOC to zero. Thus, an investor who is more risk averse holds less of the risky asset.
Now, consider a single investor, so hold the utility function constant, but look at different wealth levels.
Why? Because A(W0) depends on W0. Do we get what we think we should?
Write final wealth as Wo ( w~

x  R)  Wo ( wz  (1  w) R) (ignore current consumption).
[ 39 ]
Thus, to increase initial wealth we can increase W0 or equivalently increase z and R together: ~z  ~z  a
and R  R  a . Thus, ~ x  z  R stays the same. So, we can leave ~x unchanged and increase R (i.e. a
comparative static change in R leaving the distribution of ~x fixed is equivalent to a change in W0). We
decrease the price of the safe asset and leave the reward for bearing risk the same (thus decreasing the
price of the risky asset as well).
Look at our FOC – we normalized W0 to be equal to 1 – and do a comparative static change in R.

Think of E[u ( w * ~
x  R) ~ x ]  0 as a level set of the form F(w,R) = 0.
w * F R
The implicit function theorem tells us that: 
R F w
w * ~
E[u () x ] E[ A()u () x ]~
 
R x E[u () ~
x2] E[u () ~
x 2]
…the denominator is negative, the numerator is negative or positive depending on whether A(·) is
decreasing or increasing. What if A(·) is constant? Thus, decreasing absolute risk aversion is sufficient
for the optimal amount of the risky asset, w*, to be increasing in initial wealth. The economic
interpretation…the wealth effect (driven by the dependence of A(.) on wealth) drives this move. There
is no substitution effect as both the price of the safe and risky assets have declined.
Now, suppose that we instead hold the distribution of z fixed.

What will an increase in R do?
There is still an increase in wealth, however, it will now also decrease the reward for risk bearing
(introduce a substitution effect) and reduce the investor’s interest in holding the risky asset.
The FOC: E[u′(wz + (1-w)R)(z-R)] = 0
w *  E[u (wz  (1  w) R )( z  R )](1  w)  E[u ()] 

  
R z  E[u ()( z  R ) 2 ] 
E[ A()u () ~
x ](1  w*)  E[u ()]
=
E[u () ~
x2]
w *
= (1  w*) + a negative term
R ~x
Since the distribution of z remains fixed the wealth effect is not as powerful as when the distribution of
x was held fixed and we now see opposing forces at work. From above, if w*  1 and A(·) is increasing,
this entire expression is negative. So, a sufficient condition for an increase in R (holding the distribution
of z fixed) to decrease the demand for the risky asset is increasing absolute risk aversion. The income
and substitution effects work in the same direction. There are no simple sufficient conditions for a
positive reaction towards w* with a general utility function. Intuitively, if A(·) is decreasing, since we
have allowed the reward for risk bearing to decline, we would need A(·) to decrease fast enough to offset
w *
the substitution effect (that negative term) in order to get  0.
R z
[40]
Finally, consider a comparative static change in the expected return on the risky asset; shift the
z   or R    ~
distribution by adding a constant. Replace ~z with ~ z R~
x becomes ~
x (so ~ x )
and see what happens as  changes from zero holding R fixed.
The FOC is now: E[u ( R  w* ( ~

x   ))(~
x   )]  0
w * E[u () w * ( ~
x   )  u ()]

 R E[u ()( ~
x  )2 ]
Evaluate this at  =0
E[ A()u ()w * ~x  u ()]

= ~
E[u () x 2 ]
Since we know w* > 0 (if E(x) > 0), the first term in the numerator is negative iff A(·) is decreasing. The
2nd term in the numerator and the denominator are negative. So, decreasing absolute risk aversion is
sufficient but not necessary for w* to be increasing in  . We have increased the reward for risk bearing
(decreased the price of the risky asset) so for decreasing absolute risk aversion investor, the income and
substitution effects work together. It is not necessary since we could allow for “slowly” increasing A(·)
and still have w* increasing.
Example: CARA Utility u(z) = -e-Az with A > 0 (A is constant so there is no wealth effect)
The problem is: Max w E[u ( z )]  E[e  A( wx  R ) ] …so the budget constraint is in the maximand.
We have normalized W0 = 1
Let ~
x ~ N(a, b) (note: b > 0 since it’s a variance and presume a > 0 as is common).
Given that ~
x is normally distributed (and so wx+R is normal), we can write the problem as:
 A2 w2b 
Maxw E[u ( z )]   exp  A( wa  R )  ,
 2 
by using the moment generating function for normal random variables.  Know this trick
Note: A(wa+R) is the mean of the random variable of interest and A2w2b is its variance.
Take minus the log of minus E[u(z)] (a monotonic transformation) and the problem becomes:
A2 w2b
Max w A(wa  R) 
2
a
The FOC is: Aa – A2w*b = 0 or, w* 
Ab
w *
Note: (1) 0
A
(2) w* > 0 iff a > 0 (as before)
(3) w* is independent of R (since A(·) is constant) and increasing in a
[ 41 ]
The amount of wealth w* in the risky asset depends on the mean and variance of excess returns (why?)
and on preferences – parameterized by A(·), the measure of absolute risk aversion.
w *
We see that  0 which implies that the dollar amount invested in the risky asset is independent of
R
initial wealth. So, for all initial wealth levels the same dollar amount is put in the risky asset; the rest is
made up of a (positive or negative) position in the riskless asset.
Example: CRRA utility

We’ll use log utility for simplicity
Also, let ~
x = h > 0 with probability p
= k < 0 with probability 1-p (Why must k<0?)
And, writing utility of returns we have again normalized W0 = 1 to simplify the problem.
u(z) = Ln(wx + R) so,
Max w E[ Ln(wx  R)]

 1  
The FOC is: E  *  x = 0
 w x  R  
Write out the expectation (substitute in h and k for x) and rearrange
 h   k 
p   (1  p )  =0 or,
 w * h  R   w * k  R 
p(w*k + R)h + (1 - p)(w*h + R)k = 0
pR (k  h)  kR
So, w* =
hk
 R[ ph  (1  p )k ]  R x
=  > 0 if x > 0 as before.
hk hk
Note that now w is not independent of initial wealth (here R), a general result for CRRA utility.
w *
is in fact a constant, so the same proportion of initial wealth is put into the risky asset for all wealth
R
levels. Similarly, w* increases in R (since A(·) is decreasing) and x .
Stochastic Dominance – a bit of a tangent

One portfolio dominates another if it always outperforms the other – state by state. To First Order
Stochastically Dominate (“FOSD”) a second portfolio, a portfolio need not always outperform the second;
rather, we require that the 1st portfolio’s probability of exceeding any given level of return is larger than
that of the second portfolio.
That is…if f(z1) and g(z2) are the marginal density functions for the returns on portfolios 1 and 2,
respectively, then portfolio 1 FOSD’s portfolio 2 if:
x x
 f ( z1 )dz1   g ( z 2 )dz 2  x
 
or, F(x)  G(x)  x using their cumulative distribution functions.
We can write this in terms of random variables as:
[42]
~
d ~ ~
z2  ~
z1   where  is a non-positive random variable
Dominance would be written:
~ ~ ~
z2  ~
z1   where  is a non-positive random variable (these are realizations)
This requirement of equality of distributions illustrates the important difference between FOSD and
dominance. The probabilities of the states are irrelevant when considering dominance, yet they are
crucial in FOSD (though the states themselves are not).
Consider two assets with returns {0, 2} and {0, 1}.

If prob(z1=0)  prob(z2=0) so that prob(z1=2)  prob(z2=1), then asset 1 FOSD’s asset 2. Only in
the case that z1=2 and z2=1 in the same state of nature (i.e. they are perfectly correlated) does asset 1
dominate asset 2.
 2 1
In other words, if   represents the market returns, then asset 1 dominates asset 2 (also, 1 FOSD’s 2
 0 0
– why?), so there is an arbitrage opportunity.
 2 0
If   represents the market, then there is no dominance or arbitrage.
0 1
However, if prob(state1)  0.5, then asset 1 FOSD’s asset 2
Investors don’t disagree about dominance as long as they view the same set of states as possible. They
can, however, disagree about FOSD if they have different probability assessments.
An immediate result is that no non-satiated investor will ever hold all of his/her wealth in a/the risky
~
asset, ~
z 2 , that is first order stochastically dominated. This is because ~ z 2 and ~z1   have the same
distributions, so: E[u ( z 2 )]  E[u ( z1   )]  E[u ( z1 )] by the non-positivity of  and the strict
monotonicity of u(·).
~
Think of  as a vector of all zeros and a -$1 in one state – the idea is why take the chance of throwing
away $1 when the two portfolios “cost” the same.
Alternatively, we can compare 2 “normal” distributions, where one is the other minus a constant:
Investors may hold FOSD’ed assets as part of their optimal portfolios (but will never hold FOSD’ed
portfolios). Recall dominated assets cannot exist or no investor has an optimal portfolio.
 2 0
Why? Think of our example   when (prob(s1))  (prob(s2)) then asset 1 FOSDs asset 2.
0 1
[ 43 ]
Asset 2, however, is a hedge against the risk of asset 1. It will be held in a positive amount by all strictly
risk averse investors regardless of the probabilities of the states. It is 2’s negative correlation with 1 that
gives it value – however, if you had to choose between them, nobody would choose asset 2.
CAPM Example – negative beta assets are useful and held in portfolios, but you would never put all of
your wealth in a negative beta portfolio. Why?
[44]
4 CAPM
What we want to do here is build some intuitions by presenting a special case of more general future
results on the risk/return relation.
Mean-Variance Analysis: The basis for an equilibrium pricing relation known as the Capital Asset Pricing
Model (“CAPM”) which relates a measure of the risk of an asset to the expected return of that asset.
Foundation: The risk of a portfolio can be measured by the variance of that portfolio’s return. The
idea is that there is a derived utility function over mean return and variance of return so that these two
2
parameters completely determine expected utility – E[u ( z )]  v ( z p ,  p ) – where generally we think
that v1 > 0 and v2 < 0. This second condition is necessary for consistency with the maximization of
expected utility, the first, in the absence of a riskless asset, is not.
Notation: The matrix Z again represents our market: or we can use the random vector ~z . Note: if
there exists a riskless asset, it is not included in Z, we keep track of it “on the side.” z is a vector of
expected returns and Σ is the variance-covariance matrix of Z. For any portfolio w:
N N
Expected Returns: z w  wz  wo R  i 0 wi z i ; z o  R ; i 0 wi  1
N N
2
Variance of return:  w  w  w   wi w j  ij
i 1 j 1
Covariance Vector: w  w
Given our derived mean/variance utility function, our problem is to describe the mean-variance efficient
set of portfolios: the set of portfolios with the largest mean for a given level of variance. Any agent with
such a utility function will choose a portfolio from this set.

Mean-Variance Efficient Portfolios
2
We will find that it is easier to work with the larger set – the minimum-variance portfolios – the set of
portfolios with the smallest variance (or std. dev.) for each given level of expected return.

Minimum Variance Portfolios
2
[ 45 ]
We first consider the problem of deriving the minimum variance set in the absence of a riskless asset.
To derive the minimum variance set we minimize (by choice of a portfolio w) portfolio variance (a
quadratic function of w) subject to the linear (in w) constraints – that the portfolio’s expected return be
at a given level and the budget constraint. The problem is nicely behaved and the first-order conditions
are necessary and sufficient due to this special structure. Symbolically:
Minw ½ w′ Σ w(objective is quadratic in w)
subject to 1′w = 1 (λ) (budget constraint – linear in w)

z w   (γ) (a fixed expected return – linear in w)
Lagrangian:
L = ½ w  w   (1  1w)   (   z w)
FOCs:
L
(1)   w * 1  z  0
w
L
(2)  1 w*  1

L
(3)  z w*  

Rewrite (1)…
(1′) w* =   1 1    1 z (note this is really w*())
Now solve for λ() and γ() (the whole problem is based on a given )
(2′) 1w*  1  1  1 1   1  1 z
(3′) z w*     z   1 1   z   1 z
Define: A  1  1 1  0 B  1  1 z
C  z   1 z  0   AC  B 2  0 (as long as z  k 1 )

so, (2″) 1 = λA + γB (3″) μ = λB + γC
C  B A  B
  ( )   ( ) 
 
[46]
This lets us find the equation of the minimum variance set.
 2 (  ) = w *  w*  w * (  1 1    1 z )
= w * 1  w * z
=   
Substituting for λ and γ and rearranging we find:
A 2  2 B  C
 2 ( ) = .

This is the equation of a parabola in (σ², μ) space. If we write this equation as μ is a function of σ, i.e. in
(σ, μ) space, it is the equation of a hyperbola.
 wg
Now we want to locate a specific portfolio or two. We can locate the global minimum variance
portfolio by noting that this portfolio solves the minimization problem with a slack return constraint: γ
= 0.
g A  B
From w* =   1 1    1 z , if γ = 0, then  w g    1 1 . Also,   0

or,  g A  B  0. So,  g  B . Thus, B > 0 iff  g  0 , which is what we think of as the typical case.
A
C  B
If  g  B then   1 , which we find from either   or from 1w g  1.
A A  g
So, we have:
 1 1  1 1 B 2 1
wg   1
,  g  , and  g  from wg′ Σ wg.
A 1  1 A A
Alternatively, we take the derivative of the variance equation to find its unconstrained minimum:
 2   A 2  2 B  c  2 A  2 B
    0 at g.
     
B
 g 
A
B2 B2
C C
C  B A  A  1
Then,   
  AC  B 2 A
[ 47 ]
A  B B  B
   0 and,
 
 1 1
w g    1 1    1 z  as before.
1  1 1
Two Fund Separation:
From w*    1 1    1 z we can see that all minimum variance portfolios can be formed through
portfolio combinations of 2 distinct portfolios. Since wg corresponds to one of these portfolios, it is
natural to look at the ‘other one’ implicit in the equation for w*.
 1 z  1 z
That is, define wd  
1  1 z B
then, w* = (A) wg  (B ) wd and

AC  AB  AB  B 2
1′w* = 1 = (A)1w g  (B)1 wd  A  B  1

which verifies the proposition. Note that varying   (-, ) and observing that λ and γ are monotonic
(linear) functions of μ also shows that all portfolio combinations of wg and wd are minimum variance
portfolios.
We just saw that all min-var portfolios are portfolio combinations of wg and wd and all portfolio
combinations of these portfolios are min-var portfolios. If investors want to hold any min-var portfolio
they don’t need access to all tradable assets; they only need access to 2 mutual funds, wg and wd.
wg is the global minimum variance portfolio – can we similarly locate wd?
 1 z C
wd has expected return:  d  z wd  z  
B B

2  wd   1 z z   1   1 z z   1 z C
wd has variance of return:  d  wd  wd   2
 2
 2
B B B B
We verify that wd is a min-var portfolio by checking that it satisfies the equation:  2    

C2A
C C  2 C
2 C C  d B d A  B C B C ( AC  B 2 ) C
d       = =  2
B   B  B 2 B
Further, we know that wd is on the upper limb of the hyperbola (in the “normal” case) since:
C B 
zd  zg    We know that A &  > 0
B A AB
so z d  z g is positive if B > 0. This is true if μg > 0 which is “normal.”
We may also use any two distinct min-var portfolios in place of wg and wd – they will “sketch out” or
span the entire min-var set, i.e. the minimum variance set is derived from portfolio combinations of any
two distinct minimum variance portfolios.
Let wa and wb be defined as: wa = awg + (1-a)wd
[48]
wb = bwg + (1-b)wd
wa and wb are min-var portfolios since they are portfolio combinations of wg and wd by construction.
From w* =  Awg +  Bwd we can get:
A  b  1 1  a  A
w* = wa  wb
ba ba
by solving the equations for wa and wb for wg and wd and substituting these into the equation for w* and
remembering that  A +  B = 1.
Since the coefficients on wa and wb sum to one, the proposition is proved.
The portfolio weight of any asset is linear in  along the min-var frontier. So, as we increase  , the
portfolio weight on any asset either linearly increases or decreases when we stay in the set of minimum
variance portfolios.
w* = Awg  Bwd
AC  AB AB  B 2
= wg  wd
 
AC  B 2 B2 AB AB B2
= wg  wg  wg  wd  wd
    
( AB  B 2 )
= wg  ( wd  w g )

So,
( AB  B 2 )
wi* = wig  ( wid  wig )

If B > 0 and since   0, so A  B  0 , assets represented more heavily in wd than in wg are held in
larger and larger amounts in min-var portfolios as we increase  above  g .
Also note that for each asset there is one min-var portfolio in which it has zero weight. In all portfolios
below (above if wid-wig is negative) it is sold short.
Covariance Properties of the Min-Var Portfolios:

Consider wg: if wg has different covariances with 2 distinct portfolios, then some combination of the three
will have a lower variance than does wg itself. But, we know this can’t happen. So, wg must therefore
have the same covariance with every asset or portfolio.
Consider any portfolio, wp. Then,
 1  1 1 1
Cov(zwg, zwp) = w g  w p  1
 wp  1
 0
1  1 1  1 A
This is true for any asset or portfolio. Note 1/A is also wg’s covariance with itself (i.e. its variance).
Consider any two min-var portfolios: wa = (1-a)wg + awd and wb = (1-b)wg + bwd (without loss of
generality). Let’s assume a  0 and b  0 so we are not looking at wg in either case.
Cov (~
za , ~
2 2
z b )  (1  a )(1  b) g  ab d  [a(1  b)  b(1  a)] dg
[ 49 ]
1 C 1
= (1  a )(1  b)  ab 2  (a  b  2ab)
A B A
1 ab
=    ab
A AB 2
This is the covariance between any two min-var portfolios, it is completely determined by the choice of a
and b.
Fix any a  0 (again so wa  wg) by choosing b we can get  ab to be any value in (  ,  ), thus, moving b
you pass through zero:
1 ab
0=  2
 0  B 2  ab
A AB
B2
or, b a  0 (recall   0 only if z  k1 )
a
So, if a > 0 then b < 0, and if wa is on the upper limb of the hyperbola, then the min-var portfolio with a
zero covariance with wa is on the lower limb and vice versa.
 Look at where wg and wd are and the weight on wd implied by a > 0 and b < 0.
In fact, it can easily be shown, by finding the slope of the frontier at a particular point and finding the
interception of the tangent line with the z axis, that the expected return on a portfolio with zero
covariance with a frontier portfolio is located at this interception.
Try this for wd for homework.
[50]
The covariance of a min-var portfolio with any other asset or portfolio:
Cov(zm, zp) = ? Let wm = mwg + (1-m)wd
wp is any asset or portfolio. Then,…

Cov(zm, zp) = wm  w p
 
= mwg  w p  (1  m) wd  w p
m z   1
=  (1  m)  wp
A 1  1 z
m zp
=  (1  m)
A B
Thus, the covariance of the return on any asset or portfolio ~z p with the return on a min-var portfolio
~
z is a function of the expected return on the arbitrary portfolio alone (and, of course, which min-var
m
portfolio is chosen). Picture!
1  1 z z
Comes from g = and d =   so for any portfolio p with 1’wp = 1 we see that
A B B
z 1
pd = p and (of course) σpg = . The covariance of the return on any min-var portfolio and any
B A
other portfolio is some combination of these two covariances.
With a Riskless Asset:

When we assume that there is a riskless asset in the economy, we change the minimization problem in
two ways: we express expected returns in terms of excess returns; and, it is also convenient to consider
that there is no budget constraint for investment in the risky assets – the balance can always be made up
of borrowing or lending via the riskless asset.
Write the Lagrangian as:
L = ½ w  w   (  R  ( z  R1)w) Note: no budget constraint for choice of w
FOCs:
L
(1)   w *  ( z  R1)  0
w
L
(2)  ( z  R1)w*    R

(1′) w* =   1 ( z  R1) we also require wo* = 1-1′w*
[ 51 ]
Solve for  using   R  ( z  R1) w * so,  solves
  R  ( z  R1)w*   ( z  R1)  1 ( z  R1)
Then,   R   (C  2 RB  R 2 A) , using our existing definitions for A, B, and C.
The equation of the min-var set is:
 2  w *  w*  w *   1 ( z  R1)
= w * ( z  R1)
=  (   R)
2 (   R) 2
So,   2
, again a parabola in (2, ) space
C  2 RB  R A
1
In (  ,  ) space this is:   R   [C  2 RB  R 2 A] 2 ,
a pair of rays originating at R:
Mean-Variance Efficient Set

“Capital Market Line”
R
Once again, all min-var portfolios are portfolio combinations of any 2 min-var portfolios. One natural
choice is, of course, the riskless asset, and for the other, use the frontier (min-var) portfolio that includes
none of the riskless asset. (How do we know that there is one and only one? How would you prove this?) This
risky asset only portfolio on the min-var frontier is called the ‘tangency portfolio.’
 1 ( z  R1)
wot = 0 so, wt = from 1′wt = 1
B  AR
The expected return and variance of return of this portfolio are:
z   1 z  Rz   1 1 C  BR
z t  z wt  
B  AR B  AR
[52]
2  C  2 RB  R 2 A
 t  wt  wt 
( B  AR) 2
The interesting, but not surprising result, is that wt is also on the risky asset only frontier. We can use z t
2
and  t to show this.
On the risky asset only frontier we know:  2    
2 C  2 RB  R 2 A
t =
( B  AR) 2
(C  BR)
=  
( B  AR)
 C  BR 2  C  BR 
 C  BR) 
C  B  A   B 

=  B  AR    B  AR   B  AR 
 
C ( B  AR) 2  2 B (C  BR)( B  AR)  A(C  BR) 2

=
( B  AR) 2
AR 2  BR  C  BR 2
= 2
, which is  t .
( B  AR)
wt is the portfolio on the risky asset frontier whose tangent line hits the z axis at R,
i.e. wt is in “both” min-var sets.

zt wt

B
A wg
R
This is the picture for the case R  B A  z g . Why do we think this is natural? If R  B A , the
tangency is between the lower limb of the hyperbola and the lower ray. Agents hold the riskless asset
and are long (short) wt if R < (>) B A .
Any portfolio on the pair of rays can be formed from portfolio combinations of the two portfolios wt
and the risk free asset – called “2 fund money separation.”
[ 53 ]
What if R = B A ? No tangency:
R  BA
Use w* =   1 ( z  R1)
B
=   1 ( z  1)
A
R B
= 2
 1 ( z  1)
C  2 RB  R A A
Look at 1′w*:
 R   1 B 1 
= 2 1  z  1  1
 C  2 RB  R A  A 
R  B 
= 2  B  A  0
C  2 RB  R A  A 
Thus, 1′w* = 0 so wo* = 1. We see agents holding the riskless asset and an arbitrage portfolio of the
risky assets.
Covariance Properties of Min-Var Portfolios with a Riskless Asset

When there is a riskless asset the min-var portfolios are all levered positions in the portfolio wt, so all
such portfolios wa = k wt for some constant k (recall 1' wa ≠1). The covariance properties are simple in
this case. First note this implies that all the minimum variance portfolios are perfectly correlated.
Secondly, we see that (again) the covariance between any asset or portfolio and a min-var portfolio is
determined by the expected return of the asset or portfolio in question:
( z  R 1)

 t   wt 
B  AR
and
( z  R 1) z p  R

 t , p  w p '  wt  w p '  for any wp such that 1’wp = 1.
B  AR B  AR
[54]
Since in the presence of a riskless asset all the min-var portfolios are perfectly correlated, the same
(scaled) result holds for them (substitute the relation wa = k wt).
Where does all this math get us?
The Expected Returns Relation:

When there is a riskless asset, it is straightforward to show that the expected excess return on any asset is
proportional to its covariance with ~ z t the return on the tangency portfolio. Let’s see how this works.
 t   wt is the vector of covariances of the returns on the individual assets with the return ~z t since
the covariance of a random variable with a weighted sum of random variables is the weighted sum of the
individual covariances.
Substitute the equation for wt to see:

 1 ( z  R1) z  R1
 t  wt    .
B  AR B  AR
Premultiply this by wt′ to find (note this says the variance of ~ z t is made up of the weighted sum of the
covariances of ~ z t with the returns on the individual assets nicely reminding us about the contribution
each asset makes to the variance of ~ z t ):
2   ( z  R1) z R
 t  wt  wt  wt  t
B  AR B  AR
Then:
t z  R1
2

t zt  R
 z  R1   t2 ( zt  R)   t ( z t  R)
t
Or, for each asset:

z i  R  it2 ( z t  R)   it ( z t  R) or z i  R   it ( z t  R)
t
The story I always tell my MBA students goes like this: The market pays you for two things (1)
surrendering your capital (delaying consumption) for which you get the “rental rate” R and (2) taking a
part of the total or aggregate risk that the market must distribute across all investors (in this model
aggregate risk is measured by  t2 , why?) for which you get a risk premium. That’s what this equation
says. And in particular your risk premium is determined not by the risk you hold but rather you get paid
for the portion of the aggregate risk you hold. This is determined by the beta and the excess return on
the tangency portfolio is the price per unit risk.
Because, when there exists a riskless asset, all min-var portfolios are perfectly correlated with the
tangency portfolio (each is just a blend of the riskless asset and the tangency portfolio), exactly the same
result holds for every min-var portfolio (any portfolio on the rays).
When there is no riskless asset, then we can use any min-var portfolio and a special companion for it to
most simply express expected returns.
 1 1  1 z
Recall: wg  wd 
A B
[ 55 ]
Let wa  (1  a )w g  awd a  0, a  1
1 a   a 
 a   wa   1    z and,
 A  B
2  1  a  az a
 a  wa  wa   
 A  B
For any other portfolio wp (not-necessarily min-var, we require only that it be a positive investment
portfolio, 1’wp = 1), we find the covariance:
 1  a  az p
 pa  wp  wa    .
 A  B
As we saw before, the correlation of any asset/portfolio’s (p) return with a frontier portfolio’s return is a
function of the portfolio’s (p’s) expected return. Reversing this interpretation, this indicates that z p is a
linear function of ~z ’s covariance with any min-var portfolio.
p
2 (1 a)
Solve the equations for  a and  pa for A
and Ba , substitute the resulting relations into  a :
2
z p a  z a ap za  z p
z 2
1 2
a
 a   ap  a   ap
Now, make a clever choice of p – make it a portfolio (could be the frontier portfolio although this is not
necessary) with a zero covariance with a (i.e.  pa = 0). Then, the above simplifies to:
a
z  zz1  2
( z a  z z )  z z 1   a ( z a  z z ) the subscript z is for “zero beta” portfolio.
a
Thus, when there is no riskless asset, expected returns are linearly related to an asset’s beta with a
reference portfolio if that reference portfolio is a minimum variance portfolio.
z
Capital Market Line
za  wa
 wg
zz  wz

This looks like the “Black CAPM” named for Fisher Black.
A special case is wd. Since wd is located as it is (with z z  0 ), if wd is used as the reference portfolio (a =
 1 z
1 above), it must be that z   d z d . We can verify this using wd  .
B
[56]
z 2 z 
Then,  d  , and  d  d  z  d2 z d   d z d .
B B d
We have shown that if the reference portfolio is a min-var portfolio, the expected return on any asset is
a linear function of the covariance between that asset’s return and the return on the reference portfolio.
Important point: The linearity property of expected returns and beta or covariance holds only if the
benchmark portfolio is in the min-var set.
Proof: Assume z   1   p   1    w p for an arbitrary portfolio p with 1′wp = 1

Then:
1 
w p   1 z   1 1 = wd  w g
  
Since 1′wp = 1,     1 and wp is a portfolio combination of wg and wd and so is in the min-var set. So,
the vector of expected returns is linear in the betas with a reference portfolio if and only if the reference
portfolio is a minimum variance portfolio. Roll critique.
Variance Decomposition
The expected returns relation is telling us that only a part of an individual asset’s variance (or risk) is
priced, only that part that covaries with the return on a min-var portfolio. It is instructive to decompose
the variance of any asset or portfolio to see just how this happens. For any portfolio wp let wm be the
min-variance portfolio with the same expected return. Write:
w p  w g  ( wm  w g )  ( w p  wm )
= wg   s   d ,
where  s and  d are arbitrage (zero-investment) portfolios.
The value of this decomposition is that the returns on these portfolios are orthogonal (mutually), i.e., the
covariance between any pair is zero:
 gs  wg   s  wg ( wm  w g )  0 ,
since this is the difference between the covariance of ~

z m and ~z g and the covariance of ~z g and ~z g .
 gd  wg   d  wg ( w p  wm )  0 following the same logic as above.
 sd   s   d  wm ( w p  wm )  wg ( w p  wm )  0
The second term is zero as above and the first is also zero since the covariance of any portfolio with wm
(a min-var portfolio) is completely determined by that portfolio’s expected return and wm and wp have the
same expected return by construction.
Thus, from w p  w g   s   d and the mutual orthogonality of the terms we are able to decompose
the variance of the portfolio p as follows:
 p2   g2  s2  d 2 (all the covariances are zero)
Labels:
[ 57 ]
 g 2 = “unavoidable risk” – global min-var portfolio risk
2
 s = “systematic risk” – added risk that provides added compensation ( z p  z g )
2
 d = “diversifiable risk” – due to being off the min-var frontier, added risk that
brings no added expected return.
2 2 2
Thus, only  g and  s contribute to expected return.  d can be large, small, or zero without
affecting expected return. Draw a picture and see!
Alternatively: w p  wm   d
Recall, wm is chosen so z p  z m . Let a be any min-var portfolio. Then
z p  z z   ap ( z a  z z ) and, z m  z z   am ( z a  z z )
Since z p  z m it must be that Cov(za, zm) = Cov(za, zp).

Thus,  d  w p  wm must have zero covariance with any min-var portfolio (such as a). This is the part
of portfolio p’s return that doesn’t covary with the return on any min-var portfolio. Expected returns
are driven by covariances with min-var portfolios, since  d does not contribute to wp’s covariance with
wa, it does not contribute to expected return (you should remember this).
Equilibrium: The Capital Asset Pricing Model – “CAPM”

To this point, this has been a standard minimization problem and the consequences of the solution – all
math.
We can’t use the results so far to price assets since you recall to get here we started by assuming we
knew the expected returns vector to find the min-var portfolios – a bit circular.
Also, even if investors are not mean-variance optimizers, the min-var portfolios exist (as long as mean
and variance are well defined) and the absence of arbitrage provides the pricing results we just saw.
Thus, there is no economic content until we can identify one of the min-var portfolios. This is really
what the CAPM does. The mutual fund theorem says if all investors are mean-variance optimizers we
can, in equilibrium, identify a min-var portfolio.
Assume:
(1) Each investor chooses his/her portfolio to maximize a derived utility function over z
and  2  v( z ,  2 ) where v2 < 0 and v1 > 0 and v is concave
(2) Investors have a common time horizon and homogenous beliefs about z and 
(3) Each asset is infinitely divisible
(4) Unconstrained trading in the riskless asset
These assumptions are sufficient for all investors to hold mean-variance efficient portfolios. Think
about what each says.
Assume a riskless asset exists:

The maximization problem for each investor is:
Max w v( R  w( z  R1), w  w)
[58]
FOC:
v
 v1 ()( z  R1)  2v 2 ()  w*  0
w
So,
 v1 () 1  v1 ()( B  RA)
w*  2v ()  ( z  R1)  2v 2 () wt
2
Note that w* is proportional to the tangency portfolio and that it is chosen by each investor. Thus, the
aggregate demand for each risky asset is in proportion to its representation in wt.
The (positive or negative) remainder of each investor’s wealth is invested in R. Since, in equilibrium,
demand equals supply, it must be that the market portfolio of all risky assets wm is proportional to wt. In
other words, since all investors hold wt and R, the market portfolio – wm – the wealth weighted aggregate
of all investor’s holdings must be some version of this. Thus wm is a min-var portfolio.
If the riskless asset is in zero net supply, wt = wm. If the riskless asset is in positive net supply, then the
market portfolio is located to the left of wt on the capital market line.
We can now write: z  R1   m ( z m  R) since the market portfolio wm is on the min-var frontier.
Further, we can note something about the holdings of investors:

If a riskless asset is available, we know that the mutual fund theorem (all investors effectively hold only
two assets) allows us to say that more risk averse investors hold less of wt or wm.
 v ()  v1 ()
Indeed, w*  1  1 ( z  R1) says just this: as is smaller as the individual investor is more
2v 2 () 2v 2 ()
risk averse and this tells us how aggressively the investor holds wt.
(more risk averse  (-v2) is bigger)
If there is no riskless asset, the same intuitions hold. Write:
Max w v( wz , w  w)   (1  w1)
FOCs:
v
 v1 () z  2v 2 ()  w * 1  0
w
v
 1 w*  1

So,
 v1 () 1   v1 () B A
w*   z  1 1 = wd  wg
2v 2 () 2v 2 ()  2v 2 () 2v 2 ()
Further, since 1w*  1 , w* is a portfolio combination of wd and wg. Thus, all investors again hold a
portfolio combination of wd and wg or min-var portfolios. Aggregate demand is therefore a portfolio
combination of wd and wg and so wm must be a min-var portfolio itself.
[ 59 ]
Thus, we can write:
z  z z1   m (zm  z z ) or, for each asset: z i  z z   i ( z m  z z )
z z is the expected return on a portfolio uncorrelated with the market portfolio and now this is the Black
CAPM.
When there is no riskless asset, we would like to use the same idea as we did before and say that more
risk averse investors (in the Arrow-Pratt sense) hold less of the market portfolio. Because the holdings
are in two risky assets, we can’t in general say this. However,
 v1 () 1 
w*   z  1 1 says that more risk averse investors ((-v2) is bigger)
2v 2 () 2v 2 ()
hold less of wd and so more of wg. We cannot, however, say things about the market portfolio,
individual assets, or other pairs of min-var portfolios.
 
(Note: 1w*  1  2v2 ()  1A  2v2v1() BA so 2v2 ()  as(v 2 ()) )
Note: The CAPM has given us 2 measures of risk (1) the variance of a portfolio which
determines the efficiency of a given portfolio (macro risk if you will) and (2) beta which
measures the systematic risk of individual assets (micro level risk).
Note: Mean-variance analysis generates the separation results and the pricing results. The
equilibrium analysis simply identifies the market portfolio as being a minimum variance
portfolio.
Consistency of Mean-Variance Analysis and Expected Utility Maximization
(1) Quadratic Utility

bz 2
We can write, without loss of generality, u ( z )  z  (with b>0) since utility is unique only up to a
2
positive affine transformation.
Expected Utility is then:

b
E[u ( z )]  z  ( z 2   2 )  v ( z ,  2 )
2
…a function of mean return and variance of returns alone.
There are two drawbacks to quadratic utility:

(a) satiation – u(·) is decreasing in return after some point; and
(b) increasing absolute risk aversion.
Neither, we believe, is consistent with reality.
(2) Multivariate Normal Returns for Assets
[60]
Portfolio combinations of normal random variables are normal and normal random variables are
completely characterized by their means and variances. Thus, each asset or portfolio is characterized by
z and  2 . Thus, for any u(z), E[u(z)] is characterized by z and  2 .
There are (at least) two drawbacks to this approach as well:

(a) limited liability; and
(b) derivatives.
Both imply that a (joint) normal distribution for returns on all the assets is not a reasonable
representation of reality.
The Three Moment Problem
Notation: We consider the 3rd central moment: skewness  m 3  E[( z  z ) 3 ]
Co-skewness is defined as:  mijk  E[( z i  z i )( z j  z j )( z k  z k )]
N N N
3
The skewness of a portfolio is given by: m p   wi w j wk mijk
i 1 j 1 k 1
2 3
We consider that investors have derived utility functions: v( z p ,  p , m p ) - could come from a cubic
2
utility of returns where z p and m 3p are liked and  p is disliked.
2 3
To solve this problem, we could try to hold z p and  p fixed and max m p (analogously to the CAPM
approach), but this just gives a big mess.
2 3
Instead, start with an investor at his optimum z o ,  o , mo (“o” for optimal) then consider perturbing
him away from this optimum by having him sell some small amount (w) of his optimal portfolio and buy
this amount of individual asset i. The resulting portfolio p has:
z p  (1  w) z o  wz i
 p 2  (1  w) 2  o2  2w(1  w) io  w 2 i 2
3 3 3
m p  (1  w) 3 mo  3(1  w) 2 wm ioo  3(1  w) w 2 moii  w3 mi
If we consider v w it should find a maximum at w = 0 by assumption.

v 2 3
0  v1 ()( z i  z o )  2v 2 ()( io   o )  3v3 ()(mioo  mo )
w w  o
2v 3v
z i  z o  2 ( io   o2 )  3 (mioo  mo3 )
v1 v1
This must hold for any asset i  z i depends only on covariance and co-skewness of asset i with the
optimal portfolio (and some preference parameters which we will get rid of).
[ 61 ]
Note: Only  io and mioo are important in the expected returns relation. As in the CAPM,  i2 is not
important. Also, miio and mi3 are not important since they provide no tradeoffs at the margin:
mo3
 3(mioo  mo3 ) . miio and mi3 don’t appear, so they don’t affect the optimum.
w w 0
mioo contributes since skewness mo3   wi * mioo so it is asset i’s contribution to the skewness of the
optimal portfolio, just as  o2   wi *  io implies asset i’s covariance with the optimal portfolio
determines asset i’s contribution to portfolio variance.
z i  3v3
Also note  : If v3 > 0 (so positive skewness is liked), this term is negative so there is a
mioo v1
substitution between z and mioo at the optimum.
If a riskless asset exists, the relation must hold for it:

2v 2 3v 3
R  z o  2  o  3 mo and so we can write:
v1 v1
3
 2v 2 ( z o  R ) 3v3 mo
 2
 .
v1 o v1  o 2
Then,
2 3
  3v m 2 3v 3
z i  z o  ( z o  R ) io 2 o  3 o 2 ( io   o )  3 (mioo  mo )
o v1  o v1
o 3v3 3 o
= R  ( z o  R)  i  mo (  i   io )
v1
 m
where  io  io2 and  io  ioo3
o mo
If some investor’s optimal portfolio is the market portfolio (by happenstance) then the 1st two terms
duplicate the CAPM.
Finally, consider the asset with returns, ~ z z , that are uncorrelated with ~z o and that has the smallest
variance of all such assets.
 Here we need the further identification since not all zero-beta assets have the same expected
return as they did in the CAPM. Then,
3v 3v3 ( z z  R)
z z  R  3 mo3 ( zo ) and  3
v1 v1 mo ( zo )
We can then further develop the expected returns relation to:

o  io   io
z i  R   i ( z o  R)  ( z z  R)
 zo
The intuition?
[62]
Any zero-beta asset will do – the difference will be accounted for by different values of z z and  zo .
Note, however, that we are still left with the need to identify some investor’s optimal portfolio if we
were to try to apply this pricing relation.
[ 63 ]
[64]
5 Generalized Risk and Return

After having considered the special, well-known, and well (over?) used case of mean-variance analysis,
let’s step back to the general problem again.
First, Risk: The CAPM makes the assumption that the variance of the return on a portfolio measures
its risk.
Consider the idea that risk is the combination of properties of a set of random outcomes that change the
evaluation of E[u ( ~ x )] away from u (x ) for concave utility u( ). Two aspects of this definition to
highlight are: (1) that what alters this evaluation (or is, on net, “disliked”) is, of course, dependent upon
the utility function used in the evaluation. Thought of this way, risk is necessarily a property defined for
a class of utility functions. This definition of risk also (2) conveys only the notion of dispersion of
outcomes. This requires that we correct for the mean (location) when we talk about risk. It also means
that all other aspects (good and bad) of the distribution (beyond location) are lumped into “risk.”
More formally…
Definition: If uncertain outcomes ~ x and ~ y have the same location (expectation), then ~ x is said to
~
be weakly less risky than y for the class of utility functions U if no individual with a utility function in U
prefers ~
y to ~ x . That is: E[u ( ~
x )]  E[u ( ~
y )]  u  U. ~ x is strictly less risky if the inequality is strict for
some u  U.
For some restricted classes of utility functions (some U), this ordering is complete (i.e. for all pairs of
random variables ~ x and ~ x is weakly less risky than ~
y , with the same mean, either ~ y or ~ y is weakly
less risky than ~
x or both).
Example: Quadratic utility: as we have seen, variance is the measure of risk. Any quadratic utility
function can be written as:
b 1
u(z) = z  z 2 we restrict z to be z  so that u ( z )  0
2 b
Expected utility (for any distribution for which mean and variance are defined) is written:
b
E[u ( z )]  z  ( z 2   2 ) .
2
x and ~
So, for any ~ y with x  y , E[u ( ~
x )]  E[u ( ~
y )] iff  x2   y2 .
The completeness of this ordering is unfortunately not a common property across different (broader) classes
of utility functions. Consider the example introduced by Ingersoll:
~
x = 0 with prob. = ½ ~
y = 1 with prob. = 7
8
4 with prob. = ½ 9 with prob. = 1
8
For these simple distributions we easily calculate that: x  y  2 , and  x2 = 4 <  y2 = 7.
[ 65 ]
Thus, for any investor with quadratic utility, ~x is preferred to ~
y (it provides higher expected utility).
However, for u ( x)  x , ~y provides higher expected utility. Thus, for any class of utility function that
x and ~
includes both quadratic and square root utility, ~ y cannot be ranked.
Further, note that it is not the case that variance is the appropriate measure of risk whenever the
ordering is complete. Consider the class of cubic utility functions defined as u ( z )  z  cz 3 with
1
c > 0, where we restrict outcomes to be bounded between zero and (3c) 2 .
Expected utility is written (translating the non-central moment to the central moments):
E[u ( z )]  z  c[ E ( z 3 )]  z  c(m 3  z 3  3 2 z ) ,
where m 3 is the third central moment E[( z  z ) 3 ] that we examined before. (Note that for this class
of utility functions variance and skewness are both disliked.)
x is preferred to ~
If x  y and ~ y then it must be that c[3x ( y2   x2 )  (m 3y  m x3 )]  0 .
Thus, 3 z 2  m 3 is the proper measure of risk for this class of utility functions. So, even for classes of
utility functions that imply a complete ordering of random variables, variance is not a universal measure
of risk.
For the general class of all risk averse (concave) utility functions, the ordering is obviously incomplete.
We need to find a way to reduce the scope of the problem if we are to say more, i.e., restrictions on
distributions or restrictions on utility functions (which are rarely thought to be particularly interesting or
valuable).
x and ~
Mean Preserving Spreads: For illustration, let’s take a look at a particular relation between ~ y
that allows a comparison for all u( ) in U (the class of all risk averse utility functions).
Intuitively, if we take ~
x and add mean zero noise to it we should wind up with something less attractive
to risk averse agents. It turns out it’s not quite that simple. Why?
A random variable ~ x is less risky than ~

y if g( ~
y ) can be obtained from f( ~ x ) by the application of a
series of mean preserving spreads (MPS) (where f(·) and g(·) are density functions).
Definition:
 for c  x  c  t
  for c   x  c   t

s ( x )    for d  x  d  t
 for d   x  d   t

 0 elsewhere
where:  (c   c )   ( d   d ) t >0
 >0 c  t  c  d  t
 >0 d  t  d
[66]
From a uniform distribution we might get the following:
So, we have four (non-overlapping) intervals of non-zero value with the middle two negative and the
outside two positive.

Note that:  s ( x)dx  0 so what is added is subtracted elsewhere


and,  xs ( x)dx  0 “mean” zero (really addition of s(x) doesn’t change mean)

Thus, if f(x) is a density function for x, then f(x) + s(x) is also a density function that gives the same
expected value. (As long as you don’t violate non-negativity for the resulting density.)
An exercise you should work through in Ingersoll’s text shows that if ~ y has density g(·) = f(·) + s(·),
~ ~
where s(·) is a mean preserving spread, then y is riskier than x (whose density is f(·)) for the class of all
concave utility functions. The proof simply compares expected utility of ~ x and ~ y for general concave
utility functions (lots of Jensen’s inequality applications). Transitivity implies this works for a series for
MPS’s as well.
Rothschild & Stiglitz Theorems on Risk:

The notion of ~ y differing from ~x by the addition of a series of MPS’s, implying more risk is very
~
intuitive, y has a more disperse, “noisier” distribution than ~
x . This can be made more precise, more
general, and less cumbersome to deal with.
Theorem 1: For the concave class of utility functions, outcome ~ x is weakly less risky than outcome
~
y iff ~ x  ~ where ~ is a fair game with respect to ~
y is distributed like ~ x.
d
That is: ~
y ~x  ~ and E[~ x ]  0  x
Note that the law of iterated expectations tells us x  y in this case.
d
Proof (sufficiency): If ~
y ~x  ~ …
E[u ( y )]  E[u ( x   )] equivalence of distributions
 E{E[u ( x   ) x]} law of iterated expectations
 E{u[ E ( x   x)]} Jensen’s inequality
 E[u ( x)] fair game property
Given this result, we can easily see that variance is a valid measure of risk for the class of all concave
utility functions when we restrict ourselves to normal random variables.
[ 67 ]
For any ~
y ~ N(  , b) and ~
x ~ N(  , a) , with b  a, we can always write:
d
~
y ~x  ~ where ~ ~ N(0, b - a) is independent of ~x . Since independence is stronger than
~
the fair game property  is therefore also a fair game with respect to ~
x.
We can also show the related result that if ~

y is riskier than ~
x we know it must have a larger variance –
this must be true because the quadratic utility functions are members of the concave class and for
quadratic utility functions variance measures risk.
Alternatively: If E[~ x ]  0 then, Cov ( ~

x , ~ )  0 (in fact Cov ( g ( ~
x ), ~ )  0 for all functions g())
d
Therefore, ~
y ~ x  ~ tells us that:
Var ( ~
y) = Var ( x   )
= Var ( x )  Var ( )  2Cov ( x,  )
= Var ( x )  Var ( )
≥ Var (x )
When location differs, we can correct by simply de-meaning and comparing ~ x  x vs. ~
y  y or
~ ~
comparing y vs. x  ( y  x ). This, however, shifts around the distributions and is itself somewhat
cumbersome and even a little odd when our goal is to describe a risk/return tradeoff.
A closely related concept to riskier is second order stochastic dominance – “SSD”. This concept
incorporates the correction for location within the comparison.
x displays weak 2nd order stochastic dominance over ~
~ y if:
d ~ ~
~y ~x    ~ where   0
and, E[ x   ]  0 x, 
Theorem 2: ~ y is preferred to ~ x SSD ~

x by no strictly increasing concave utility function iff ~ y
Note: SSD incorporates the correction for location, so the class of utility functions for which this holds
must be narrower.
Proof (sufficiency):
Theorem 1 provides E[u ( x   )]  E[u ( x     )] for all concave utility functions.
st
Because  is non-positive, 1 order stochastic dominance provides that:
E[u ( x)]  E[u ( x   )] for strictly increasing utility functions.
Thus E[u ( x)]  E[u ( y )] for all increasing concave utility functions (since ~ y and x     have
the same distributions they have the same expected utility).
 The relation between riskiness and SSD is strong but they are not identical.
x  x is less risky than ~
If x  y and ~ x 2nd order stochastically dominates ~
y  y we know that ~ y since
d
we can write ~y ~x  ( y  x )  ~ with E ( | x )  0 (from “riskier”) so that E ( | x  ( y  x ))  0 where
  y  x is a degenerate non-positive random variable.
x SSD ~
However, if ~ y while we know that x  y we cannot say that ~
y  y is riskier than ~
x  x . For
~
one thing, they may not even be ranked. Further if they are, it is possible that x  x is riskier
[68]
~ ~ ~ x dominates ~
than y  y . Let y ~ U(0,1) and x ~ U(2,8) . ~ y , and so it FSDs and SSDs ~
y , but
~ ~
x  x is riskier than y  y .
x and ~
The difference lies in the correction for location. In SSD it is built in, we compare ~ y , while in
~ ~ ~ ~
“riskier” we compare x  x to y  y . An x with a big mean and dispersion versus a y with a small
mean and small dispersion may be evaluated differently than ~x  x vs ~ yy.
 After a deviation to review the general portfolio problem and results on portfolio choice, where this
is heading is a general study of the efficient set of portfolios. SSD will play a large role.
Table: x and y are random variables defined on [a, b] (see Ingersoll pg 123)
Concept: Utility Condition Random Variable Distributional
Condition Condition
~ ~ ~ ~ ~ ~ ~
x~
Dominance E[u ( x  w)]  E[u ( y  w)] for y~ x  y
~
any random variable w for all ~ Outcome by outcome
 0
increasing utility d
Note: = not 
FOSD E[u ( ~
x )]  E[u ( ~
y )] for all ~
d ~ F(x), G(y) are
y ~x  distribution functions:
increasing utility functions ~
 0 G(t) ≥ F(t)  t
Prob(x≥t) ≥ Prob(y≥t)
t
SOSD E[u ( ~
x )]  E[u ( ~
y )] for all d ~ t
~
y ~ x    ~
increasing concave utility ~ a [G (s)  F ( s)]ds
functions  0
 (t )  0 t
E[ x   ]  0 x, 
“on average” G>F
Riskier E[u ( ~
x )]  E[u ( ~
y )] for all d (t )  0t
~
y ~x  ~
concave utility functions (b)  0
E[ x ]  0
[ 69 ]
Now, Portfolio choice: To quickly review: The general portfolio problem can be cast as: an agent
chooses a portfolio to maximize his/her expected utility of returns:
Max w E[u (~
z w )] ~
zw  ~
z w  i 1 wi ~
N
z si
s.t. 1w  1 (i.e.  wi  1 )
Some other restrictions we might (but won’t) use include:

 w0 - no short sales
~
 z 0 - limited liability for the primitive assets
 z*  1 - non-negative wealth constraint (bankruptcy)
Form the Lagrangian:

L = E[u ( ~
z w )]   (1  1 w)
FOCs:
L
 E[u (~
z *)~
z]  with   0 and z* = z′w*
w
or, for each asset:
L
 E[u (~
z *)~
zi ]   for all assets i
wi
L
 1w*  1

The concavity of u(·) and linearity of the constraint implies that the FOCs are necessary and sufficient
for a maximum.
The FOC must hold for the riskless asset, if it exists: E[u ( z*) R]  
So, we can subtract this from the general condition to find: E[u ( z*)(~ z i  R)]  0 i
in the presence of a riskless asset. This will help us examine some results concerning portfolio
formation in this general context.
Theorem: If a solution w* exists for a strictly concave utility function and a set of assets Z then the
probability distribution of its return is unique and if there are no redundant assets then the portfolio w*
is unique as well.
Proof: For each feasible w define new control variables  s by   Zw or  s  Zws   wi Z si

s  1,..., S and define the function v( ), a derived utility function over state returns:
v( 1 ,..., S )  E[u ( s )]  s  s u ( i wi Z si )
The portfolio choice problem can then be written:

Max s v( 1 ,... S ) s.t. w with  s   i wi Z si s
This is a standard convex optimization problem with an objective function that is concave in the choice
variables (note that E(u(zw)) is not concave in w) and a set of linear constraints. Thus, the optimum (θ*),
if it exists, is unique. The vector θ* describes the distribution of returns across the states, so write:
[70]
θ* = Zw*
If there are no redundant assets Z has a unique left inverse L  ( Z Z ) 1 Z  and w*  L * is then also
unique.
We can say something specific about choices made by agents in this very general context but not much:
Theorem: The optimal portfolio for a strictly risk averse, non-satiated investor will be the riskless
security if and only if z j  R  j = 1, 2, …, N.
Proof: When a riskless asset exists, w* satisfies the FOC: E[u ( z*)( z j  R )]  0
If z j  R j then z*  R1 will satisfy the FOC. Since u(·) is strictly concave we know the probability
distribution of the return is unique. If there are no redundant assets the portfolio is unique. (Strictly risk
averse agents hold only risk free portfolios or the risk free asset. We also see why the “strictly concave”
is required for uniqueness of the solution in the theorem above.)
If z*  R we can write the FOC as u ( R ) E ( z j  R )  0

u ( R)  0 from strictly monotone u(·), so if z*=R is optimal, it must be that z j  R j
This generalizes our earlier two-asset (one risky, one riskless) result.
Finally, Efficient portfolios: the efficient set of portfolios – those portfolios for which there are no
other portfolios with the same or greater expected return and less risk. Alternatively, those portfolios
which are not second order stochastically dominated.
More specifically…
Definition: A portfolio w is efficient if w  E
E  {wˆ  R N (u  U )( Eu ( z wˆ )  MaxwR N ,1 w1 E[u ( z w)])}
i.e. if there exists a utility function u  U (strictly monotone, concave class), for which w solves the
investor’s problem:
Maxw E[u ( z w)]
such that 1w  1 budget constraint
N
w R w is a portfolio of the traded assets – a restriction whose strength
is determined by the structure of Z
We use U, the strictly monotone, concave class, because if we the use monotone, concave class E is
trivially all w  R N with 1w  1 since this class allows a constant utility function, making it
uninteresting and a useless definition.
The following theorem shows that: Efficient portfolios are those for which there are no other portfolios
with the same or greater expected return and less risk (those that are not SSDed). But, not the set for
which there is not a less risky portfolio with the same mean return – this set conceptually includes the
lower limb of the CAPM hyperbola.
Theorem: For some efficient portfolio k with returns given by ~z ek if ~z ek  z ek is riskier than
~
z w  z w (for any portfolio w), then z ek  z w .
Note: It is not true in general that z ek  z w implies that ~

z w  z w is riskier than ~z ek  z ek since “riskier”
is an incomplete ordering.
[ 71 ]
Proof: If ~z ek  z ek is riskier than ~
z w  z w , then
~ ~
E[u ( z e )]  E[u ( z w  z w  z ek )] u  U (by the definition of riskier).
k
If z ek  z w , then
E[u ( ~
z w  z w  z ek )]  E[u (~
z w )] so, E[u ( ~
z ek )]  E[u ( ~
z w )] u  U (monotonicity).
But, this contradicts the assumption that ~z is an efficient portfolio’s return, so it must
e
k
be
k
that z  z w holds.
e
From this, we get two corollaries:

 Corollary 1: z ek  R for all efficient portfolios since all efficient portfolios are weakly riskier
than the risk free asset.
 Corollary 2: The “riskier” is an efficient portfolio, the higher is its expected return. In other
words, there is a risk/return tradeoff.
Now let’s consider an alternative way to characterize the efficient set using a simple complete markets
example with two states (so we can draw it). Since the market is complete, we can find a portfolio that
has any distribution of wealth across the two states that we would like. Therefore, we can make W1 and
W2 (wealth in states 1 and 2: Woz1* and Woz2*) the choice variables.
MaxW1 ,W2  1u (W1 )   2u (W2 )  E[u (W )]
subject to: p1W1 + p2W2 = Wo (budget constraint is written using the unique state prices)
L: =  1u (W1 )   2u (W2 )   (Wo  p1W1  p2W2 )

The FOCs give:
 u (W1 )
 1u (W1 )  p1  0 or, p1  1

 2u (W2 )
 2u (W2 )  p2  0 or, p2 

p s u (Ws )
Or, s   = state price density …just as we saw before.
s 
Note: Under risk neutrality, state prices are proportional to the actual probabilities (or the state price
density is constant across the states). Thus, again we see that a single risk neutral agent in the economy
(who faces no restrictions on short sales or borrowing) trades to set prices in such a way that  is
constant. Any risk averse agents in the economy (in equilibrium) will then also trade so that u (Ws ) is
constant across all states. Thus, their optimal choice must be to hold riskless positions. Said another
way: the risk neutral agent trades so that there is no reward for risk bearing (all assets have E(z)=R) so
no risk averse agent bears any risk.
[72]
Examine the set of efficient (optimal) portfolios for this two state example:
W2  1W1   2W2  k
k 1
or, W2   W1
2 2
slope=  pp12 riskless asset
p1W1  p2W2  Wo
W p
or, W2  o  1 W1
p2 p2
r
 d Simply assume:
 p
45  line f  slope=  12  1  1
2 p2
W1
p1
Line with slope =  p2
is the budget constraint

Line with slope =   1 is a line of constant expected wealth
2
Draw an indifference curve for an arbitrary risk averse utility function. How do we know that the
constant expected wealth line is tangent to the indifference curve at the 45° line?
It must be – from r outward on the line, you stay at the same expected wealth but add risk if you move
in either direction. Thus, the constant expected wealth line through r must be tangent as it must be
below the indifference curve away from r on either side.
Which of the possible consumption bundles will be chosen by a rational agent?
 Consider point d in the picture. By concavity of the utility function, d is no better than r (same
expected wealth but more risk) and by strict monotonicity f is dominated by d since d has larger W1
and W2. So, by analogy all points on the budget line below r are dominated by r, the riskless asset
(they all give less expected wealth and more risk).
 Rational choices are above the point r on the budget line in this picture. Thus, only choices with W2 ≥
W1 are optimal. Moving in this direction gets more risk and higher (not lower) expected wealth.
Where will the indifference curve be tangent to the budget constraint?
 Depends on u but we know it will be above point r because we assumed the budget line had a
 p p 
steeper slope than the constant expected value line (that is,  1   1  1  1 or,
2 p2 p2  2
p1 p 2
 or, 1  2 ).
1  2
 Thus, the state price per unity probability of wealth in state one is greater than in state 2. Choosing
W2 ≥ W1 is, in this sense, a cost minimizing choice (for a given distribution of wealth, choosing W2
≥ W1 has lower cost than the reverse). Given state independent vN-M utility we will show later that
cost minimization in this way is really the only restriction imposed by maximizing behavior.
We can also find the result: if  1 >  2 then W2 ≥ W1 from the FOC of the investor’s decision problem
in this example:
[ 73 ]
u (Ws )  s
So, if  1 >  2 we require u (W1 )  u (W2 ) , or, since u  is weakly decreasing, W1  W2 .
We can generalize this two state example:

Theorem (Dybvig J of B ‘88): Assume a complete market and equally probable states. For the strictly
monotone, concave class of utility functions U, w is an efficient portfolio if and only if, for all states r, s:
 r   s  Zws  Zwr
Proof (sketch): Suppose w is an efficient portfolio, then it follows from the FOC
( u ( Zws )  s ) and the concavity of u(·) that Zwr  Zws   s   r (the 2nd inequality is weak since we
don’t require strict concavity). The contra-positive of this is:  r   s  Zwr  Zws . Now suppose that
 r   s  Zwr  Zws or equivalently Zwr  Zws   s  r . Graphically this says:

Zw
Choose any function g(·) which has g(Zws)=  s for  s > 0. We know that this function is positive and
weakly decreasing so re-label it u ( Zws )   s .
Integrate this function and you get a strictly monotone concave function u(·) such that it satisfies the
FOC at the candidate w. Note that we have found a u(·) scaled so that the Lagrange multiplier  equals
1. But, we can always do this since u(·) is unique only up to a positive affine transformation. The
constant of integration from the step u   u is the “rest” of this transformation. This is the
equivalence between maximizing behavior and cost minimization.
Problem:Look back to the two state complete markets example. Prove that if state prices are proportional to the actual
probabilities then any choice not on the 45° line is riskier than a bundle on the 45° line.
Instead of the formal proof of sufficiency for the last theorem, to provide a little intuition, look at the
situation of a complete market where  s  1 S  s=1,…,S (an unnecessary simplification). Now, since
there are S states, there are S! ways to order or assign the lottery outcomes to states (which state
generates greater wealth than which). This means that there is some cheapest way to order the lottery.
Suppose that one of the cheapest ways does not assign outcomes in reverse order to the state price
density. Then  states r, s such that  r   s but wr  ws . Now, switch the outcomes for states r and s.
The change in cost is:
( p r ws  p s wr )  ( p r wr  p s ws )  (ws  wr )( p r  p s )  ( ws  wr )( r   s ) S 2
which is a negative number – implying a cost decrease with no change in expected utility for state
independent utility of wealth – and so a contradiction.
Example – Verifying the efficiency of a given portfolio:

The last theorem can be generalized and Ingersoll uses the general result to build a numerical example
illustrating the idea that verifying the efficiency of a given portfolio requires only that we check the
ordering of returns it assigns across states. It is, therefore, also true that all portfolios which impose the
[74]
same orderings on their returns as that of an efficient portfolio are efficient, i.e., some strictly increasing
concave utility function sees that portfolio as optimal.
If the matrix Z has no redundant assets then the rank of Z is N  S.

When N = S the market is complete and thus there is a unique vector  that supports Z. Thus there is
only one ordering of the state contingent payoffs that can represent an efficient portfolio.
If N < S, the market is incomplete,  is not unique (there are N equations in S unknowns in the
supporting equation) so not all efficient portfolios need have the same orderings of returns. An example
from the text will help illustrate this – here it has been changed by putting things in terms of the  s
where the book uses marginal utilities – see the FOC for the translation:
0.6 2.4
1
Consider the market characterized by: Z  1.2 1.5  where  1   2   3 
3
3.0 0.6
S=3  3! or 6 potential orderings of returns. With 2 assets, only 4 orderings are feasible.
w1  3 5 (1, 2,3)
3  w  3 
7 1 5 ( 2,1,3)  st
The feasible set is:  Market possibilities – state with lowest return is 1
1  w  3 ( 2,3 ,1)
3 1 7 
w1  3
1 (3,2,1)
(Write returns in each state as a function of w1 and 1 – w1 then graph Zws s = 1, 2, 3 against w1.)
Now look at an optimizing investor’s FOC to consider efficiency:

s  s u ( z s *) z si   for i=1, 2 Let u ( z s *)  u s
u s
Since s  , this FOC can also be written E (z i )  1 or   s  s z si  1 for i=1, 2

Here, these two equations are:

.21  .4 2 1.03  1
.81  .5 2 .23  1
This has solutions of: 1  2111 3  511

 2   3811 3  30 11  These represent efficient portfolios.
3  3
[ 75 ]
Graph these three lines:
1 3
1 ,  2 , 3
2
3
Thus, four orderings are efficient.

From high  to low (i.e. state with highest  is listed first):
(2, 3, 1)(2, 1, 3)(1, 2, 3)(1, 3, 2)
The reverse of the final ordering is not feasible (see above), we can’t find a portfolio of these two assets
that gives returns that are lowest in state 1 and highest in state 2. The first three represent all the
portfolios with w1  13 . So, any such portfolio is optimal for some agent. Of the feasible orderings, only
(3, 2, 1) ( w1  13 ) isn’t efficient. This isn’t a practical way to approach the issue but it helps us
understand the next set of theorems and the discussion of systematic risk to come.
Homework problem: Now change the probabilities to  1  1 2  2  13  3  16 .

Which portfolios are optimal portfolios? Does the resulting change in the answer make sense – i.e. explain the difference
between the two cases using the characteristics of the two assets.. Relate this to the mean variance efficient portfolios. Is this
reasonable?
Convexity of the Efficient Set:

In studying the relation of the CAPM to the more general model it is interesting to ask: is the efficient
set convex? Why? To answer this question we use the technology developed above.
 Again, label the set of efficient portfolios E and call the set of marketed assets M (the feasible set of
portfolios).
 We define an Arrow-Debreu world, or complete market, as one in which the positive supporting
state prices are unique.
In complete markets λ, and so the ordering of the λs’s across states, is unique. Thus, we have:
E  M  {z z r1  z s1 }  {z z r2  z s2 }    {z z rL  z sL }
where a pair of states (r, s) is considered in these restrictions, (r, s) = (ri, si), for some i   r   s and
L is the number of distinct pairs ( r ,  s ) with  r   s .
[76]
If we label states so that 1   2  3   then E is the intersection of the sets of returns patterns with
z1  z 2 if 2  1 and the set with z 2  z 3 if 3   2 etc. All these intersected with the set of
marketed assets. We then have the following:
Theorem: If M is a complete market then E is a convex set.
Proof: E is the intersection of convex sets and so is convex.
Theorem: In a risk neutral market E=M so E is necessarily convex.
Proof: In this case λs is constant across states. Thus, all orderings of returns across states are efficient.
From the FOC we know z r  z s   r   s , but also z s  z r  r  s .
Thus, both orderings are efficient if  r   s .
Alternatively: All assets have the same expected return and so all possible portfolios may be held by
maximizing risk neutral agents.
In general, E  M  (  E ). This second set is the union over all valid λs of E  .

That is, E   l {z z rl  z sl } where (r, s) = (rl, sl) for some l  rl   sl for a given  .
In incomplete markets we must account for the multiple orderings possible because there are different
valid  s.
Two technical results are stated without proof:

1. For U and a finite state space, E is the union of a finite number of closed convex sets and hence
is closed.
2. E is connected – i.e. E cannot be represented as the union of two separate (disjoint) sets.
Definition: A market is said to exhibit k-fund separation (“kfs”) if  z 1 , z 2 ,..., z k  M such that
E  {z z   i 1 wi z i ,  ik1 wi  1}
k
i.e. the efficient set is contained in a set spanned by k “mutual funds”
Theorem: E is convex whenever M exhibits two-fund separation (2fs).
Proof: From 2fs, E is contained in a line, and since E is connected, E is convex since connectedness
implies convexity in R1.
Thus, we now have three special cases where we know E is convex and so we know that the market
portfolio is an efficient portfolio. However, one is uninteresting from a risk-return perspective and the
other two are actually incompatible (see Dybvig & Ingersoll). Now, we present a counter-example from
Dybvig & Ross that shows E is not convex in general.
Theorem: The efficient set is not necessarily convex and kfs (with k  3) does not guarantee convexity.
Proof: Shown by counter-example. Assume 3 assets and 4 states. We could have less trivial 3fs by
splitting one state into many indistinguishable states and introducing fair gambles with respect to these
new states as new primary assets.
[ 77 ]
66 82 72
 44 52 48
Let  1   2   3   4  1
4 Let Z = 
52 38 48
 
50 50 48
The valid set of price vectors is: R4 / {0}  span{(1,68,40,59), (21, 28,40,39)}
We can divide by 8,088 or 6,648 respectively to get Z p  1 .
Thus, we have two orderings on p and, since  s  1

4 s , on  .
Look at z1  Zw1 z 2  Zw2 with w1  (1,0,0) and w2  (0,1,0)

Since z1  (66,44,52,50) has an opposite ordering to (1, 68, 40, 59) and z 2  (82,52,38,50) has an
opposite ordering to (21, 28, 40, 39) both asset 1 and asset 2 are efficient “portfolios”.
However, 1
2 w1  1 2 w2  (74,48,45,50) is not efficient since it isn’t in the opposite order to either valid
p (  ).
Alternatively, consider w*  (1,.4,2.4) . Because vN-M agents view equally probable states
symmetrically, we know that for every strictly monotone vN-M (state independent) utility function that:
E[u ( zw*)]  E[u (74,50.4,48,45.2)]
 E[u (74,48,45.2,50.4)] (state independent utility)
 E[u (74,48,45,50)] (strictly monotone utility)
 E[u ( 1 2 w1  1 2 w2 )]
Thus, a convex combination of two efficient portfolios w1 and w2 is not an efficient portfolio since it is
dominated by another portfolio for every monotone state independent utility agent. Thus, E is not a
convex set.
w2
Picture: E is the shaded area
 1
2 w1  1
2 w2
w1
Systematic & Non-Systematic Risk

 How do we understand these concepts using the Rothschild-Stiglitz notion of riskiness and our
definition of efficiency?
[78]
 Analogous to the CAPM and Beta, we have seen that the way in which the risk or variability of an
asset’s returns affects the risk/expected return of an individual’s optimal portfolio is through its
correlation with the state price density  (rather than its correlation with Zm.)
o Equivalently, the correlation between an investor’s marginal utility of the return on
his/her optimal portfolio and any asset’s return.
 That is, if an asset’s returns have an inverse ordering across the states of nature as does the marginal
utility of z* (λ) then it has a similar correlation with the marginal utility of z* as does z* itself and
much of that asset’s variability contributes to the value of the portfolio – think of CAPM and
correlation with the market return.
 Systematic risk is the notion of an individual asset’s contribution to the risk of an efficient portfolio
(i.e. what portion of an asset’s risk is priced).
Consider marginal utility of returns as a random variable: u~k  u ( ~z ek )
Now, consider a ‘conceptual’ regression: z i   ik   ik u~k  ~ik

~
which defines the systematic risk and non-systematic risk of asset i with respect to efficient portfolio k
and utility function u( ).
Cov (u k , ~
zi )
Consider  ik 
Var (u k )
Cov (u k , ~
z ek )
Normalize  ik by  kk  to remove any influence of the scale of the utility function we are
Var (u k )
 Cov (u k , ~zi )
using: so define bik  ik  .
 kk Cov (u k , ~z ek )
bik provides us with a measure of the systematic risk of asset i with respect to efficient portfolio k that is
independent of the scale of the utility function, u(·).
 This measure possesses a portfolio property that the b of a portfolio is the weighted average of the
b’s of the assets in the portfolio (using the portfolio weights)
 The ordering asset i is riskier than asset j by this measure is a complete ordering and the ordering is
independent of the efficient portfolio chosen. Therefore, if we can identify an efficient portfolio and
measure marginal utility we could correct the problem we had before of an incomplete ordering on
riskiness.
 If there is a riskless asset, we can write excess expected returns as being proportional to b. Rearrange
the FOC to write:
E[u k  z i ]  RE[u k ] now rewrite the left hand side and rearrange the equation
Cov[u k , z i ]  RE[u k ]  E[u k ]z i   E[u k ]( z i  R)
This holds for all assets, including z ek , so:
Cov (u k , z i ) ( z i  R)
bik   or, z i  R  bik ( z ek  R )
Cov (u k , z ek ) ( z ek  R)
[ 79 ]
And, since we know z ek  R , z i  R is positively proportional to bik .
Suppose there is no riskless asset – What is the equivalent of a zero-beta asset here?
The relation can come more quickly from the standard E[λ zi] = 1 for all assets i.
Our problem is b is in terms of u k , which is not something we can easily measure. In order to see the
generality of this result is let’s examine some special (familiar) cases.
 If utility is quadratic, u  is linear in ~z ek .

Cov (~ z ek , ~
zi )
Then, bik becomes bik  ~ k
  ik . Then, all we need is the efficiency of the market
Var ( z )
e
portfolio. We get that from the assumed quadratic utility since, as we have seen and will see again, it
generates 2fs which implies E is convex. So, the market portfolio is in E.
 If asset returns are multivariate normal –

Stein’s lemma says Cov (u ( ~z ek ), ~z i )  E (u ( ~z ek )) cov(~z ek , ~z i )
Cov ( ~ z ek , ~
zi )
So, bik again becomes bik  ~ k
. See comment above. Normality also implies 2fs.
Var ( z ) e
 The consumption CAPM (Breeden) spills out of this, as well. What we are seeing is that it all
depends on what is a sufficient statistic for marginal utility or λ. With quadratic utility or multivariate
normal returns, the return on an efficient portfolio is sufficient for u ( z ek ) . The assumptions of the
CCAPM imply aggregate consumption is a sufficient statistic for marginal utility.
Nonsystematic Risk:
Consider the  ik from our conceptual regression.  ik depends upon both the benchmark portfolio and
the utility function chosen. Thus, the  ik we identified is not an unequivocal measure of nonsystematic
risk that will be agreed upon by all investors.
The one exception is a complete market or an effectively complete market. There we know that the
marginal utilities of all investors are exactly proportional (λ is unique), thus for all efficient portfolios and
all utility functions the u k s are perfectly correlated and  ik   ij for all investors. (In a pareto efficient
market all investors see the same state prices, so no valuable trades can be created. Thus, the marginal
utilities must all be proportional.)
Sufficient conditions for  -risk to be nonsystematic if it is uncorrelated with the market return (not true
generally) are a pareto efficient market and that  is a fair game with respect to z m .
 That is, if we write ~ z as ~
i z ~
ix ~ where E[~ z ]  0 and that the market is effectively complete
i i i m
(which implies that E is convex), then  i is unequivocally nonsystematic risk.

 ~
xi may be systematic, nonsystematic, or a combination of both types of risk. The market is
(effectively) complete so E is convex and the market portfolio is efficient. There exists, therefore, a
u m for which z m is the efficient portfolio and from the pareto efficiency of the market we know
u k  au m ( z m ) for all investors k. Since ε is a fair game with respect to z m it is also uncorrelated
[80]
with u m ( z m ) and so uncorrelated with any u k .  is therefore recognized as nonsystematic risk by
all investors and will “have no price”.
In particular models the notion of nonsystematic risk being risk that is uncorrelated with the state price
density can be a handy representation.
[ 81 ]
[82]
6 Portfolio Separation Results

PORTFOLIO SEPARATION RESULTS
One Fund Separation – 1fs:

1fs: Taste based separation – using utility of returns
 The necessary and sufficient condition for taste based one fund separation is that all investors have
the same utility of return function up to a positive affine transformation
Sufficiency:
If all investors (k) have u k  a k  bk u with bk  0 k , then the FOCs of all investors look
alike:
E[u k (~z k *)(~
zi  ~z j )]  0 k i, j
 bk E[u (~ z k *)(~zi  ~z j )]  0 k i, j .
Note that the individual preference parameter does not affect the FOC so:
E[u (~ z *)(~zi  ~z j )]  0 k i, j ,
i.e., the FOC is the same for all investors so all hold the same portfolio: z k *  z * k .
Necessity:
If all investors hold the same portfolio, regardless of how assets’ returns are distributed, then it
must be that E[u k ( z*) z i ]   k k i for all possible asset returns distributions.
If the assets’ have returns distributions that are dirac delta functions, then the FOC is:
u k ( z *) z i   k k i
So, asset by asset we have:
u k ( z *) 1 u  ( z *) u j ( z *)
 for each investor k or j so k  .
k zi k j
Thus it is necessary that there is some  kj such that
u k ( z*)   kj u j ( z*)
i.e., for this to hold for a unique z* (regardless of its mean), the marginal utility of z* can differ only by a
multiplicative constant. So, u k   k u  . Integrating gives: u k ( z )   k   k u ( z ) k .
Alternatively…For every investor to hold the same portfolio, regardless of the returns
distribution, then we must have that for all returns z:
u a ( z ) u b ( z ) u ' (1)
 or u ' a ( z )  a u 'b ( z )
u a (1) u b (1) u 'b (1)
i.e., all investors must have the same marginal rate of substitution across realizations of returns. If this is
true across all investors for all levels or return, z, we get 1fs.
Integrate:
r r u a (1)
 u  (r )dr  u
1
a a (r )  u a (1)  
1 u b (1)
u b (r )dr
u a (1)
= [u b (r )  u b (1)] r
u b (1)
[ 83 ]
So ua is a positive affine transformation of ub – clearly all will hold the same portfolio the utility functions
are all the same.
Now, look at a (complete) market with one riskless asset and one risky asset:
~ r with prob 1 2
z 
1 with prob 1 2
From a’s FOC: 1 2 u a ( zw * (r ))  Pr For simplicity, normalize ua so   1

2
1 u  ( zw * (1))  P
2 a 1
Then, if the ratio of the marginal utilities is not the same for all investors, so u a    u b , the investors
see different relative state prices and so choose different portfolios in this market.
Cass & Stiglitz worked this out considering utility of wealth – by considering utility of returns, we
implicitly assumed initial wealth was the same for all investors. The Cass & Stiglitz results hold for all
Wo  0 and so all investors must have an affine transformation of one CRRA utility function.
 When does MaxE[u (Wo Zw)] s.t. 1w  1 look the same for all Wo ?
 If, for example:
W 1  (W Zw)1 
u (W )  , then E[u (Wo Zw)]  E  o 

1  1  
E (Zw)1
= Wo1
1
So, the optimal portfolio is independent of Wo . u (W )  Ln(W ) also works.
1fs: Distribution of Returns based:

The obvious example would be if the ~ z i are i.i.d.. All (risk averse) investors (risk neutral investors are
indifferent) set wi  1 N i , expected returns are equal and this minimizes risk.
iid returns are not necessary for 1fs. One aspect of this example that is required is that all assets must
have the same mean. Why?
The necessary and sufficient conditions for 1fs are:

(1) zi  ~
~ x  ~i i
E[~ x]  0
i i x and,
(2)  a w m such that 1w m  1 and  wim ~i  0
 Sufficiency: with no redundant assets, wm is unique and E ( i )  0 i . So, any holding other than
wm has the same expected return and more risk than wm. That is,
~
z m  Zw m   wim ~ x   wim ~i  ~
x
Any other portfolio has returns of:
~
zk  ~ z m  ~k
x   wik  i  ~
Where, E[ k z m ]   wik E[ i x]  0
[84]
So, all portfolios have the same expected return and zm is less risky than all other portfolios k, thus only
wm will be held by an investor with concave utility.
d
 Necessity: (Need to show ~ zi  ~
x  ~i not ~ x  ~i )
zi  ~
Let zm be the returns on the assumed optimal portfolio. By leaving the ei unspecified, we can, without loss
of generality, write ~ z m  e~i . We do know, however, that E[ei ]  0 i must be true since if not
zi  ~
some assets will have different expected returns and some investors will hold different portfolios,
trading off risk and return in different ways.
The FOC of any investor, regardless of utility function, must hold for all zi and for zm. So,
0 = E[u ( z m )( z i  z m )]  E[u ( z m )ei ] i u ()

= Cov (u ( z m ), ei ) since E(ei)=0 i u ()
Since zm must be optimal for all monotonic, concave utility functions, i.e. any positive, decreasing u (z ) ,
by the fair game lemma, this zero covariance implies E[ei z m ]  E[ei ]  0 .
( cov( x, g ( y ))  0, g (.)  E[ x y ]  E[ x] .)
So, we have:
~ zm  ~
zi  ~ ei with E[ei z m ]  0
Finally, we required, by construction, that the assumed optimal portfolio has no e-risk. So,  wm with
 wim ei  0 .
The story is simply that there is a single source of systematic risk and all assets have the same exposure
to it. Only in this way is there no risk-return tradeoff (must be no such tradeoff so that only one
portfolio will be held regardless of the utility function involved) and only with the fair game property do
you get nice results on riskiness for the idiosyncratic component of returns.
{iid returns fit within this restriction, we see this as follows:

1 N ~
Define: Y  z i and  i  z i  Y
N i 1
Since ~
z i are i.i.d.  E[ i Y ]  k (Y ) independent of i So,
N
k(Y)  1
N 
i 1
E[ i Y ] 
 E[ z i  Y Y ]
1
N
~ ~
 E[ 1 N  zi  Y Y ]  E[Y  Y Y ]  0
and, the other condition we need holds by construction. }
Pricing in equilibrium: Taste Based

The one fund, from supply = demand, must be the market portfolio. Thus, we know the market
portfolio is efficient and pricing follows from z i  R  bim ( z m  R) . With:
Cov (u ( z m ), z i )
bim  and u′ is the common (base) utility function.
Cov (u ( z m ), z m )
So we indeed simplify pricing but this level of investor homogeneity is extreme.
Pricing in equilibrium: Distribution Based
[ 85 ]
It’s simple – all assets must have the same expected return: z i  z m i , an uninteresting case.
Two Fund Separation – 2fs

2fs: Taste Based:
An obvious sufficient condition is that all investors’ utility functions can be represented as a positive
affine transformation of one of two “base” utility functions (if we use u(w) instead of u(z) the base utility
needs to be CRRA class). Agents then all hold one of two funds in a degenerate example of 2fs where
portfolio combinations of the two funds are not held.
The only known class of utility functions that permits non-degenerate 2fs is the quadratic class.
Quadratic utility investors hold portfolio combinations of two efficient funds.
Two Fund Money Separation – Taste Based

(This analysis restricts returns in that we assume there exists a riskless asset, minimal but still a restriction
on the distribution of returns that accompanies the utility restriction.)
 Here we assume that the riskless asset is one of the two “funds” – usually label it Asset 0. We can
make this assumption without loss of generality as long as utility functions with infinite risk aversion
are contained in the class of utility functions (U) we consider.
 Since the riskless asset is one of the funds, the other need include only risky assets and we, therefore,
want to find classes of utility functions for which the risky assets are held in a fixed proportion. i.e.
find U for which wik  ij (constant) for all agents k where i, j  0 .
w jk
 In other words, we want all investors holding a levered position in the same (market) portfolio of
risky assets.
 The investor’s FOC: E[u k ( z*)( z i  R)]  0 for all i determines the optimal portfolio of risky assets.
 A sufficient condition for this is that all utility functions are in the HARA (or LRT) class:
1  a  bz with b constant for all k.
A k
Or, u k (Z )  ( Ak  B k Z )  c (u (W )  ( Ak  Bk W Wo )  c )
Where c is constant across investors and Bk, and c must have the same sign for concave utility.
[86]
U: Recall that the class of utility functions includes the following.
(W  Wˆ k )  W A
(1) with   1 (  0)   1  c Wˆ k   ok k
 Bk
Ŵk  subsistence level of wealth
(Wˆ k  W ) 
(2)  with   1 Ŵk  satiation level of wealth

(3) Ln(W  Wˆ k )  0 (c=1)
(4)  exp(  k W )    (c  ) in this case Ak = 1, Bk = k/c, and c  -

so the satiation level of wealth is infinite in this case.
Proof: Under our assumption for u′, the FOC is:

E[( Ak  Bk  0N wik z i ) c ( z i  R )]  0 i=1,…,N
Case 1: If Ak  0 the FOC becomes:

B K c E[( iN 0 wik z i )  c ( z i  R )]  0 i=1,…,N
The optimal wk * is clearly independent of Bk and depends only on c. Thus, all investors hold
the same w * since c is common to all (all investors have the same standard CRRA utility function, look
at Ŵk above) and it degenerates to 1fs.
Case 2: General –
Consider that investor k first places an arbitrary amount, 1-ak, in the riskless asset and ak in a
portfolio of all risky assets and the riskless asset. This is without loss of generality since the second
portfolio contains the riskless asset and there are no short sales restrictions.
Let the amount ak be split up as  o ,  with  0N  i  1 .
The final portfolio weights are therefore written:
wok  1  a k  a k  ok
wik  a k  ik i=1,…,N
The FOC is now written:
E[( Ak  Bk (1  a k ) R  Bk  Nj 0 a k  jk ~
z j ) c (~
z i  R )]  0
Ak
Now since ak is arbitrary, choose something clever and set a k  1  .
( RBk )
Ak
So, 1  a k   .
RBk
So the FOC becomes: (a k Bk )  c E[( Nj0  jk z j )  c ( z i  R )]  0 i=0,…,N
[ 87 ]
Thus,  ok* and  ik* i=1,…,N are now independent of Ak and Bk:  ik   i for all k.
This demonstrates that the same  o and  are optimally chosen by all investors with the same utility
parameter c, the FOC holds, and the constraints  iN0 wi  1  iN0  i  1 also hold.
For the risky assets, wik  a k  ik  a k  i

w 
So, ik  i is independent of k for all investors k as was desired.
w jk  j
All investors hold the same portfolio of risky assets – thus it must be the market portfolio.
The individual utility parameters Ak and Bk simply determine the amount of leverage.
wok  1  a k  a k  ok  1  a k  a k  o
Ak  A 
   o 1  k 
RBk  RBk 
Interpreting Utility (an aside):
W A
For   1 : Wˆ k   ok k is the subsistence level of wealth for “next period”.
Bk
u (Wˆ k )   or u (Wˆ k )   so investors hold a portfolio that insures W  Wˆ k .
For   1 : Ŵk is the satiation level of wealth.

For W  Wˆ k , utility is either declining or convex for   1 , so we confine our analysis to
W  Wˆ .k
From wok  1  a k  a k  ok , wik  a k  ik ,

W A A
Wˆ k   ok k ak  1  k
Bk RBk
We can find the dollar demand for the assets as:
 A A 
wok Wok  1  1  k   ok   ok k Wok
 RBk RBk 
 AW  AW 
   k ok   ok Wok  k ok  

 RBk  RBk 
Wˆ  Wˆ 
 k   o Wok  k 
R  R 
 AW   Wˆ 
wik Wok   ik Wok  k ok    i Wok  k 
 RBk   R 
This holds for generalized power and generalized log utility agents. Such investors place the
ˆ
present value of Ŵk into the riskless asset Wk R to ensure a minimum subsistence level of wealth. They
[88]
then divide the remaining wealth among all assets (including the riskless asset) according to the optimal
portfolio weights,  o ,  , where these weights are the optimal choice for an investors with utility

u (W )  W which depends only on  (or c).

Example:
( w  wˆ k ) 2
u ( w) 
2
ŵk W
for log utility let   0 in this equation.

(w  wˆ k )  e Log ( w wˆ k )
i.e. lim  lim
 0   0 
log( w  wˆ k )e Log ( w wˆ l )
 lim
 0 1
 Log ( w  wˆ k )
For   1
The demand is as given but these demand functions represent utility minimizing portfolios, we
Wˆ
must bound wealth below Ŵk . So,  i (Wok  k ) must be negative.
R
That is, the investor takes a short position in (  o ,  ).
[ 89 ]
For exponential utility:
   W 1
wik Wok  i wok Wok  o  k ok
k k k
So, the investor’s demand for the risky assets is constant for all Wo. The  i are optimal for an investor
with absolute risk aversion parameter 1Wok , u (W )   exp( W ) , so there is unitary absolute risk
Wo
aversion on utility of returns u(z) = –exp(-z ).
2fs – Distribution Based

The necessary and sufficient conditions for 2fs are:
~ ~
 bi , w1 , w 2 , X , Y , ~i such that
~ ~
z  X  b Y  ~
~
i i i i
E[~i X  Y ]  0 i, X , Y , 
 w1~  0
i i  w1i  1
 wi2~i  0  wi2  1
 w1i bi   wi2 bi
 There are two residual risk free portfolios that have distinct exposure to “Y” risk. Without this last
condition it would degenerate to 1fs.
 w1 and w2 are the two funds.
~ ~
The systematic risk of any asset is given by X  bi Y
 This combination of conditions implies ε is idiosyncratic risk for all investors and that all desirable
risk/return combinations can be achieved by trading in w1 and w2.
Example: Normal returns with no riskless asset

~ ~ ~
Let Y  Z m  X , then we get the familiar results:
ε is “diversified” away by holding portfolio combinations of Zm and X.
Z
 Zm
 X
2
[90]
Two Fund Money Separation
 The necessary and sufficient conditions are:
 bi , Y ,  i , w m , R such that
~
z i  R  bi Y  ~i
~ i
E[~ Y ]  0
i i, Y
 w m such that 1w m  1 ,  wim ~i  0 and  wim bi  0
 Introduce some notation:
 0 
w m 
Let    1  denote the portfolios weights on what will be the market portfolio of risky assets
  
 m
wN 
and,
0
b 
Let b   1  denote the augmented vector of sensitivity levels to the single risk factor
 
 
bN 
wb
For any portfolio w define   as the relative level of systematic risk of portfolio w (relative to the
 b
market portfolio).
Sufficiency: The story is again that all desirable risk return tradeoffs can be achieved with R and Zm
(both of which are devoid of ε-risk) and that nobody holds ε-risk.
Suppose that the portfolio w is optimal for some u  U .

Then we can always write w as:
1   0   0 
0 wm    
w  (1   )        
1 1
      
   m  
0  wN   N 
where 1     0 (from 1w  (1   )1i1  1   1 )
Since  was chosen so that wb   b (i.e.  in wm and (1-  ) in R mimics the systematic return of w)
we know  b  0 must hold. α is an arbitrage portfolio with no systematic risk exposure.
(Or: by construction, w  (1   )i1     . Post-multiply by b wb  (1   )  0   b   b
And, since we chose  so that wb   b , it must be true that  b  0 . Thus,  doesn’t contribute
to systematic risk. And since E (  )  0 , α doesn’t contribute to expected returns either.)
[ 91 ]
 All risk-return tradeoffs can by accomplished with R and wm.
 We need only show that:
E[u ( wz )]  E[u ((1   ) R   z   z )]
 E[u (1   ) R   z)]  E[u (q)]
where q  (1   ) R   z  i.e. no investor holds the arbitrage portfolio  .
 The two portfolios w and q have the same expected return since E[ i ]  0 i so we simply need
to show that q    is riskier than q.
i.e. show that E[  q ]  0 (or E[  q ]  0 )
E[  q]    i E[ i q ]
But, since knowing q implies knowing implies knowing Y and vice versa, this is equivalent to:
  i E[ i Y ]  0
Thus, q~   ~ is riskier than q in a R-S sense, and no investor with a concave utility function holds the
arbitrage portfolio   all hold some combination of R and wm.
Necessity:
The general FOC is: E[u (q~ )( z i  R)]  0 if q is the assumed optimal portfolio’s returns
 If q~ allows 2fms q  [z m  (1   ) R] where wm is some portfolio of the risky assets (1 w m  1)
 If separation holds, the maximization problem is

Max E[u (q )] i.e. just choose the balance between, R, wm
 The FOC of this problem is E[u (q)( z m  R)]  0

This condition must imply, if separation holds, that E[u (q )( z i  R)]  0 i
 Without loss of generality, write z i  R   i ( z m  R)  ei

zi  R
and let  i  so that E[ei ]  0  i
zm  R
We can always do this if we place no further restrictions on the e’s.
The conditions:
Is there a w with 1w  1 and we  0 ?
Yes – by construction zm = Zwm has no e-risk.
So, wm is a/the well diversified portfolio of risky assets.
~ ~z R
Define Y  m and bi  bm i
bm
~
and you have all returns zi can be written as: z i  R  bi Y  ei
[92]
Now, show E[ei Y ]  E[ei z m ]  0 is required.
The general FOC is: E[u (q )( z i  R)]  0 or,

E[u (q )( i ( z m  R)  ei )]  0
So,
0   i E[u (q )( z m  R)]  E[u (q )ei ] i and for all u(·) in the monotone concave class.
By the assumption of separation, the first term is zero, so it must be that E[u (q)ei ]  0 i , u(·).
As before,
0  E[u (q )ei ]  Cov (u (q ), ei ) since E[ei ]  0 i , u(·)
And so, by the fair game lemma,

E[ei q ]  E[ei ]  0 i and,
E[ei q ]  E[ei Z m ]  E[ei Y ]  0 i (all are informationally equivalent)
so the fair game property implies that the FOC (assuming separation) holding implies that the general
FOC holds, which proves necessity.
Equilibrium with 2fs:
Taste Based – for quadratic utility, we know the result
For 2fms - we also know that wm must, in equilibrium, be the market portfolio. So, the market
portfolio is efficient and our general pricing formulas can be applied using the restrictions on allowed
utility functions and recognizing that z*=zm.
Cov (u ( z m ), z i )
zi  R  ( z m  R)
Cov (u ( z m ), z m )
2fms Distribution Based - Again, the risky asset portfolio must be the market portfolio. From our
~ ~
conditions z m  R  bm Y and ~z i  R  bi Y  ~i we get:
~
b
z i  R  bi Y and z m  R  bm Y , or z i  R  i ( z m  R)
bm
and that
Cov ( z i , z m )  bi bmVar (Y ) and Var ( z m )  bm2Var (Y )
bi Cov( z i , z m )
So,   i
bm Var ( z m )
So, 2fms, if the variance of Y is defined, allows CAPM pricing. This had to follow since all optimal
portfolios, being combinations of R and wm, can be completely described by their means and variances
alone.
[ 93 ]
From 2fs, we get the Black-CAPM pricing equation.
Without a riskless asset:
m is one portfolio
let portfolio “0” be the other portfolio
Z o  X  bo Y Zm  X  Y where bm  1
X  Z o  boY and, Y  Z m  X
Z o  bo Z m
X  Z o  bo Z m  bo X 
1  bo
Z o  bo Z m
Y  Zm  X  Zm   ( Z m  Z o )(1  bo ) 1
1 b
z i  X  bi Y  (1  bo ) 1 ( Z o  bo Z m )  bi (Z m  Z o )(1  bo ) 1
 (1  bo ) 1 ( Z o  bo Z m  bi ( Z m  Z o ))
 b  bo   b  bo 
z i  Z o 1  i   Z m  i 
 1  bo   1  bo 
from z i  X  bi Y   i
Cov ( z i , z m )  Var ( X )  biVar (Y )  (1  bi )Cov ( X , Y )
Now, if “0” is a zero-beta portfolio, so Cov ( z o , z m )  0

Cov ( z o , z m )  Var ( X )  boVar (Y )  (1  bo )Cov ( X , Y )  0
or,
Var ( X )  Cov ( X , Y )
bo  
Var (Y )  Cov ( X , Y )
substitute this bo into the z i equation to get:
bi  bo biVar (Y )  (1  bi )Cov ( X , Y )  Var ( X )

  i
1  bo Var (Y )  2Cov ( X , Y )  Var ( X )
So, z i  z o (1   i )   i z m  z o   i ( z m  z o )
Thus, the Black-CAPM equation occurs for the same reason as in the case of 2fms.
K-Fund Separation
Investors will hold combinations of no more than K risky mutual funds and the riskless asset if:
~
(1) z i  R   Kk1 bik Yk   i
[94]
(2) E[ i Y1 ,...,YK ]  0 i k
(3)(4) w k for k=1,…,K such that 1w k  1 and,  wik  i  0
(5) Rank(A) = K
where alk   iNi wil bik l = 1,…,K k
Each element of A is a portfolio weighted average of the b’s on one Yk for a given fund l.
i.e. alk is fund l’s factor loading on factor k (Yk).
 a11  a1K 
A      
a K 1  a KK 
Condition (1) describes a K factor model of the returns generating process. Y’s are factors and b’s are
factor loadings.
Condition (2) says the ε’s are idiosyncratic risk that no investor wants to hold.
Conditions (3) and (4) say that each mutual fund is a well diversified portfolio and that there are K such
funds.
Condition (5) says that we have K linearly independent (non-collinear) funds – thus, a rotation of the
factors can set A to a diagonal matrix (i.e. there is one portfolio combination of the funds that creates a
fund with no ε-risk and bi > 0 for only one i) and we can therefore get a portfolio with any combination
of factor loadings by using the K funds (i.e. from any possible risk-return tradeoff using the funds).
Proof: Basically the same as before – show that no one holds ε-risk and can accomplish any desired
risk/return combination using the K funds.
Pricing Under kfs:
We know: z i  R   k bik Yk for all assets i.
This must then hold for the K funds as well – use the “rotated factors” so that each of the funds has a
non-zero factor loading on only one of the factors.
Let z k be the expected return on the mutual fund with a non-zero loading only on factor k. And,
Let b k   wik bik be that loading.
Then,
z k  R  b k Yk
And, substituting for Yk
[ 95 ]
( z k  R)
z i  R   k bik .
bk
Furthermore,
Cov ( z i , z k )  bik b k Var (Yk )
2
Var ( z k )  b k Var (Yk )
So,
Cov ( z i , z k ) bik
k
 k   ik
Var ( z ) b
and
z i  R   k  ik ( z k  R ) .
Note – this pricing result is not dependent upon separation.

 The expected returns on assets whose common variation depends on K factors must be
linearly related to the assets’ responses/loadings on the factors if just one risk-averse
investor holds an ε-risk free portfolio.
[96]
7 The Arbitrage Pricing Theory

APT – Ross’s Color Commentary (or as close as I can come to reproducing it)
The model is:

~
(1) z  a  B f  ~
~
where,
(2) E (~ )  0
~
(3) E( f )  0
(4) E ( ff )  I
(5) E (f )  0
(6) E ( )  D a diagonal matrix
 Note that (1), (2), and (3) tell us that a  z

 The model says we can describe all systematic risk with this linear factor model
 Only (6) has any ‘bite’ to it – it requires uncorrelated residuals given this structure
 The theory seeks to explain relative pricing in the asset market as is done in Modigliani-Miller theory
and Black-Scholes using the absence of arbitrage
 The pricing equation: z i  R   Kk1  ik ( z k  R ) The pricing is in terms of exogenous factors

and the intuition is that the ’s carry no premium since they can be diversified away in “large”
portfolios.
Compare & Contrast:

The CAPM is a static model considering the general portfolio choice problem. It is an equilibrium
model that explains pricing in terms of an endogenous market aggregate (the market portfolio). In other
words, you consider risk premia as being determined by the projection of zi on zm.
A brief survey of history tells us that, prior to the development of the CAPM, the standard intuition was:
E (~
zi )  R f   i
 People invest in risky assets if you give them a positive expected return in addition to compensation
for the time value of money, Rf.
 This implies that if you want a preference based theory, you should use concave preferences since
risk averse investors are the ones that will demand a positive premium for holding risk.
  i was thought to be a function of an asset’s own variance of return –  i2 .
 Portfolio theory and the equilibrium arguments in the CAPM gave an identification of the risk
premium  i i that said only market risk, the projection of an asset’s returns on the market
portfolio’s returns (the part of the asset’s return that is correlated with the market portfolio’s return),
[ 97 ]
is priced – you ignore all the rest.
 The supposed intuition of the CAPM is that idiosyncratic risk can be diversified away leaving only
systematic (market) risk to be priced.
But,…
 Idiosyncratic risk in the CAPM framework is defined with reference to the market portfolio, it’s the
residual of the regression or projection of an asset’s return on the market’s return.
 No further assumptions about the  ' s are made – i.e. they could be highly correlated. In fact, they
are linearly dependent since when weighted by the market value weights they must sum to zero. So,
in any large portfolio, we cannot use the law of large numbers to say this portfolio has negligible
idiosyncratic risk, contrary to our intuition.
 The exception, of course, is the market (tangency) portfolio. But, then the intuition that
diversification leads to pricing based on the market portfolio is circular at best.
 The APT is also a static model but it can be seen as a static version of most inter-temporal models in
finance in which the factors represent innovations in the underlying state variables.
 The APT directly assumes a return structure in which the systematic and idiosyncratic components
of returns are defined a priori. Thus, the standard notion of diversification is directly used.
 The APT is based upon the absence of arbitrage and pricing is done in terms of the exogenous
factors.
Pricing Models – In General:

One starts with some specification of a technology – a payoff (or returns) matrix:
Y  Ysi  v   v1  v N 
…and the assumption of the absence of arbitrage (or something stronger like equilibrium).
 One view of pricing is that from the absence of arbitrage (and so it will be true in any equilibrium
model with increasing preferences) we know that :
 p    p1  p s   0 s.t. Y  p  v
 In a complete market we can formally identify the p’s as A-D state prices. In an incomplete market,
the p’s still exist but technically we can’t identify them as A-D prices. This approach is currently
being pursued in interesting ways.
 Traditionally, however, people have been looking for a pricing relation like: z  R f   i
 Think like an economist for a moment. Y is your technology – it determines how money is moved
from state to state and across time in this economy.
 How do we explain prices in this world? Well, what’s missing so far?

Answer: Preferences of the agents in this economy.
 For illustration, how does the CAPM follow in this framework?
[98]
 We restrict attention to (or specify):

(1) Multivariate normality (this is sufficient – a necessary condition is 2fs). Note – this is really a
joint restriction on Y and v since we specify normality for Z=Y/v.
or,
(2) We can restrict preferences to be quadratic.
 Either restriction we know translates to more mean and less variance being goods (i.e. you consider
u(mean, variance) as your preference restriction in the economy). This provides the 2fs and as we
have seen the simplification of the general pricing equation.
 The APT is a look at how we can restrict Y – the technology rather than preferences.
 We can interpret the returns generating model as a rank restriction on the Y matrix – this will be
illustrated more concretely in what follows. In other words the factor model restricts the nature of
what can constitute the “total risk” of the capital market.
 By considering only restrictions on the technology (restricting ourselves only to increasing

preferences) we may only consider relative prices in the asset market and we give up (by not further
restricting preferences) the ability to talk about some bigger issues – i.e. what are the risk premia or
what is their relation to other aggregate or macro variables.
As an illustration let’s develop an arbitrage based approach to the CAPM pricing equation or the SML.
Suppose:
~
z i  z i   i f  ~i
~ a 1-factor model
where, written this way,

~ ~ ~
f is necessarily the unexpected innovation of some exogenous factor f   f   f
Assume there are N risky assets and one riskless asset.

Also assume:
~
(1) E (~i f )  0 i
Consider a positive investment portfolio w (1w  1 ):
Z w  w~
z  w( z  f   )  wz  ( w ) f  w
Set w  0 - i.e. choose a portfolio with no systematic risk. This assumes it is not the case that all
assets have the same level of systematic risk. That is, there exist assets i, j s.t.  i   j (or that the
vectors 1 and  are not collinear). Note, since relative pricing is the goal this is a minimal assumption,
on the order of  i, j with z i  z j , made in the CAPM derivation.
(2) Suppose the portfolio w is also well diversified. That is, in some metric you specify the wi’s are
close to 1/N for all assets i.
[ 99 ]
(3) Assume either an upper bound on the variances of the  ’s or more simply assume that  2i   2
i . i.e. the  ’s all have the same (finite) variance. Also assume that E (~i ~j )  0 if i  j (uncorrelated
residuals).
2 1
The variance of w is then “close to” 2 if wi is “close to” i .
N N
  2 
For example, if wi  0( N )  Var ( w )  0 2  .
1 Or,
N 
w  0 as N  E ( w )  0 Var ( w )  0 as N  thus we can ignore the ’s.
(This is where we see the “bite” of equation (6) and where all the difficulty comes in the APT.)
Now, since we set w  0 we have: Zw  wz  ( w ) f  w  wz for “large N”
Economics tells us that wz must equal R  1  r f

  w s.t. 1w  1 w  0 w  0
 a well diversified, positive investment portfolio with no systematic risk must
have expected return wz  R
Similarly,  s.t. 1  0 ,    0 and  well diversified, we must have   z  0.
If no money is invested, and no risk exposure assumed, we should expect no return.
Since both conditions hold, it must be that z is a linear combination of 1 and 

Or, z  1   
Since this must also work for R (our portfolio w):

R     R  
And, we can write: z  R1   
Further, picking an asset with β = 1 we see that   z 1  R where z 1 is the expected return on an asset
with a factor loading equal to 1.
The development in the text is also intuitive:

 Pick two assets with 1   2 .
 Consider a residual risk free, positive investment portfolio (1w  1 ) of the two assets with no
systematic risk w1 1  w2  2  0 :
z w  wz  w f  w
z w  wz  R or, w( z  R1)  0
 In matrix form,
 z1  R z 2  R   w1  0
   
 1  2   w2  0
 Since this matrix is singular, it must be that the first row is collinear with the second: ( z i  R)   i
 Or, z i  R   i showing our intuition was correct. (Draw a picture!)
[100]
Again considering an asset or portfolio with   1 :

z1  R   or,   z 1  R
Thus, z i  R   ( z 1  R )
And, we get the standard intuition that for a risky asset the expected return is made up of the time value
of money and a risk premium.
~
In the returns equation ~z i  z i   i f  ~i , economics tells us that z i and  i should be related. We
found:
z i  R   i ( z 1  R)
What restrictions drive these results?

(1) the absence of arbitrage
(2) a one factor model of the returns generating process
Forget  for the moment.

State-by-state: z si  z i   i f s
If this holds, then each column of the Z matrix is a linear combination of 2 column vectors:
for, column j: z j  z j 1   j f
(i.e. with no residual risk, the Rank(Z) = 2)
This illustrates the interpretation of the APT factor-model as a rank restriction on Z
Here, it is a linear combination of a vector of ones and something else (f – the factor).
Is it always a vector of ones?
[ 101 ]
Now, go back and see what’s wrong.
z j  R  ( z   R )  j is only an approximate pricing result – the rank restriction is only
approximate when there is residual risk. There is a whole literature on “what is close” – Absence of
Asymptotic Arbitrage – and how big can any pricing errors can be?
The APT brings in exogenously specified factors.

What are they?
What else is missing?
What are the risk premia?
i.e. How big is z 1  R - to what does it relate?
Since there is no restriction placed on preferences other than monotonicity, we don’t or can’t say
anything about these questions.
Two Factor Model (with no idiosyncratic risk):

Let returns be described by:
~ ~ ~
z i  ai  bi f1  ci f 2
where b , c , and 1 are not collinear (so that our factor loadings are “sufficiently different” a term we
will use now and make more precise later).
Form a portfolio of 3 assets – its return is:

Z w  wa  f 1 wb  f 2 wc
choose w so that wb  wc  0
 Z w  wa  R
or, w(a  R1)  0
We can write this as:

 a1  R a 2  R a3  R   w1  0
 b b2 b3    w2   0
 1
 c1 c2 c3   w3  0
Again, this is a singular matrix where the last 2 rows are not collinear.
So, it must be that:
z i  R  ai  R  1bi   2 ci i
Again, let z i be the return on a portfolio with one unit of factor i risk.
We identify z 1  R  1 and z 2  R   2 and the familiar pricing formula is complete.
Unavoidable Risk:
Consider the two factor model:
~
z i  ai  bi f 1  f 2 i  0
Any positive investment portfolio has returns:
[102]
Zw  wa  wbf 1  f 2
Clearly, we do not have the factor loadings on factor 2 sufficiently different across assets – this is an
extreme version.
However, any portfolios with the same b are perfectly correlated and must have the same expected
return from the absence of arbitrage (here simple dominance).
Define ao as the expected return on any portfolio with b=0.

This is well defined by the argument just given.
Form 2 portfolios, w+ and w*, with 3 assets (i, j, o).

Let: wi  wi*   1
w j  w*j   2
wo  wo*   1   2
By construction, the two portfolios have the properties that:

a   a *  1 (a i  a o )   2 (a j  a o )
b   b *  1bi   2 b j
Now, choose  1bi   2 b j  0 . Then, b+ = b*, and it must be that a+ = a*.
Thus,
 b 
0   1 ( ai  a o )   2 ( a j  a o )   1 a i  a o  i ( a j  a o ) 
 bj 
(by solving  1bi   2 b j  0 for  2 )
This must, then, hold for any choice of  1 (and its accompanying  2 ). So,
o
ai  a o a j  a
 
bi bj
The linear pricing relation is then (by simply rearranging this expression):
ai  a o  bi  or,
z i  ai  z o  bi ( z 1  z o )
where z o  a o
z 1 = expected return on a portfolio with b = 1
Note: The riskless rate does not appear here, even if a riskless asset exists. This occurs because we
cannot create a risk free asset with these risky assets – the absence of arbitrage therefore does not provide
a relation between R and the expected return on risky assets. In particular, z o  R .
In fact, we expect z o  R is the usual relation since you must compensate investors for their exposure
to the unavoidable factor 2 risk. Similarly, z 1 can be greater than or less than z o .
The Pricing Equation:
[ 103 ]
An interesting question to ask at this point is what can we learn from the pricing relation? Can we put it
in the context of other results we have seen?
u( Z ek )
Recall: that   (i.e. the state price density is equal to the marginal utility of future return on the

optimal portfolio scaled by the lagrange multiplier or the marginal utility of current consumption). Thus
the general pricing model tells us that pricing is determined by the covariance of an asset’s with the
marginal utility of the return of an optimal portfolio or the its covariance with the state price density.
Recall also that when there was a set of circumstances in which there was a simple sufficient statistic for
this marginal utility we were able to derive standard pricing equations like the CAPM (quadratic utility or
normally distributed returns) or the CCAPM (when consumption or consumption growth is sufficient
for marginal utility) from the general formula.
To make this point in a slightly different way consider the CAPM. Given the model:
(1)   a  bz m and 1  E[ z i ]
We can find constants γ and δ such that
(2) E ( z i )     im .
Conversely, given (2) we can find constants a and b such that (1) holds.
This simple result follows from the standard decomposition of 1  E[z i ] .

1 Cov ( , z i )
E ( zi )  
E ( ) E ( )
Write this as
1 Cov ( , z i )   Var ( )  1 Cov ( z m , z i )   Var ( ) 
E ( zi )        
E ( ) Var ( )  E ( )  E ( ) bVar ( z m )  E ( ) 
1   Var ( ) 
   im        im
E ( )  bE ( ) 
Where the second equality follows from   a  bz m . Further, if a riskless asset exists, note that
1
   R.
E ( )
From the last equation for expected return we can obviously go backward and show that there exists an
a and a b such that (1) holds. Thus the standard CAPM is equivalent to there being a state price density
that prices all assets that is linear in the return on the market portfolio. Note this may not be a strictly
positive state price density if the return on the market can be extreme.
Not too surprisingly, the same kind of result holds for the APT model as well. The pricing model there
is a multi-factor beta pricing model. From Cochrane (pg 107):
Given the model:

(1)   a  bf and 1  E ( z i )
where b and f are vectors, we can find a γ and a vector δ such that:
(2) E ( z i )      i
where the βi are the multiple regression coefficients of zi on f and a constant. Conversely, given (2) we
can find an a and a vector b such that (1) holds.
[104]
Thus these are equivalent pricing relations. This also gives us an idea of how we might look for or select
factors. The factors should be things that help model marginal utility. Thus select variables that will
proxy for how “happy” people are: zm, the business cycle, production, consumption, input and energy
prices, etc.
Asymptotic Arbitrage Opportunities (AAO):

An AAO exists if there exists a sequence of arbitrage portfolios wn n=2,… (here we are thinking of a
sequence of economies with the number of assets, n, increasing) such that:
n
 i 1
win  0 n (all are arbitrage portfolios)
n
n
w
i 1
i zi    0 n (expected return is bounded away from zero)
n n
w n ' w n   win w nj ij  0 (risk disappears)
i 1 j 1
For the variance all that is technically required is that some infinite sub-sequence has variance with a
limit of zero.
We focus on variance because as n   risk  0 . So, as n   you get a riskless payoff and mean-
variance analysis is a good approximation.
This is an extension of the notion of a riskless arbitrage with two cautions:

(1) In the simple case, a riskless arbitrage opportunity has arbitrary scale, so  profit comes
with any unbounded position. With AAO, you must be careful how you increase scale as
risk must still vanish in the limit. That is, if w n is a sequence defining an AAO, then w n is
as well for any   0 . However,  cannot become unbounded in an arbitrary fashion – it
must depend on n.
1
One way this can be done is to set  n  ( w n ' w n ) 4
Then, wˆ n   n w n is an AAO with infinite profit in the limit

n n
 i 1
wˆ in z i   n i 1 win z i   n   (as n   )
n n n n
i1  j 1 wˆ in wˆ nj ij   n2 i1  j 1 win w nj ij   n n
w n w nj ij
i 1  j 1 i

1
2
 0 (as n   )
(2) AAO is not a preference free idea. A riskless arbitrage opportunity guarantees infinite wealth
and an infinite certainty equivalent utility of return. i.e u ( Zwn )  u () , so an unbounded
position is taken
Chebyshev’s inequality (Pr ob[ ~ x    t ]  t 1 ) tells us that an AAO does guarantee infinite wealth
with probability one, however an infinite certainty equivalent wealth is not guaranteed. That is, not all
investors take a position in an AAO.
[ 105 ]
A counter-example in the text demonstrates this and we can see that an opportunity to increase wealth
with no investment and vanishingly small risk does not guarantee an increase in expected utility – in the
example, no investor invests in the AAO.
If a utility function is bounded below by a quadratic function, then an AAO is a good deal. But, few
“nice” utility functions are bounded in this way.
The Counter Example:

Let u ( z )   exp  z
Realized returns are {1  n,1  n,1}
Probabilities: {n 3 , n 3 ,1  2n 3 }
z n  1  n 3 [1  n  1  n  2]  1  0 n
 n2  n 3 [( n) 2  n 2 ]  2 n  0 as n 
Eu ( z n )  n 3 (e n1  e  n1  2e 1 )  e1   n 3 e n1
As n   , the limit of this is   , so investing in this AAO reduces utility to   .
Pricing With Idiosyncratic Risk:

Theorem 1: If the returns on the risky assets are given by a k-factor model with bounded residual risk
( 2i   2 i ) and there are no AAO’s, then: There exists a linear pricing model which gives expected
returns with a mean squared error of zero – that is  o , 1 ,...,  k (dependent upon n) such that:
K
vi  a i   o  k 1 bik  k
1 1 2
Lim  vi2  Lim v n 0
n n n n
where v n is the Euclidean norm of the length of the vector v: v n   in1 vi2
Proof: Select n assets and number them 1,…n. “Regress” their expected returns on the bik and call the
regression coefficients  k . (This is really a “population exercise,” formally it is a projection of the ai on
the space spanned by the matrix B and the vector 1 , the constant in the regression. If there is a multi-
collinearity problem, prespecify as many of the  k as is necessary to remove the problem.)
The residuals vi of this “regression” are given by:

ai  o   K bik  k  v i .
From the orthogonality property of the projection, we know:

 i vi  0 and  i vi bik  0 k
vi
Now, consider the arbitrage portfolio: win 
vn n
The payoff on this portfolio is:
[106]
 n vn 1
 i vi z i  ( n vn
1
 i vi (a i   k bik f k   i )
1
 ( n vn  i vi ( a i   i )
Expected profit is:

 n vn 1
 i v i ai  ( n v n ) 1 [o  i vi   i v i bik   vi2 ]
 ( n v n ) 1  vi2
vn

n
Variance of Profit is:

2 2
(n v n ) 1  vi2 2i 
n
vi2
Now suppose that the theorem is false so that:   0 in the limit.
n
vn
Then, cannot go to zero and an AAO exists.
n
So, if no AAO’s exist, then
2
vn vn  v i2
 0 and   0 as well. QED.
n n n
The derived no arbitrage condition is 1 N  i (a i   o   k bik  k ) 2  0 . Each term in this sum is non-
negative. The average term is zero, thus all but a finite number must be negligible.
More precisely, if we order the assets by the size of their absolute pricing error: v1    v n   , then
for any  , no matter how small, there exists a finite N such that fewer than N assets are mispriced by
more than  .
v1    v N 1    v N  
With an infinite number of assets, the probability (picking one at random) of getting one which has an
error of more than  is zero as the assets with this size error or worse are a finite set which has a
measure zero in an infinite set of assets.
Thus, the linear model prices most assets correctly and all with a negligible mean squared error. It can,
however, be arbitrarily bad at pricing a finite number of assets.
The Magnitude of the Residual Variance and the Pricing Bound:

Why doesn’t the residual variance enter the pricing bound equation? We might expect that assets with
small residual variance should be priced very closely, after all we know that among assets with no
residual risk the pricing is exact. This is confirmed in the following theorem (Dybvig ’83).
Theorem 2: Under the conditions of Theorem 1, the pricing error must satisfy the following:
[ 107 ]
1  vi2 
0
Lim  i 2
n n  
 i 
Proof: Look at a weighted regression.

a b
Regress i on ik
 i  i
The residuals are given as:
ai  b v
 o   k  k ik  i
 i  i  i  i
Where,
v i bik v 1
i   0 k and  i i  0
 i  i  i  i
[108]
Consider the portfolio:
vi
2
w n   i
i 1
 vi2 2

n i 2 
  i 
Profit on the portfolio is given by:

1 1
 
  v i2  2  vi
 
 n   2    2 (ai   k bik f k   i )
   i    i
 
1 1
 
  v i2  2  v
=  n   2    i2 (ai   i )
   i    i
 
Expected Profit is:

1 1
 
  v i2  2  vi
 n   2    2 (o   k bik k  v i )
   i    i
 
1 1
 
  vi2  2  v
=  n   2    i2
   i    i
 
1
 2 
  vi  2 
 
  2  
   i  
= 
n
Variance of Profit is:

1
  v2   2
 n  i    vi  2
   2i    4i
i
 
1
  v2   2
=  n  i2    vi
   i    2i
 
1
=
N
Now, suppose that the theorem is false. That is,

1
1 v 2
i 1  vi2 2
 cannot go to zero and an AAO exists.
  0 in the limit, then  2
n    i 
2
n  i 
So, if no AAOs exist,
[ 109 ]
1
1  vi2 2  
2
 2  and 1   vi  both  0 as n  .
n    i 
 n   2i 

Priced & Unpriced Risk:

We must specify the factors more fully before we can identify what risk is priced because: we can from
any k-factors create a different set of uncorrelated factors for which only one receives a positive factor
risk premium.
The model is:

z  a  f   with a  o 1  
Now, look at an orthogonal transformation of f:

fˆ  T f where TT   I so f  Tfˆ
Then, we can write:

z  a   Tfˆ    a  ˆfˆ  
where,   ˆT 
a  o 1  ˆT 
We must show T  is a vector with one non-zero entry and that the transformed model is valid.
The transformed model is proper since:
E (fˆ )  E (f T  )  E (f )T  0 and,
E ( fˆfˆ )  E (T ff T )  T E ( ff )T  T IT  I for any orthogonal T.
1
  
Now, choose a matrix T T    (  ) 2 , x 
 
kxk 1
 
…where x is any matrix (k x k-1) whose columns are mutually orthogonal and all are orthogonal to  .
By construction, T is orthogonal T T  I
1

So, I  T T  (T  (  ) , T x )
2
And,
1
T   ((  ) ,0,0,...,0)
2
The lesson being that in any equilibrium, only a portion of the uncertainty (risk) brings compensation.
This can be true even when unpriced risk is common to many or all assets. We can’t make any
meaningful economic statements until priced and unpriced sources of risk are identified. The CAPM by
its preference restriction does this. In particular, we must say something further about the nature of the
factors in the returns generating model or we cannot attach any significance to the size or the sign of the
k .
[110]
Fully Diversified Portfolios:
We would like to say that  k  ( z k  R ) . The problem is that all such ~
z k may not be priced the same
since pricing is only approximate. So,  k would not be well defined.
A fully diversified portfolio is the limit of a sequence of positive net investment portfolios whose
weights satisfy:
n
Lim ni 1 ( wi (n)) 2  c   i.e.. the wi vanish for most assets.
n
wi 0( 1 n ) for all but a finite number of assets in a fully diversified portfolio.
The important feature is of course that they have no residual risk in the limit:
2
n 2 2 n 2  
Lim i 1 wi  i  Lim(ni 1 wi )   0
n n
 n 
There may also exist less than fully diversified portfolios with no residual risk.
Theorem 3: The expected returns on all fully diversified portfolios are given correctly with zero error
(in the sense that ( wi v i ) 2  0 so the error on any fully diversified portfolio is negligible) by any linear
pricing model satisfying Theorem 1.
Proof:
The expected return on the nth portfolio in a sequence is:
aF   o   K wi bik  k   wi vi
  o   K bFk  k  v F
If  wi v i  v F , n vanishes, the pricing is exact.
2 2 2 2 2
 v i2
Consider v F  ( w i v i )  w v
i i  (n  w ) i
n
(From Cauchy-Schwartz inequality, ( E[ XY ]) 2  E ( X 2 ) E (Y 2 )) , but the 1st term is bounded and the
second term goes to zero from Theorem 1.)
So, for a fully diversified portfolio, we know o is the return on a portfolio with no factor risk, but it
also has no residual risk since it is fully diversified. So, o  R must hold.
Also, for fully diversified portfolio with one source of factor risk.
i.e. bik=1 and bij=0 j  k , the expected return is exactly z k  R   k
So,  k  z k  R is now well defined.
Interpreting the factor premiums:

Define ˆ n  (1,  n ) the augmented factor loading matrix, an n x k+1 matrix.
Assume for now that it has rank = k+1 for large n.  n must have rank k or there is unavoidable risk –
i.e. columns of  n are collinear and none of the column of  n should be collinear with a constant
vector. In either case, you cannot form vectors with one source of factor risk.
[ 111 ]
Define Qn  ˆn ˆ n / n and assume the sequence Qn has a limit. Qn is k+1 x k+1 independent of n and
has a limit if each element does. We know these are all bounded since the factor loadings are bounded.
Consider the sequence of portfolio formation problems using n = k+1, k+2,… assets:
Min 1 2  wi2 such that  wi  1  wi bik  1  wi bij  0 j  k
So, form well diversified portfolios with single factor risk (positive investment):
L  1 2 ww   (c  ˆ w)
where  is a vector of Lagrange multipliers and c is a vector with a 1 in the 1st and kth positions and
zeros elsewhere.
The FOCs are:

(1) 0  w  ˆ 
(2) 0  c  ̂ w
Pre-multiply (1) by   (1)0  ˆ w  ˆ ˆ  c  ˆ ˆ

Or,   ( ˆ ˆ ) 1 c
Or, from (1) again w  ˆ ( ˆ  ) 1
By construction, this portfolio has a single source of factor risk and is fully diversified if:
nww  nc (ˆ ˆ ) 1 ˆ ˆ (ˆ ˆ ) 1 c  c Q 1c is bounded.
Since c and Qn are finite in size this can be unbounded only if Qn becomes singular in the limit. Qn is
non-singular for finite n since  n is of full column rank.
If c is the vector i1 the portfolio has no factor risk and is fully diversified.
Its expected return must be o  R .
The portfolio interpretation:

ai  R   bik ( z k  R )
where z k is the expected return on a portfolio with one unit of factor k risk and no other risk (bij  0
j  k and fully diversified) is valid if Qn is non-singular in the limit.
This makes precise our loose “sufficiently different” phrase in an economy with residual risk.
Exact Pricing in the Linear Model:

“Under what conditions does the total error converge to zero – when is pricing exact for all assets?”
 A sufficient condition:
At least one investor chooses to hold a fully diversified portfolio and each asset’s idiosyncratic risk is
a fair game with respect to the factors:
E ( i f1  f k )  0 i
[112]
Exact pricing follows from the separation result we saw previously.
 When will some investor hold a fully diversified portfolio?

Is it correct to say that idiosyncratic risk is not priced and so all risk averse investors hold well-
diversified portfolios?
No – this is circular reasoning.
The idea is that if the risk is diversified away by investors, then in equilibrium it will not be priced –
the argument above says it’s not priced so no one holds it.
Example: One assets’ idiosyncratic risk is priced – a zero factor world:

z1  a1  ~1
~ z i  a  ~i
~ i  2
~
Let the  i i  2 be i.i.d. if a=R, assets 2, 3, 4, … are all priced “correctly.”
However, if a1  R , asset 1 is not – can this be sustained in an equilibrium? Yes.
By symmetry, all investors hold assets 2, 3, … equally, in the limit this duplicates the riskless asset, so
a=R in any equilibrium.
Suppose  1 ~ N(0, 2 )
u  e Z
We know that the portfolio weight on asset 1 is:

a R
w 1 2 with 1-w put in the “riskless”

So, a1  R is consistent with equilibrium.

Here, the  1 risk of asset 1 is a significant portion of all market risk.
We do have…
Theorem 5: A sufficient condition for the arbitrage model to provide exact pricing in the limit is if:
(i) returns are given by a factor model with E ( i f1  f k )  0 .
(ii) The market proportion or supply of each asset is negligible
(iii) The loadings on each factor are spread evenly among the assets
(iv) No investor takes an unboundedly large position in any asset ( w *i  w i )
(v) Marginal utility is bounded above zero. Then, the pricing relation
ai  o   bik  k i
holds and the pricing errors converge to zero in the sense that:
n
Lim i 1 vi2  0
n
n vi2
i.e. all errors must be negligible in the limit (it’s not Lim i 1  0)
n  n
“Proof”:
(ii) and (iii) describe a “fully diversified economy”
(iii) - no asset is a big portion of the market portfolio
(iii) - assures that each factor affects many assets and that each factor must therefore make an
identifiable contribution to market risk – no multi-collinearity in the columns of  n
[ 113 ]
In a fully diversified economy, all investors can simultaneously hold fully diversified portfolios with
factor loadings spread among their portfolios in any form.
(iv) and (v) guarantee an interior optimum to each investor’s choice problem
(iv) - says risk aversion doesn’t vanish so variance is always disliked
(v) - nonsatiation – expected returns are always liked
(i) the residuals don’t matter for pricing in the limit, so you get a bound no the vi  0 has the same
role as in the discussion of the separating distributions.
[114]
8 Stochastic Discount Factors - Lucas

THE MODEL
 Lucas develops an infinite horizon, discrete time, representative agent model of an “endowment
economy” in order to examine the basis for and behavior of price movements.
.
 Utility of consumption is assumed to be given by: Et  
j 0

 j u (c t  j )
So, the assumption of additively separable utility is used but u () is reasonably general.
Where,
ct is the date t consumption level (consumption of the single good)
 is the periodic discount factor (0    1)
u () is the periodic utility function over the single consumption good. u : R   R  is
continuously differentiable, bounded, increasing, and strictly concave with u(0)=0.
 Production: There are n productive units (or “Lucas trees” as they have come to be known)
exogenously producing (no production decision is made nor is any scarce resource used in
production) different (weakly positive) amounts of the same good. yit is the stochastic output of unit
i at time t and y t  ( y1t ,  , y nt ) is the vector of output levels of the economy at time t.
 The good is perishable (none can be saved so there is no riskless asset), so aggregate consumption at
time t must satisfy: 0  c t   in1 y it  1 y t (consumption is weakly positive).
 ct is a choice variable (recall: there is no investment of the consumption good in production).
 yt follows an exogenously specified Markov process defined by the transition function:
F ( y , y )  Pr{ y t 1  y  y t  y} the agent can not manipulate this in any way.
 Each productive unit has one (perfectly divisible) share of stock outstanding which trades on a
competitive market.
 Ownership of a share at time t (“the beginning of date t”) entitles the owner to the time t production
of the associated “tree” – shares then trade at (“the end of”) time t at ex-dividend prices
pt  ( p1t ,  , p nt ) to assign ownership of the time t+1 production
 Denote a consumer’s beginning of period t holdings as z t  ( z1t ,  , z nt ) , thus at time t the agent
chooses ct and zt+1 (what to eat and what to buy at prices pt  ( p1t ,  , p nt ) ).
 The representative agent nature of the model and the assumption of increasing utility imply that, in
equilibrium, the following must be true: ct   in1 y it (consume or it vanishes, there is no other use)
and z t  1  (1,1,  ,1) t (demand must equal supply in equilibrium).
 Because the only uncertainty in the model is introduced via “production” and because utility is
recursive – at each date the choice problem looks the same – price should be a fixed function of the
“useful history” of production. Given that production, yt, follows a Markov process the useful
history of production is summarized in the current state. Thus, the price vector at the end of date t
is a fixed function of the time t production: pt = p(yt).
[ 115 ]
 Therefore, knowing the transition function governing production F ( y , y ) and the price function
p(yt) will allow determination of the process followed by price.
 Similarly, ct () (current consumption) and z t 1 () (future holdings), which are the agent’s time t
choice variables, depend on current holdings (zt), current production (yt), and what you think the
price will be ( pt) at each future point in time. So we have fixed functions ct  c t ( z t , y t , pt ) and
z t 1  z t 1 ( z t , y t , pt ) .
 To “close” the model - Lucas uses rational expectations – the idea that the anticipated or
hypothesized price function used in the agent’s optimization problem is the same as the realized
market clearing price function (the true price function).
Definition: An equilibrium is a continuous function p( y ) : E n   E n (En+ is the subset of n-

dimensional space with non-negative elements) and a continuous bounded value function
v( z , y ) : E n   E n   R  such that:
(i)  
v( z, y )  Maxc, x u (c )    v( x, y )dF ( y , y )
s.t. c  p ( y )  x  y  z  p( y )  z
with c  0 and 0xz
where z is a vector with all n elements exceeding 1.0
(ii) for each y, v(1, y ) is attained by c   i y i and x  1

i.e. – an equilibrium is (i) a price function and a value function such that the agent’s consumption and
investment (in shares) decisions at each point in time maximize the value function (total discounted
expected utility) subject to the budget constraint and (ii) that the optimal consumption and investment
decisions given that price function are market clearing choices.
Let’s build up to the solution slowly:

 Examine first a one-period (two date) version of his model because we have seen a version of this before –
the agent has endowment yo – generated by his/her endowed initial holding of all the assets (perhaps
thought of as coming from “last period”) – the uncertainty is described by a distribution function
governing future production F(y) with density dF(y)/dy = f(y) (here we need not condition on the
previous production level as only y1 is uncertain).
 Utility of consumption for the 2 dates is u (co )   u (c1 )

 The budget constraint is given by: co  p  x  y o  p  1 (what you eat and buy must cost weakly
less than your dividends and proceeds from asset sales). p is the price of assets that payoff y1. And c1
must be satisfy c1  x  y1 .
 Market clearing is simply co   in1 y io  1y o ' , x  1 , and c1   i y i1 (which comes for free)
 Equilibrium: A vector p  ( p1 , , p n ) , a consumption choice co, and an investment choice x

that maximize:
u (co )  E[  u (c1 )] s.t. co  p  x  y o  p1 , and c1  x  y1
and markets clear co= 1yo’ , x  1 , and c1 = 1y1’
[116]
 Write the agent’s optimization problem as:
Maxco , x u (co )  E[  u ( x  y1 )] s.t. co  p  x  y o  p  1
 The Lagrangian is:

L= u (co )  E[  u ( x  y1 )]   (c o  p  x  y o  p  1)
 The FOC’s are:

co  u (c o )    0
x E[u ( x  y1 ) y1 ]  p  0
Budget  co  p  x  y o  p1 (which will hold with equality at the optimum)
 Solving the FOC’s and imposing the market clearing provides:

  u (co ) so E[ u ( x  y1 ) y1 ]  E[ u (c1 ) y1 ]  u (c o ) p
x 1
co  1 y o ' and c1  1y1 '
E[ u (c1 )  y i1 ]   u (c1 )    u (c1 ) y i1 
pi   E   y i1  or E    1
  u ( c o )   u ( c o ) p i 
Does the solution for p look familiar? (the expectation of the state price density times the return
on any asset must equal 1 or price is the discounted value of future payoff)
 As a specific example of this, consider the following utility function:

u (co , c1 )   1 2 (c o  c*) 2  1 2  [(c1  c*) 2 ]
(the negative signs appear because c* is the satiation level of consumption).
 Then, u (c o )   (co  c*)

u (c1 )    (c1  c*)
 From above, we know:

  (c1  c*) 
p  E  y1 
 (co  c*) 
 To transform this result into something familiar, recall that everything here is in real terms and that
there is a single consumption good.
So,
W1  c1 wealth at t=1 results only from “dividends”
W
W1  R1 (Wo  c 0 )
y 
where, the return to aggregate wealth is: R1W   in1 wi R1i   in1 wi  i1 
 pi 
with  i wi  1
W
 R 1 is the return on aggregate wealth or the return on the market portfolio.
[ 117 ]
 Now write:
 (c1  c*)  [ R1W (Wo  c o )  c*]

(c o  c*) (co  c*)
  c *  (Wo  c o ) W
  R1
co  c * co  c *
 a o  bo R1W
 So, the state price density (stochastic discount factor or marginal rate of substitution) can be written
as a linear function of the return to aggregate wealth or the return on the market portfolio – which
leads to CAPM pricing as we know….and it should given the assumption of quadratic utility. We
could instead just note that there must be 1fs in this model and proceed from that direction using
the pricing results we developed previously.
Infinite Horizon Model –

 Equation (6) in the Lucas paper is the equivalent of the FOC in the static model:
u ( y i ) pi ( y )    u ( j y j )( y i  pi ( y ))dF ( y , y )
which basically says that at the optimum the cost of buying marginally more of any asset
(u ( y i ) pi ( y )) (cost put in utility terms) will equal the expected marginal benefit of the purchase
(again in utility terms). The same interpretation as the FOC given above.
 Let’s rewrite this equation to move toward a more familiar presentation:
u (c o ) pi ( y o )   E[u (c1 )( y i1  pi ( y1 ))]
This simply uses market clearing conditions, rewrites the expectation, and restores (for clarity) some
time subscripts that Lucas removes due to the stationarity of the decision variables. Note this holds
not just for times 0 and 1 but for any t and t+1.
 Now write:
 u (c1 ) 
pi ( y o )  E  ( y i1  pi ( y1 ))
 u (c o ) 
Something that looks a lot like an old friend, but not quite – the difference derives from the multi-
period nature of the problem used here. We can think of the expression as pointing out that in the
infinite horizon model you derive two benefits from owning assets, dividends and capital gains.
[118]
 Now recall that in this problem, price is a fixed function of aggregate output – it is independent of
time:
  u (c1 )    u (c1 ) 
pi ( y o )  E  y1   E  p i ( y1 )
 u (co )   u (co ) 
  u (c1 )    u (c1 )  u (c 2 ) 

 E y1   E   ( y 2  pi ( y 2 ))
 u (co )   u (c o ) u (c1 ) 
  u (c1 )    2 u (c 2 ) 
 E y1   E  ( y 2  p i ( y 2 ))
 u (co )   u (co ) 
 This ultimately results in:

  u (ct ) 
pi ( y o )  t 1 E   t yt 
 u (c o ) 
 More familiarly,

pi ( y o )  t 1 Emot  y t  (to anticipate the notation in our next lecture we write m as the
stochastic discount factor instead of λ)
u (ct )
where, mot   t  mo1  m12  m 23  m34  mt 1t
u (c o )
So, the multi-period version of our standard pricing representation is just a natural extension of the static
model, today’s price is today’s value of the entire stream of dividends that are expected to be received
from ownership of the asset. Valuation at each date is done by multiplying the payoff by the state price
density or stochastic discount factor for that date relative to today.
Consider briefly the solution process used by Lucas:
Proposition 1:
This tells us that for each function p(y) we might choose, there exists a unique value function
v(z, y; p) that satisfies condition (i) – i.e.:

v( z, y )  Maxc, x u (c )    v( x, y )dF ( y , y ) 
s.t. c  p ( y )  x  y  z  p( y )  z c0 0xz
The value function v() represents the optimal consumption and investment decisions for the agent given
the expected price function p(y).
Proposition 1 also establishes that for each output vector y, the value function v(z, y; p) is an increasing
function of z – so we have a well-behaved problem.
Proposition 2:
Gives us the derivative of v(z, y; p) with respect to z
[ 119 ]
With this derivative and the equilibrium conditions in (ii), we can derive the stochastic Euler equation (6)
that represents the solution to the problem.
Then, he notes that (6) does not involve the particular value function v(z, y; p) used in its own derivation.
Thus, (6) must hold for any equilibrium price function (see proposition 1). Conversely, if p*(y) solves (6)
and v(z, y; p*) is constructed as in proposition 1 to be the unique value function associated with p* then
the pair p*(y) and v(z, y; p*) represent an equilibrium. Thus, all equilibrium price functions solve (6) and
any solution of (6) is an equilibrium price function.
Proposition 3:
This provides that there is exactly one solution to (6) and so exactly one equilibrium price function for
this economy.
In the endowment economy asset prices adjust to conform to the consumption pattern.
Example:
The two date structure of the last example is unattractive because most investors do not set their
portfolios, consume, and then die. We can derive our old friend the CAPM in an intertemporal context
by substituting for the 2nd date utility function with a quadratic value function.
 Consider the two date quadratic “utility” function:

u  u (co )   Eo [v(W1 )]
 Thus, we suppose the investor cares about current consumption and the wealth she carries forward.
To simplify things (also necessary for the quadratic example) we are also going to place an extra
restriction on the transition function F ( y , y ) by requiring the return to aggregate wealth RW to be
i.i.d. each period.
 If such an investor considers whether to buy marginally more of an asset for price pio and receive an
increase in its payoff at time 1 (of yi1 per unit), the “FOC” associated with this decision is:
p io u (c o )   Eo [v (W1 )( y i1  pi1 )]
Which gives us:
  v (W1 )    v (W1 ) 
pio  E  ( y i1  pi1 ) or, m01   
 u (c o )   u (c o ) 
Noting that at an optimum it must be that the marginal value of an extra penny consumed must
equal the marginal value of a penny saved or,
  v (W1 ) 
u (c o )  v (Wo  c o ) we can write m01   
 v (Wo  co ) 

 Now, impose the restriction that the value function is quadratic: v(W1 )   (W1  W *) 2
2
 Then, as we did above, we can write m01  a o  bo R1W which tells us that we have a case where the
CAPM holds period by period if our value function makes any sense.
Now, let’s see where this assumed quadratic value function comes from:
 To derive the CAPM result we used two assumptions: (1) the value function depends only on wealth
and (2) it’s a quadratic function of wealth. The first told us pricing would be done in terms of wealth
[120]
or returns on aggregate wealth and the second told us it would be a linear function of the returns on
aggregate wealth that sets prices.
We really want to start from:
u (c, c, c, )  E t  j 0  j u (ct  j ) as in the Lucas paper.
 Let’s continue to talk in terms of returns to wealth or the return on the market portfolio – this is just
thinking in terms of all the n assets together and of each “y” as a dividend.
 So, paying the price pt gets you yt+1 (dividend) and pt+1 (capital gain)
 Now, let’s define a value function – as Lucas does, we’ll label it v(Wt) rather than v(z, y). We drop the
z since we will suppress the portfolio choice in what comes and we substitute W for y since we are
putting things in terms of wealth.

v(Wt )  Maxc , x Et  j  0  j u (ct  j )
 Now, break out current consumption:

 

v(Wt )  Maxct , xt u (c t )   Et Maxc , x Et 1  j 1  j u (ct  j )
or,
v(Wt )  Maxct , xt u (c t )   E t [v (Wt 1 )
 This is the power of using a value function in dynamic programming – it allows us to express an
infinite-period problem as if it were a two-date problem; if we’re careful.
 Now – if u () is quadratic, is v() quadratic?

The answer is that this is one special case in which this is true. We can show this by first guessing
that v(Wt+1) is quadratic with unknown parameters, then using the definition of v(Wt) and solving the
two-date problem to find the optimal consumption choice.
 Then, we plug this optimal consumption back into the equation for v(Wt) – if the guess was right, we
will get a quadratic function for v(Wt) and be able to determine the unknown parameters.
 Let u (ct )   1 2 (ct  c*) 2 and guess that


v(Wt 1 )   (Wt 1  W *) 2 with  and W* unknown
2
 The “two-date” problem for consumption choice (continue to submerge x since it won’t affect what
we want to learn)
  
v(Wt )  Maxct   1 2 (ct  c*) 2  Et (Wt 1  W *) 2 
 2 
W
subject to Wt 1  Rt 1 (Wt  ct )
 Substitute the constraint into the maximand and take the derivative with respect to ct.
0  (ct  c*)  E t [( RtW1 (Wt  ct )  W *) RtW1 ] at the optimum
[ 121 ]
 Let ĉt represent the optimum, so:
cˆt  c*  E t [( RtW1 (Wt  cˆt )  W *) RtW1 ]
 Solve for ĉt :

2 2
cˆt [1   E[ RtW1 ]]  c *  E ( RtW1 )Wt  W * E ( RtW1 )
2
c *  E ( RtW1 )Wt  W * E ( RtW1 )
cˆt  2
[1  E ( RtW1 )]
 Note: ĉt is linear in Wt so the value function using ĉt is:


v(Wt )   1 2 (cˆt  c*) 2  [[ RtW1 (Wt  cˆt )  W *]2 ]
2
showing that v(Wt) is a quadratic function of Wt and ĉt since a quadratic function of a linear
function is quadratic. Thus, the value function is indeed a quadratic function of Wt. The complete
solution of the problem solves for the unknown parameters  and W*.
 This is, essentially, what Lucas does in a much more general framework and with much greater
precision.
[122]
9 Stochastic Discount Factor – Modern Approach

Classic Issues in Finance from the Stochastic Discount Factor (“SDF”) Perspective
The basic pricing equation v  E[mY ] or 1  E[mZ ] allows us to illustrate many of the classic lessons
of finance in a simple way. (For today’s discussion assume there exists a riskless asset.)
(1) The time value of money or the risk free rate
1
Rf  Demonstrating this is simple.
E[m]
If a risk free asset exists – m must price it.
1
1  E[mR f ]  R f  since Rf is a constant.
E[m]
Let’s use this relation to consider the economics behind risk free rates:
1 1
Assume u (c)  c power utility for illustration (log utility as   1) so u (c)  c  .
1 
From earlier discussions we know if there is no uncertainty about ct+1, then


1 1 1 1 c 
Rf  = = =   t 1 
E[m] u (ct 1 ) ct1   ct 
 
u (ct ) ct
So:
(A) Real interest rates are higher when people are impatient (  is low). It takes a high real rate
to get impatient people to save rather than consume. The impatience parameter can
reasonably be treated as an exogenous parameter of the economy.
(B) Real rates are high when consumption growth is high. If consumption growth is high, it
requires a high rate of interest to induce investors to consume less now in return for more
consumption later. This interpretation considers Rf to be determined by the consumption
pattern which follows production (as in Lucas).
Alternatively: If Rf is high, consumers save more, increasing consumption growth. In this

interpretation Rf is looked at as being determined by the production technology and that
consumption adjusts to it. Actually, all three are endogenously determined.
(C) Real rates are more sensitive to changes in consumption growth when  is large:
 1
R f  c 
   t 1   Increasing in 
  
ct  1
ct
  ct 
If  is large, then utility is more concave (for this utility function,  measures both relative
risk aversion – aversion to consumption changing across states of nature and intertemporal
substitution – aversion to consumption changing across time periods). With more concave
utility, the investor wants very badly to maintain a smooth consumption stream across time
[ 123 ]
and across states of nature. This implies investors are less willing to change the consumption
stream across time in response to interest rate incentives. Thus, it takes a bigger change in Rf
to induce the investor to a given consumption growth. Said differently, consumption is less
sensitive to interest rates when the desire for a smooth consumption path is high.
Now introduce uncertainty: Let r f  Ln( R f ),   e  , Ln(ct 1 )  Ln(ct 1 )  Ln(ct )

Assume consumption growth is lognormally distributed.
 
 
 1 
R f  E 

  ct 1  
    
  ct  
The combination of power utility and a lognormal distribution implies this expectation can be
written:
2
r f    Et (Ln(ct 1 ))   t2 (Ln(ct 1 ))
2

 c  c 
2 c  Ln  t 1  Ln  t 1 
z E ( z ) 1 2 (z)  ct   ct 
This comes from: if z is normal, E (e )  e . Note that  t 1  e e
 ct 
ct 1 c 
and that since is lognormal Ln t 1  is normal.
ct  ct 
So, Rf is high when:

(A) Impatience is high (high  or low )
(B) Ln(ct 1 ) - consumption growth is expected to be high
(C) higher  makes Rf more sensitive to expected consumption growth; Ln(ct 1 )
(D)  2 (Ln(ct 1 )) captures precautionary savings (it’s negative), the impact of uncertainty in this
model. When consumption growth is very volatile, people with power utility are more concerned
with low consumption states than they are pleased by high consumption states so they save to avoid
the dips at sacrifice of current consumption. This is the concavity in power utility and it implies that
people want to save more, driving the risk free rate down.
(2) Risk Corrections

E[Y ]
v(Y )  E[mY ]  E[m]E[Y ]  Cov (m, Y )   Cov (m, Y )
Rf
Current price equals the expected payoff discounted by Rf, plus a risk correction.
Substituting for m from consumption/investment problem:

E[ y i ,t 1 ] Cov (  u (c t 1 ), y i ,t 1 )
vt ( y i ,t 1 )  
Rf u (ct )
Since u  is decreasing in ct+1, an asset sells for a lower price if its payoff covaries positively with future
consumption or consumption growth (negatively with marginal utility of consumption).
Why? Investors don’t like uncertainty about consumption. Assets that payoff more in states you are
wealthy and less in those you are poor add to the volatility of consumption. Requires a lower price on
this asset for you to hold it – i.e. it provides compensation for the risk or a risk premium.
[124]
In terms of returns:
1  E[mz i ]  E[m]E[ z i ]  Cov (m, z i )
so,
E[ z i ]
1  Cov (m, z i ) or, E[ z i ]  R f  R f Cov (m, z i )
Rf
Cov (u (ct 1 ), z i )
E[ z i ]  R f 
E[u (ct 1 )]
All assets have an expected return that is the risk free rate plus a premium that is positive for assets
whose returns covary positively with consumption.
Why covariance matters rather than variance…

Remember that an investor really cares about volatility of consumption, not the volatility of individual
assets or even the volatility of the return on her portfolio. Consider what happens to the volatility of
consumption if an investor buys a little more, w, of a payoff y:
 2 (c  wy )   2 (c )  w 2 2 ( y )  2 wCov (c, y )
At the margin, i.e. for small w, the change in volatility of consumption comes from the Cov(c, y) term so
the way in which y contributes to what matters is via its covariance with consumption, not via its own
variance.
(3) Idiosyncratic risk is not priced

One interpretation of what we just saw is that assets with very volatile payoffs or returns need not have
large risk corrections. It is only the portion of payoff that is correlated with the stochastic discount
factor that implies a risk correction is required. The portion of payoff uncorrelated with m receives no
such risk correction, even if this volatility is large. (Where have we seen that before?)
E[ y ]
If Cov(m,y) = 0, then v( y )  , no matter how volatile is y (how large is  2 ( y ) ). Recall  2 ( y ) has
Rf
no first order effect on consumption volatility.
For any random payoff y, consider the decomposition: y  proj ( y m)   where proj ( y | m) is the
projection of y on m. This is the portion of y’s volatility that is perfectly correlated with m (it’s like
m where  is a regression coefficient from a regression of y on m with no intercept).
E[my ]
proj ( y m)  m , further we know that E[ ]  E[ m]  0 by construction
E[ m 2 ]
Then, the value of the projection of y on m is equal to the value of y itself.
 E[my ]   m 2 E[my ] 
v( proj ( y | m))  v 2
m 
  E  2
  E[my ]  v( y )
 E[m ]   E[m ] 
So,  must have a price of zero: v( )  0 its expectation is zero and it is orthogonal to m.
Note that here the  ’s can be uncorrelated across assets (as is assumed in the APT) or not (as allowed in
the CAPM). But, since this is based on the absence of arbitrage and the APT is as well, and the CAPM
is an equilibrium model, we knew that would follow.
[ 125 ]
(4) Expected Return/Beta Representations
Assume there exists a riskless asset:
1  E[mz i ] i  1  E[m]E[ z i ]  Cov (m, z i )
 E[ z i ]  R f  R f Cov (m, z i )
Cov (m, z i )   Var (m) 
 Rf   
Var (m)  E[m] 
 R f   i ,m  m
where  m is the price per unit risk (and a function of Var(m)) and  i, m measures the quantity of risk for
asset i.

c 
To relate this to the underlying variables of interest, recall m    t 1  assuming power utility.
 ct 

Cov (m, z i )   Var (m)  c 
Performing a Taylor’s series expansion of E[ z i ]  R f     with m    t 1 
Var (m)  E[m]   ct 
ct 1
around consumption growth, , gives:
ct
Cov ( z i , c) c
E[ z i ]  R f   i , c  c where  i , c  ,  c  Var (c ) , and c  t 1
Var (c) ct
In the continuous time limit this approximation becomes precise.
 This says that expected returns increase linearly with an asset’s beta with consumption growth. This
is the consumption CAPM relation. This falls directly out of the power utility framework.
 The price of risk in this case depends upon the risk aversion coefficient of the investor and the
variance of consumption growth (the fundamental risk facing the investor).
 The more risk averse are agents or the more risky their consumption, the larger is the expected
return required to induce them to hold risky assets (assets that covary positively with consumption).
Now consider the following: If instead of the SDF (m) being a function of consumption growth we
assume m  a  bz Mkt , then we know that for any asset i:
1  E[m  z i ]  E[az i  bz Mkt z i ]
or,
1  az i  bz i z Mkt  bCov ( z i , z Mkt )
or,
1 b
zi   Cov ( z i , z Mkt )  R f  bR f Cov ( z i , z Mkt )
a  bz Mkt a  bz Mkt
recognizing that R f  1 E[ m ]
This must hold for all assets i and also for zMkt. So,
z Mkt  R f  bR f Var ( z Mkt )
[126]
and,
( z Mkt  R f )
b 
R f Var ( z Mkt )
So,
Cov ( z i , z Mkt )
zi  R f  ( z Mkt  R f )
Var ( z Mkt )
So if the SDF is a linear function of the returns on the market portfolio we get our familiar CAPM
pricing relation.
One comment: Cochrane likes to use returns in specifying m since then it has a neat interpretation. But,
if m=a +bzMkt it may not always be positive. He’s a little loose with this – a stochastic discount factor
that is not strictly positive can correctly price the assets it is just not the strictly positive SDF guaranteed
by the absence of arbitrage. If we instead rely on the law of one price we know there is a SDF, but it is
not restricted to be positive. In the absence of arbitrage there must be a strictly positive SDF
( m( s )  ke  az Mkt ( s )  0 ) giving a beta pricing representation.
(5) Mean-Variance Frontier

All assets priced by a SDF must obey:
 ( m)
E[ z i ]  R f   ( zi )
E[m]
This follows again from:

1  E[mz i ]  E[m]E[ z i ]   m , zi  ( z i ) (m) (using correlation instead of covariance)
 (m)
 E[ z i ]  R f    m , z i
 ( zi )
E[m]
And since  1    1 , the result follows.
We can see several things based on this simple relation:
(A) Since  1    1 , means and variances of all assets must lie in the wedge shaped region
bounded by the minimum variance frontier derived in the development of the CAPM. Thus,
the frontier is of general interest without assuming M-V preferences.
E[z]
Slope  E([ mm )]
Systematic Risk
  Asset i
Rf Idiosyncratic Risk (see note (F) below)
 (z )
(B) Only if   1 does asset i lie on the minimum variance efficient frontier. Thus all portfolios
on the frontier are perfectly correlated with the SDF, m. Returns on the upper limb have
 m , z  1 , so are perfectly negatively correlated with m and thus, perfectly positively
i
correlated with consumption growth. They are maximally risky and so demand the highest
[ 127 ]
expected return per unit variance. The converse is true for assets on the lower limb.
(C) All frontier returns are also perfectly correlated with each other. They are all perfectly
correlated with m. Therefore, we know we can span the return of any frontier portfolio
using any two distinct frontier returns. For example, pick any single frontier return z MV 1
(not Rf). Note that Rf is also on the frontier. Any other frontier return zMV2 can be written
z MV 2  R f  a ( z MV 1  R f ) for some constant a.
Show this is true as a homework problem.
(D) Since each return on the frontier is perfectly correlated with m, we can find constants a, b, d,
e such that for any minimum variance return:
m  a  bz MV and, z MV  d  em
What does this mean? It means that any mean-variance efficient portfolio contains all the
pricing information in m. For example, in the CAPM the market contains all the pricing
information in m, which we knew already. Its return is a sufficient statistic for m or marginal
utility. Thus, zMkt can serve as m – requires zMkt is on the frontier. Given any mean-variance
efficient return and the risk free rate, we can find a SDF that prices all assets, and vice versa.
See problems 1-3 in Cochrane.
(E) Given m, we can also construct a single-beta representation such that expected returns are
expressed in a single-beta model using the return on any mean-variance efficient portfolio
(except Rf):
E[ z i ]  R f   i , MV ( E[ z MV ]  R f )
(F) All asset returns can be decomposed into a “priced” or systematic component and a “non-
priced” or idiosyncratic component. The priced component is perfectly correlated with m
and any frontier return and so this component would “plot on the frontier” (see picture on
previous page). The unpriced component is uncorrelated with m and generates no expected
return or risk adjustment. (Recall the decomposition we did in our development of the
CAPM.)
Note: Assets “inside” the frontier are not “worse” than assets on the frontier. The frontier
and its interior characterize equilibrium asset returns. Rational investors are happy to hold
all assets. You just don’t put all your wealth in an inefficient asset, but you are happy to put
small amounts of wealth in many such assets.
(6) The Slope of the Mean/Standard Deviation Frontier and “The Equity Premium Puzzle”
The ratio of the mean excess return to standard deviation is known as the Sharpe Ratio:
E[ z i ]  R f
Sharpe Ratio 
 ( zi )
This is more interesting and a better indication of performance than mean return alone. For example, if
you borrow at the risk free rate and invest the proceeds into some risky security, you increase expected
return but you don’t increase the Sharpe ratio since  p increases at the same rate as E[zp].
The slope of the mean/standard deviation frontier is the maximal Sharpe ratio. It tells us how much
more mean return you can get by taking on added (priced) volatility.
Let zMV be a frontier return.
[128]
 ( m)
From E[ z i ]  R f   ( zi )
E[m]
We can see that:
E[ z i ]  R f  ( m)
   (m) R f for all assets i.
 ( zi ) E[m]
For zMV (that is, for frontier returns), since their correlation with m is one:
E[ z MV ]  R f  (m)
MV
   ( m) R f .
 (z ) E[m]
Thus the slope of the frontier is governed by the volatility of m and this slope we know determines the
risk premium.

 c 
Consider again the power utility framework: u (c)  c and so m    t 1 
 ct 
  c  

   t 1 

MV
E[ z ]  R f   ct 

Then:   
MV 
 (z )  c  
E  t 1  
 ct  
 (m) increases if consumption is more volatile or if  is large.
If consumption growth is lognormal, “it can be shown,” using the transformation above, that:
E[ z MV ]  R f  2 2 ( Ln ( ct 1 ))
 e  1   (Ln(ct 1 ))
 ( z MV )
This shows more directly that the slope of the mean/standard deviation frontier is higher if
consumption growth is more volatile or if risk aversion is higher.
In post-war data (50 years) for the U.S., E[ z Mkt ]  9% ,  ( z Mkt )  16% , and R f  1% (all in real
terms).
Aggregate consumption growth has had a mean about equal to 1% and a standard deviation of about
1%. We can plug these values into the above equation to get:
9%  1%
 0.50   (.01)
16%
This implies a risk aversion coefficient of roughly 50!
This is an order of magnitude too high to be believable. The interpretation is that consumption is not
volatile enough to explain asset returns unless investors have risk aversion coefficients much larger than
we think they are. This is the point of the famous Mehra-Prescott paper.
Possible Explanations:
1. People are much more risk averse than we think.
2. Stock returns are largely a result of unexpected good fortune over the last 50 years and are not
indicative of expectations.
3. There may be real problems with measures of consumption.
[ 129 ]
4. Something is deeply wrong with the unconditional model.
This question/issue has been a central focus of the asset pricing literature for the last 15-20 years.
(7) Exact Factor Structure in the APT

If returns are given by a 2-factor model (for example):
z i  z i   i1 f1   i 2 f 2
The law of one price implies there is a discount factor SDF linear in the factors that price the assets:
m  a  bf 1  df 2 .
It is also true that if returns are given as above, we can use 1=E[mz] to derive the beta pricing relation:
z i  z i   i1 f1   i 2 f 2
The law of one price implies we can use m to price both sides of this equation:
1  E[ z i m]  z i m   i1 E[mf 1 ]   i 2 E[mf 2 ]
zi  R f  R f  i1 E[mf 1 ]  R f  i 2 E[mf 2 ]
 R f   i1 ( R f  E[mf 1 ])   i 2 ( R f  E[mf 2 ])
 R f   i11   i 2  2
Now, form portfolios with returns:
z 1  z 1  1  f1  0  f 2
z 2  z 2  0  f1  1  f 2
The above pricing equation must hold for these portfolios, so:
z 1  R f  1 so, 1  z 1  R f and,
z 2  R f  2 so, 2  z 2  R f
So, this representation takes us a ways towards developing the APT. See Cochrane Ch. 9.
(8) Random Walks and Time Varying Expected Returns

The predictability of future returns has become an issue of increasing interest. If we look across time
rather than across assets we can examine this issue. From the basic first-order condition:
pt u (ct )  Et [  u (ct 1 )( p t 1  d t 1 )]
Now, if investors are risk neutral (or if consumption is constant), dt+1=0, and   1 (the time horizon is
very short), then the FOC becomes:
pt  Et [ pt 1 ]
or, we can write:
pt 1  pt   t 1
So, prices follow a martingale (or a random walk if we further assume that  2 ( t 1 ) is constant across
time) and expected returns should be constant across time.
The basic FOC really tells us that prices (adjusting for dividends) scaled by marginal utility follow a
martingale. For short horizons prices should be close to a martingale since consumption and risk
aversion don’t vary much over a day. Returns over longer horizons may be predictable.
[130]
For long horizons we write our expected returns relation:

Cov (mt 1 , z t 1 )
E[ z t 1 ]  R ft  
E[mt 1 ]
 (m )
  t t 1  t ( z t 1 )  (mt 1 , z t 1 )
Et [mt 1 ]
  t  t (Ln(ct 1 )) t ( z t 1 )  t (mt 1 , z t 1 )
Now, looking at:

E[ z t 1 ]  R ft   t  t (Ln(ct 1 )) t ( z t 1 )  t (mt 1 , z t 1 )
we see that variation in E[ z t 1 ]  R ft could come from variation in  t ( z t 1 ) . However, this is not borne
out in the data, as variables correlated with mean changes are not correlated with variance changes (and
vice versa).
Predictable excess returns can then come from changes in aggregate risk -  t (Ln(ct 1 )) - or risk
aversion -  t - or perhaps from  t (mt 1 , z t 1 )
The literature doesn’t consider  t (mt 1 , z t 1 ) much. But, it is natural to think that  t (Ln(ct 1 )) and
 t change over the business cycle which is the horizon over which returns are relatively predictable
(which is not short).
(9) Present Value Statement:

Price can be related to a stream of cash flows rather than just next period’s price/cash flow.
We can use a longer term objective function:


E t  j  0  j u (c t  j )
Now, suppose that you can purchase the stream {dt+j} for pt, the FOC gives us the pricing formula directly
(developed in the Lucas paper):
 j
u (ct  j ) 
pt  Et  j 1  d t  j  Et  j 1 mt ,t  j d t  j
u (ct )
u (ct 1 )
Now by noticing that this holds at t and at t+1 and noting that multiplying pt+1 by  puts the equation
u (ct )
for pt+1 in terms of mt,t+j rather than mt+1,t+1+j. We can then get to:
pt  Et [mt 1 (d t 1  pt 1 )] from the multi-period relation.
So, the two date solution and the multi-period solution are equivalent.
From the multi-period relation we can write:

 Et (d t  j ) 
pt   j 1   j 1 Covt (d t  j , mt ,t  j )
R ft ,t  j
and the same message about pricing risk comes out.
[ 131 ]
[132]
10 Option Pricing
Background
 Options are side bets between investors concerning the future price level of an underlying asset
(which we will refer to as a stock for simplicity) relative to a fixed benchmark. Even more generally,
a bet about the future realization of a random outcome (weather). As they are created when two
investors take opposite sides of the “bet,” they are in zero net supply.
 A call option is a financial security which gives its owner the right (but not the obligation) to buy an
underlying asset (stock) for a pre-specified price (this is the fixed benchmark called the strike or
exercise price, k) on (or before) the expiration date (T) of the option contract. A “European” call
option can be exercised only on the expiration date whereas an American call can be exercised at any
time up to and including the expiration date.
 A put option gives its owner the right (but not the obligation) to sell an underlying asset at the strike
price on or before the expiration of the put option contract.
 Given the transactional complexity (at least for the first time you discuss options) of a call option,
we will first identify several relevant prices in the hopes of avoiding confusion (we will concentrate
our discussion on call options):
c or ct is the current price/value of the call option
cT is the payoff/value of the call at the expiration date
s or st is the current price/value of the underlying asset
sT is the price/value of the underlying asset at T
k is the contractually specified strike price
Similarly,
p or pt is the current price of the put option
pT is the payoff of the put at expiration
 The basic option pricing literature concentrates on finding ct as a function of st and other variables in
a variety of circumstances.
Payoffs at Expiration
 A call option has value to its owner at T only if the price of the underlying asset sT is above the strike
price k. Thus buying a call is a bet the price will end up above this benchmark. If it is, the payoff on
the call is sT - k; if not the payoff is not sT - k (which is negative if sT < k) but rather 0 (because the
owner has a choice, an option). The owner of the option simply lets it expire unexercised when the
exercise value would be negative. Note that the buyer of an option purchases a choice (right), the
seller of an option incurs an obligation.
 We write the payoff of the call option as:

s  k if sT  k (in the money)
cT   T
 0 if sT  k (out of the money)
or, cT  Max( sT  k ,0)
Once again, the owner of a call option benefits if the price of the underlying asset ends up above the
strike price at the expiration of the option.
[ 133 ]
Graphically, on the expiration date…
Buy (Own) Sell (Write)

cT cT
ct
k sT sT
-ct
The dotted line is the “profit” of the position which was commonly but incorrectly examined.
 A put option works in the “opposite” way. It allows you to sell for k. So, you (the owner) are
interested in (are betting on) states in which the true price is low (lower than k). Note, however, that
you do not have the same interest in a low price as does the seller of a call option.
k  sT if sT  k
pT  
 0 if sT  k
or, pT  Max(k  sT ,0)
Own Sell (Write)

pT pT
k
pt
k sT sT
Put-Call Parity
 The law of one price allows us to find a particular relation between the current value of a put, the
current value of a call, the current value of the underlying stock, and a risk free bond’s value.
 Consider the following two strategies:

(1) Buy a call and write (sell) a put on the same underlying, with the same strike price k, and
the same expiration date T;
(2) Buy the underlying stock and borrow so you must payback k at time T.
[134]
 The time T payoffs of these 2 strategies as a function of the price of the underlying are:
(1) Call Put Portfolio (sum)
k sT k sT k sT
-k -k
(2) Stock Borrow Portfolio (sum)
sT k sT k sT
-k -k
 Now, since their future payoffs are equivalent, cT - pT = sT – k, in all states of nature (note that the
only uncertainty for both positions concerns the future, time T, price of the underlying stock), their
current prices must also be equal.
So, ct  p t  st  k
Rf
Thus, pt  ct  st  k
Rf
so we can find pt if we know ct – hence our concentration, as is traditional, on call options.
Restrictions on Option Prices:

Ingersoll presents several results – the more intuitive of which are replicated here – these illustrate some
important intuitions/features of option prices.
Proposition (1) – American and European put and call option prices are always at least weakly positive.
This comes from the limited liability of the payoff equations for options.
Proposition (2) – The final payoffs on options are weakly positive and as given above. Options are
never exercised out of the money as the exercise decision is a choice of the owner.
Proposition (3) – American calls and puts must always sell for at least their exercise values, ct  s t  k
(exercise “value” not “price”). Otherwise an immediate arbitrage is available.
Proposition (4) – For two American calls (puts) written on the same underlying with the same strike
price, the call (put) with the greater time to maturity is at least as valuable as the “shorter”
contract. With a “longer” American contract, the owner can do all she could with a shorter
[ 135 ]
contract and more (i.e. exercise before, on, or after the expiration of the shorter contract), the
current value of the extra right must be weakly positive.
Proposition (5) – An American call (put) is worth at least as much as a European call (put) with the
same characteristics. Again, the added rights have value. Here, the right is to exercise not only
at maturity, but before T as well.
Proposition (6) – Call option values are non-increasing in the exercise price and put option values are
non-decreasing in the exercise price
Consider calls – if you have two calls, one with a low exercise price and one with a high exercise
price, then whenever the second can be profitably exercised, the first can also be profitably
exercised and for a strictly greater payoff. Also, the first can be profitably exercised in some
states in which the second with the higher exercise price cannot be. The first call must have a
higher current value.
Early Exercise of American Call Options

 We can show that one should never exercise an American call option early if the underlying asset
pays no dividends.
 From the payoff equation we know that at expiration cT  Max( sT  k ,0)  sT  k .
 Thus, the current price of the call must satisfy the restriction ct  st  k . Since the risk free rate
Rf
is strictly positive (and thus Rf > 1) ct  s t  k . (The difference ct  ( s t  k )  0 is often called the
“option” value. The added value from keeping the option alive rather than exercising it. This result
strengthens Proposition 3 above by making the inequality strict.)
 This says: what you get if you exercise the call early is strictly less than what you get if you sell the
call. This suggests that in simple situations, we can concentrate not only on calls but European calls.
CR&R consider American calls which leads to some of the complications.
Binomial Option Pricing Example (from CR&R): - The riskless hedge

 We could always proceed with ct  E[mt ,T  cT ] , but it is instructive to go through a more elaborate
analysis. In the end, of course, ct  E[mt ,T  cT ] will hold.
 Suppose the distribution of the payoff on a stock over the next period is given by
s1  100 with probability q

s 0  50  
 s1  25 with probability 1  q
Also assume that Rf = 1.25, k = 50 (call is currently at the money), and T = 1.
The question of the day: What is c0?
[136]
To address this question we examine a portfolio formed by selling 3 calls, buying 2 shares of the
underlying stock, and borrowing $40 now (so at T we pay back $50 = $40 x Rf).
Payoff Now if s1 = 25 if s1 = 100

Sell 3 calls 3c0 0 -150
Buy 2 shares -100 50 200
. Borrow 40 -50 -50 .
Total 3c0 - 60 0 0
Since, in “all” states of nature the payoff on the portfolio is zero (riskless hedge), its current price must
be zero by the law of one price. So 3c0 - 60 = $0, or c0 = $20.
Alternatively: A replicating portfolio:

Buy 2 shares and borrow $40 (to again payback $50) and compare this to buying 3 calls.

(1) Buy 2 shares -100 50 200
. Borrow 40 -50 -50 .
Total -60 0 150

(2) Buy 3 calls -3c0 0 150
The law of one price again says c0 = $20 (-3c0 = -$60) and here we see that an appropriately levered
position in the underlying (1) replicates the payoff on the call option (2). Replicating the payoff on the
option is the approach used in the early option pricing literature. This implies that the call option is
redundant, i.e., is already in the asset span. Ross (1976) looks at the usefulness of call options when this
is not true.
The General Binomial Formula:

Assume: The price of the underlying stock follows a multiplicative binomial process over discrete
periods. The gross return on the stock can have one of two values – u (for up) with probability q and d
(for down) with probability 1 - q. So, if s is the current price of the stock, its distribution looks like:
q us
1-q ds
Also, assume the periodic risk free rate is constant: Rf > 1

We require u  R f  d  0 . Why?
[ 137 ]
One Period Analysis:
If the call expires in one period, then its price process looks like:
c u ( pr  q )  cT (us )  Max(us  k ,0)

c
c d ( pr  1  q)  cT (ds )  Max(ds  k ,0)
Now, as in the numerical example, form a portfolio containing  shares of stock and the (dollar)
amount B in riskless bonds. The initial cost is s  B and its payoff distribution is:
 ( pr  q )  us  R f B
s  B  
( pr  1  q)  ds  R f B
Now, choose  and B to replicate the payoff of the call:
us  BR f  cu and, ds  BR f  c d
Solve the above to find:

c  cd c  cd uc d  dcu k  uc d  dcu 
 u  u  0 (explain) B     0
us  ds (u  d ) s (u  d ) R f R f  (u  d )k 
With  and B chosen this way, we have a replicating portfolio and c =  s+B as long as this is not less
than s – k. If this is less than s – k, then c = s – k (recall the American option can not sell for less than
its exercise value or it would represent an arbitrage opportunity).
Then, c  s  B (unless this is less than s - k)

c  c d uc d  dcu
 u 
ud (u  d ) R f
 R f  d   u  Rf  
 cu   c d 
 u  d   ud  
 (again, or s - k)
Rf
This is a risk neutral pricing equation where the risk neutral probabilities are given by:
Rf  d u  Rf
p*  1  p* 
ud ud
Note that the risk neutral probabilities depend only on parameters of the stock price process and the risk
free rate. We could also represent the pricing equation using state prices or a state price density (or
stochastic discount factor) by adjusting for the actual probabilities and taking an expectation:
ct  E[mt ,T  cT ] . Question: where is q?
[138]
Finally, c  [ p * cu  (1  p*)c d ] R f
 [ p * Max(us  k ,0)  (1  p*)Max(ds  k ,0)] R f (or s - k)
 [ p * (us  k )  (1  p*)(ds  k )] R f
 s  k Rf  s  k if R f  1 as above
Note:  >0 and B<0, so the replicating portfolio is again a levered position in the underlying.  is
referred to as the “delta” of the option – the change in option value for a given change in the price of
the underlying stock. The general form of the pricing formula is  times the current price of the
underlying less the amount borrowed (which we can represent as the present value of the exercise price
times some factor) necessary to form the replicating portfolio.
The Two-Period Problem
u2s cuu=Max(u2s-k, 0)
q us cu
s uds and, c cud=Max(uds-k, 0)
ds cd
1-q
d2s cdd=Max(d2s-k, 0)
Given what we have just derived and that, under our assumptions for Rf and the stock price process, the
environment does not change from period to period, so we can automatically write:
cu  [ p * cuu  (1  p*)cud ] R f
and,
c d  [ p * c ud  (1  p*)c dd ] R f
Careful, the forms are the same but not everything is. For example, examine the Δ at each node of the
tree using a numerical example.
At time 0 we could again form a portfolio of  shares and B in bonds to replicate the call value cu if s 
us or cd if s  ds. Again, the forms of  and B are unchanged but their values are not.
c  cd uc  dcu
 u ≥0 B d ≤0
(u  d ) s (u  d ) R f
Simply substitute the new cu and cd from above to get the  s+B representation for the current call price.
As before, c = Max(  s+B, s-k).
Alternatively, we can again write the current value of the call as:
c  [ p * cu  (1  p*)c d ] R f
again, if this is greater than s-k or it will be s-k otherwise.
Substitution allows us to derive the risk neutral probability representation of the call price:
c  [ p * 2 cuu  2 p * (1  p*)cud  (1  p*) 2 c dd ] R 2f
 [ p *2 Max(u 2 s  k ,0)  2 p * (1  p*)Max(uds  k ,0)  (1  p*) 2 Max(d 2 s  k ,0)] R 2f
[ 139 ]
which we can see is always greater than s-k using the same process as was used above.
 The n period problem: By extension (of the last equation) we can write:
 n  n!  j 
c   j 0   p * (1  p*) n j Max(u j d n j s  k ,0) R nf
  j!(n  j )!  
 Now, let a be the smallest non-negative integer such that with a up moves and n-a down moves
the option finishes in the money:
u a d n a s  k
 This can also be stated as a is the smallest non-negative integer greater than
Ln(k sd n ) Ln(u d ) .
If a > n the option can never be in the money prior to its expiration and c=0.
If 0 ≤ a ≤ n then for all j < a, Max(u j d n j s  k ,0)  0 and for all j  a,

Max(u j d n j s  k ,0)  u j d n  j s  k so we can simplify our formula as:
 n  n!  j 
c   j a   p * (1  p*) n  j [u j d n j s  k ] R nf
  j!(n  j )!  
Rewrite this as:

n  j  u d
 n  n!  j  j n  j   k  n  n!  j 
c  s  j  a   p * (1  p*)       p * (1  p*) n j

 R n  R n j a  
  j!(n  j )!   f  f   j!(n  j )!  
or,
k
c  s[a; n, p*]  n [a; n, p*]
Rf
Rf  d  
Where: p*  p*   u R  p * , and
ud  f 
a  the smallest non-negative integer greater than Ln( k sd  Lnu d )

n
Again, c  s  B where the  ’s are the complementary binomial distribution function – the probability
of getting a or more u’s in n tries if the probability of u is p * (or p*).
The Continuous Time Limit:
 Here we let the “n” from the n-period problem get large. In doing so, we want to let the length of a
period of time go to zero. We need to take some care in doing this so that we don’t wind up with
ridiculous parameter values that say the stock price is expected to change by 20% over an instant in
time rather than over a year’s time.
[140]
 Let h represent the elapsed time between successive stock price changes (this is what we will let go to
zero), and let T be the fixed length of calendar time to expiration (fixed number of “units” of time),
and n is the number of periods of length h prior to expiration ( so h  T n) . We want to see what
happens as n   or h  0.
 We first need to adjust Rf. Rf is one plus the rate of interest over a “unit” of calendar time, so over T
units, R Tf is the riskless return.
 Denote by R̂ f one plus the riskless rate over a period of length h. Then, over the time to expiration
there are n such periods. So, R̂ nf is the total return until expiration (date T).
T
 We therefore require: Rˆ nf  R Tf so Rˆ f  R f n
 We also need to adjust u and d for changing n as well.
 Over each discrete period, in the n period model we assumed the stock price would experience a one
plus rate of return of u with probability q and d with probability 1-q.
 It’s easier to work with Ln(u) or Ln(d) which gives the continuously compounded rate of return over
each period. The continuously compounded return is a binomial random variable with realization
Ln(u) with probability q and Ln(d) with probability (1-q) in each period.
 Over n periods – the continuously compounded return is additive:

   
Ln s * s  jLn(u )  (n  j ) Ln(d )  jLn u d  nLn(d )
where j is the (random) number of up moves (an “up” is a 1) in the n periods until expiration.
 Then,
    
E Ln s * s  Ln u d E ( j )  nLn(d )
and,
     
Var Ln s *  Ln u
s d
2
Var ( j )
 The expected outcome for each trial is q (the probability of an “up”), so in n trials E ( j )  nq
 Since the variance of the outcome of each trial is q(1  q) 2  (1  q )(0  q ) 2  q (1  q ) so

Var(j)=nq(1-q)
      
So, E Ln s *  qLn u  Ln(d ) n  ̂n
s d

  s   q(1  q)Lnu d  n  ˆ n
Var Ln s *
2
2
[ 141 ]
If T and  2T are the empirical values of the stock’s expected return and variance over the time until
expiration (T periods till expiration with  as the expected return over each unit of time and 2 the
variance over each unit of time), then we want:
qLnu d  Ln(d )n  T and
  d  n   T
q(1  q) Ln u
2
2
as n
 These conditions will follow if we define:

u  e T n
d  e  T n
q 1
2  1 2    T
  n
By substitution you can see that for any n, ˆn  T
and   
ˆ 2 n   2   2 T n T   2T as n  .
So, the mean is correct for all n and the variance converges in the limit (as n   ).
 Our n period option pricing formula is:

c  s[a ; n, p*]  kRˆ f n [a ; n, p*]
 Since Rˆ f n  R f T ,
c  s[a ; n, p*]  kR f T [a ; n, p*]
which means the binomial model converges to the Black-Scholes model, which is written
T
Ln( s / kR f T ) 1
c  sN ( x)  kR f N ( x   T ) where x   2  T , if
 T
[a ;n, p*]  N ( x )

and,
[a ;n, p*]  N ( x   T ) .
And they do as shown by the central limit theorem results developed in the paper. Note again that the
option price can be expressed as delta times the stock price less the amount that must be borrowed to
replicate the option.
[142]

NKJNJ

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

NKJNJ

Enviado por

Direitos autorais:

Formatos disponíveis

[2]

Advanced Financial Engineering

1 Risk and Risk Aversion

Definition: Let  be a preference relation with an expected utility representation.  is said to

Definition: A function f( ): W →  (reals) is concave if f(az + (1-a)y)  af(z) + (1-a)f(y)  z, y  W

The definition of concavity leads naturally to Jensen’s Inequality:

Illustration: Consider a fair gamble defined by  1   2   and p = ½. Label f( ) = U( ).

Theorem: An agent is strictly risk averse iff U( ) is strictly concave.

“An agent is strictly risk averse only if U( ) is strictly concave”

 If V(c) is increasing and p > 0 then U(W) is increasing

Arrow-Pratt Measures of Risk Aversion

We can drop the o  ² terms as inconsequential if  is ‘small,’ leaving:

U(wo) + (2p-1)  U  (wo) + ½  ² U  (wo)  U(wo)

U' ' (wo)

If we had instead considered a proportional (to initial wealth) lottery:

Consider the relation between the three versions:

Read Ep( ) as the expectation using p

Comparing Risk Aversion

So,  1 >  2 since U1′ > 0 (that is, U1 is an increasing function)

(1)  (2) Define G( ) = U1[U2-1(w)] then

Similarly, U1″ = G″(U2)(U2′)² + G′(U2) U2″

Examples of Commonly Used Utility Functions

 Constant Absolute Risk Aversion – CARA

 Constant Relative Risk Aversion – CRRA

-log(U′(w)) = Rlog(w) + I1 as long as w > 0

A positive affine transformation of the base utility function U(w) = log(w)

So, the base utility function is U(w) = w1-R/1-R

 Linear Risk Tolerance (a.k.a. “Hyperbolic Absolute Risk Aversion – HARA”)

Stronger measures of Risk Aversion – introduced by Ross

(2)  G,  with G′ and G″ ≤ 0,  > 0 such that U1(w) =  U2(w) + G(w)

(3) ~ , e~ with E( e~ |w) = 0,  ≥  where E[U ( w

E[U1(w-  1 )] = pU1(w1 -  1) + (1-p) U1(w2 -  1)

Now equate these two approximations to find:

(2)  (3) U1 = λU2 + G λ > 0, G′, G″≤ 0

 G″( ) ≤ 0 since  > 0 and U2″( ) < 0

(3)  (1)  1 ≥  2 only if  w1, w2, p

 pU 1 ' ' ( w1 )  pU 2 ' ' ( w1 )

U 1 ' ' ( w1 ) pU 1 ' ( w1 )  (1  p )U 1 ' (w2 )

Now, if for some w1 and w2,

U 1 ' ( w2 ) U ' ' ( w1 )

then, for p sufficiently small we have a contradiction.

U 1 ' ' ( w1 ) U ' (w )

U 1 ' ' (w) U ' (w)

Now let U1( ) = -e-mw U2( ) = -e-nw with m > n

 U 1 ' ' ( w1 )  U 2 ' ' ( w1 )

Why does this example work?

Thought exercise: What happens as we move outside this example’s structure?

 y11 y12  y1N 

 The final payoff on the portfolio n is an (S × 1) vector: ~

 Thus, we might think of solving the following economic problem:

Portfolios in Returns Space:

Simple Numerical Example:

More on the Construction of Z:

(1) Redundant Assets

 Redundancy is formally stated as: If the “augmented” returns tableau

matrix L such that L Z  I N .

Pick any portfolio w of the marketed securities generating return Zw.

Now, further suppose S = N and Rank(Z) = S = N

So, if N = S and Rank(Z) = N = S, then L = R = Z-1.

(2) Insurable States

Rule of Rank: A non-homogenous system of equations x1 c1 + x 2 c 2 + … + x N c N = b has a

 Connection: Suppose that all states are insurable: Rank(Z) = S.

ZR = I SxS where R = Z′(ZZ′)-1

E[u ( z) z i ]      E[u ( z)]E[ z i ]  Cov (u ( z*), z i )

If there is a riskless asset, E[u ( z) R]   so, E[u ( z)( zi  R)]  0 i