Escolar Documentos
Profissional Documentos
Cultura Documentos
J. Bertolai
Contents
Teoria dos Jogos: Panorama geral 2
Um exemplo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
A Teoria Econômica e a Teoria dos Jogos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
References 206
1
Teoria dos Jogos: Panorama geral
Um exemplo
Remark (Teoria dos Jogos). A Teoria dos Jogos proporciona previsões sobre qual será o comportamento dos
indivı́duos sob um dado conjunto de regras (instituição).
Previsão 1. (Confess, Confess) é uma boa previsão sobre o comportamento dos indivı́duos.
(
confessar se 1 espera que 2 não confessará
o melhor para 1 é
não confessar se 1 espera que 2 confessará
(
confessar se 2 espera que 1 não confessará
o melhor para 2 é
não confessar se 2 espera que 1 confessará
2
Previsão 2. Há duas boas previsões para o comportamento dos indivı́duos: (not Confess, Confess) e
(Confess, not Confess)
“Equilı́brios de Nash”
Prisoner 2
not Confess Confess
not Confess −1,−1 −3,−3
Prisoner 1
Confess −3,−3 −6,−6
Prisoner Dilemma
(
não confessar se 1 espera que 2 não confessará
o melhor para 1 é
não confessar se 1 espera que 2 confessará
(
não confessar se 2 espera que 1 não confessará
o melhor para 2 é
não confessar se 2 espera que 1 confessará
Previsão 3. Há somente uma boa previsão para o comportamento dos indivı́duos:
Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma
3
Questão 2. O podemos esperar sobre o comportamento dos prisioneiros?
não confessar se 1 espera que 2 não confessará e a ≥ c
confessar se 1 espera que 2 não confessará e a ≤ c
a melhor resposta para 1 é
não confessar se 1 espera que 2 confessará e b ≥ d
confessar se 1 espera que 2 confessará e b ≤ d
– a melhor resposta para 2 é análoga, dado a simetria do jogo
Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma
(nC, nC) se a − c ≥ 0
(nC, C) se a − c ≤ 0 e b − d ≥ 0
será uma boa previsão (Equilı́brio de Nash)
( C, nC) se a − c ≤ 0 e b − d ≥ 0
( C, C) se b − d ≤ 0
y =b−d
✻
(nC, nC)
(C, C) (C, C)
Formalização:
Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma
o conjunto de jogadores (prisioneiros) é dado por I := {1, 2}
4
o conjunto de melhores respostas de i para a conjectura s−i é dado por
G = {S1 , S2 , · · · , Sn ; u1 , u2 , · · · , un },
then there is at least one Nash Equilibrium (possibly involving mixed strategies).
Desenho de Mecanismos
Questão 3. Quais devem ser as possı́veis sentenças dos prisioneiros (a, b, c e d) quando a sociedade deseja
que eles revelem a verdade (C,C) e
não se pode prender alguém sem alguma confissão por mais do que 1 ano (a ≥ −1);
não se pode prender alguém usando testemunha por mais do que 2 anos (b ≥ −2);
não se pode prender alguém usando confissão por mais do que 10 anos (c ≥ −10 e d ≥ −10); e
P2
nC C
nC a,a b,c
P1
C c,b d,d
Prisoner Dilemma
5
Ou seja, como desenhar o mecanismo ótimo para revelar a verdade?
se houvesse evidência de culpa dos 2 prisioneiros, a sociedade escolheria (a, b, c, d) = (−1, −2, −10, −10)
como não há evidência da culpa, ninguém confessará o crime se (a, b, c, d) = (−1, −2, −10, −10)
y =b−d
✻
P2 (nC, C) (nC, nC)
nC C (C, nC)
nC a,a b,c ✲x=a−c
P1
C c,b d,d
(nC, nC)
Prisoner Dilemma
(C, C) (C, C)
Observação 1. “A sociedade desenha o mecanismo para induzir a confissão (C,C), mas pode acabar em
uma situação (nC,nC) pior do que a esperada.”
x = a∗ − c∗ = 9 > 0 e y = b∗ − d∗ = 0 ≥ 0
6
e, portanto, há dois equilı́brios de Nash: (nC, nC) e (C, C).
y =b−d
✻
(nC, nC)
(C, C) (C, C)
– c1 consumo no perı́odo t = 1
– c2 consumo no perı́odo t = 2
7
– 2R unidades de recursos na data 2 se liquidado no longo prazo
A matriz de payoffs é
Depositante 2
correr n~
ao correr
correr r,r D, 2r − D
Depositante 1
n~
ao correr 2r − D,D R,R
Bank run Game
Graficamente:
y =r−D
✻
(nC, nC)
(C, C) (C, C)
Computacionalmente:
8
def payoffs(x,y):
"""Para cada par de estrat\’egias s1=x e s2=y,
esta fun\c{c}\~aoo retorna o payoff do jogador 1 e o payoff do jogador 2"""
if x==’correr’:
if y==’correr’:
z = [r,r]
else:
z = [D,2*r-D]
else:
if y== ’correr’:
z = [2*r-D,D]
else:
z = [R,R]
return z
– O programa a seguir usa as funções payoffs() e NE() para calcular o conjunto de equilı́brios de
Nash.
* dois casos serão estudados
* em ambos os casos, a dotação inicial é D = 1 e o retorno de longo prazo é de 20%, ou seja,
R = 1.2
* no primeiro caso, o retorno de curto prazo é 10%, ou seja, r = 1.1
* no segundo caso, o retorno de curto prazo é −10%, ou seja, r = 0.9
D, r, R = 1,1.1,1.2
eqs = NE()
9
print(’O conjunto de equilbrios de Nash :’)
print(’{ ’,end=’’)
for i, eq in enumerate(eqs):
#print(’{}’.format(eq),end=’’)
if i<len(eqs)-1:
print(’{}, ’.format(eq),end=’’)
else:
print(’{} ’.format(eq),end=’’)
print(’}’)
D, r, R = 1,0.9,2
eqs = NE()
print(’O conjunto de equilbrios de Nash :’)
print(’{ ’,end=’’)
for i, eq in enumerate(eqs):
#print(’{}’.format(eq),end=’’)
if i<len(eqs)-1:
print(’{}, ’.format(eq),end=’’)
else:
print(’{} ’.format(eq),end=’’)
print(’}’)
10
Observação 2. A previsão do modelo neste caso é:
Economics is a science which studies human behavior as a relationship between ends and scarce means which
have alternative uses
que especifica uma cesta de consumo (x1i , x2i ) ∈ R2+ para cada indivı́duo i ∈ {1, 2, . . . , I}. A alocação é dita
factı́vel se
I
X
xji ≤ ω j , ∀j ∈ {1, 2}.
i=1
11
distribuição de recursos entre os indivı́duos
Caixa de Edgeworth (I = 2)
Melhorias de Pareto: por que esperar que elas não são exploradas?
12
Market clearing: demanda agregada igual a oferta agregada
I
X I
X
xj∗
i = ωij j = 1, 2
i=1 i=1
Teorema 2 (The First Fundamental Welfare Theorem). Toda alocação resultante de um equilı́brio com-
petitivo é Pareto ótima.
se mercados completos:
Teorema 3 (The Second Fundamental Welfare Theorem). Toda alocação Pareto ótima pode ser alcançada
(sustentada ou decentralizada) como um equilı́brio competitivo.
the outcome from strategy profile (not Confess, not Confess) Pareto dominates the outcome from
(Confess, Confess)
13
Another equilibrium concept: Nash equilibrium
Strategic interdependence:
Each individual’s welfare depends not only on his own actions but also on the actions of the other
individuals
The actions that are best for an individual to take may depend on what he expects the other players to
do
Nash equilibrium
Mechanism Design
Questão 5 (Choosing among games:). How to design games in order to implement optimal allocations?
Principal-Agent problems
– moral hazard
– adverse selection
– fiscal policy
– monetary policy
– regulation
14
Cap. 1 - Static Games of Complete Information
Static Games of Complete Information:
Static:
Complete information:
Each individual’s welfare depends not only on his own actions but also on the actions of the other
individuals
The actions that are best for an individual to take may depend on what he expects the other players to
do
the combination of strategies chosen by players determines a payoff for each player
15
Actions and payoffs
If neither confesses, they will be convicted of a minor offense and sentenced to one month in jail
If both confess then both will be sentenced to jail for six months
If one confesses but the other does not, then the confessor will be released immediately but the other
will be sentenced to nine months in jail
Matrix representation
Implicitly assumed that each player does not like to stay in jail
Prisoner 2
not Confess Confess
not Confess −1,−1 −9,0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma
General case:
The normal form representation
(a) Players
(b) Strategies
(c) Payoffs
We write “player i” where i is the name of the player and I is the collection of names
The set Si is called strategy space and may have any structure: finite, countable, metric space, vector
space
The collection (si )i∈I = (s1 , · · · , sn ) is called a strategy profile and denoted by s or s
16
Given an agent j and a profile s, we denote by (s−j ; s′j ) the new profile σ = (σi )i∈I defined by
(
s′j if i=j
σi =
si if i 6= j
Payoffs:
– he plays strategy si
– and any other player j plays strategy sj
G = (Si , ui )i∈I
Si is a set
Q
ui is a function from S = j∈I Sj to [−∞, +∞]
– rational players
17
– who are fully knowledgeable about
* the structure of the game
* and each others’ rationality?
Simultaneous moves: In a normal form game the players choose their strategies simultaneously
It suffices that each choose his or her action without knowledge of the others’ choices
– Prisoners’ dilemma: the prisoners may reach decisions at arbitrary times but it must be in separate
cells
– Bidders in an sealed-bid auction
Definição 7 (Strictly dominated strategies). Consider a normal form game (Si , ui )i∈I .
the player i’s payoff from playing s′i is strictly less than the payoff playing s′′i .
Formally,
Y
ui (s′i , s−i ) < ui (s′′i , s−i ) ∀s−i ∈ Sk ,
k6=i
Observação 4 (The prisoners’ dilemma). For a prisoner, playing not Confess is strictly dominated by
playing Confess
Prisoner 2
not Confess Confess
not Confess −1,−1 −9,0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma
18
Assume we are player 1
Iterated elimination
Questão 8. Can we use the idea that “rational players do not play strictly dominated strategies” to find a
solution to other games?
Player 2
Left Middle Right
Up 1, 0 1, 2 0, 1
Player 1
Down 0, 3 0, 1 2, 0
19
for Player 1
for Player 2
– then Player 1 can eliminate Right from Player 2’s strategy set
then both players can play the game as if it were the following game
Player 2
Left Middle
Up 1, 0 1, 2
Player 1
Down 0, 3 0, 1
Player 2
Left Middle
Player 1 Up 1, 0 1, 2
Definição 8 (Iterated elimination of strictly dominated strategies). This process is called iterated elimi-
nation of strictly dominated strategies.
20
Proposição 1. The set of strategy profiles that survive to iterated elimination of strictly dominated strate-
gies is independent of the order of deletion
Drawbacks:
(i) Each step requires a further assumption about what the players know about each other’s rationality
(ii) this process often produces a very imprecise prediction about the play of the game
L C R
U 0, 4 4, 0 5, 3
M 4, 0 0, 4 5, 3
D 3, 5 3, 5 6, 6
The process produces no prediction whatsoever about the play of the game
Questão 9. Is there a stronger solution concept than IESDS which produces much tighter predictions in a
very broad class of games?
Motivation:
Suppose that game theory makes a unique prediction about the strategy each player will choose
in order for this prediction to be compatible with incentives (or correct) it is necessary that
21
– each player’s predicted strategies must be that player’s best response to the predicted strategies of
other players
A strategy profile s∗ = (s∗i )i∈I is a Nash equilibrium of G if for each player i, the strategy s∗i is player
i’s best response to the strategies specified in s∗ for the other players.
Interpretation
If the theory offers as a prediction
then
there exists at least one player that will have an incentive to deviate from the theory’s prediction
Observação 7. If a convention is to develop about how to play a given game, then the strategies prescribed
by the convention must be a Nash equilibrium, else at least one player will not abide the convention
Examples
In a 2-player game we can compute the set of NE as follows:
22
A pair of strategies (profile) is NE if
both corresponding payoffs are underlined in the matrix
L C R
U 0, 4 4, 0 5, 3
M 4, 0 0, 4 5, 3
D 3, 5 3, 5 6, 6
Player 2
Left Middle Right
Up 1, 0 1, 2 0, 1
Player 1
Down 0, 3 0, 1 2, 0
Prisoner 2
not Confess Confess
not Confess −1,−1 −9,0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma
Proposição 2. If IESDS eliminates all but the strategy profile s∗ = (s∗ )i∈I , then s∗ is the unique NE of
the game.
Teorema 4. If the strategy profile s∗ is a NE, then s∗ survives iterated elimination of strictly dominated
strategies.
there can be strategy profiles that survive IESDS, but which are not NE
Questão 10. Is NE too strong? Can we be sure that a Nash equilibrium exists?
A classic example:
The battle of sexes
A man (Pat) and a woman (Chris) are trying to decide on an evening’s entertainment
23
while at workplaces, Pat and Chris must choose to attend either the opera or a rock concert
both players would rather spend the evening together than apart
Pat
Opera Rock
Opera 2,1 0,0
Chris
Rock 0,0 1,2
In some games with multiple NE, one equilibrium stands out as the compelling solution
Theory’s effort:
identify such a compelling equilibrium in different classes of games
In the example above,
1.2 Applications
Cournot model of duopoly
– we assume c < a
24
The strategies available to each firm are the different quantities Qi = [0, ∞)
An element of Qi is denoted qi
we have (
qi [(a − c − qj∗ ) − qi ] if qi < a − qj∗
πi (qi ; qj∗ ) =
−qi c if qi ≥ a − qj∗
1
qi∗ = (a − qj∗ − c)
2
which yields
a−c
qi∗ = , ∀i ∈ I
3
obs.: this is consistent with the assumption qi∗ ∈ (0, a − c)
Interpretation
Each firm would like to be a monopolist in this market
25
it would choose qi to maximize πi (qi , 0). The solution is
a−c
qm =
2
q1 + q2 = qm
– at this price, each firm has an incentive to deviate by increasing the production
– in spite of the fact that such a deviation drives down the market price, the profit obtained still
increases
Graphical solution
1
R2 (q1 ) = (a − q1 − c)
2
Likewise,
1
R1 (q2 ) = (a − q2 − c)
2
26
The two best response functions intersect only once,
a−c
Proposição 3. The monopoly quantity qm = 2 strictly dominates any higher quantity.
(3)
We can then consider the game G (3) = (Qi , πi )i∈I with
(3)
Qi = [0, qm ]
while
a−c
πi (qm + x, qj ) = [qm + x] − x − qj
2
= πi (qm , qj ) − x(x + qj )
27
Proposição 4. Given that quantities exceeding qm = (a − c)/2 have been eliminated, the quantity qm /2
strictly dominates any lower quantity.
Formally,
Proof.
3
πi (qm /2, qj ) = qm /2 (a − c) − qj
4
and
3
πi (qm /2 − x, qj ) = [qm /2 − x] (a − c) + x − qj
4
a−c
= πi (qm /2, qj ) − x + x − qj
2
After these two steps, the quantities remaining in each firm’s strategy space are those in the interval
a−c a−c
,
4 2
in the limit (we need countably many steps), these intervals converge to the single point qi∗ = a−c
3
but that’s it
Observação 10. IESDS yields only the imprecise prediction that each firm’s quantity will not exceed the
monopoly quantity
28
Q−i : sum of the quantities chosen by the firms other than i
(
qi (a − qi − Q−i − c) if qi + Q−i < a
πi (qi ; Q−i ) =
−cqi if qi + Q−i > a
In effect, we know that Q−i ∈ [0, 2qm ] = [0, a − c]. Fix qi ∈ [0, qm ] and recall that
(
qi (a − qi − Q−i − c) if qi + Q−i < a
πi (qi ; Q−i ) =
−cqi if qi + Q−i > a.
Bertrand (1883) suggested that firms actually choose prices, rather than quantities as in Cournot’s model
29
Observação 11. This is an unrealistic demand function
Nash equilibrium:
the price pair (p∗1 , p∗2 ) is a Nash equilibrium if,
1
p∗i = (a + bp∗j + c)
2
1 1
p∗1 = (a + bp∗2 + c) and p∗2 = (a + bp∗1 + c)
2 2
30
if b < 2 then the unique Nash equilibrium is
a+c
p∗1 = p∗2 =
2−b
Final-offer arbritation
the parties believe that x is randomly distributed according to a probability measure µ on the Borelian
sets of [0, 1)
31
F : [0, 1) → [0, 1] is differentiable, with derivative f
We assume that
32
FOC for the union’s problem
Therefore,
wu∗ + wf∗
1
F =
2 2
The average of the offers must equal
wu∗ + wf∗
1
F =
2 2
1
wu∗ − wf∗ = w∗ +w∗
u f
f 2
An example:
Suppose the arbitrator’s preferred settlement is normally distributed with mean m and variance σ 2 , i.e.,
1 1 2
f (x) = √ exp − (x − m)
2πσ 2 2σ
wu∗ + wf∗ 1 √
=m and wu∗ − wf∗ = = 2πσ 2
2 f (m)
In equilibrium,
33
the parties’ offers are centered around m
neither party can afford to make an offer far from the mean
each summer, all the farmers graze their goats on the village green
during the spring, the farmers simultaneously choose how many goats to own
Let
v(G): the value (per goats) to a farmer of grazing a goat on the green
34
v : [0, Gmax ] → R+ is
the strategy space is Gi = [0, ∞) (we could have chosen Gi = [0, Gmax])
is
πi (gi , g−i ) = gi v(gi + σ[g−i ]) − cgi
P
where σ[g−i ] = k6=i gk
1 ∗ ′ ∗
v(G∗ ) + G v (G ) − c = 0
n
P
where G∗ denotes i∈I gi∗
Social optimum
A social planner decides how many goats the “society” should graze on the village green
35
the planner should solve
max {Gv(G) − Gc}
G≥0
the FOC is
v(Gs ) + Gs v ′ (Gs ) − c = 0
Observação 12. Too many goats are grazed in the Nash equilibrium, compared to the social equilibrium
When a farmer considers the effect of adding one more goat, he focuses on
He does not care about the effect of his action on the other farmers
– Each player has a penny and must choose whether to display it with heads or tails facing up
– If the two pennies match then player i2 wins player i1 ’s penny
– If the pennies do not match then i1 wins i2 ’s penny
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
36
Proposição 6. There is no Nash equilibrium.
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
Poker, battle
Mixed strategies
Definição 11. A mixed strategy p = (p(si ))si ∈Si of player i is a vector in RSi satisfying
X
∀si ∈ Si , psi = p(si ) ≥ 0 and p(si ) = 1
si ∈Si
then p is denoted Dirac(si ) or 1si and (abusing notations) is assimilated with the pure strategy ŝi
Interpretation
A family p−i = (pj )j6=i of mixed strategies pj ∈ ∆(Sj ) can represent
37
Notation 5. The expected value of agent i’s payoff if he plays si believing that the other players will play
according to p−i is denoted by
ui (si , p−i )
and is defined by
X Y
ui (si , p−i ) ≡ Ep−i [ui (si )] = pj (sj ) ui (si , s−i )
s−i ∈S−i j6=i
| {z }
p−i (s−i )
Notation 6. If pi is a mixed strategy in ∆(Si ) we let p = (pj )j∈I and the expected value
X Y
Ep [ui ] = pj (sj ) ui (si1 , . . . , sin )
s∈S j∈J
X
= pi1 (si1 ) . . . pin (sin )ui (si1 , . . . , sin )
s∈S
is denoted by
ui (p)
Observe that
X
ui (p) = pi (si )ui (si , p−i )
si ∈Si
there is no belief that player i could hold about the strategies the other players will choose
when
Y
∀p−i ∈ ∆(Sj ), / arg max{Ep−i [ui (si )] : si ∈ Si }
si ∈
j6=i
In other words,
for every belief p−i that agent i could hold about the others,
38
Proposição 7. Assume that the pure strategy si is strictly dominated by the pure strategy σi
Then
there is no belief that player i could hold about the strategies the other players will choose such that
it would be optimal to play si .
More precisely, for every family p−i = (pj )j6=i of mixed strategies pj ∈ ∆(Sj ), we have
In this case, the strategy σi improves the expected payoff independently of the belief p−i agent i holds about
the other players’ actions
ui1 (B, pi2 ) = ui1 (1B , pi2 ) = 1 < 3/2 = ui1 (pi1 , pi2 )
1
Sometimes one my find the notations: pi1 = 21 Dirac(T ) + 12 Dirac(M ) or pi1 = 21 1T + 12 1M .
39
Observação 14. The strategy B is strictly dominated by the mixed strategy pi1 = (1/2, 1/2, 0)
Observação 15. A given pure strategy can be a best response to a mixed strategy
even if the pure strategy is not a best response to any other pure strategy
Player i2
L R
T 3,− 0,−
Player i1 M 0,− 3,−
B 2,− 2,−
The pure strategy B is not a best response for player i1 to either L or R by player i2
but B is the best response for player i1 to the mixed strategy pi2 by player i2 provided that
1 2
< pi2 (L) <
3 3
Definição 13. A profile of mixed strategies p∗ = (p∗i )i∈I is a Nash equilibrium of the game G if
each player’s mixed strategy is a best response to the other players’ mixed strategies,
∀i ∈ I, p∗i ∈ arg max ui (pi , p∗−i ) : pi ∈ ∆(Si ) .
The family p−i = (pj )j6=i represents player i’s uncertainty about which strategy each player j will choose
fix a family pi−i = (pij )j6=i of mixed strategies representing player i’s beliefs about player j’s strategies
denote by Si∗ (pi−i ) the set of pure strategies best response of player i defined by
40
if pi is a mixed strategy in ∆(Si ), we denote by supp pi its support defined by
if and only if the support of p∗i is a subset of all pure strategies that are best response to pi−i , i.e.,
Teorema 7. A profile of mixed strategies p∗ = (p∗i )i∈I is a Nash equilibrium of the game G if and only if
for every player i every pure strategy in the support of p∗i is a best response to the other players’ mixed
strategies
Interpretation 8. Players
Matching pennies
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
41
given this belief we have
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
Observe that
ui1 (pi1 , pi2 ) = (2q − 1) + r(2 − 4q)
if q < 1/2 then r ∗ (q) = 1 and i1 ’s best response is to play the pure strategy Heads
if q > 1/2 then r ∗ (q) = 0 and i1 ’s best response is to play the pure strategy T ails
42
if q = 1/2 then r ∗ (q) = [0, 1] and any mixed strategy is a best response, i.e., i1 is indifferent between
Heads and T ails
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
assume now that player i2 plans to choose a mixed strategy pi2 = (q, 1 − q) i.e.,
Observe that
ui2 (pi1 , pi2 ) = q(4r − 2) + (1 − r)
if r < 1/2 then q ∗ (r) = 0 and i2 ’s best response is to play the pure strategy T ails
if r > 1/2 then q ∗ (r) = 1 and i2 ’s best response is to play the pure strategy Heads
if r = 1/2 then q ∗ (r) = [0, 1] and any mixed strategy is a best response, i.e., i2 is indifferent between
Heads and T ails
43
Player i2 ’s best response (q ∗ (r), 1 − q ∗ (r)) to i1 s strategy (r, 1 − r)
We can draw in the same picture the best response correspondence of each player
The pair defined by p∗i1 = (r̂, 1 − r̂) and p∗i2 = (q̂, 1 − q̂)
is a Nash equilibrium if and only if
r̂ ∈ r ∗ (q̂) and q̂ ∈ r ∗ (r̂)
44
The unique Nash equilibrium of the Matching Pennies is then
(q, 1 − q) the mixed strategy in which Pat plays Opera with probability q
(r, 1 − r) the mixed strategy in which Chris plays Opera with probability r
3. Pat plays the mixed strategy where Opera is chosen with probability 1/3 and Chris plays the mixed
strategy where Opera is chosen with probability 2/3
Player i2
Left Right
Up x,− y,−
Player i1
Down z,− w,−
We discuss the four following cases
45
(ii) x < z and y < w
let q ′ = (w − y)/(x − z + w − y)
In case (iii) U p is optimal for q > q ′ and Down for q < q ′ , whereas in case (iv) the reverse is true
For cases involving x = z or y = w the best response correspondences are L-shaped (two adjacent sides
of the unit square)
Then we can perform analogous computations and get the same 4 best-response correspondences
46
Fix any of the four best response correspondence for player i1
We obtain the following qualitative features that can result: There can be
Teorema 9 (Nash). Consider a game G = (Si , ui )i∈I . If for each player i the set of pure strategies Si is
finite then there exists at least one Nash equilibrium with mixed strategies.
47
General existence result
(1) the set Si is a compact, convex and non-empty subset of Rni for some ni ∈ N
Q
(2) the payoff function s → ui (s) is continuous on S = i∈I Si
(3) for each s−i ∈ S−i , the function si → ui (si , s−i ) is quasi-concave in the sense that
48
Cap. 2 - Dynamic games of complete information
2.1 Dynamic games of complete and perfect information
Theory: Backwards induction
Important words
in this chapter we analyze dynamic games with complete but also perfect information
An example:
Consider the following 2-move game
1. player i1 chooses between giving player i2 $1,000 and giving player i2 nothing
2. player i2 observes player i1 ’s move and then chooses whether or not to explode a grenade that will kill
both players
Suppose that player i2 threatens to explode the grenade unless player i1 pays the $1, 000
but player i1 should not believe the threat, because it is not credible:
The framework
We analyze in this chapter the following class of dynamic games with complete and perfect information
49
then player i2 observes player i1 ’s move
2. player i2 observes ai1 and then chooses an action ai2 from a feasible set Ai2
2. all previous moves are observed before the next move is chosen
3. the players’ payoffs from each feasible combination of moves are common knowledge
Backwards induction
We solve a game from this class by backwards induction as follows:
when player i2 gets the move at the second stage of the game
assume that for each ai1 ∈ Ai1 , player i2 ’s optimization problem has a unique solution, denoted by Ri2 (ai1 )
player i1 will anticipate player i2 ’s reaction to each action ai1 that i1 might take
assume that the previous optimization problem for i1 also has a unique solution, denoted by a∗i1
50
Definição 14. The pair of actions (a∗i1 , Ri2 (a∗i1 )) is called the backwards induction outcome of this game
player i1 anticipates that player i2 will respond optimally to any action ai1 that i1 might choose, by
playing Ri2 (ai1 )
player i1 gives no credence to threats by player i2 to respond in ways that will not be in i2 ’s self-interest
when the second stage arrives
A 3-move game
Consider the following 3-move game: player i1 moves twice
1. Player i1 chooses L or R
2
And recalls his own choice in the first stage
51
Let’s compute the backwards induction outcome of this game
at the second stage, player i2 anticipates that if the game reaches the third stage then i1 will play L′′
at the first stage, player i1 anticipates that if the game reaches the second stage then i2 will play L′
the first stage choice for player i1 is L, thereby ending the game
At some points in the history of the U.S. automobile history, for example, General Motors has seemed to play
such a leadership role
πi (qi , qj ) = qi [P (Q) − c]
where
52
P (Q) = [a − Q]+ is the market-clearing price when the aggregate quantity on the market is Q =
q i1 + q i2
c is the constant marginal cost of production (no fixed costs)
firm i1 should anticipate that the quantity choice qi1 will be met with the reaction Ri2 (qi1 )
The backwards induction outcome of the Stackelberg duopoly game is (qi∗1 , qi∗2 ) where
a−c a−c
qi∗1 = and qi∗2 = Ri2 (qi∗1 ) =
2 4
Interpretation
in the Nash equilibrium of the Cournot game (simultaneous moves) each firm produces (a − c)/3
– thus aggregate quantity in the backwards induction outcome of the Stackelberg game, 3(a − c)/4, is
greater than in the Cournot-Nash equilibrium
– so the market clearing price is lower in the Stackelberg game
in the Stackelberg game, i1 could have chosen its Cournot quantity, (a − c)/3
53
in the Stackelberg game, i1 could have achieved its Cournot profit level but chose to do otherwise
so i1 ’s profit in the Stackelberg game must exceed its profit in the Cournot game
but because the market clearing price is lower in the Stackelberg game
therefore, the fact that i1 is better off implies that i2 is worse off
Observação 18. In game theory, having more information can make a player worse off.
more precisely, having it known to the other players that one has more information can make a player
worse off
Leontief (1946) proposed the following model of the relationship between a firm and a monopoly union
54
– strictly increasing (i.e., R′ > 0)
– strictly concave (i.e., R′′ < 0) and
– satisfies Inada’s condition at 0 and ∞, i.e.,
First, we can characterize the firm’s best response L∗ (w) in stage 2 to an arbitrary wage demand w by
the union in stage 1
R′ (L∗ (w)) = w
55
Fixing the wage level w′ on the vertical line
– optimal L is such that the isoprofit curve through (L, w) is tangent to the constraint {(L, w′ ) : L ≥ 0}
Higher indifference curves represent higher utility levels for the union
The union can solve the firm’s second stage problem as well as the firm can solve it
The union should anticipate that the firm’s reaction to the wage demand w will be to choose the employ-
ment level L∗ (w)
The union would like to choose the wage demand w that yields the outcome (w, L∗ (w)) that is on the
highest possible indifference curve
56
The solution to the union’s problem, w, is the wage demand such that
the union’s indifference curve through the point (L∗ (w∗ ), w∗ ) is tangent to the curve {L∗ (w) : w > 0} at
that point
Inefficiency
Both the union’s utility and the firm’s profit would be increased it (L, w) were in the shaded region
Repeated games
Espinosa and Rhee (1989) propose one answer to this puzzle
Based on the fact that the union and the firm negotiate repeatedly over time
There may exist an equilibrium of such a repeated game in which the union’s choice of w and the firm’s
choice of L lie in the shaded region
57
Sequential bargaining
And so on
Each offer takes one period, and the players are impatient
– they discount payoffs received in later periods by a factor δ ∈ (0, 1) per period
Discount factor
The discount factor δ reflects the time-value of money
A dollar received at the beginning of one period can be put in the bank to earn interest, say at rate r per
period
– So this dollar will be worth 1 + r dollars at the beginning of the next period
Equivalently, a dollar to be received at the beginning of the next period is worth only 1/(1 + r) of a dollar
now
Observação 19. The value today of a future payoff is called the present value of that payoff.
(1a) At the beginning of the first period, player i1 proposes to take a share s1 of the dollar, leaving 1 − s1 for
player i2
58
(2a) At the beginning of the second period, i2 proposes that player i1 take a share s2 of the dollar,3 leaving
1 − s2 for i2
Player i1 can receive s in the third period by rejecting i2 ’s offer of s2 this period
Thus, i1 will
– accept s2 if s2 ≥ δs
– reject s2 if s2 < δs
We assume that each player will accept an offer if indifferent between accepting and rejecting
3
st always goes to player i1 regardless of who made the offer
59
Observação 20. If play reaches the second period, player i2 will offer s∗2 and player i1 will accept.
Then i1 knows that i2 can receive 1 − s∗2 in the second period by rejecting i1 ’s offer of s1 this period
The value this period of receiving 1 − s∗2 next period is only δ(1 − s∗2 )
– which is less than the 1 − δ(1 − s∗2 ) = 1 − δ(1 − δs) available from the former option
Except that the exogenous settlement in step (3) is replaced by an infinite sequence of steps (3a), (3b),
(4a), (4b), and so on
Because the game could go on infinitely, there is no last move at which to begin such an analysis
60
A solution was proposed by Shaked and Sutton (1984)
The game beginning in the third period (should it be reached) is identical to the game as a whole (beginning
in the first period)
Suppose that there is a backwards induction outcome of the game as a whole in which players i1 and i2
receive the payoffs s and 1 − s
We can use these payoffs in the game beginning in the third period, should it be reached
And then work backwards to the first period, as in the 3-period model, to compute a new backwards
induction outcome for the game as a whole
In this new backwards induction outcome, i1 will offer the settlement (f (s), 1 − f (s)) in the first period
and i2 will accept, where
f (s) = 1 − δ(1 − δs)
Let sH be the highest payoff player i1 can achieve in any backwards induction outcome of the game as a
whole
Using sH as the third-period payoff to player i1 , this will produce a new backwards induction outcome in
which player i1 ’s first-period payoff is f (sH )
The only value of s that satisfies f (s) = s is 1/(1 + δ), which will be denoted by s∗
Actually we can prove that (s∗ , 1 − s∗ ) is the unique backwards-induction outcome of the game as a whole
The moves in all previous stages are observed before the next stage begins
61
– The game has imperfect information
1. Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively
2. Players i3 and i4 observe the outcome of the first stage, (ai1 , ai2 ), and then simultaneously choose actions
ai3 and ai4 from feasible sets Ai3 and Ai4 , respectively
The feasible action sets of players i3 and i4 in the second stage, Ai3 and Ai4 , could be allowed to depend
on the outcome of the first stage, (ai1 , ai2 )
In particular, there may be values of (ai1 , ai2 ) that end the game
One could allow for a longer sequence of stages either by allowing players to move in more than one stage
or by adding players
The first step in working backwards from the end of the game involves solving a simultaneous-move game
between players i3 and i4 in stage 2, given the outcome of stage 1
We will assume that for each feasible outcome (ai1 , ai2 ) of the first game, the second-stage game that
remains between players i3 and i4 has a unique Nash equilibrium denoted by (âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
62
If i1 and i2 anticipate that the second-stage behavior of i3 and i4 will be given by the functions âi3 and
âi4
Then the first-stage interaction between i1 and i2 amounts to the following simultaneous-move game
1. Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively
2. Payoffs are
ui (ai1 , ai2 , âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
Suppose (a∗i1 , a∗i2 ) is the unique Nash equilibrium of this simultaneous-move game
We will call
(a∗i1 , a∗i2 , a∗i3 , a∗i4 )
Players i1 and i2 should not believe a threat by players i3 and i4 that the latter will respond with actions
that are not a Nash equilibrium in the remaining second-stage game
Because when play actually reaches the second stage at least one of i3 and i4 will not want to carry out
such a threat exactly because it is not a best response
Suppose player i1 is also player i3 and that player i1 does not play a∗i1 in the first stage
Player i4 may then want to reconsider the assumption that player i3 (i.e., player i1 ) will play âi3 (ai1 , ai2 )
in the second stage
Bank runs
If the bank is forced to liquidate its investment before the project matures, a total of α(2D) can be
recovered, where
1
<α<1
2
63
If the bank allows the investment to reach maturity, the project will pay out a total of β(2D), where
β>1
There are two dates at which investors can make withdrawals from the bank
If both investors make withdrawals at date 1 then each receives αD and the game ends
Finally, if neither investor makes a withdrawal at date 1 then the project matures and the investors make
withdrawal decisions at date 2
If both investors make withdrawals at date 2 then each receives βD > D and the game ends
If only one investor makes a withdrawal at date 2 then that investor receives (2β − 1)D > βD, the other
receives D, and the game ends
Finally if neither investor makes a withdrawal at date 2 then the bank returns βD to each investor and
the game ends
There is a unique Nash equilibrium in this game: both investors withdraw, leading to a payoff of (βD, βD)
Since there is no discounting, we can simply substitute this payoff into the normal-form game at date 1
64
Date 1
The original 2-period bank runs game has two subgame perfect outcomes
this model does not predict when bank runs will occur,
If the total quantity on the market in country i is Qi , then the market clearing price is
Pi (Qi ) = [a − Qi ]+
65
The firm in country i (called firm i) produces hi for home consumption and ei for export, in particular
we have
Qi = hi + ej
The firms have a constant marginal cost c and no fixed costs (we assume that c < a)
Ci (hi , ei ) ≡ c(hi + ei )
Timing
2. The firms observe the tariff rates and simultaneously choose quantities for home consumption and for
export (hi , ei )
Profit to firm i is
1
Wi (ti , tj , hi , ei , hj , ej ) ≡ Q2i + πi (ti , tj , hi , ei , hj , ej ) + ti ej
2
Solução
Suppose the governments have chosen the tariffs ti1 and ti2
Assume that (h∗i1 , e∗i1 , h∗i2 , e∗i2 ) is a Nash equilibrium in the remaining game between firms i1 and i2
66
Then, for each i, (h∗i , e∗i ) must solve
a − c + ti a − c − 2tj
h∗i (ti ) = and e∗i (tj ) =
3 3
67
In the Cournot game, both firms were choosing the quantity (a − c)/3,
– but this result was derived under the assumption of symmetric marginal costs
In the equilibrium described above, the governments’ tariff choices make marginal costs asymmetric
In equilibrium the function h∗i increases in ti and e∗j decreases (at a faster rate) in ti
Having solved the second-stage game that remains between the two firms after the governments choose
tariff rates
We can now represent the first-stage interaction between the two governments as the following simultaneous-
move game
First, the governments simultaneously choose tariff rates ti1 and ti2
We now solve for the Nash equilibrium of this game between the governments
Wi∗ (ti , tj ) ≡ Wi (ti , tj , h∗i (ti ), e∗i (tj ), h∗j (tj ), e∗j (ti ))
If (t∗i , t∗j ) is a Nash equilibrium of this game between governments then, for each i, the tariff t∗i must solve
Observe that if ti and t∗j belong to (0, (a − c)/2) then Wi∗ (ti , t∗j ) equals
A solution is
a−c
t∗i =
3
for each i, independent of t∗j
68
In this model, choosing a tariff rate of (a − c)/3 is a dominant strategy for each government
We then obtain the following firms’ quantity choices for the second-stage
4(a − c) a−c
h∗i (t∗i ) = and e∗i (t∗j ) =
9 9
2(a − c)
Qi =
3
The consumers’ surplus on market i is lower when the governments choose their dominant strategy tariffs
than it would be if they chose zero tariffs
arg max{Wi∗1 (ti1 , ti2 ) + Wi∗2 (ti1 , ti2 ) : ti1 ≥ 0 and ti2 ≥ 0}
There is an incentive for the governments to sign a treaty in which they commit to zero tariffs
Tournaments
The workers’ wages therefore can depend on their outputs but not directly on their effort levels
Suppose the boss decides to induce effort by having the workers compete in a tournament
69
The winner of the tournament is the worker with the higher output
u(w, e) = w − g(e)
The boss is player i1 whose action ai1 is choosing the wages to be paid in the tournament, wH and wL
There is no player i2
Workers observe the wages chosen in the first stage and then simultaneously choose actions ai3 and ai4 ,
namely effort choices ej1 and ej2
Since outputs (and so also wages) are functions not only of the players actions but also of the noise term
εj1 and εj2 , we work with the players’ expected payoffs according to the density f
Let (e∗j1 , e∗j2 ) be a Nash equilibrium of the remaining game between the workers
πj (wH , wL , ej , e∗k ) = wH Pr{yj (ej ) > yk (e∗k )} + wL Pr{yj (ej ) < yk (e∗k )} − g(ej )
= (wH − wL ) Pr{yj (ej ) > yk (e∗k )} + wL − g(ej )
70
The worker j chooses ej such that the marginal disutility of extra effort, g ′ (ej ), equals the marginal gain
from extra effort
implying that4 Z
P rob{yj (ej ) > yk (e∗k )} = [1 − F (e∗k − ej + z)]f (z)dz
we get Z
(wH − wL ) f (z)2 dz = g ′ (e∗ (wH , wL ))
R
– a bigger prize for winning (i.e., a larger value of wH − wL ) induces more effort
because the outcome of the tournament is likely to be determined by luck rather than effort
71
Suppose that if the workers agree to participate in the tournament (rather than accept alternative em-
ployment)
Then they will respond to the wages wH and wL by playing the symmetric Nash equilibrium previously
exhibited
We ignore the possibility of asymmetric equilibria and of an equilibrium with “corner” solutions
Suppose that the workers’ alternative employment opportunity would provide utility Ua
In the symmetric N E each worker wins the tournament with probability 1/2
1
P rob{yj (e∗ (wH , wL )) > yk (e∗ (wH , wL ))} =
2
If the boss intends to induce the workers to participate in the tournament then he must choose wages
(wH , wL ) that satisfy
1 1
wH + wL − g(e∗ (wH , wL )) ≥ Ua (IR)
2 2
The boss chooses wages to maximize expected profit
The participation restriction (IR) must be binding at the optimum, i.e., (wH
∗ , w ∗ ) must be a solution to
L
72
We denote by f ∗ the function defined by
Z
∗ ′ −1
∀δ ≥ 0, f (δ) = [g ] (δξ) where ξ = f (z)2 dz
R
δ∗
Ua + g(f ∗ (δ∗ )) − >0
2
Ψ′ (δ∗ ) = 0
g′ (e∗ (wH
∗ ∗
, wL )) = 1
Remember that Z
∗ ∗
(wH − wL ) f (z)2 dz = g′ (e∗ (wH
∗ ∗
, wL ))
73
Therefore the optimal wages satisfy
Z
∗ ∗
(wH − wL ) f (z)2 dz = 1
∗ ∗
wH + wL = 2Ua + 2g([g′ ]−1 (1))
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
Suppose the outcome of the first play is observed before the second play begins
Suppose the payoff for the entire game is simply the sum of the payoffs from the two stages (no discounting)
This game, called the two-stage Prisoners’ Dilemma belongs to the class of games analyzed in the previous
chapter
The action spaces Ai3 and Ai4 are identical to Ai1 and Ai2
The payoff
ui (ai1 , ai2 , ai3 , ai4 )
For each possible outcome of the first-stage game, (ai1 , ai2 ), the second-stage game that remains between
players i3 and i4 has a unique NE (âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
In the two-stage Prisoners’ Dilemma the unique equilibrium of the second-stage game is (L1 , L2 ), regardless
of the first-stage outcome
We analyze the first-stage Prisonners’ Dilemma by taking into account that the outcome of the game
remaining in the second stage will be the NE (L1 , L2 ) with payoff (1, 1)
Thus the players’ first-stage interaction amounts to the one-shot game below
74
Player i2
L2 R2
L1 2,2 6,1
Player i1
R1 1,6 5,5
The unique subgame perfect outcome of the two-stage Prisoners’ Dilemma is (L1 , L2 ) in the first-stage,
followed by (L1 , L2 ) in the second-stage
Cooperation-i.e., (R1 , R2 )-cannot be achieved in either stage of the subgame perfect outcome
The payoff of player k is uk ((ai )i∈I ) where ai is chosen from the action set Ai
Definição 15. Given a static game G, let G(T ) denote the finitely repeated game in which G is played
T times:
the outcomes of all preceding plays are observed before the next play begins
the payoffs for G(T ) are simply the (discounted) sum of the payoffs from the T stage games
Proposição 9. If the stage game G has a unique NE then, for any finite T , the repeated game G(T ) has
a unique subgame-perfect outcome: the NE of G is played in every stage
Player i2
L2 M2 R2
L1 1,1 5,0 0,0
Player i1 M1 0,5 4,4 0,0
R1 0,0 0,0 3,3
We will show that there is a subgame perfect outcome of this repeated game in which the strategy (M1 , M2 )
is played in the first stage
We assume that in the first-stage, players anticipate that the second-stage outcome will be a NE of the
stage game
75
We have for this specific stage game, several Nash equilibria in the second stage
Players may anticipate that different first-stage outcomes will be followed by different stage-game equilibria
in the second stage
For example, suppose that players anticipate that (R1 , R2 ) will be the second-stage outcome if the first-
stage outcome is (M1 , M2 )
Players anticipate that (L1 , L2 ) will be the second-stage outcome if any of the eight other first-stage
outcomes occurs
The players’ first stage interaction then amounts to the following one-shot game
Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4
There are three pure-strategy Nash equilibria (L1 , L2 ), (M1 , M2 ) and (R1 , R2 )
Every NE of this one-shot game corresponds to a subgame perfect outcome of the original repeated game
Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4
Denote by ((w, x), (y, z)) the outcome of the repeated game
The NE (L1 , L2 ) in the one-shot game above corresponds to the subgame-perfect outcome ((L1 , L2 ), (L1 , L2 ))
in the repeated game
The NE (R1 , R2 ) in the one-shot game above corresponds to the subgame-perfect outcome ((R1 , R2 ), (L1 , L2 ))
in the repeated game
These two subgame-perfect outcomes of the repeated game simply concatenate NE outcomes from the
stage game
Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4
76
The third NE of the one shot game, (M1 , M2 ) corresponds to the subgame-perfect outcome ((M1 , M2 ), (R1 , R2 ))
in the repeated game
Because the anticipated second-stage outcome is (R1 , R2 ) following (M1 , M2 ), cooperation can be achieved
in the first stage of a subgame perfect outcome of the repeated game
if G is a static game of complete information with multiple Nash equilibria then there may be subgame
perfect outcomes of the repeated game G(T ) in which, for any t < T , the outcome in stage t is not a NE
of G
Credible threats or promises about future behavior can influence current behavior
In deriving the subgame perfect outcome ((M1 , M2 ), (R1 , R2 )) we assumed that the players anticipate that
(R1 , R2 ) will be the second-stage outcome if the first-stage outcome is (M1 , M2 ) and that (L1 , L2 ) will be
the second-stage outcome if any of the eight other first-stage outcomes occurs
But playing (L1 , L2 ) in the second stage, with its payoff of (1, 1), may seem silly when (R1 , R2 ), with its
payoff of (3, 3), is also available as a NE of the remaining stage game
They might reason that “bygones are bygones” and unanimously preferred stage-game equilibrium (R1 , R2 )
should be played instead
If (R1 , R2 ) is to be the second-stage outcome after every first-stage outcome, then the incentive to play
(M1 , M2 ) in the first stage is destroyed
Indeed, in that case, the payoff (3, 3) has been added to each cell of the stage game
77
To suggest a solution to this renegotiation problem, we consider the following modification of the stage
game
Player i2
L2 M2 R2 P2 Q2
L1 1,1 5,0 0,0 0,0 0,0
M1 0,5 4,4 0,0 0,0 0,0
Player i1 R1 0,0 0,0 3,3 0,0 0,0
P1 0,0 0,0 0,0 4,1/2 0,0
Q1 0,0 0,0 0,0 0,0 1/2,4
The players unanimously prefer (R1 , R2 ) to (L1 , L2 ), in other words, (R1 , R2 ) Pareto dominates (L1 , L2 )
There is no NE (x, y) such that the players unanimously prefer (x, y) to (P1 , P2 ), or (Q1 , Q2 ), or (R1 , R2 )
We say that (P1 , P2 ), (Q1 , Q2 ), and (R1 , R2 ) belong to the Pareto frontier of the payoffs to Nash equilibria
of the stage game
Suppose that the stage game is played twice, with the first-stage outcome observed before the second
stage begins
Suppose that players anticipate that the second-stage outcome will be as follows
The players’ first stage interaction then amounts to the following one-shot game
Player i2
L2 M2 R2 P2 Q2
11
L1 4,4 2 ,4 3,3 3,3 3,3
M1 4, 11
2 7,7 4, 12 4, 12 4, 12
Player i1 R1 1
3,3 2 ,4 6,6 3,3 3,3
1
P1 3,3 2 ,4 3,3 7, 72 3,3
1 7
Q1 3,3 2 ,4 3,3 3,3 2 ,7
More importantly, the difficulty raised in the previous example does not arise here
78
In the previous example, the only way to “punish” a player for deviating in the first stage from col-
laboration was to play a Pareto dominated equilibrium in the second stage, thereby also punishing the
punisher
A static game G is repeated infinitely, with the outcomes of all previous stages observed before the current
stage begins
We will define
– a player’s strategy
– a subgame
– a subgame perfect Nash equilibrium (SPNE)
The main theme is that credible threats or promises about future behavior can influence current behavior
– We will illustrate that even if the stage game G has a unique NE, there may be subgame perfect
outcomes of the infinitely repeated game in which no stage’s outcome is the NE of G
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
1
The discount factor δ = 1+r is the value today of a dollar to be received one stage later, where r is the
interest rate per stage
Given the discount factor δ and a player’s payoffs from an infinite sequence of stage games, we can compute
the present value of the payoffs
79
Definição 16. Given the discount factor δ, the present value of the infinite sequence of payoffs (πt )t≥1
is
∞
X
π1 + δπ2 + δ2 π3 + · · · = δt−1 πt
t=1
We can also use δ to interpret a game that ends after a random number of repetitions
Suppose that after each stage is played a (weighted) coin is flipped to determine whether the game will
end
If the probability is p that the game ends immediately, then a payoff π to be received in the next stage
(if if is played) is worth only
1−p
π
1+r
before this stage’s coin flip occurs
(1 − p)2
π
(1 + r)2
π1 + δπ2 + δ2 π3 + . . .
reflects both the time-value of money and the possibility that the game will end
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
Consider the infinitely repeated Prisoners’ Dilemma in which each player’s discount factor is δ
We will show that cooperation (i.e., (R1 , R2 )) can occur in every stage of a subgame-perfect outcome of
the infinitely repeated game
– Even though the only NE in the stage game is noncooperation (i.e., (L1 , L2 ))
– if the players cooperate today then they play a high-payoff equilibrium tomorrow
– otherwise they play a low-payoff equilibrium tomorrow
We do not need to add artificially the high-payoff equilibrium that might be played tomorrow
80
Suppose player i begins the infinitely repeated game by cooperating and then cooperates in each subse-
quent stage game if and only if both players have cooperated in every previous stage
Player i cooperates until someone fails to cooperate, which triggers a switch to noncooperation forever
after
If both players adopt this trigger strategy then the outcome of the infinitely repeated game will be (R1 , R2 )
in every stage
We will first show that if δ is close enough to 1 then it is a NE of the infinitely repeated game for both
players to adopt this strategy
We propose to provide rigorous definitions of the following concepts for both finitely and infinitely repeated
games
1. a strategy in a repeated game
Definição 17. Given a stage game G = {Ai , ui }i∈I , let G(T, δ) denote the finitely repeated game in
which
for each t , the outcomes of the t − 1 preceding plays are observed before the stage t begins
each player’s payoff in G(T, δ) is the present value of the player’ payoffs from the sequence of stage
games
Definição 18. Given a stage game G = {Ai , ui }i∈I , let G(∞, δ) denote the infinitely repeated game
in which
for each t, the outcomes of the t − 1 preceding plays are observed before the stage t begins
81
each player’s payoff in G(∞, δ) is the present value of the player’ payoffs from the infinite sequence
of stage games
Definição 19. In the finitely repeated game G(T, δ) or the infinitely repeated game G(∞, δ), a player’s
strategy specifies the action the player will take in each stage, for each possible history of plays through
the previous stages
st = (s1 , s2 , . . . , st )
where
Y
∀0 ≤ τ ≤ t, sτ = (ai,τ )i∈I ∈ S ≡ Ai
i∈I
fi : S(T ) → Ai
fi : S(∞) → Ai
82
Consider a finitely repeated game G(T, δ)
Definição 20. A strategy profile f ∗ = (fi∗ )i∈I is a NE of the repeated game G(T, δ) if for each player i, fi∗
∗ , i.e.,
is a best response to f−i
Definição 21. A strategy profile f ∗ = (fi∗ )i∈I is NE of the infinitely repeated game G(∞, δ) if for each
player i, fi∗ is a best response to f−i
∗ , i.e.,
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
Consider the infinitely repeated Prisoners’ Dilemma in which each player’s discount factor is δ
Suppose player i begins the infinitely repeated game by cooperating and then cooperates in each subse-
quent stage game if and only if both players have cooperated in every previous stage
fi∗ (∅) = Ri
83
Proposição 10. If δ ≥ 1/4 then the profile f ∗ defined above is a NE of the infinitely repeated Prisoners’
Dilemma
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
4
πi (fi∗ , fj∗ ) = 4 + 4δ + · · · + 4δt + · · · =
1−δ
Now fix another strategy fi 6= fi∗ and assume that fi (∅) 6= fi∗ (∅), i.e., fi (∅) = Li
If follows that the outcome at the first stage is O1 (fi , fj∗ ) = (Li , Rj )
The outcome at the second stage is then O2 (fi , fj∗ ) = (ai,2 , Lj ) for some action ai,2 ∈ {Ri , Li }
This implies
πi (fi , fj∗ ) ≤ 5 + δ + · · · + δt + . . .
and therefore
δ 4δ − 1
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ −1 + 3 =
1−δ 1−δ
Observe that the value of fi (s1 ) for an outcome s1 different from sco
1 ≡ (R1 , R2 ) is irrelevant for the payoff
ui (O2 (fi , fj∗ ))
If follows that the outcome at the second stage is O2 (fi , fj∗ ) ≡ (fi (sco co
1 ), fj (s1 )) = (Li , Rj )
84
The outcome at the third stage is then O3 (fi , fj∗ ) = (ai,3 , Lj ) for some action ai,3 ∈ {Ri , Li }
This implies
πi (fi , fj∗ ) ≤ 4 + 5δ + δ2 + · · · + δt + . . .
and therefore
δ2
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ 0 − 1δ + 3
1−δ
δ δ(4δ − 1)
≥ [3δ − (1 − δ)] =
1−δ 1−δ
Observe that the value of fi (st−1 ) for an outcome history st−1 different from sco,t−1 is irrelevant for the
payoff ui (Ot (fi , fj∗ ))
This implies
πi (fi , fj∗ ) = 4 + 4δ + · · · + 4δt−2 + 5δt−1 + δt + · · · + δT + . . .
85
and therefore
δt
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ −δt−1 + 3
1−δ
t−1
δ [3δ − (1 − δ)] δt−1 (4δ − 1)
≥ =
1−δ 1−δ
86
Theory: Subgames
Definição 22.
The repeated game in which G is played T − t times after st is denoted by G(T − t, δ, st ) and is called the
subgame beginning at stage t + 1 following history st
Definição 23.
The repeated game in which G is played infinite times after st is denoted by G(∞, δ, st ) and is called the
subgame beginning at stage t + 1 following history st
Definição 24.
fi (σ τ |st ) = fi (st , σ τ )
Definição 25.
fi (σ τ |st ) = fi (st , σ τ )
87
for every history σ τ of G(∞, δ, st )
Definição 26. A subgame perfect equilibrium of a finitely repeated game G(T, δ) is a strategy profile f ∗ =
(fi∗ )i∈I which constitutes a NE of every subgame, i.e.,
f ∗ is a NE of G(T, δ)
the strategy profile f ∗ (·|st ) ≡ (fi∗ (·|st ))i∈I is a NE of the subgame G(T − t, δ, st )
Observação 24.
Many possible histories st are out of equilibrium paths, i.e., they are different from the outcome history
(O1 (f ∗ ), . . . , Ot (f ∗ ))
Definição 27. A subgame perfect equilibrium of an infinitely repeated game G(∞, δ) is a strategy profile
f ∗ = (fi∗ )i∈I which constitutes a NE of every subgame, i.e.,
f ∗ is a NE of G(∞, δ)
the strategy profile f ∗ (·|st ) ≡ (fi∗ (·|st ))i∈I is a NE of the subgame G(∞, δ, st )
– To be subgame perfect, the players’ strategies must first be a NE and must then pass an additional
test
The notion of subgame perfect equilibrium eliminates Nash equilibria in which the players’ threats or
promises are not credible
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
88
Consider the infinitely repeated Prisoners’ Dilemma in which each player’s discount factor is δ
Suppose player i begins the infinitely repeated game by cooperating and then cooperates in each subse-
quent stage game if and only if both players have cooperated in every previous stage
fi∗ (∅) = Ri
Proposição 11. If δ ≥ 1/4 then the profile f ∗ defined above is a SPNE of the infinitely repeated Prisoners’
Dilemma
We only have to prove that for every T ≥ 1 and every possible history sT , the profile f ∗ (·|sT ) is a NE of
the subgame G(∞, δ, sT )
sT = (sco co co
1 , s2 , . . . , sT )
Therefore we can reproduce the arguments of the proof that f ∗ is a NE of G(∞, δ) to show that f ∗ (·|sT )
is a NE of G(∞, δ, sT )
sT 6= (sco co co
1 , s2 , . . . , sT )
The strategy fj∗ (·|sT ) is given by fj∗ (σ t |sT ) = Lj for any history σ t of the subgame G(∞, δ, sT )
89
The payoff πi (f ∗ (·|sT )) for agent i of the strategy fi∗ (·|sT ) is
ui (O1 (fi∗ (·|sT ), fj∗ (·|sT ))) + δui (O2 (fi∗ (·|sT ), fj∗ (·|sT ))) + . . .
We get that
1
πi (f ∗ (·|sT )) = 1 + δ + δ2 + · · · =
1−δ
ui (O1 (gi , fj∗ (·|sT ))) + δui (O2 (gi , fj∗ (·|sT ))) + . . .
Since for every t there exists an action ai,t ∈ {Li , Ri } such that Ot (gi , fj∗ (·|sT )) = (ai,t , Lj )
We get that
πi (gi , fj∗ (·|sT )) ≤ 1 + δ + δ2 + · · · = πi (f ∗ (·|sT ))
Definição 28. A profile of payoffs x = (xi )i∈I is called feasible in the stage game G if it is a convex
combination of the pure strategy payoffs of G, i.e., if there exists a family (π k )k∈K of pure strategy payoffs
π k = (πik )i∈I such that
X
x= αk π k
k∈K
P
where αk ≥ 0 and k∈K αk = 1.
90
Average payoff
V (π) = π1 + δπ2 + δ2 π3 + . . .
If a payoff π̄ were received in every stage, the present value would be π̄/(1 − δ)
The average payoff of an infinite sequence π = (π1 , π2 , . . . ) of payoffs is the payoff π̄ that should be received
in every stage in order to achieve the same present value
π̄
= V (π)
1−δ
Definição 29. Given a discount factor δ, the average payoff of an infinite sequence of payoffs π =
(π1 , π2 , . . . ) is
X
(1 − δ) δt−1 πt
t≥1
Since the average payoff is just a rescaling of the present value, maximizing the average payoff is equivalent
to maximizing the present value
The advantage of the average payoff over the present value is that the former is directly comparable to
the payoffs from the stage game
91
Teorema 14 (Folk Theorem: Friedman (1971)). Let
If
then there exists a subgame perfect NE of the infinitely repeated game G(∞, δ) that achieves x = (xi )i∈I as
the profile of average payoffs
92
Collusion between Cournot duopolists
Cournot duopoly
The equilibrium aggregate quantity, 2(a − c)/3 exceeds the monopoly quantity, qm ≡ (a − c)/2
Both firms would be better off if each produced half the monopoly quantity, qi = qm /2
Consider the infinitely repeated game based on this Cournot stage game when both firms have the dis-
counted factor δ
We propose to compute the values of δ for which it is a subgame perfect Nash equilibrium to play the
previous trigger strategy
The profit to one firm when both produce qm /2 is (a − c)2 /8, which will be denoted by πm /2
The profit to one firm when both produce qC is (a − c)2 /9, which will be denoted by πC
If firm i is going to produce qm /2 this period then the quantity that maximizes firm j’s profit this period
solves
arg max{[a − c − qj − (qm /2)]qj : qj ≥ 0}
The solution is qj = 3(a − c)/8, with associated profit of πd ≡ 9(a − c)2 /64
1 πm δ
× ≥ πd + πC
1−δ 2 1−δ
93
This yields δ ≥ 9/17
For the same reasons as the Prisoners’ Dilemma infinitely repeated game, this NE is subgame perfect
We first determine, for a given value of δ, the most profitable quantity the firms can produce if they both
play trigger strategies that switch forever to the Cournot quantity after any deviation
But for any value of δ it is a SPNE simply to repeat the Cournot quantity forever
This implies that the most profitable quantity that triggers strategies can support is between qm /2 and
qC
If firm i is going to produce q ∗ this period, then the quantity that maximizes firm j’s profit this period
solves
arg max{(a − c − qj − q ∗ )qj : qj ≥ 0}
(a − c − q ∗ )2
πd ≡
4
It is a NE for both firms to play the trigger strategy given before provided that
1 δ
π ∗ ≥ πd + πC
1−δ 1−δ
Solving the resulting quadratic in q ∗ shows that the lowest value of q ∗ for which the trigger strategies are
a SPNE is
9 − 5δ
q ∗ (δ) = (a − c)
3(9 − δ)
qm
lim q ∗ (δ) = and lim q ∗ (δ) = qC
9
δ→ 17 2 δ→0
94
We now explore the second approach, which involves threatening to administer the strongest credible
punishment
We propose to show that Abreu’s approach can achieve the monopoly outcome in our model when δ = 1/2
(which is less than 9/17)
Let V (x) denote the present value of receiving π(x) this period and half the monopoly profit forever after:
δ πm
V (x) = π(x) + ×
1−δ 2
If firm i is going to produce x this period, then the quantity that maximizes firm j’s profit this period
solves
arg max{(a − c − qj − x)qj : qj ≥ 0}
(a − c − x)2
πdp (x) ≡
4
1 1
× πm ≥ πd + δV (x) (1)
1−δ 2
and
V (x) ≥ πdp (x) + δV (x) (2)
95
For δ = 1/2, condition (4) is satisfied provided
x 3 1
∈ ,
a−c 10 2
For δ = 1/2, the two-phase strategy achieves the monopoly outcome as a SPNE provided that
x 3 1
∈ ,
a−c 8 2
Extensive-form representation
It may seem that that static games must be represented in normal form and dynamic games in extensive
form
Although for some games one of the two forms is more convenient to analyze
We will discuss how static games can be represented using extensive form and how dynamic games can
be represented using normal form
Normal-form representation
The normal-form representation of a game specifies
3. the payoff received by each player for each combination of strategies that could be chosen by the players
96
3. the payoff received by each player for each combination of moves that could be chosen by the players
We propose to describe such games using game trees rather than words
Example: consider the following class of two-stage games of complete and perfect information
2. Player 2 observes a1 and then chooses an action a2 from the set A2 = {L, R}
3. Payoffs are u1 (a1 , a2 ) and u2 (a1 , a2 ), as shown in the following game tree
We can extend in a straightforward manner the previous game tree to represent any dynamic game of
complete and perfect information:
To represent a dynamic game in normal form, we need to translate the information in the extensive form
into the description of each player’s strategy
97
Normal-form representation of dynamic games
Definição 31. A strategy for a player is a complete plan of action: it specifies a feasible action for the
player in every contingency in which the player might be called on to act
We could not apply the notion of Nash equilibrium to dynamic games of complete information if we
allowed a player’s strategy to leave the actions in some contingencies unspecified
For player j to compute a best response to payer i’s strategy, j may need to consider how i would act in
every contingency, not just in the contingencies i thinks likely to arise
In the previous game, player 2 has two actions but four strategies
Strategies of player 2:
The reason is that player 1 has only one contingency in which he might be called upon to act
98
Player 1’s strategy space is equivalent to the action space A1 = {L, R}
We can now derive the normal-form representation of the game from its extensive-form representation
Player 2
(L′ , L′ ) (L′ , R′ ) (R′ , L′ ) (R′ , R′ )
L 3, 1 3, 1 1, 2 1, 2
Player 1
R 2, 1 0, 0 2, 1 0, 0
Extensive-form of static games
We turn to showing how a static (i.e., simultaneous-move) game can be represented in extensive form
It suffices that each choose a strategy without knowledge of the other’s choice
Alternatively, player 2 could move first and player 1 could then move without observing 2’s action
To represent that some player ignores the previous moves, we introduce the notion of a player’s information
set
Definição 32. An information set for a player is a collection of decision nodes satisfying
(i) the player has the move at every node in the information set
(ii) when the play of the game reaches a node in the information set, the player does not know which node
in the information set has (or has not) been reached
Part (ii) implies that the player must have the same set of feasible actions at each decision in an information
set
99
In an extensive-form game, we will indicate that a collection of decision nodes constitutes an information
set by connecting the nodes by a dotted line
Fink = confess
We propose a second example of the use of an information set in representing ignorance of a previous play
100
Perfect and imperfect information
We previously defined perfect information to mean that at each move in the game the player with the
move knows the full history of the play of the game thus far
Imperfect information means that there is at least one non-singleton information set
Subgames
We extend this definition to general dynamic games of complete information in terms of the game’s
extensive-form representation
(a) begins at a decision node n that is a singleton information set but is not the game’s first decision node
(b) includes all the decision and terminal nodes following n in the game tree but no nodes that do not follow
n
(c) does not cut any information sets, i.e., if a decision node n′ follows n in the game tree, then all other
nodes in the information set containing n must also follow n, and so must be included in the subgame
Subgames: example
There are two subgames, one beginning at each of player 2’s decision nodes
101
Subgames: example
Subgames: example
There is only one subgame: it begins at player 3’s decision node following R by player 1 and R′ by player
2
Because of part (c), a subgame does not begin at either of player 2’s decision nodes, even though both of
these nodes are singleton information sets
102
Subgame perfect Nash equilibrium
Definição 34. A profile of strategies of a dynamic game with complete information is a subgame perfect
Nash equilibrium if it is a Nash equilibrium of the initial game and the players’ strategies restricted to every
subgame constitute a Nash equilibrium of the subgame
We already encountered two game solutions for dynamic games: backwards induction outcome and subgame
perfect outcome
The difference is that a SPNE is a collection of strategies and a strategy is a complete plan of actions
Whereas an outcome describes what will happen only in the contingencies that are expected to arise, not
in every contingency that might arise
Equilibrium vs outcome
Consider the standard two-stage game of complete and perfect information defined as follows
Assume that the previous optimization problem for player 1 also has a unique solution, denoted by a∗1
The pair of actions (a∗1 , R2 (a∗1 )) is the backwards induction outcome of this game
For player 1 a strategy coincides with an action since there is only one contingency in which player 1 can
be called upon to act – the beginning of the game
In this game, the subgames begin with player 2’s move in the second stage
103
Observação 25. The profile of strategies f ∗ ≡ (a∗1 , R2 ) is a SPNE
We have to show that f ∗ = (a∗1 , R2 ) is a NE and that the restriction to each subgame is also a NE
– Being a NE reduces to requiring that player 2’s action be optimal in every subgame
– This is exactly the problem that the best-response function R2 solves
Consider the standard two-stage game of complete but imperfect information defined as follows:
Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively
Players i3 and i4 observe the outcome of the first stage, (ai1 , ai2 ), and then simultaneously choose actions
ai3 and ai4 from feasible sets Ai3 and Ai4 , respectively
We will assume that for each feasible outcome (ai1 , ai2 ) of the first game, the second-stage game that
remains between players i3 and i4 has a unique NE denoted by
Assume that (ai1 , ai2 ) is the unique NE of the first-stage interaction between i1 and i2 defined by the
following simultaneous-move game
1. Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively
2. Payoffs are
ui (ai1 , ai2 , âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
Proposição 12. In the two-stage game of complete but imperfect information defined above, the subgame
perfect outcome is
(a∗i1 , a∗i2 , âi3 (a∗i1 , a∗i2 ), âi4 (a∗i1 , a∗i2 ))
104
but the subgame perfect Nash equilibrium is
Player 2
(L′ , L′ ) (L′ , R′ ) (R′ , L′ ) (R′ , R′ )
L 3, 1 3, 1 1, 2 1, 2
Player 1
R 2, 1 0, 0 2, 1 0, 0
105
The second one corresponds to a non-credible threat of player 2
If the threat works then 2 is not given the opportunity to carry out the threat
106
Cap. 3 - Static Games of Incomplete Information
3.1 Static Bayesian games and Bayesian Nash equilibrium
Introduction
In a game of complete information the players’ payoff functions are common knowledge
In a game of incomplete information, at least one player is uncertain about another player’s payoff function
– each bidder knows his or her own valuation for the good being sold
– but each bidder does not know any other bidder’s valuation
– bids are submitted in sealed envelopes, so the players’ moves can be thought of as simultaneous
An example:
Consider a Cournot duopoly model with inverse demand given by P (Q) = a − Q, where Q = q1 + q2 is
the aggregate quantity in the market
Information is asymmetric:
Firm 2 may want to choose a different (and presumably lower) quantity if its marginal cost is high than
if it is low
Firm 1 should anticipate that firm 2 may tailor its quantity to its cost in this way
Let q2 (cH ) and q2 (cL ) denote firm 2’s quantity choices as a function of its cost
107
Let q1 denote firm 1’s single quantity choice
Firm 1 knows that firm 2’s cost is high with probability θ and should anticipate that firm 2’s quantity
choice will be q2 (cH ) or q2 (cL ), depending on firm 2’s cost
where
f1 (q1 , q2∗ ) ≡ θ[(a − q1 − q2∗ (cH )) − c]q1 + (1 − θ)[(a − q1 − q2∗ (cL )) − c]q1
a − 2cH + c 1 − θ
q2∗ (cH ) = + (cH − cL )
3 6
a − 2cL + c θ
q2∗ (cL ) = − (cH − cL )
3 6
and
a − 2c + θcH + (1 − θ)cL
q1∗ =
3
Consider the Cournot equilibrium under complete information with costs c1 and c2
a − 2ci + cj
q̂i =
3
108
In the incomplete information case, q2∗ (cH ) is greater than (a − 2cH + c)/3 and q2∗ (cL ) is less than (a −
2cL + c)/3
This occurs because firm 2 not only tailors its quantity to its costs
Recall that the normal-form representation of a game of complete information is G = (Si , ui )i∈I
We can write G = (Ai , ui )i∈I where Ai is player i’s action space and ui (ai , a−i ) is player i’s payoff
We should represent the idea that each player knows his or her payoff function but may be uncertain
about the other players’ payoff functions
Let player i’s possible payoff functions be represented by ui (ai , a−i ; ti ) where ti is called player i’s type
The type ti belongs to a set of possible types Ti also called type space
For example, suppose that player i has two possible payoff functions
We would say that player i has two types ti1 and ti2
Player i’s type space is Ti = {ti1 , ti2 } and player i’s two payoff functions are ui (a; ti1 ) and ui (a; ti2 )
We can also represent the possibility that the player might have different sets of feasible actions
– Suppose for example that player i’s set of feasible actions is {a, b} with probability q and {a, b, c}
with probability 1 − q
– We can say that i has two types: ti1 and ti2 where the probability of ti1 is q
109
– We can define i’s feasible set of actions to be {a, b, c} for both types but define the payoff from taking
action c to be −∞ for type ti1
Another example
Firm 2 has two possible cost functions and thus two possible profit or payoff functions
and
π2 (q1 , q2 ; cH ) = [(a − q1 − q2 ) − cH ]q2
We say that firm 2’s type space is T2 = {cL , cH } and that firm 1’s type space if T1 = {c}
Saying that player i knows his or her own payoff function is equivalent to saying that player i knows his
or her type
Saying that player i may be uncertain about the other players’ payoff functions is equivalent to saying
that player i may be uncertain about the types of the other players, denoted by
We use T−i to denote the set of all possible values of t−i , i.e.,
Y
T−i ≡ Tj = T1 × · · · × Ti−1 × Ti+1 × · · · × Tn
j6=i
We use the probability distribution π(t−i |ti ) to denote player i’s belief about the other players’ types, t−i ,
given player i’s knowledge of his or her own type, ti
In many applications, the players’ types are independent, in which case π(t−i |ti ) does not depend on ti ,
so we can write player i’s beliefs as π(t−i )
Each firm’s chance of success depends in part on how difficult the technology is to develop, which is not
known
110
Each firm knows only whether it has succeeded and not whether the other has
If firm 1 has succeeded, then it is more likely that the technology is easy to develop and so also more
likely that firm 2 has succeeded
Firm 1’s belief about firm 2’s type depends on firm 1’s knowledge of its own type
G = (Ai , Ti , pi , ui )i∈I
where
ui (ai , a−i ; ti )
G = (Ai , Ti , pi , ui )i∈I
Player i’s type ti is privately known by player i, determines player i’s payoff function ui (ai , a−i ; ti )
111
To simplify notations,
Therefore
pi (ti , t−i )
pi (t−i |ti ) =
pi (ti )
Since player i observes his own type, we do not need to define the probability pi on the whole space T
ti 7→ pi (·|ti )
from Ti to P rob(T−i )
Following Harsanyi (1968) we will assume that the timing of a static Bayesian game is as follows
1. Nature draws a type vector t = (ti )i∈I where ti is drawn from the set of possible types Ti
3. The players simultaneously choose actions, player i choosing ai from the feasible set Ai
We can interpret a game of incomplete information as a game of imperfect information since at some move
in the game the player with the move does not know the complete history of the game thus far
Indeed, nature reveals player i’s type to player i but not to player j in step (2)
Player j does not know the complete history of the game when actions are chosen in step (3)
For some games, player i’s payoff may depend not only on the actions (ai , a−i ), his type ti but also on all
the other types t−i
We will assume that it is common knowledge that in step (1) of the timing, nature draws a type vector
t = (ti )i∈I according to a common prior probability distribution p ∈ P rob(T )
When nature reveals ti to player i, he can compute the belief pi (t−i |ti ) using Bayes’ rule
112
The other players can compute the various beliefs π(·|ti ) that player i might hold, depending on i’s type
ti
We will frequently assume that players’ type are independent, i.e., there exists qi ∈ P rob(Ti ) such that
In this case the other players know i’s belief about their types
In order to define an equilibrium concept for static Bayesian games, we must first define the player’s
strategies
Recall that a player’s strategy is a complete plan of action, specifying a feasible action in every contingency
in which the player might be called on to act
Giving the timing of a static Bayesian game, in which nature begins the game by drawing the players’
types, a (pure) strategy for player i must specify a feasible action for each of player i’s possible types
Definição 36. A strategy for player i in the static Bayesian game G = (Ai , Ti , pi , ui )i∈I is a function
si : Ti 7→ Ai
which specifies for each type ti ∈ Ti an action si (ti ) from the feasible set Ai
In a Bayesian game the strategy spaces are constructed from the type and action spaces
Player i’s set of possible (pure) strategies, Si , is the set of all possible functions with domain Ti and range
Ai
Si ≡ ATi i
It may seem unnecessary to require player i’s strategy to specify a feasible action for each player i’s
possible types
113
Once nature has drawn a particular type and revealed it to a player, it may seem that the player need
not be concerned with the actions he should have taken had nature drawn some other type
What they will do depends on what the other players think player i will do, for each ti in Ti (since they
do not observe ti )
In deciding what to do once one type has been drawn, player i will have to think about what he would
have done if each other types in Ti had been drawn
When player j has to decide what to do, he should think about what player i may do for each possible
types in Ti , since player j cannot observe player i’s type
Definição 37. In the static game G = (Ai , Ti , pi , ui )i∈I the profile of strategies
s∗ = (s∗i )i∈I
is a (pure strategy) Bayesian Nash equilibrium (BNE) if for each player i and for each of i’s types ti in Ti ,
the action s∗i (ti ) solves
X
arg max ui (ai , s∗−i (t−i ); (ti , t−i ))pi (t−i |ti ) : ai ∈ Ai
t−i ∈T−i
In a static BNE, no player wants to change his strategy, even if the change involves only one action by
one type
We can show that in a finite static Bayesian game (i.e., a game in which Ai and Ti are finite sets) there
exists a BNE, perhaps in mixed strategies
Proposição 13. Assume that (s∗i )i∈I is a Bayesian Nash equilibrium, i.e.,
Then we have
s∗i ∈ arg max{Ep [ui (si , s∗−i )] : si ∈ Si = [Ai ]Ti }
114
where
X
Ep [ui (si , s∗−i )] ≡ ui (si (ti ), s∗−i (t−i ))p(t).
t∈T
3.2 Applications
Mixed strategies revisited
Player j’s mixed strategy represents player i’s uncertainty about j’s choice of a pure strategy
Player j’s choice in turn depends on the realization of a small amount of private information
More precisely, a mixed-strategy NE in a game of complete information can (almost always) be interpreted
as a pure-strategy BNE in a closely related game with a little bit of incomplete information
The crucial feature of a mixed-strategy NE is not that player j chooses a strategy randomly, but rather
that player i is uncertain about player j’s choice
Pat
Opera Fight
Opera 2, 1 0, 0
Chris
Fight 0, 0 1, 2
Suppose that Chris and Pat are not quite sure of each other’s payoffs
For instance, suppose that Chris’s payoff if both attend the Opera is 2 + tc , where tc is privately known
by Chris
Pat’s payoff if both attend the Fight is 2 + tp , where tp is privately known by Pat
The parameters tc and tp are independent draws from a uniform distribution on [0, x], where x should be
thought as small with respect to 2
115
All the other payoffs are the same
G = {Ac , Ap ; Tc , Tp ; pc , pp ; uc , up }
where
Pat
Opera Fight
Opera 2 + tc , 1 0, 0
Chris
Fight 0, 0 1, 2 + tp
Chris plays Opera if tc exceeds the critical value c and plays F ight otherwise, i.e.,
(
Opera if tc > c
s∗c (tc ) =
F ight if tc ≤ c
Pat plays F ight if tp exceeds the critical value p and plays Opera otherwise, i.e.,
(
F ight if tp > p
s∗p (tp ) =
Opera if tp ≤ p
For a given value of x, we will determine values of c and p such that these strategies are a BNE
Given Pat’s strategy s∗p , Chris’s expected payoff from playing Opera is
p h pi p
uc (Opera, s∗p ; tc ) = (2 + tc ) + 1 − · 0 = (2 + tc )
x x x
p h pi p
uc (F ight, s∗p ; tc ) = ·0+ 1− ·1 =1−
x x x
116
Playing Opera is optimal (best response) if and only if
x
tc ≥ −3≡c
p
Given Chris’s strategy s∗c , Pat’s expected payoff from playing F ight is
h ci c c
up (F ight, s∗c ; tp ) = 1 − · 0 + (2 + tp ) = (2 + tp )
x x x
Yields p = c and p2 + 3p − x = 0
equal √
−3 + 9 + 4x
1−
2x
which approaches 2/3 as x approaches zero
As the incomplete information disappears (x → 0), the players’ behavior in this pure-strategy BNE of the
incomplete information game approaches the players’ behavior in the mixed-strategy NE in the original game
of complete information
An auction
117
Bidder i has a valuation vi for the good
– If bidder i gets the good and pays the price p, then i’s payoff is vi − p
The two players’ valuations are independently and uniformly distributed on [0, 1]
The higher bidder wins the good and pays the price she bid; the other bidder gets and pays nothing
A profile (b̃1 , b̃2 ) is BNE if for each player i, for each valuation vi ∈ [0, 1], the value b̃i (vi ) belongs to
1
arg max (vi − bi )pi {bi > b̃j } + (vi − bi )pi {bi = b̃j } : bi ≥ 0
2
Recall that
pi {bi > b̃j } = λ{vj ∈ [0, 1] : bi > b̃j (vj )}
and
pi {bi = b̃j } = λ{vj ∈ [0, 1] : bi = b̃j (vj )}
118
We propose to look for a linear equilibrium of the form
b̃i : vi 7→ ai + ci vi
where ci > 0
We are not restricting the players’ strategy spaces to include only linear strategies
We allow players to choose arbitrary strategies but ask whether there is an equilibrium that is linear
b̃j = aj + cj Id
where
Id : [0, 1] → [0, 1] is defined by Id(v) = v
aj ≤ b̃i (vi ) ≤ aj + cj
bi − a j
λ{bi > aj + cj Id} = λ[0, (bi − aj )/cj ) =
cj
If 0 < aj < 1 then there are some values of vi such that vi < aj , in which case b˜i is not linear
aj 1
b̃i (vi ) = + vi
2 2
119
The function b̃i takes the form ai + ci Id where ai = aj /2 and ci = 1/2
An auction: uniqueness
We propose to prove that there is a unique symmetric BNE which is the linear equilibrium already derived
We propose to prove that there is a single function b̃ such that (b̃, b̃) is a BNE
Since players’ valuations typically will be different, their bids typically will be different, even if both use
the same strategy
Suppose that player j adopts a strategy b̃ and assume that b̃ is strictly increasing and differentiable
If bi ∈ Im(b̃) then h
{bi > b̃} = 0, b̃−1 (bi )
implying that h i′
−b̃−1 (b̃i (vi )) + (vi − b̃i (vi )) b̃−1 (b̃i (vi )) = 0
Observe that h i′ h i′
Id × b̃ (vi ) = vi b̃ (vi ) + b̃(vi )
This leads to h i′
Id × b̃ − (1/2)Id2 =0
5
Observe that λ{bi = b̃} = 0
120
Therefore there exists a constant k such that
1
∀vi ∈ [0, 1], vi b̃(vi ) = vi2 + k
2
A player’s action should be individually rational: No player should bid more than his valuation
Thus, we require b̃ ≤ Id
A double auction
We consider the case in which a buyer and a seller each have private information about their valuations
The buyer’s valuation for the seller’s good is vb , the seller’s valuation is vs
These valuations are private information and are drawn from independent uniform distribution on [0, 1]
If the buyer gets the good for price p then his utility is vb − p; if there is no trade the buyer’s utility is
zero
If the seller sells the good for price p then his utility is p − vs ; if there is no trade the seller’s utility is
zero
A strategy for the buyer is a function p̃b : vb 7→ p̃b (vb ) specifying the price the buyer will offer for each of
his possible valuation
A strategy for the seller is a function p̃s : vs 7→ p̃s (vs ) specifying the price the seller will demand for each
of his possible valuation
A profile of strategies (p̃b , p̃s ) is a BNE if the two following conditions hold
121
where
{pb ≥ p̃s } = {vs ∈ [0, 1] : pb ≥ p̃s (vs )}
where
{ps ≥ p̃b } = {vb ∈ [0, 1] : ps ≤ p̃b (vb )}
A profile of strategies (p̃b , p̃s ) is a BNE if the two following conditions hold
There are many BNE, we propose to exhibit one of them where trade occurs at a single price if it occurs
at all
122
Trade would be efficient for all (vs , vb ) pairs such that vb ≥ vs , but does not occur in the two shaded
regions
If the buyer’s valuation is vb , his best reply p̃b (vb ) should solve
1 as + pb [pb − as ]+
max vb − pb +
pb ≥0 2 2 cs
2 1
p̃b (vb ) = vb + a s
3 3
Thus, if the seller plays a linear strategy, then the buyer’s best response is also linear
If the seller’s valuation is vs , his best reply p̃s (vs ) should solve
1 cb − [ps − ab ]+
max {ps + E[p̃b |ps ≤ p̃b ]} − vs
ps ≥0 2 cb
Thus, if the buyer plays a linear strategy, then the seller’s best response may also be linear
Assume ab ≤ cb /2
If the players’ linear strategies are to be best response to each other then we get
2 2 as ab + cb
cb = , cs = , and ab = , as =
3 3 3 3
2 1 2 1
p̃b (vb ) = vb + and p̃s (vs ) = vs +
3 12 3 4
123
Trade occurs if and only if the pair (vs , vb ) is such that p̃b (vb ) ≥ p̃s (vs ), i.e., iff vb ≥ vs + (1/4)
In both cases, the one-price and linear equilibria, the most valuable trade (vs = 0 and vb = 1) does occur
The linear equilibrium, in contrast, misses all trades worth next to nothing but achieves all trades worth
at least 1/4
124
This suggests that the linear equilibrium may dominate the one-price equilibria, in terms of the expected
gains the players receive
One may wonder if it is possible to find other BNE for which the players might do even better
Myerson and Satterthwaite (Jet 1983) show that, for the uniform valuation distributions, the linear
equilibrium yields higher expected gains for the players than any other Bayesian Nash equilibria of the
double auction
This implies that there is no BNE of the double auction in which trade occurs if and only if it is efficient
(i.e., if and only if vb ≥ vs )
É comum que um dos jogadores possa definir os termos em que um processo de interação estratégica irá
se desenvolver
– o governo é vendedor
– mas também é o agente que determina as regras do jogo (regras da privatização)
Um empresário que decide vender sua empresa pode também definir os termos em que se desenvolverá a
negociação com eventuais compradores
O Banco Central, nos seus leilões de tı́tulos públicos, estabelece as próprias regras desses leilões
O fato que alguns jogadores possuem informação privada é essencial para que o jogador que define as
regras consiga maximizar sua recompensa
Quando se vai vender uma empresa (ou um tı́tulo do BC) em geral não se conhece o valor que o comprador
está disposto a pagar
Questão 11. Como elaborar ou desenhar um mecanismo para que determinado objetivo seja alcançado?
O jogador que possui o poder de desenhar o mecanismo tem a liberdade de definir as regras do jogo
125
1. individual rationality: o jogador responsável pelo desenho não pode adotar nenhuma coerção
– os jogadores envolvidos no mecanismo devem jogar de forma voluntária
2. incentive compatibility: o jogador responsável pelo desenho tem que ter expectativas razoáveis sobre
o comportamento dos outros jogadores
– os jogadores envolvidos no mecanismo não irão jogar algo que não seja um equilı́brio do mecan-
ismo consistente com seus próprios interesses
Exemplo
Os possı́veis tipos de compradores da empresa, {a, b}, são de conhecimento comum dos jogadores
1. Perguntar ao comprador qual é o tipo dele, ou de forma equivalente, e quanto ele está disposto a
pagar pela empresa
2. Estabelecer um preço que o comprador, qualquer que seja o seu tipo, estará disposto a pagar
No primeiro caso, o comprador sempre irá declarar que é do tipo b, que valoriza menos a empresa
126
Será o mesmo resultado do caso em que o governo estabelece um preço que seja aceitável para qualquer
um dos dois tipos de comprador
Há alternativas possı́veis que podem produzir melhores resultados para o governo
O governo pode estabelecer um mecanismo em que a venda esteja assegurada se for oferecido um preço
acima de v̂ = 17
Se for oferecido um valor v menor (seria então v = b) a probabilidade de a venda efetivamente se concretizar
seria de 50%
Como v̂ > v, o comprador tipo t = b irá preferir correr o risco de não efetuar a compra, fazendo a oferta
mais baixa b
Se o comprador de alta avaliação t = a paga v̂, seu excedente (payoff) é dado por a − v̂ = 13
1 1 a−b 30 − 10
(a − b) + 0 = = = 10
2 2 2 2
Assim é melhor negócio para o comprador de alta avaliação t = a pagar o preço mais elevado estipulado
pelo governo e assegurar a aquisição da empresa
O governo venderá a empresa para um comprador de alta avaliação com uma probabilidade p
Para o comprador com baixa avaliação, há 50% de chances de o governo vender, mas também há 50% de
chances de o governo cancelar a venda
127
Esse mecanismo é melhor para o governo do que o mecanismo de um só preço para ambos os tipos
Para um valor igual a β, há uma probabilidade (1 − θ) de que o governo cancele a privatização
O comprador de alta avaliação (t = a) prefere comprar a empresa pagando o valor α do que correr o risco
de ofertar o valor baixo, desde que
α − θβ
a − α ≥ θ(a − β) i.e. , a≥
1−θ
O comprador de baixa avaliação (t = b) prefere correr o risco de ofertar o valor baixo β do que pagar o
valor alto, se
α − θβ
θ(b − β) ≥ b − α i.e., b≤
1−θ
A equação
α − θβ
b≤ ≤a
1−θ
é chamada de restrição de compatibilidade de incentivos (incentive compatibility)
Graças a ela, cada tipo de comprador prefere selecionar o valor a ser pago mais adequado ao seu tipo
Estabelecendo o valor α muito alto pode estimular o comprador de alta avaliação a incorrer o risco de
propor o valor mais baixo
Temos que adicionar a restrição de que nenhum tipo de comprador pode ser coagido a adquirir a empresa
(i.e., participar do mecanismo)
O lucro esperado de cada tipo de comprador tem que ser maior do que o custo de oportunidade (não
participar do mecanismo), i.e.,
a≥α e b≥β
Dada as restrições
α − θβ
b≤ ≤a
1−θ
a receita esperada do governo é
pα + (1 − p)θβ
128
1. compatibilidade com incentivos
α − θβ
b≤ ≤a (IC)
1−θ
2. racionalidade individual
a≥α e b≥β (IR)
α − θβ
a=
1−θ
and
b=β
1. Se b < pa o governo deve fazer θ = 0, i.e., não vender a empresa se o preço for inferior a α = a
2. Se b > pa o governo deve fazer θ = 1, i.e., vender com certeza, desde que obtenha o valor mı́nimo
pela empresa
Princı́pio de revelação
Quando o mecanismo assegura a compra apenas a um preço mais elevado (θ > 0), ele leva os jogadores a
revelarem indiretamente suas verdadeiras caracterı́sticas (tipos) por meio de suas decisões
Pode-se obter o mesmo resultado por meio de um mecanismo pelo qual os jogadores se vejam estimulados
a anunciar suas verdadeiras caracterı́sticas
Em vez de oferecer a empresa por um valor alto, porém certo, ou um valor baixo, porém incerto
O governo poderia simplesmente ter perguntado ao comprador qual era o seu verdadeiro tipo
Avisando que, se o tipo informado fosse t = a, a empresa lhe seria oferecida com certeza, mas ao valor
mais elevado
Se o tipo informado fosse t = b, a empresa seria oferecida ao valor mais baixo, mas a venda teria 50% de
chance de ser efetivamente concretizada
129
O resultado desse mecanismo chamado de direto seria o mesmo, apesar da forma do jogo ser diferente
Os jogadores anunciariam seu verdadeiro tipo ao governo que em seguida atribuiria as recompensas ade-
quadas
Fix
a set of players I
where
ui : A × T → [−∞, ∞) where
ui (ai , a−i ; t)
is the payoff received by player i if he chooses ai , given that the other players choose a−i and players
type are t = (ti , t−i )
∀i ∈ I, Bi = Ti
130
Definição 40. A direct mechanism (vi )i∈I is said to be incentive compatible (or truth telling) if telling the
truth is a Bayesian Nash equilibrium, i.e., the strategies
Teorema 16 (Revelation principle). Every payoff profile (πi∗ )i∈I obtained in a BNE of any mechanism
(Ai , ui )i∈I can be obtained through an incentive compatible direct mechanism, i.e., there exists a direct
mechanism (vi )i∈I which is incentive compatible and for which (πi∗ )i∈I is the payoff profile of its truth
telling BNE Z
πi∗ (ti ) pi
= E [vi (ti , Id−i ; (ti , ·))] = vi (ti , t−i ; (ti , t−i ))pi (t−i |ti )
T−i
Proof. Pose
vi (τi , τ−i ; t) ≡ ui (s∗i (τi ), s∗−i (τ−i ); t)
where (s∗i )i∈I is the BNE of the mechanism (Ai , ui )i∈I leading to the payoff profile (πi )i∈I
131
Cap. 4 - Dynamic games of incomplete information
4.1 Introduction to Perfect Bayesian equilibrium
Consider the following dynamic game of complete but imperfect information
If player 1 chooses either L or M then player 2 learns that R was not chosen
Player 2 then chooses between two actions, L′ and R′ , after which the game ends
Player 2
L’ R’
L 2, 1 0, 0
Player 1 M 0, 2 0, 1
R 1, 3 1, 3
To determine whether these Nash equilibria are subgame perfect, we should define the game’s subgames
The game in consideration has no subgames since a subgame should begin at a decision node that is a
singleton
132
– So player 1 should not be induced to play R by 2’s threat to play R′ if given the move
One way to strengthen the equilibrium concept so as to rule out the SPNE (R, R′ ) is to impose the following
requirements
Requirement 17. At each information set, the player with the move must have a belief about which node in
the information set has been reached by the play of the game.
For a non-singleton information set, a belief is a probability over the nodes in the information set
For a singleton information set, the player’s belief puts probability one on the single decision node
Requirement 18. Given their beliefs, the players’ strategies must be sequentially rational in the sense that
at each information set the action taken by the player with the move (and the player’s subsequent strategy) must
be optimal given the player’s belief at that information set and the other players’ subsequent strategies
A “subsequent strategy” is a complete plan of action covering every contingency that might arise after
the given information set has been reached
Requirement 1 implies that if the play of the game reaches player 2’s non-singleton information set then
player 2 must have a belief about which node has been reached
– from playing R′ is p · 0 + (1 − p) · 1 = 1 − p
– from playing L′ is p · 1 + (1 − p) · 2 = 2 − p
Since 2 − p > 1 − p for any value of p, Requirement 2 prevents player 2 from choosing R′
Requiring that each player have a belief and act optimally given this belief suffices to eliminate the
implausible equilibrium (R, R′ )
Requirements 1 and 2 insist that the players have beliefs and act optimally given these beliefs
133
But these beliefs may not be reasonable
In order to impose further requirements on the players’ beliefs, we introduce the distinction between
information sets that are on the equilibrium path and those that are off the equilibrium path
on the equilibrium path if it will be reached with positive probability if the game is played according to
the equilibrium strategies
off the equilibrium path if it is certain not to be reached if the game is played according to the equilibrium
strategies
“Equilibrium” can mean Nash, subgame perfect, Bayesian, or perfect Bayesian equilibrium
Requirement 19. At information sets on the equilibrium path, beliefs are determined by Bayes’ rule and the
players’ equilibrium strategies
Indeed, given player 1’s equilibrium strategy (namely L), player 2 knows which node in the information
set has been reached
To illustrate Requirement 3, suppose that there were a mixed-strategy equilibrium in which player 1 plays
L with probability q1 , M with probability q2 , and R with probability 1 − q1 − q2
q1
p=
q1 + q2
The crucial new feature of this equilibrium concept is due to Kreps and Wilson (Econometrica 1982)
134
An equilibrium no longer consists of just a strategy for each player but now also includes a belief for each
player at each information set at which the player has the move
Requirement 3 imposes that players hold reasonable beliefs on the equilibrium path
We will introduce Requirement 4 which imposes that agents’ beliefs are reasonable off the equilibrium
path
Requirement 20. At information sets off the equilibrium path, beliefs are determined by Bayes’ rule and the
players’ equilibrium strategies where possible.
We will provide a more precise statement of “where possible” in each of the economic applications analyzed
subsequently
Definição 41. A perfect Bayesian equilibrium consists of strategies and beliefs satisfying Requirements
1 through 4.
This game has one subgame beginning at player 2’s singleton information set
These strategies and the belief p = 1 for player 3 satisfy Requirements 1 through 3
They also trivially satisfy Requirement 4, since there is no information set off this equilibrium path, and
so constitute a PBE
135
Consider the strategies (A, L, L′ ), together with the belief p = 0
– Player 3 has a belief and acts optimally given it, and players 1 and 2 act optimally given the
subsequent strategies of the other players
Thus, Requirements 1 through 3 do not guarantee that the player’s strategies are a SPNE
The problem is that player 3’s belief (p = 0) is inconsistent with player 2’s strategy, L
– but Requirements 1 through 3 impose no restrictions on 3’s belief because 3’s information set is not
reached if the game is played according to the specified strategies
Requirement 4, however, forces player 3’s belief to be determined by player 2’s strategy:
So the strategies (A, L, L′ ) and the belief p = 0 do not satisfy Requirements 1 through 4
136
Player 2 now has a third possible action, A′ , which ends the game
If player 1’s equilibrium strategy is A then player 3’s information set is off the equilibrium path
But now Requirement 4 may not determine 3’s belief from 2’s strategy
But if 2’s strategy is to play L with probability q1 , R with probability q2 , and A′ with probability 1−q1 −q2 ,
where q1 + q2 > 0, then Requirement 4 dictates that 3’s belief be
q1
p=
q1 + q2
Concluding remarks
In a PBE, Requirements 1 and 2 are equivalent to insisting that no player’s strategy be strictly dominated
beginning at any information set
Nash and Bayesian Nash equilibrium do not share this feature at information sets off the equilibrium path
Even SPNE does not share this feature at some information sets off the equilibrium path, such as infor-
mation sets that are not contained in any subgame
In a PBE, players cannot threaten to play strategies that are strictly dominated beginning at any infor-
mation set off the equilibrium path
Such an equilibrium often cannot be constructed by working backwards through the game tree, as we did
to construct a SPNE
Requirement 2 determines a player’s action at a given information set based in part on the player’s belief
at that information set
137
either Requirement 3 or 4 applies at this information set, then it determines the player’s belief from the
players’ action higher up the game tree
But Requirement 2 determines these actions higher up the game tree based in part on the players’ subse-
quent strategies, including the action at the original information set
This circularity implies that a single pass working backwards through the tree will not suffice to compute
a PBE
A Sender (S)
A Receiver (R)
1. Nature draws a type ti for the Sender from a finite set of feasible types T = {t1 , · · · , tI } according to a
probability distribution p ∈ P rob(T ) with full support, i.e., p(ti ) > 0 for every ti
2. The Sender observes ti and then chooses a message mj from a finite set of feasible messages M =
{m1 , · · · , mJ }
3. The Receiver observes mj but not ti and then chooses an action ak from a finite set of actions A =
{a1 , · · · , aK }
In many applications, the sets T , M and A are intervals on the real line, rather than finite sets
One may allow the set of feasible messages to depend on the type Nature draws
One may allow the set of feasible actions to depend on the message the Senders chooses
Job-market signaling
138
Corporate investment and capital structure
In Myers and Majluf’s (JFE 1984) model of corporate investment and capital structure
Monetary policy
– there could be an action by the Receiver before the Sender chooses the message in step 2
– there could be an action by the Sender after (or while) the Receiver chooses the action in step 3
the Federal Reserve has private information about its willingness to accept inflation in order to increase
employment
the type is the Fed’s willingness to accept inflation in order to increase employment
139
A player’s strategy is a complete plan of action:
– a strategy specifies a feasible action in every contingency in which the player might be called upon
to act
In a signaling game:
– a pure strategy for the Sender is a function ti 7→ m(ti ) specifying which message will be chosen for
each type that Nature might draw
– a pure strategy for the Receiver is a function mj 7→ a(mj ) specifying which action will be chosen for
each message that the Sender might send
In the simple signaling depicted before, the Sender and the Receiver both have four pure strategies
We translate the informal statements of Requirements 1 through 3 into a formal definition of a PBE in a
signaling game
Requirement 1 is trivial when applied to the Sender since his choice occurs at a singleton information set
The Receiver, in contrast, chooses an action after observing the Sender’s message but without knowing
the Sender’s type
– There is one information set for each message the Sender might choose
– Each such information set has one node for each type Nature might have drawn
140
Requirement 21 (1). After observing any message mj from M , the Receiver must have a belief about which
types could have sent mj
Requirement 22 (2R). For each mj in M , the Receiver’s action a∗ (mj ) must maximize the Receiver’s expected
utility, given the belief µ(·|mj )
Requirement 2 also applies to the Sender, but the Sender has complete information
Requirement 23 (2S). For each ti in T , the Sender’s message m∗ (ti ) must maximize the Sender’s utility,
given the Receiver’s strategy a∗ (mj )
Given the Sender’s strategy ti 7→ m∗ (ti ), let Tj denote the set of types that send the message mj
Tj ≡ {ti ∈ T : m∗ (ti ) = mj }
or equivalently6
Tj = [m∗ ]−1 (mj )
Given a message mj ,
– if Tj is non-empty then the information set corresponding to the message mj is on the equilibrium
path
– otherwise, mj is not sent (at equilibrium) by any type and so the corresponding information set is
off the equilibrium path
For messages on the equilibrium path, one should apply Requirement 3 to the Receiver’s strategy
Requirement 24 (3). For each mj ∈ M , if there exists ti ∈ T such that m∗ (ti ) = mj , then the Receiver’s
belief at the information set corresponding to mj must follow from Bayes’ rule and the Sender’s strategy:
p(ti )
µ(ti |mj ) = p(ti |[m∗ ]−1 (mj )) = P
τi ∈Tj p(τi )
6
Rigorously, we should write [m∗ ]−1 ({mj }).
141
Definição 42. A pure-strategy perfect Bayesian equilibrium in a signaling game is
– m∗ : ti 7→ m∗ (ti )
– a∗ : mj 7→ a∗ (mj )
If the Sender’s strategy is pooling or separating then we call the equilibrium pooling or separating,
respectively
There are four possible pure-strategy perfect Bayesian equilibria in this two-type, two-message game
Pooling on L
Pooling on R
142
A simple signaling game: pooling on L
So the Receiver’s belief (p, 1 − p) at this information set is determined by Bayes’ rule and the Sender’s
strategy
The Sender’s type t1 earns payoff of 1 and the Sender’s type t2 earns payoff of 2
To determine whether both “Sender types” are willing to choose L, we need to specify how the Receiver
would react to R
If the Receiver’s response to R is u, i.e., a∗ (R) = u then type t1 ’s payoff from playing R is 2, which exceeds
t1 ’s payoff of 1 from playing L
But if the Receiver’s response to R is d, i.e., a∗ (R) = d then t1 and t2 earn payoffs of 0 and 1 from playing
R, whereas they earn 1 and 2 from playing L
To get the pooling equilibrium on L, the Receiver’s response to R must be d, i.e., a∗ (R) = d
One have to check that a∗ (R) = d is an optimal action with respect to the Receiver’s belief at the
information set corresponding to R
Observe that
Eµ(·|R) [UR (·, R, d)] = q × 0 + (1 − q) × 2 = 2(1 − q)
and
Eµ(·|R) [UR (·, R, u)] = q × 1 + (1 − q) × 0 = q
m∗ (t) = L, ∀t ∈ {t1 , t2 }
(
∗ u if m = L
a (m) =
d if m = R
143
and the beliefs m 7→ µ(·|m) defined by
(
(0.5, 0.5) if m = L
µ(·|m) =
(q, 1 − q) if m = R
But t1 can earn 1 by playing L, since the Receiver’s best response to L is u for any value of p
Observação 27. There is no equilibrium in which the Sender plays m∗ (t) = R for any t in T .
So both beliefs are determined by Bayes’ rule and the Sender’s strategy
It remains to check whether the Sender’s strategy is optimal given the Receiver’s strategy a
It is not:
144
– earning t2 a payoff of 2,
– which exceeds t2 ’s payoff of 1 from playing R
So both beliefs are determined by Bayes’ rule and the Sender’s strategy
a∗ (m) = u, ∀m ∈ {L, R}
a∗ (m) = u, ∀m ∈ {L, R}
145
Job market signaling
We restate Spence’s (QJE 1973) model as an extensive-form game and describe some of its perfect Bayesian
equilibria
1. Nature determines a worker’s productive ability, η, which can be either high H or low L. The
probability that η = H is q
2. The worker learns his or her ability and then chooses a level of education, e ≥ 0
3. Two firms observe the worker’s education but not the worker’s ability, and then simultaneously make
wage offers to the worker
4. The worker accepts the higher of the two wage offers, flipping a coin in case of a tie
Payoffs
y(η, e) − w
where y(η, e) is the output of a worker with ability η who has obtained education e
The payoff to the firm that does not employ the worker is zero
Assumption on production
We allow for the possibility that output increases not only with ability but also with education
where ye (η, e) is the marginal productivity of education for a worker of ability η at education e
∂y
ye (η, e) = (η, e) ≥ 0
∂e
146
Interpretation of education
Thus, the game could apply to a cohort of high school graduates, or to a cohort of college graduates or
MBAs
Under this interpretation, e measures the number and kind of courses taken and the caliber of grades and
distinctions earned during an academic program of fixed length
Tuition costs (if they exist at all) are independent of e, so the cost function c(η, e) measures non-monetary
(or psychic) costs
Students of lower ability find it more difficult to achieve high grades at a given school, and also more
difficult to achieve the same grades at a more competitive school
Firm’s use of education as a signal thus reflects the fact that firms hire and pay more to the best graduates
of a given school and to the graduates of the best schools
Assumption on costs
The crucial assumption in Spence’s model is that low-ability workers find signaling more costly than do
high-ability workers
More precisely, we assume that the marginal cost of education is higher for low-ability than for high-ability
workers
∀e, ce (L, e) > ce (H, e)
where ce (η, e) denoted the marginal cost of education for a worker of ability η at education e
∂c
ce (η, e) = (η, e)
∂e
Consider a worker believing that with education e1 he would get paid wage w1
We investigate the increase in wages that would be necessary to compensate this worker for an increase
in education from e1 to e2
– Low-ability workers find it more difficult to acquire the extra education and so require a larger
increase in wages to compensate them for it
Z e2
∂c
∆w = w2 − w1 = (η, e)de
e1 ∂e
147
The graphical statement of this assumption is that low-ability workers have steeper indifference curves
than do high-ability workers
Spence also assumes that competition among firms will drive expected profits to zero
One can build this assumption into our model by replacing the two firms in stage 3 with a single player
called the market
The market makes a single wage offer w and has the payoff
−[y(η, e) − w]2
Doing so would make the model belong to the class of one-Receiver signaling games defined previously
To maximize its expected payoff, as required by Signaling Requirement 2R, the market would offer a wage
equal to the expected output of a worker with education e, given the market’s belief about the worker’s
ability after observing e
µ(H|e) is the market’s assessment of the probability that the worker’s ability is H
The purpose of having two firms bidding against each other in Stage 3 is to achieve the same result without
resorting to a fictitious player called the market
Firms’ beliefs
To guarantee that firms will always offer a wage equal to the worker’s expected output
148
We need to impose that, after observing education choice e, both firms hold the same belief about the
worker’s ability, again denoted µ(H|e)
Signaling Requirement 3 determines the belief that both firms must hold after observing a choice of e that
is on the equilibrium path
The assumption is that the firms also share a common belief after observing a choice of e that is off the
equilibrium path
Given this assumption, it follows that in any PBE the firms both offer the wage w̃(e) given in (W )
First, consider temporarily that the worker’s ability is common knowledge among all the players, rather
than privately known by the worker
Competition between the two firms in Stage 3 implies that a worker of ability η with education e earns
the wage
ŵ(η, e) = y(η, e)
∂y ∂c
lim (η, e) − (η, e) > 0
e→0+ ∂e ∂e
and
∂y ∂c
lim (η, e) − (η, e) < 0
e→∞ ∂e ∂e
Then the maximization problem has a unique solution e∗ (η) satisfying
∂y ∂c
(η, e∗ (η)) = (η, e∗ (η))
∂e ∂e
149
We propose to strengthen the assumption
Assume that
inf ye (H, e) ≥ max ye (L, e)
e≥0 e≥0
We now return to the assumption that the worker’s ability is private information
– The additional effort c[L, e∗ (H)] − c[L, e∗ (L)] needed to obtain the education level e∗ (H) is not
compensated by the additional wage w∗ (H) − w∗ (L)
– The additional effort c[L, e∗ (H)] − c[L, e∗ (L)] needed to obtain the education level e∗ (H) is compen-
sated by the additional wage w∗ (H) − w∗ (L)
150
The low-ability worker has no incentives to pretend being a high-ability worker by choosing e∗ (H), i.e.,
The low-ability worker has incentives to pretend being a high-ability worker by choosing e∗ (H), i.e.,
– pooling
– separating
– hybrid
Requirement 3 then implies that the firm’s belief after observing ep must be the prior belief
151
This in turn implies that the wage offered by the firm after observing ep must be
wp = q × y(H, ep ) + (1 − q) × y(L, ep )
1. to specify the firm’s belief µ(·|e) for out-of-equilibrium education choices e 6= ep (Requirement 1)
2. these beliefs will then determine the firm’s strategy e 7→ w̃(e) through
(Requirement 2R)
3. to show that both worker-types’ best response to the firm’s strategy w̃ is to choose e = ep (Require-
ment 2S)
Pooling equilibrium
One possibility is that the firms believe that any education level other than ep implies that the worker
has low ability
∀e 6= ep , µ(H|e) = 0
The refinement we will introduce in a subsequent chapter will rule out the beliefs analyzed here
152
Consider the following example
The low-ability worker’s indifference curve through the point [e∗ (L), w∗ (L)] lies below that type’s indif-
ference curve through (ep , wp )
This implies that the education ep is optimal for the low-ability worker
The high-ability worker’s indifference curve through the point (ep , wp ) lies above the wage function w =
y(L, e)
– This implies that the education ep is optimal for the high-ability worker
– This is because the solution e∗H to the maximization problem
In the previous example, many other pooling perfect Bayesian equilibria exist
Others involve the same education choice but different off the equilibrium path
153
Let ê denote a level of education between ep and e′
If we substitute ep by ê then the resulting belief and strategy for the firms together with the strategy
e(η) = ê for both worker’s types form another pooling PBE
These belief and strategy for the firms and the strategy (e(L) = ep , e(H) = ep ) for the worker form a
third pooling PBE
154
Signaling Requirement 3 then determines the firms’ belief after observing either of these two education
levels
µ[H|e∗ (L)] = 0 and µ[H|e∗ (H)] = 1
and
w̃(e∗ (H)) = w∗ (H) = y[H, e∗ (H)]
1. to specify the firms’ belief µ(H|e) for out-of-equilibrium education choices, i.e., values of e other than
e∗ (L) and e∗ (H)
2. which then determines the rest of the firms’ strategy w̃ through Equation (W )
3. to show that the best response for a worker of ability η to the firms’ strategy w̃ is to choose e∗ (η)
Consider the belief that the worker has high ability if e is at least e∗ (H) but has low ability otherwise
(
0 for e < e∗ (H)
µ(H|e) =
1 for e ≥ e∗ (H)
Recall that e∗ (H) is the high-ability worker’s best response to the wage function e 7→ y(H, e)
155
Since y(L, e) ≤ y(H, e) we get that e∗ (H) is still a best response to the wage function w̃
Recall that e∗ (L) is the low-ability worker’s best response to the wage function e 7→ y(L, e) on the whole
real line, this implies that it is also a best response on the interval [0, e∗ (H)) since e∗ (L) < e∗ (H)
Observe that
f ′ (e) = ye (H, e) − ce (L, e) ≤ ye (H, e) − ce (H, e) ≤ 0
is the highest payoff the low-ability worker can achieve among all choices of e ≥ e∗ (H)
Now the high-ability worker cannot earn the high wage y(H, ·) simply by choosing the education e∗ (H)
that he should choose under complete information
156
To signal his ability, the high-ability worker must choose es where es > e∗ (H) is defined by
This is because the low-ability worker will mimic any value of e between e∗ (H) and es
And will trick the firm into believing that the worker has high ability
Actually this is the only equilibrium that survives the refinement we will introduce in a subsequent chapter
We propose the following specification of the firms’ out-of-equilibrium beliefs that supports this equilibrium
behavior (
0 for e < es
µ(H|e) =
1 for e ≥ es
157
Let us compute the best response of the low-ability worker
We already know that e∗ (L) is a best response among all choices of e < es
One should find the worker’s best response to the firms’ strategy among all choices of e ≥ es , i.e.,
Observe that
g ′ (e) = ye (H, e) − ce (L, e) ≤ ye (H, e) − ce (H, e)
Therefore, the worker’s best response to the firms’ strategy among all choices of e ≥ es is es
Since
w∗ (L) − c(L, e∗ (L)) = y(H, es ) − c(L, es )
– Alternatively, we could increase es by an arbitrary small amount so that the low-ability worker would
strictly prefer e∗ (L)
This implies that the worker’s best response to the firms’ strategy among all choices of e ≥ es is es
What about the worker’s best response among all choices of e < es ?
Let π ∗ (L) be the payoff of the low-ability worker at point (e∗ (L), w∗ (L))
158
This is the equation of the indifference curve IL of the low-ability worker passing through (e∗ (L), w∗ (L))
This is the equation of the indifference curve IH of the high-ability worker passing through (es , w̃(es ))
By definition of es we have
W (L, es ) = W (H, es )
Observe that
∂W ∂W
(H, e) − (L, e) = ce (H, e) − ce (L, e) < 0
∂e ∂e
Implying that the function e 7→ W (H, e) − W (L, e) is strictly decreasing
∀e ≥ 0, W (L, e) ≥ y(L, e)
It follows that the indifference curve of the high-ability worker passing through (es , w̃(es )) is always above
the production function y(L, e), implying that any payoff among e < es is inferior to the one obtained at
es
There are other separating equilibria that involve a different education choice by the high-ability worker
159
There are other separating equilibria that involve the education choices e∗ (L) and es but differ off the
equilibrium path
Hybrid equilibrium
We analyze the case of an hybrid equilibrium where the low-ability worker randomizes
The low-ability worker randomizes between choosing eh with probability π and choosing eL with proba-
bility 1 − π
Signaling Requirement 3 then determines the firms’ belief after observing eh and eL
Since the high-ability worker always choose eh but the low-ability worker does so only with probability π,
observing eh makes it more likely that the worker has high ability so µ(H|eh ) > q
Second, as π approaches zero, the low-ability worker almost never pools with the high-ability worker so
µ(H|eh ) approaches 1
Third, as π approaches one, the low-ability worker almost always pools with the high-ability worker so
µ(H|eh ) approaches the prior belief q
When the low-ability worker separates from the high-ability worker by choosing eL
But choosing e∗ (L) would yield the payoff of at least y[L, e∗ (L)] − c[L, e∗ (L)]
For the low-ability worker to be willing to randomize between separating at e∗ (L) and pooling at eh
The wage wh ≡ w̃(eh ) must make that worker indifferent between the two
160
Recall that Equation (W ) and the definition of the belief µ(·|eh ) imply
q (1 − q)π
wh = × y(H, eh ) + × y(L, eh )
q + (1 − q)π q + (1 − q)π
For a given value of eh , if Equation (P ) yields wh < y(H, eh ) then there is a unique possible value for π
consistent with a hybrid equilibrium in which the low-ability worker randomizes between e∗ (L) and eh
If wh > y(H, eh ), then there does not exist a hybrid equilibrium involving eh
Observe that Equation (P ) yields wh < y(H, eh ) if and only if eh < es where es is the education chosen
by the high-ability worker in the separating equilibrium
r × y(H, eh ) + (1 − r) × y(L, eh ) = wh
q(1 − r)
π=
r(1 − q)
The separating equilibrium described previously is the limit of the hybrid equilibria considered here
To complete the description of the hybrid PBE, we should define the firms’ belief out-of-equilibrium path
and check the workers’ best response
161
The firms’ strategy is then
(
y(L, e) for e < eh
w̃(e) =
r × y(L, e) + (1 − r) × y(H, e) for e ≥ eh
Consider an entrepreneur who has started a company but needs outside financing to undertake an attrac-
tive new project
The entrepreneur has private information about the profitability of the existing company
The payoff of the new project cannot be disentangled from the payoff of the existing company
Suppose the entrepreneur offers a potential investor an equity stake in the firm in exchange for the
necessary financing
Suppose that the profit of the existing company can be either high or low: π ∈ {H, L} with H > L > 0
The potential investor’s opportunity cost is r, i.e., there is an alternative investment possibility with rate
of return r
The new project is attractive in the sense that the NPV is positive, i.e., R > I(1 + r)
162
2. The entrepreneur learns π and then offers the potential investor an equity stake s, where 0 ≤ s ≤ 1
3. The investor observes s but not π and then decides either to accept or to reject the offer
4. Payoffs:
If the investor rejects the offer then the investor’s payoff is I(1 + r) and the entrepreneur’s payoff is
π
If the investor accepts s then the investor’s payoff is s(π + R) and the entrepreneur’s is (1 − s)(π + R)
Suppose that after receiving the offer s the investor believes that the probability that π = L is q(s)
The entrepreneur prefers to receive the financing at the cost of an equity stake of s if and only if
R
s≤ (PC-E)
π+R
In a pooling PBE, the investor’s belief must be q(spo ) = p after receiving the equilibrium offer spo
I(1 + r) R
≤ (NC-p)
pL + (1 − p)H + R H +R
If p is close enough to one, however, the necessary condition (NC-p) holds only if
I(1 + r)
R − I(1 + r) ≥ H −L (sNC-p)
R
In a pooling equilibrium, the high-profit type must subsidize the low-profit type
Setting q(spo ) = p yields that the investor accepts to finance the project if and only if
I(1 + r) I(1 + r)
spo ≥ >
pL + (1 − p)H + R H +R
If the investor were certain that π = H the he would accept the smaller equity stake
I(1 + r)
ssy
H =
H +R
163
The larger equity stake required in a pooling equilibrium may be so expensive that the high-profit firm
would prefer to forego the new project
A pooling equilibrium exists if p is close to zero, so that the cost of subsidization is small
Or if the profit from the new project outweighs the cost of subsidization
In such an equilibrium, investment is inefficiently low: the new project is certain to be profitable, but the
high-profit type foregoes the investment
Financing terms that are attractive to the high-profit type are even more attractive to the low-profit type
As Myers and Majluf (J. Fin. Econ. 1984) observes, the forces in this model push firms toward either
debt or internal sources of funds
Actually, Myers and Majluf analyze a large firm (with shareholders and a manager) rather than an
entrepreneur (who is both the manager and the sole shareholder)
We consider the possibility that the entrepreneur can offer debt as well as equity
If the entrepreneur does not declare bankruptcy then the investor’s payoff is D and the entrepreneur’s is
π+R−D
If the entrepreneur does declare bankruptcy then the investor’s payoff is π + R and the entrepreneur’s is
zero
Since L > 0, there is always a pooling equilibrium: both profit-types offer the debt contract D = I(1 + r),
which the investor accepts
If L were sufficiently negative that R + L < I(1 + r), then the low-profit type could not repay this debt
so the investor would not accept the contract
164
A similar argument would apply if L and H represented expected rather than certain profits
Suppose that the type π means that the existing company’s profit will be
If L − K + R < I(1 + r) then there is probability 1/2 that the low profit type will not be able to repay
the debt D = I(1 + r) so the investor will not accept the contract
Monetary policy
Consider a sequential-move game in which employers and workers negotiate nominal wages
After the negotiation, the monetary authority chooses the money supply, which in turn determines the
rate of inflation
If wage contracts cannot be perfectly indexed, employers and workers will try to anticipate inflation in
setting the wage
Once an imperfectly indexed nominal wage has been set, actual inflation above the anticipated level of
inflation will erode the real wage
The monetary authority therefore faces a trade-off between the costs of inflation and the benefits of
reduced unemployment and increased output that follow from surprise inflation
We follow Barro and Gordon (J. Mon. Econ. 1983) and analyzed a reduced-form version of this model in the
following game
Second, the monetary authority observes this expectation and chooses actual inflation, π
The payoff to employers is −(π−π e )2 : employers simply want to anticipate inflation correctly, they achieve
their maximum payoff when π = π e
The monetary authority would like inflation to be zero but output (y) to be at its efficient level (y ∗ )
U (π, y) = −cπ 2 − (y − y ∗ )2
where the parameter c > 0 reflects the monetary authority’s trade-off between its two goals
165
Suppose the actual output is the following function of target output and surprise inflation
ỹ(π, π e ) = by ∗ + d(π − π e )
We first compute the monetary authority’s optimal choice of π given employers’ expectation π e :
d
π ∗ (π e ) = [(1 − b)y ∗ + dπ e ]
c + d2
Since employers anticipate that the monetary authority will choose π ∗ (π e ), employers choose π e to maxi-
mize −[π ∗ (π e ) − π e ]2 , which yields π ∗ (π e ) = π e , or
d(1 − b) ∗
πe = y ≡ πs
c
In this subgame-perfect outcome, the monetary authority is expected to inflate and does so
We consider a two-period version of the previous model and we add private information
In the two-period model, each player’s payoff is the sum of the player’s one-period payoffs
where πt is actual inflation in period t and πte is employers’ expectation (at end of period t−1 or beginning
of period t) of inflation in period t
We now assume that the parameter c is privately known by the monetary authority: c ∈ {S, W }
3. The monetary authority observes π1e and then chooses actual first-period inflation, π1
4. Employers observe π1 but not c, and then form π2e , their expectation of second-period inflation
166
5. The monetary authority observes π2e and then chooses actual second-period inflation, π2
Employers’ first-period expectation of inflation and the monetary authority’s second-period choice of
inflation precede and follow the signaling game
If the monetary authority’s type is c then its optimal choice of π2 given the expectation π2e is
d
π2∗ (π2e , c) ≡ [(1 − b)y ∗ + dπ2e ]
c + d2
If employers begin the second period believing that the probability that c = W is q, then they will form
the expectation π2e (q) that maximizes
In a pooling equilibrium, both types choose the same first-period inflation π1∗
On the equilibrium path, employers begin the second period believing that the probability that c = W is
p and so form the expectation π2e (p)
Then the monetary authority of type c chooses its optimal second-period inflation given this expectation,
namely π2∗ [π2e (p), c], thus ending the game
In a separating equilibrium, the two types choose different first-period inflation levels, say πW and πS
After observing πW , employers begin the second period believing that c = W and so form the expectation
π2e (1) solution of the equation
d(1 − b) ∗
π2e = π2∗ (π2e (1), W ) i.e., π2e (1) = y
W
167
In equilibrium, the weak type then chooses π2∗ [π2e (1), W ] and the strong type π2∗ [π2e (0), S], ending the game
3. in particular, to check neither type has an incentive to mimic the other’s equilibrium behavior
The weak-type might be tempted to choose πS in the first period, thereby inducing π2e (0) as the employers’
second-period expectation
Even if πS is uncomfortably low for the weak type, the ensuing expectation π2e (0) might be so low that
the weak type receives a huge payoff from the unanticipated inflation
In a separating equilibrium, the strong type’s first period inflation must be low enough that the weak
type is not tempted to mimic the strong type
But the Sender’s messages are just talk: costless, non-binding, non-verifiable claims
– a worker who simply announced “My ability is high” would not be believed
Stein (Am. Econ. Rev. 1989) shows that policy announcements by the Federal Reserve can be informative
but cannot be too precise
Matthews (Quarterly J. Econ. 1989) studies how a veto threat by the president can influence which bill
gets through Congress
One can also ask how to design environments to take advantage of cheap talk
168
Farrell and Gibbons (1991) show that in some settings unionization improves social welfare because it
facilitates communication from the work force to management
In Spence’s job market model, cheap talk cannot be informative because all the Sender’s types have the
same preferences over the Receiver’s possible actions:
Let’s illustrate why uniformity of preferences over the Receiver’s possible actions vitiates cheap talk
Suppose there were a pure-strategy equilibrium in which one subset of Sender-types, T1 , send one message,
m1
In equilibrium, the Receiver will interpret mi as coming from Ti and so will take the optimal action given
this belief; denote this action by ai
– If one type prefers a1 to a2 , then all types have this preference and will send m1 rather than m2
3. Receiver’s preferences over actions not be completely opposed to the Sender’s preferences
Suppose that the Receiver prefers low actions when the Sender’s type is low and high actions when
the Sender’s type is high
If low Sender-types prefer low actions and high types high actions, then communication can occur
If the Sender has the opposite preference then communication cannot occur because the Sender would
like to mislead the Receiver
Crawford and Sobel (Econometrica 1982) analyze an abstract model that satisfies these three necessary
conditions: they show that
– more communication can occur through cheap talk when the players’ preferences are more closely
aligned
– perfect communication cannot occur unless the players’ preferences are perfectly aligned
Each of economic applications (cheap talk of the Fed, veto threats, information transmission in debate,
union voice) involve complicated models of economic environments
169
An abstract cheap-talk game
The timing of the simplest cheap-talk game is identical to the timing of a signaling game; only payoffs differ
1. Nature draws a type ti for the Sender from a set T = {t1 , ..., tI } of feasible types according to a probability
distribution p with full support, i.e., p(t) > 0 for every t ∈ T
2. The Sender observes ti and then chooses a message mj from a set of feasible messages M = {m1 , ..., mJ }
3. The Receiver observes mj (but not ti ) and then chooses an action ak from a set of feasible actions
A = {a1 , ..., ak }
The key feature of such a game is that the message has no direct effect on either the Sender’s or the
Receiver’s payoff
The only way the message can matter is through its information content
By changing the Receiver’s belief about the Sender’s type, a message can change the Receiver’s action
Because the simplest cheap-talk and signaling games have the same timing, the definitions of P BE in the
two games are identical
If the Receiver will ignore all messages then pooling is a best response for the Sender; and if the Sender
is pooling then a best response for the Receiver is to ignore all messages
More formally, let a denote the Receiver’s optimal action in a pooling equilibrium, i.e., a solves
X
max p(ti )UR (ti , ak )
ak ∈A
ti ∈T
170
The interesting question therefore is whether non-pooling equilibria exist
The payoffs are given in the following table (this is not a game in normal form!)
Receiver
tL tH
aL x, 1 y, 0
Sender
aH z, 0 w, 1
The first payoff in each cell is the Sender’s and the second the Receiver’s
To illustrate the first necessary condition, suppose both Sender-types have the same preferences over
actions
To illustrate the third necessary condition, suppose the players’ preferences are completely opposed
171
The type, message and action spaces are continuous
When the Sender’s type is t, the Receiver’s optimal action is a = t, but the according to Sender’s
preferences, the optimal action is7 a = t + b
Different Sender-types have different preferences over the Receiver’s actions (higher types prefer higher
actions)
We will prove the existence of partially pooling equilibrium of the following form
All the types in a given interval send the same message, but types in different intervals send different
messages
Given the value of the preference-similarity parameter b, there is a maximum number of intervals (or
“steps”) that can occur in equilibrium
This maximum number is denoted by n∗ (b), and partially pooling equilibria exist for each n ∈ {1, 2, ..., n∗ (b)}
n∗ (b) approaches infinite as b approaches zero: perfect communication cannot occur unless the players’
preferences are perfectly aligned
We characterize these partially pooling equilibria, starting with a two-step equilibrium, i.e., n = 2
Suppose all the types in [0, x1 ) send one message while those in [x1 , 1] send another
7
Actually, it is min{1, t + b}
172
After receiving the message from the types in [0, x1 ), the Receiver will believe that the Sender’s type is
uniformly distributed on [0, x1 )
After receiving the message from the types in [x1 , 1], the Receiver’s optimal action will be (x1 + 1)/2
For the types in [0, x1 ) to be willing to send their message, it must be that all these types prefer the action
x1 /2 to the action (x1 + 1)/2
The Sender-type t
– prefers x1 /2 to (x1 + 1)/2 if the midpoint between these two actions exceeds that type’s optimal
action, t + b
– prefers (x1 + 1)/2 to x1 /2 if t + b exceeds the midpoint
For a two-step equilibrium to exist, x1 must be the type t whose optimal action t + b exactly equals the
midpoint between the two actions
1 x1 x1 + 1
x1 + b = +
2 2 2
or x1 = (1/2) − 2b
For b ≥ 1/4 the players’ preferences are too dissimilar to allow even this limited communication
We still have to address the issue of messages that are off the equilibrium path
Let the Sender’s strategy be that all types t < x1 send the message 0
173
Let the Receiver’s out-of-equilibrium belief after observing any message from (0, x1 ) be that t is uniformly
distributed on [0, x1 )
And after receiving any message from (x1 , 1] be that t is uniformly distributed on [x1 , 1]
To make the boundary type xk indifferent between the steps [xk−1 , xk ) and [xk , xk+1 )
Therefore we have
n × d + n(n − 1) × 2b = 1 (NC)
Given any n such that n(n − 1) × 2b < 1, there exists a value of d that solves (NC)
The largest possible number of steps in such an equilibrium, n∗ (b), is the largest value n such that
n(n − 1) × 2b < 1
1h p i
1 + 1 + (2/b)
2
Observe that n∗ (b) = 1 for b ≥ 1/4: no communication is possible if the players’ preferences are too
dissimilar
174
Moreover, n∗ (b) approaches infinity only as b approaches zero: perfect communication cannot occur unless
the players’ preferences are perfectly aligned
The amount that union members earn if not employed, called union’s reservation wage, is denoted by wr
– The firm might have superior knowledge concerning new products in the planning stage
2. If the firm rejects this offer the game proceeds to the second period
A more realistic model might allow the bargaining to continue until an offer is accepted
Or might force the parties to submit to binding arbitration after a prolonged strike
We refer to Sobel adn Takahashi (Rev. Econ. Sud. 1983) for an infinite horizon analysis
175
We begin by sketching the unique PBE of this game
If its first-period offer is rejected, the union updates its belief about the firm’s profit
π1∗ 2−δ
w2∗ = = πH < w1∗
2 2(4 − 3δ)
If the firm’s profit, π, exceeds w2∗ then the firm accepts the offer; otherwise, it rejects it
We will refer interchangeably to one firm with many possible profit types and to many firms each with
its own profit level
The union’s second-period belief reflects the fact that high-profit firms accepted the first-period offer
In equilibrium, low-profit firms tolerate a one-period strike in order to convince the union that they are
low-profit and so induce the union to offer a lower second-period wage
Firms with very low profits find even the lower second-period offer intolerably high and so reject it, too
In this simplified game, the union has the move at three information sets: the union’s strategy consists
of three wage offers
176
These three moves occur at three non-singleton information sets, at which the union’s beliefs are denoted
respectively
177
In the full game, a strategy for the union is a
1. first-period offer w1
2. a second-period offer function w1 7→ w̃2 (w1 ) that specifies the offer w2 to be made after each possible
offer w1 is rejected
There is one second-period information set for each different first-period wage offer the union might make
So there is a continuum of such information sets, rather than two in the simplified game
With both the lone first-period and the continuum of second-period information sets, there is one decision
node for each possible value of π (so a continuum of such nodes, rather than two for the simplified game)
At each information set, the union’s belief is a probability distribution over these nodes
The union’s second-period belief, after observing the first-period offer w1 has been rejected, is denoted by
µ2 (·|w1 )
178
Let A1 (w1 |π) equal one if the firm would accept the first-period offer w1 when its profit is π, and zero if
the firm would reject w1 under these circumstances
Let A2 (ww |π, w1 ) equal one if the firm would accept the second-period offer w2 when its profit is π and
the first-period offer was w1 , and zero if the firm would reject w2 under these circumstances
and
A1 : (w2 , w1 , π) 7→ A2 (w2 |π, w1 ) ∈ {0, 1}
Since the firm has complete information throughout the game, its belief are trivial
The strategies (w̃1 , w̃2 ) and (A1 , A2 ), and the beliefs (µ1 , µ2 ) form a PBE if they satisfy Requirements 2,
3 and 4
The simplest step of the argument is to apply Requirement 2 to the firm’s second-period decision A2 (w2 |π, w1 )
Since this is the last move of the game, the optimal decision for the firm is to accept w2 if and only if
π ≥ w2 ; the value of w1 is irrelevant
(
1 if π ≥ w2
A2 (w2 |π, w1 ) =
0 if π < w2
Given the strategy A2 , we can apply Requirement 2 to the union’s second-period choice of a wage offer
w2 should maximize the union’s expected payoff, given the union’s belief µ2 and the firm’s subsequent
strategy A2
Suppose the union believes that the firm’s profit is uniformly distributed on [0, π1 ], where for the moment
π1 is arbitrary
179
where
π1 − w
P rob{firm accepts w} =
π1
Assume that the union offers w1 in the first period and the firm expects the union to offer w2 in the
second period
– π − w − 1 from accepting w1
– δ(π − w2 ) from rejecting w1 and accepting w2
– zero from rejecting both offers
w1 − δw2
π≥ ≡ π ∗ (w1 , w2 )
1−δ
Thus for arbitrary values of w1 and w2 , firms with π ≥ max{π ∗ (w1 , w2 ), w1 } will accept w1 and the other
firms will reject
Since Requirement 2 dictates that the firm act optimally given the players’ subsequent strategies, we can
derive A1 (w1 |π) by replacing the arbitrary wage w2 by w̃2 (w1 ), i.e.,
(
1 if π ≥ max{π ∗ (w1 , w̃2 (w1 )), w1 }
A1 (w1 |π) =
0 if π < max{π ∗ (w1 , w̃2 (w1 )), w1 }
We can derive µ2 , the union’s second-period belief at the information set reached if the first period offer
w1 is rejected
Requirement 4 dictates that the union’s belief be determined by Bayes’ rule and the firm’s strategy
Thus, given the first part of the firm’s strategy A1 just derived
The union’s belief must be that the types remaining in the second period are uniformly distributed on
[0, π̂1 (w1 , w̃2 )] where
π̂1 (w1 , w̃2 ) ≡ max{π ∗ (w1 , w̃2 (w1 )), w1 }
180
It follows that w̃2 (w1 ) solves the implicit equation for w2 as a function of w1 :
w1
w̃2 (w1 ) =
2−δ
Therefore, the union’s second-period belief at the information set reached if the first period offer w1 is
rejected
Is that the types remaining in the second period are uniformly distributed on
2w1
A1 (w1 |π) = 1 ⇔ π ≥ π̃(w1 ) =
2−δ
1
µ2 (·|w1 ) = λ
π̃(w1 ) [0,π̃(w1 ))
w1
w̃2 (w1 ) =
2−δ
A2 (w2 |π, w1 ) = 1 ⇔ π ≥ w2
181
where Π2 (π1 ) is the discounted of the second-period payoff conditional to the rejection by the firm of the
offer w1 , i.e.,
Π2 (π1 ) = δw̃2 (w1 ) × µ2 [{A2 (w̃2 (w1 )|·, w1 ) = 1}|w1 ]
Observe that
πH − π̃(w1 )
µ1 {A1 (w1 |·) = 1} = µ1 {π ≥ π̃(w1 )} =
πH
Observe that
µ2 [{A2 (w̃2 (w1 )|·, w1 ) = 1}|w1 ] = µ2 ([w̃2 (w1 ), πH ]|w1 )
If its first period offer is rejected, the union updates its belief about the firm’s profit: the union believes
that π is uniformly distributed on [0, π1∗ ]
π1∗ 2−δ
w2∗ = = πH < w1∗
2 2(4 − 3δ)
If the firm’s profit, π, exceeds w2∗ then the firm accepts the offer; otherwise, it rejects it
Any finitely repeated game based on this stage game has a unique SPNE
– The Nash equilibrium of the stage game is played in every stage, after every history
182
A great deal of experimental evidence suggests that cooperation occurs frequently during finitely repeated
Prisoners’ Dilemmas
Kreps, Milgrom, Roberts, and Wilson (J. Econ. Theory 1982) show that a reputation model offers an
explanation of this evidence
Rather than assume that one player has private information about his or her payoffs
We will assume that the player has private information about his or her feasible strategies
We will assume that with probability p the Row player can play only the Tit-for-Tat strategy
– This strategy begins the repeated game by cooperating and thereafter mimics the opponent’s previous
play
While with probability 1 − p the Row player can play any of the strategies available in the complete-
information repeated game (including Tit-for-Tat)
Under this formulation, if the Row player ever deviates from the Tit-for-Tat strategy then it becomes
common knowledge that Row is rational
– i.e., even if the Column player has only a tiny suspicion that the Row player might not be rational
KMRW show that there is an upper bound on the number of stages in which either player finks in
equilibrium
This upper bound depends on p and on the stage-game payoffs but not on the number of stages in the
repeated game
Thus, in any equilibrium of a long enough repeated game, the fraction of stages in which both players
cooperate is large
1. If the Row player deviates from Tit-for-Tat then it becomes common knowledge that Row is rational
183
2. Given an assumption on the stage-game payoffs to be imposed below, the Column player’s best response
against Tit-for-Tat would be to cooperate until the last stage of the game
Rather than assume that p is small and analyze long repeated games
We will assume that p is large enough that there exists an equilibrium in which both players cooperate in
all but the last two stages of a (possibly short) repeated game
The timing is
3. Row and Column play the Prisoners’ Dilemma for a second and last time
Column
Cooperate Fink
Cooperate 1, 1 b, a
Row
Fink a, b 0, 0
Recall that finking (F) strictly dominates cooperating (C) in the stage game, both for rational Row and
for Column
Since, in the last stage of this two-period game of incomplete information, Column will surely fink
Then, there is no reason for the rational Row to cooperate in the first stage
184
This move is then mimicked by Tit-for-Tat in the second period
p · 1 + (1 − p) · b
Since Tit-for-Tat and the rational Row choose different moves in the first period
Column will begin the second period knowing whether Row is Tit-for-Tat or rational
p · a + (1 − p) · 0
This reflects Column’s uncertainty about Row’s type when deciding whether to cooperate or fink in the
first period
p · a + (1 − p) · 0
p + (1 − p)b ≥ 0 (C-1)
If Column and the rational Row both cooperate in the first period
Then the equilibrium path for the second and third periods will be given by the equilibrium of the previous
two-period game with X = C
185
We will derive sufficient conditions for Column and the rational Row to cooperate in the first period and
get the following three-period path, called “cooperation equilibrium”
In this equilibrium
The payoff to the rational Row is 1 + a
[p · 1 + (1 − p) · 1] + [p · 1 + (1 − p)b] + [p · a + (1 − p) · 0] = 1 + p + (1 − p)b + pa
Thus, the total payoff to the rational Row from finking in the first period is a
The rational Row has no incentive to deviate from the strategy of the cooperation equilibrium
Having finked in the first period, Column must then decide whether to fink or cooperate in the second
period
If Column finks in the second period, then Tit-for-Tat will fink in the third period
186
Column’s payoff from this deviation is a
This is less than Column’s expected payoff in the cooperation equilibrium provided that
1 + p + (1 − p)b + pa ≥ a
Given (C-1), a sufficient condition for Column not to play this deviation is
1 + pa ≥ a (C-2)
Alternatively, Column could deviate by finking in the first period but cooperating in the second
This is less than Column’s expected payoff in the cooperation equilibrium provided that
1 + p + (1 − p)b + pa ≥ a + b + pa
Given (C-1), a sufficient condition for Column not to play this deviation is
a+b≤1 (C-3)
Then the cooperation equilibrium is the equilibrium path of a PBE of the three-period Prisoners’ Dilemma
For a given value of p, the payoffs a and b satisfy these three conditions if they belong to the shaded region
187
As p approaches zero, this shaded region vanishes
We observed that in such an equilibrium no player’s strategy can be strictly dominated beginning at any
information set
We now consider two further requirements on beliefs off the equilibrium path
Since PBE prevents player i from playing a strategy that is strictly dominated beginning at any information
set, it is not reasonable for player j to believe that i would play such a strategy
To make this idea more concrete, consider the following dynamic game with incomplete information
188
In (L, L′ ), player 2’s information set is on the equilibrium path, so Requirement 3 dictates that p = 1
In (R, R′ ), this information set is off the equilibrium path but Requirement 4 puts no restriction on p
We thus require only that 2’s belief p make the action R′ optimal – i.e., p ≤ 1/2
The key feature of this example is that M is a strictly dominated strategy for player 1
Thus, it is not reasonable for player 2 to believe that 1 might have played M
Therefore, the PBE (R, R′ , p ≤ 1/2) is not reasonable leaving (L, L′ , p = 1) as the only PBE satisfying
this requirement
If L were striclty dominated (for instance if player 1’s payoff of 3 were, say, 3/2)
Then the same argument would imply that it is not reasonable for p to be positive, but this would
contradict the earlier result that p must be one
In such a case, the new requirement would not restrict player 2’s out-of-equilibrium beliefs
We will require that player j should not believe that player i might have played a strategy that is strictly
dominated beginning at any information set
– We expand the game in such a way that player 2 has a move preceding 1’s move and has two choices
at this initial move
– Either end the game or give the move to 1 at 1’s information set
– Now M is not any more strictly dominated because if 2 ends the game at the initial node then L,
M , and R all yield the same payoff
Definition
The strategy s′i is strictly dominated beginning at this information set if there exists another
strategy si such that
– for every belief that i could hold at the given information set
– for each possible combination of the other players’ subsequent strategies8
8
a “subsequent strategy” is a complete plan of action covering every contingency that might arise after the given information set
has been reached
189
Player i’s expected payoff from taking the action specified by si at the given information set and playing
the subsequent strategy specified by si
Is strictly greater than the expected payoff from taking the action and playing the subsequent strategy
specified by s′i
Requirement 25 (5). If possible, each player’s beliefs off the equilibrium path should place zero probability
on nodes that are reached only if another player plays a strategy that is strictly dominated beginning at some
information set
The qualification “If possible” in Requirement 5 covers the case that would arise in the previous game if
R dominated both M and L (as would occur if player 1’s payoff of 3 were 3/2
In such a case, Requirement 1 dictates that player 2 have a belief, but it is not possible for this belief to
place zero probability on the nodes following both M and L
The Sender strategy (m′ , m′′ ) means that type t1 chooses a message m′ and type t2 chooses the message
m′′ , i.e., the Sender strategy m̃ = (m′ , m′′ ) is given by
(
m′ if t = t1
m̃(t) =
m′′ if t = t2
The Receiver strategy (a′ , a′′ ) means that the Receiver chooses action a′ following L and a′′ following R,
i.e., the Receiver strategy ã = (a′ , a′′ ) is given by
(
a′ if m = L
ã(m) =
a′′ if m = R
190
We can check that the strategies and beliefs
The key feature of this signaling game, however, is that it makes no sense for t1 to play R
– The strategies in which t1 plays R are strictly dominated beginning at the Sender’s information set
corresponding to t1
– Showing that (R, L) and (R, R) are strictly dominated beginning at this information set amounts to
exhibiting an alternative strategy for the Sender that yields a higher payoff for t1 for each strategy
the Receiver could play
– (L, R) is such a strategy: it yields at worst 2 for t1 , whereas (R, L) and (R, R) yield at best 1
The t1 -node in the Receiver’s information set following R can be reached only if the Sender plays a strategy
that is strictly dominated
Furthermore, the t2 -node in the Receiver’s information set following R can be reached by a strategy that
is not strictly dominated beginning at an information set, namely (L, R)
Since {(L, L), (u, d), p = 0.5, q} is a PBE only if q ≥ 1/2, such an equilibrium cannot satisfy Requirement
5
Definição 43. In a signaling game, the message mj from M is dominated for type ti from T if there
exists another message mj ′ from M such that ti ’s lowest possible payoff from mj ′ is greater than ti ’ highest
possible payoff from mj
min US (ti , mj ′ , ak ) > max US (ti , mj , ak )
ak ∈A ak ∈A
Signaling Requirement (5). If the information set following mj is off the equilibrium path and mj is domi-
nated for type ti then (if possible) the Receiver’s belief µ(ti |mj ) should place zero probability on type ti
satisfies Signaling Requirement 5 trivially because there are no information set off this equilibrium path
Suppose now that the Receiver’s payoffs when type t2 plays R are reversed:
191
Now
{(L, L), (u, d), p = 0.5, q}
So
{(L, L), (u, d), p = 0.5, q = 0}
In some games, there are perfect Bayesian equilibria that seem unreasonable but nonetheless satisfy
Requirement 5
1. the “Beer and Quiche” signaling game, which illustrates that unreasonable perfect Bayesian equilibria
can satisfy Signaling Requirement 5
2. a stronger version of Signaling Requirement 5, called the Intuitive Criterion
3. the application of the Intuitive Criterion to Spence’s job-market signaling game
The Sender’s message is the choice of whether to have beer or quiche for breakfast
The Receiver’s action is the choice of whether or not to duel with the Sender
– the wimpy type would prefer to have quiche for breakfast, the surly would prefer to have beer
– both types would prefer not to duel with the Receiver (and care about this more than about which
breakfast they have)
– the Receiver would prefer to duel with the wimpy type but not to duel with the surly type
192
In this game,
{m∗ , a∗ , p = 0.1, q}
with (
∗ ∗ not if m = Quiche
m (t) = Quiche, a (m) =
duel if m = Beer
This equilibrium satisfies Signaling Requirement 5, because Beer is not dominated for either Sender type
The Receiver’s belief off the equilibrium path does seem suspicious
If the Receiver unexpectedly observes Beer then the Receiver concludes that the Sender is at least as
likely to be wimpy as surly (i.e., q ≥ 1/2) even though
(a) the wimpy type cannot possibly improve on the equilibrium payoff of 3 by having Beer rather than
Quiche
(b) the surly type could improve on the equilibrium payoff of 2, by receiving the payoff of 3 that would
follow if the Receiver held a belief q < 1/2
Given (a) and (b), one might expect the surly type to choose Beer and then make the following speech:
Seeing me choose Beer should convince you that I am the surly type:
– choosing Beer could not possibly have improved the lot of the wimpy type, by (a)
– if choosing Beer will convince you that I am the surly type then doing so will improve my lot, by (b)
If such a speech is believed, it dictates that q = 0, which is incompatible with this pooling PBE
Definição 44. Given a PBE in a signaling game, the message mj from M is equilibrium-dominated
for type ti from T if ti ’s equilibrium payoff, denoted by U ∗ (ti ), is greater than ti ’s highest possible payoff
193
from mj
U ∗ (ti ) > max US (ti , mj , ak )
ak ∈A
Signaling Requirement (6). If the information set following mj is off the equilibrium path and mj is
equilibrium-dominated for type ti then (if possible) the Receiver’s belief µ(ti |mj ) should place zero probability on
type ti . This is possible provided mj is not equilibrium-dominated for all types in T
“Beer and Quiche” shows that a message mj can be equilibrium-dominated for ti without being dominated
for ti
– because interpreting a deviation – i.e., in forming the belief µ(ti |mj ) – the Receiver asks whether the
Sender’s past behavior could have been rational
– whereas backwards induction assumes that future behavior will be rational
there are enormous number of pooling, separating and hybrid perfect Bayesian equilibria in this model
194
– ti = L chooses e∗ (L)
– ti = H chooses es
Because
y[L, e∗ (L)] − c[L, e∗ (L)] > w(e) − c(L, e) ∀e > es
For any PBE with e(H) = ê, ê > es , a deviation would be choose e ∈ [es , ê)
in any equilibrium that satisfies Signaling Requirement 5, type-H’s utility must be at least
y(H, es ) − c(H, es )
is below the high-ability worker’s indifference curve through the point [es , y(H, es )]
195
no pooling equilibria satisfy Signaling Requirement 5
– type-H worker cannot achieve the utility y(H, es ) − c(H, es ) in such an equilibrium
no hybrid equilibrium in which the type-H worker does the randomizing satisfy Signaling Requirement 5
– the point (e, w) at which pooling occurs in such an equilibrium lies below the wage function w =
q · y(H, e) + (1 − q) · y(L, e)
no hybrid equilibrium in which the type-L worker does the randomizing satisfy Signaling Requirement 5
– the point (e, w) at which pooling occurs in such an equilibrium must be on the type-L’s indifference
curve through the point [e∗ (L), w∗ (L)]
– and so lies below the type-H’s indifference curve through the point [es , y(H, es )]
Observação 29. In this case, there is only one PBE satisfying Signaling Requirement 5.
196
no hybrid equilibrium in which the type-L worker does the randomizing satisfy Signaling Requirement 5
– the point (e, w) at which pooling occurs in such an equilibrium must be on the type-L’s indifference
curve through the point [e∗ (L), w∗ (L)]
– and so lies below the type-H’s indifference curve through the point [es , y(H, es )]
pooling and hybrid equilibria in which the type-H worker does the randomizing can satisfy this requirement
– if the pooling occurs at an point (e, w) in the shaded region of the figure
– even the highest wage that could be paid to a worker with education e, y(H, e)
– yields an (e, w) point below the type-L’s indifference curve through the point (ep , wp )
Education choices between e′ and e′′ are not equilibrium-dominated for the type-H worker
– if such a choice convinces the firms that the worker has high ability,
– then the firms will offer the wage y(H, e)
– which will make type-H better off than in the indicated pooling equilibrium
Which in turn implies that the indicated pooling equilibrium cannot satisfy Signaling Requirement 6
Observação 30. This argument can be repeated for all the pooling and hybrid equilibria in the shaded
region in the figure
so the only PBE that satisfies Signaling Requirement 6 is the separating equilibrium previously dis-
197
cussed
198
Tópicos Especiais
Instabilidade Financeira (Bank runs)
To be written.
[(mk ≻wi ml ) ∨ (ml ≻wi mk )] ∧ ¬[(mk ≻wi ml ) ∧ (ml ≻wi mk )], ∀k, l ∈ N (1)
(mj ≻wi mk ) ∧ (mk ≻wi ml ) ⇒ (mj ≻wi ml ), ∀j, k, l ∈ N (2)
[(wk ≻mi wl ) ∨ (wl ≻mi wk )] ∧ ¬[(wk ≻mi wl ) ∧ (wl ≻mi wk )], ∀k, l ∈ N (3)
(wj ≻mi wk ) ∧ (wk ≻mi wl ) ⇒ (wj ≻mi wl ), ∀j, k, l ∈ N (4)
**Obs.:** Na notação acima, ∨ denota a disjunção ”ou”, ∧ denota a conjunção ”e” e ¬ denota a negação ”não”
Portanto, a relação de preferência de ai (denotada por ≻ai ) pode ser representada por um permutação do
conjunto N .
**Exemplo 1:** Suponha n = 3, a = w e i = 2. Então N = {1, 2, 3} e ai = w2 . O conjunto de possı́veis relações
de preferência de w2 é
Rw2 = {(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)}. (5)
A relação de preferência ≻w2 = (2, 1, 3), por exemplo, significa que o agente w2 considera * a opção m2 estri-
tamente melhor do que a opção m1 . * a opção m2 estritamente melhor do que a opção m3 . * a opção m1
estritamente melhor do que a opção m3 .
Ou seja, * Sempre que m2 for uma opção possı́vel para w2 , então w2 escolherá m2 . * w2 escolherá m1 somente
quando m2 não estiver disponı́vel e m1 estiver disponı́vel. * w2 escolherá m3 somente quando m2 e m1 não
estiverem disponı́veis.
199
**Exemplo 2 (descrição completa em [ams.org](http://www.ams.org/samplings/feature-column/fc-2015-03)):**
Suponha N = 4 e que as relações de preferência de mi e de wi são representadas pela i-ésima linha da matriz
M e W , respectivamente.
w1 w2 w3 w4 m4 m3 m1 m2
w1 w4 w3 w2 m2 m4 m1 m3
M =
,
W =
m
(6)
w2 w1 w3 w4 4 m1 m2 m3
w4 w2 w3 w1 m3 m2 m1 m4
x0 (m) = min{≻(i)
m : wi ∈ W}, ∀m ∈ M. (7)
(i)
em que ≻m denota a i-ésima entrada de ≻m .
(i)
Deste conjunto, w escolhe seu elemento preferido, min{≻w : mi ∈ Mw0 }, e rejeita os demais.
Passo 0.2: Considere um elemento m ∈ M arbitrário. O conjunto de elementos de W que não rejeitaram
m é dado por
0
Wm = W \ {w ∈ W : (x0 (m) = w) ∧ (m 6= min{≻(i) 0
w : mi ∈ Mw })} ⊆ W. (9)
(i) k−1
xk (m) = min{≻m : wi ∈ W m }, ∀m ∈ M. (10)
200
(i)
Deste conjunto, w escolhe seu elemento preferido, min{≻w : mi ∈ Mwk }, e rejeita os demais.
Passo k.2: Considere um elemento m ∈ M arbitrário. O conjunto de elementos de W que ainda não
rejeitaram m é dado por
k
Wm k−1
= Wm \ {w ∈ W : (xk (m) = w) ∧ (m 6= min{≻(i) k−1
w : mi ∈ Mw })} ⊆ W.
Passo Final: Este processo continua até a iteração k na qual xk (m) = xk−1 (m) para todo m ∈ M. Neste
passo do algoritmo, define-se x(m) = xk (m) para todo m ∈ M.
Implementação em Python:
import numpy as np
def initiate_Wmt():
"Inicialize o conjunto de $w$’s que ainda no rejeitaram $m$ como o conjunto de todas os $w$’s"
Wm_t = {}
for m in N:
Wm_t[m-1] = set()
for w in N:
Wm_t[m-1].add(w)
print(’Wm_t =’,Wm_t)
return Wm_t
def compute_xt(Wm_t):
"Dado o conjunto de $w$’s que ainda no rejeitaram $m$, dado por $W_m^t$, calcule $x_t(m)$"
x_t = np.zeros((1,n),dtype=np.int8)
for m in N:
tt = n
for w in Wm_t[m-1]:
for ii in range(tt):
# print(m,w,ii,tt,M[m-1,:])
if M[m-1,ii]==w:
x_t[0,m-1] = w
tt = ii
break
print(’x_t =’,x_t)
return x_t
def update_Mwt(x_t):
"Dado a associao $x_t$, calcule o conjunto de $m$’s associados a $w$"
Mw_t = {}
for w in N:
201
Mw_t[w-1] = set()
for m in N:
if x_t[0,m-1]==w:
Mw_t[w-1].add(m)
print(’Mw_t =’,Mw_t)
return Mw_t
def compute_xxt(Mw_t):
"Dado o conjunto de $m$’s associados a $w$, calcule o aceite de $w$, dado por $xx_t$"
xx_t = np.zeros((1,n),dtype=np.int8)
for w in N:
tt = n
for m in Mw_t[w-1]:
for ii in range(tt):
# print(w,m,ii,tt,W[w-1,:])
if W[w-1,ii]==m:
xx_t[0,w-1] = m
tt = ii
break
print(’xx_t =’,xx_t)
return xx_t
def update_Wmt(x_t,xx_t,Wmt):
"Com base na associao $x_t$ e no aceite de $w$, dado por $xx_t$, atualize o conjunto de $w$’s
que\
ainda no rejeitaram $m$"
Wm_t = {}
for m in N:
Wm_t[m-1] = set()
for w in Wmt[m-1]:
if x_t[0,m-1]!=w or m==xx_t[0,w-1]:
Wm_t[m-1].add(w)
print(’Wm_t =’,Wm_t)
return Wm_t
202
print(’M =’)
print(M,end=’\n\n’)
print(’W =’)
print(W,end=’\n\n’)
# Update $M_w^t$
Mw_t = update_Mwt(x_t)
# Update $xx_t$
xx_t = compute_xxt(Mw_t)
# Update norm
norm = max(abs(x_t-old_xt)[0])
print(’norma =’,norm,end=’\n\n’)
Definio de parmetros...
N = [1 2 3 4]
M =
[[1 2 3 4]
203
[1 4 3 2]
[2 1 3 4]
[4 2 3 1]]
W =
[[4 3 1 2]
[2 4 1 3]
[4 1 2 3]
[3 2 1 4]]
Inicializao de objetos...
Wm_t = {0: {1, 2, 3, 4}, 1: {1, 2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3, 4}}
x_t = [[1 1 2 4]]
Mw_t = {0: {1, 2}, 1: {3}, 2: set(), 3: {4}}
xx_t = [[1 3 0 4]]
Iterao 1
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3, 4}}
x_t = [[1 4 2 4]]
Mw_t = {0: {1}, 1: {3}, 2: set(), 3: {2, 4}}
xx_t = [[1 3 0 2]]
norma = 3
Iterao 2
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3}}
x_t = [[1 4 2 2]]
Mw_t = {0: {1}, 1: {3, 4}, 2: set(), 3: {2}}
xx_t = [[1 4 0 2]]
norma = 2
Iterao 3
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[1 4 1 2]]
Mw_t = {0: {1, 3}, 1: {4}, 2: set(), 3: {2}}
xx_t = [[3 4 0 2]]
norma = 1
Iterao 4
Wm_t = {0: {2, 3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[2 4 1 2]]
204
Mw_t = {0: {3}, 1: {1, 4}, 2: set(), 3: {2}}
xx_t = [[3 4 0 2]]
norma = 1
Iterao 5
Wm_t = {0: {3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[3 4 1 2]]
Mw_t = {0: {3}, 1: {4}, 2: {1}, 3: {2}}
xx_t = [[3 4 1 2]]
norma = 1
Iterao 6
Wm_t = {0: {3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[3 4 1 2]]
Mw_t = {0: {3}, 1: {4}, 2: {1}, 3: {2}}
xx_t = [[3 4 1 2]]
norma = 0
References
J. Bertrand. Review of recherches sur le principe mathématique de lathéorie des richesses. Journal des Savants,
499, 1883.
I.-K. Cho and D. M. Kreps. Signaling games and stable equilibria. The Quarterly Journal of Economics, 102
(2):179–221, 1987.
A.-A. Cournot. Recherches sur les principes mathématiques de la théorie des richesses par Augustin Cournot.
chez L. Hachette, 1838.
D. W. Diamond and P. H. Dybvig. Bank runs, deposit insurance, and liquidity. Journal of political economy,
91(3):401–419, 1983.
M. P. Espinosa and C. Rhee. Efficient wage bargaining as a repeated game. The Quarterly Journal of Economics,
104(3):565–588, 1989.
J. W. Friedman. A non-cooperative equilibrium for supergames. The Review of Economic Studies, 38(1):1–12,
1971.
D. Gale and L. S. Shapley. College admissions and the stability of marriage. The American Mathematical
Monthly, 69(1):9–15, 1962.
R. Gibbons. Game Theory for Applied Economists. Princeton University Press, 1992. ISBN 9781400835881.
J. C. Harsanyi. Games with incomplete information played by” bayesian” players, i-iii. part ii. bayesian equi-
librium points. Management Science, pages 320–334, 1968.
205
J. C. Harsanyi. Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points.
International Journal of Game Theory, 2(1):1–23, 1973.
E. P. Lazear and S. Rosen. Rank-order tournaments as optimum labor contracts. Journal of political Economy,
89(5):841–864, 1981.
W. Leontief. The pure theory of the guaranteed annual wage contract. Journal of Political Economy, 54(1):
76–79, 1946.
J. F. Nash et al. Equilibrium points in n-person games. Proceedings of the national academy of sciences, 36(1):
48–49, 1950.
206