Notas de Aula Usp

Notas de Aula (Gibbons, 1992) - Teoria dos Jogos
J. Bertolai
September 26, 2017
Contents
Teoria dos Jogos: Panorama geral 2
Um exemplo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
A Teoria Econômica e a Teoria dos Jogos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Cap. 1 - Static Games of Complete Information 15

1.1 Normal form games and Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3 Mixed strategies and existence of equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Cap. 2 - Dynamic Games of Complete Information 49

2.1 Dynamic games of complete and perfect information . . . . . . . . . . . . . . . . . . . . . . . 49
2.2 Two-stage games of complete but imperfect information . . . . . . . . . . . . . . . . . . . . . 61
2.3 Repeated games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.4 Dynamic games of complete but imperfect information . . . . . . . . . . . . . . . . . . . . . . 96
Cap. 3 - Static Games of Incomplete Information 107

3.1 Static Bayesian games and Bayesian Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . 107
3.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
3.3 The Revelation Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Cap. 4 - Dynamic games of incomplete information 132

4.1 Introduction to Perfect Bayesian equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.2 Signaling Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.3 Other applications of Signaling Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
4.4 Refinements of Perfect Bayesian Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Tópicos Especiais 199

Instabilidade Financeira (Bank runs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Casamentos Estáveis (Matching) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
References 206
1
Teoria dos Jogos: Panorama geral
Um exemplo
Remark (Teoria dos Jogos). A Teoria dos Jogos proporciona previsões sobre qual será o comportamento dos
indivı́duos sob um dado conjunto de regras (instituição).
Considere o jogo a seguir:

Prisoner 2
not Confess Confess
not Confess −1,−1 −9, 0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma
Questão 1. O podemos esperar sobre o comportamento dos prisioneiros?
o melhor para 1 é confessar, não importa o que 2 faça
o melhor para 2 é confessar, não importa o que 1 faça
Previsão 1. (Confess, Confess) é uma boa previsão sobre o comportamento dos indivı́duos.
“Equilı́brio em Estratégias Dominantes”
Considere o (novo) jogo a seguir:

Prisoner 2
not Confess Confess
not Confess −1,−1 −3, 0
Prisoner 1
Confess 0,−3 −6,−6
Prisoner Dilemma
(
confessar se 1 espera que 2 não confessará
o melhor para 1 é
não confessar se 1 espera que 2 confessará
(
confessar se 2 espera que 1 não confessará
o melhor para 2 é
2
Previsão 2. Há duas boas previsões para o comportamento dos indivı́duos: (not Confess, Confess) e
(Confess, not Confess)
“Equilı́brios de Nash”
Considere o (novo) jogo a seguir:
Prisoner 2
not Confess Confess
not Confess −1,−1 −3,−3
Prisoner 1
Confess −3,−3 −6,−6
Prisoner Dilemma
(
não confessar se 1 espera que 2 não confessará
o melhor para 1 é
(
não confessar se 2 espera que 1 não confessará
o melhor para 2 é
Previsão 3. Há somente uma boa previsão para o comportamento dos indivı́duos:
“Equilı́brios de Nash”: (not Confess, not Confess)
Considere o (caso geral de) jogo a seguir:
Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma
3
Questão 2. O podemos esperar sobre o comportamento dos prisioneiros?



 não confessar se 1 espera que 2 não confessará e a ≥ c

 confessar se 1 espera que 2 não confessará e a ≤ c
a melhor resposta para 1 é


 não confessar se 1 espera que 2 confessará e b ≥ d

 confessar se 1 espera que 2 confessará e b ≤ d
– a melhor resposta para 2 é análoga, dado a simetria do jogo
Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma



 (nC, nC) se a − c ≥ 0

 (nC, C) se a − c ≤ 0 e b − d ≥ 0
será uma boa previsão (Equilı́brio de Nash)


 ( C, nC) se a − c ≤ 0 e b − d ≥ 0

 ( C, C) se b − d ≤ 0
y =b−d
✻
(nC, C) (nC, nC)

(C, nC)
✲x =a−c
(nC, nC)
(C, C) (C, C)
Formalização:
Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma
o conjunto de jogadores (prisioneiros) é dado por I := {1, 2}
o conjunto de estratégias do jogador i ∈ I é dado por Si := {nC, C}




 a se (si , s−i ) = (nC, nC)

 b se (si , s−i ) = (nC, C)
para cada (s1 , s2 ) ∈ S1 × S2 , o payoff do jogador i ∈ I é ui (s1 , s2 ) =


 c se (si , s−i ) = ( C, nC)

 d se (si , s−i ) = ( C, C)
4
o conjunto de melhores respostas de i para a conjectura s−i é dado por
Ri (s−i ) := arg max ui (σ, s−i )

σ∈Si
Definição 1. O perfil de estratégias (s1 , s2 ) é um equilı́brio de Nash se s1 ∈ R1 (s2 ) e s2 ∈ R2 (s1 ). Ou

seja, se si é ponto fixo de Ri ◦ R−i :
s1 ∈ R1 (R2 (s1 )) e s2 ∈ R2 (R1 (s2 ))
Teorema 1 (Nash et al. (1950)). In the n-player normal-form game
G = {S1 , S2 , · · · , Sn ; u1 , u2 , · · · , un },
if n is finite and Si is finite for every i
then there is at least one Nash Equilibrium (possibly involving mixed strategies).
Desenho de Mecanismos
Questão 3. Quais devem ser as possı́veis sentenças dos prisioneiros (a, b, c e d) quando a sociedade deseja
que eles revelem a verdade (C,C) e
não se pode prender alguém sem alguma confissão por mais do que 1 ano (a ≥ −1);
não se pode prender alguém usando testemunha por mais do que 2 anos (b ≥ −2);
não se pode prender alguém usando confissão por mais do que 10 anos (c ≥ −10 e d ≥ −10); e
a sociedade deseja maximizar −a − b − c − d?
P2
nC C
nC a,a b,c
P1
C c,b d,d
Prisoner Dilemma
5
Ou seja, como desenhar o mecanismo ótimo para revelar a verdade?
se houvesse evidência de culpa dos 2 prisioneiros, a sociedade escolheria (a, b, c, d) = (−1, −2, −10, −10)
como não há evidência da culpa, ninguém confessará o crime se (a, b, c, d) = (−1, −2, −10, −10)
y =b−d
✻
P2 (nC, C) (nC, nC)
nC C (C, nC)
nC a,a b,c ✲x=a−c
P1
C c,b d,d
(nC, nC)
Prisoner Dilemma
(C, C) (C, C)
Questão 4. Como convencer os prisioneiros a confessar?
os 2 indivı́duos confessam somente quando y = b − d ≤ 0
– esta restrição é chamada de restrição de incentivos
as penas são limitadas (a ≥ −1, b ≥ −2 e c, d ≥ −10)
– esta restrição é chamada de restrição de factibilidade
O mecanismo (a, b, c, d) ótimo resolve:




 a ≥ −1

 b ≥ −2
max −(a + b + c + d) s.t. ,
(a,b,c,d) 

 c, d ≥ −10

 b≤d
ou seja, ele é dado por m∗ := (a∗ , b∗ , c∗ , d∗ ) = (−1, −2, −10, −2)
Observação 1. “A sociedade desenha o mecanismo para induzir a confissão (C,C), mas pode acabar em
uma situação (nC,nC) pior do que a esperada.”
Proof. Sob o mecanismo ótimo m∗ = (−1, −2, −10, −2) tem-se
x = a∗ − c∗ = 9 > 0 e y = b∗ − d∗ = 0 ≥ 0
6
e, portanto, há dois equilı́brios de Nash: (nC, nC) e (C, C).
y =b−d
✻
(nC, C) (nC, nC)

(C, nC)
✲x=a−c
(nC, nC)
(C, C) (C, C)
Aplicação: Corrida Bancária em equilı́brio
Há dois indivı́duos na economia, chamados depositantes.
Os depositantes vivem por 3 perı́odos: t = 0, 1, 2
– perı́odo inicial (data 0)

– curto prazo (data 1)
– longo prazo (data 2)
e auferem utilidade u(c1 + c2 ) = c1 + c2
– c1 consumo no perı́odo t = 1
– c2 consumo no perı́odo t = 2
e possuem dotação inicial (em t = 0) individual de D unidades de recursos
Os indivı́duos participam de um arranjo bancário de três perı́odos:
– decisão de investimento (data 0)

– curto prazo (data 1)
– longo prazo (data 2)
Na data zero ambos depositam D unidades de recurso no banco.
O banco recebe 2D unidades de recurso e as aplica em um investimento que rende:
– 2r unidades de recursos na data 1 se liquidado no curto prazo
7
– 2R unidades de recursos na data 2 se liquidado no longo prazo
Se ambos os depositantes sacam recursos na data 1,
– o investimento é liquidado no curto prazo

– cada depositante recebe r
Se somente um dos depositantes saca recursos na data 1,
– o investimento é liquidado no curto prazo

– o depositante que saca na data 1 recebe D
– o depositante que saca na data 2 recebe 2r − D
Se ambos os depositantes sacam recursos na data 2,
– o investimento é liquidado somente no longo prazo

– cada depositante recebe R
A matriz de payoffs é
Depositante 2
correr n~
ao correr
correr r,r D, 2r − D
Depositante 1
n~
ao correr 2r − D,D R,R
Bank run Game
Calculando o(s) equilı́brio(s)
Graficamente:
y =r−D
✻
(nC, C) (nC, nC)

(C, nC)
✲x =R−D
(nC, nC)
(C, C) (C, C)
Computacionalmente:
8
def payoffs(x,y):
"""Para cada par de estrat\’egias s1=x e s2=y,
esta fun\c{c}\~aoo retorna o payoff do jogador 1 e o payoff do jogador 2"""
if x==’correr’:
if y==’correr’:
z = [r,r]
else:
z = [D,2*r-D]
else:
if y== ’correr’:
z = [2*r-D,D]
else:
z = [R,R]
return z
def NE( S = (’correr’,’no correr’) ):

equilibrios = []
# Para cada perfil de estratgias s = (s1,s2)
for s1 in S:
for s2 in S:
# verifique se ’s’ equilbrio
v = payoffs(s1,s2)
eq = True
for t in S:
if payoffs(t,s2)[0]>v[0] or payoffs(s1,t)[1]>v[1]:
eq = False
break
if eq:
equilibrios.append(’({},{})’.format(s1,s2))
return equilibrios
– O programa a seguir usa as funções payoffs() e NE() para calcular o conjunto de equilı́brios de
Nash.
* dois casos serão estudados
* em ambos os casos, a dotação inicial é D = 1 e o retorno de longo prazo é de 20%, ou seja,
R = 1.2
* no primeiro caso, o retorno de curto prazo é 10%, ou seja, r = 1.1
* no segundo caso, o retorno de curto prazo é −10%, ou seja, r = 0.9
– Caso I: (D, R, r) = (1, 1.2, 1.1)
D, r, R = 1,1.1,1.2
eqs = NE()
9
print(’O conjunto de equilbrios de Nash :’)
print(’{ ’,end=’’)
for i, eq in enumerate(eqs):
#print(’{}’.format(eq),end=’’)
if i<len(eqs)-1:
print(’{}, ’.format(eq),end=’’)
else:
print(’{} ’.format(eq),end=’’)
print(’}’)
O programa acima gera o seguinte resultado:
O conjunto de equilbrios de Nash :

{ (nao correr,nao correr) }
Neste caso, há somente um equilı́brio de Nash: (n~

ao correr,n~
ao correr).
Neste equilı́brio,
* todos os depositantes aguardam para sacar em t = 2
* o projeto de investimento atinge sua maturação
* a economia consegue explorar o retorno de longo prazo, 20%
– Caso II: (D, R, r) = (1, 1.2, 0.9)
D, r, R = 1,0.9,2
eqs = NE()
print(’O conjunto de equilbrios de Nash :’)
print(’{ ’,end=’’)
for i, eq in enumerate(eqs):
#print(’{}’.format(eq),end=’’)
if i<len(eqs)-1:
print(’{}, ’.format(eq),end=’’)
else:
print(’{} ’.format(eq),end=’’)
print(’}’)
O programa acima gera o seguinte resultado:
O conjunto de equilbrios de Nash :

{ (correr,correr), (nao correr,nao correr) }
Neste caso, surge outro equilı́brio de Nash: (correr,correr).

Neste equilı́brio,
* todos os depositantes correrão para sacar em t = 1
* o projeto de investimento é liquidado antes de sua maturação
* a economia não consegue explorar o retorno de longo prazo, 20%
10
Observação 2. A previsão do modelo neste caso é:
– ou a economia beneficiará do retorno de longo prazo (Estabilidade bancária)

– ou a economia estará em situação pior do que sem o arranjo bancário (Instabilidade bancária)
mas não há certeza sobre qual equilı́brio emergirá.
A Teoria Econômica e a Teoria dos Jogos

Teoria Econômica: estudo das formas alternativas de se alocar recursos escassos.
Economics is a science which studies human behavior as a relationship between ends and scarce means which
have alternative uses
fim último da Ciência Econômica é o indivı́duo e seu bem estar
Exemplo: como escassez determina alocação ótima

Considere uma economia habitada por I indivı́duos
com somente J = 2 bens, cuja dotação é ω j ≥ 0, e sem produção
Definição 2. Uma alocação é um vetor

x = (x11 , x21 ), (x12 , x22 ), . . . , (x1I , x2I ) ∈ R2I
+
que especifica uma cesta de consumo (x1i , x2i ) ∈ R2+ para cada indivı́duo i ∈ {1, 2, . . . , I}. A alocação é dita
factı́vel se
I
X
xji ≤ ω j , ∀j ∈ {1, 2}.
i=1
11
distribuição de recursos entre os indivı́duos
Caixa de Edgeworth (I = 2)
Eficiência vs Equidade: o que é uma alocação socialmente ótima?
Definição 3. Uma alocação factı́vel x ∈ R2I ′ 2I

+ é Pareto ótima se não existe outra alocação x ∈ R+ factı́vel
tal que
ui (x′i ) ≥ ui (xi ) para todo i ∈ {1, 2, . . . , I}
e ui (x′i ) > ui (xi ) para algum i.
propriedade mı́nima e consensual
– mı́nima: toda alocação ótima precisa ser Pareto ótima

– consensual: não pode haver desperdı́cio sob a alocação ótima
Caixa de Edgeworth e a Curva de Contrato
Economics and Efficiency
Eficiência como critério de previsão: Eficiência de Pareto é a regra, não a exceção
Certo consenso entre economistas
Melhorias de Pareto: por que esperar que elas não são exploradas?
Principal critério de previsão da Teoria econômica
An equilibrium concept: competitive equilibrium – markets and prices
Definição 4. Uma alocação x∗ ∈ R2I ∗ ∗ ∗ 2

+ e um preço p = (p1 , p2 ) ∈ R+ constituem um equilı́brio competitivo
se
Utility maximization: para cada consumidor i, x∗i resolve

max ui (xi ) : s.t. p∗1 x1 + p∗2 x2 ≤ p∗1 ωi1 + p∗ ωi2
x∈R2+
12
Market clearing: demanda agregada igual a oferta agregada
I
X I
X
xj∗
i = ωij j = 1, 2
i=1 i=1
Teorema 2 (The First Fundamental Welfare Theorem). Toda alocação resultante de um equilı́brio com-
petitivo é Pareto ótima.
se mercados completos:
– cada bem é transacionado em um mercado

– sob um preço publicamente conhecido
se indivı́duos são tomadores de preços
– agem de forma perfeitamente competitiva
Teorema 3 (The Second Fundamental Welfare Theorem). Toda alocação Pareto ótima pode ser alcançada
(sustentada ou decentralizada) como um equilı́brio competitivo.
se dotação de recursos é adequadamente arranjada
se as preferências dos indivı́duos são convexas
se indivı́duos agem como price-takers
se os mercados são completos
Game theory revolution
But what about Prisoners’ Dilemma?

Prisoner 2
not Confess Confess
not Confess −1,−1 −9,0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma
the outcome from strategy profile (not Confess, not Confess) Pareto dominates the outcome from
(Confess, Confess)
however (Confess, Confess) is more reasonable to expect
13
Another equilibrium concept: Nash equilibrium
Strategic interdependence:
Each individual’s welfare depends not only on his own actions but also on the actions of the other
individuals
The actions that are best for an individual to take may depend on what he expects the other players to
do
Even more equilibrium concepts and some refinements:
Equilibrium in Dominant strategies
Nash equilibrium
subgame-Perfect Nash equilibrium
Bayesian Nash equilibrium
Perfect Bayesian Nash equilibrium
The Intuitive Criterion (Cho and Kreps (1987))
Mechanism Design
Questão 5 (Choosing among games:). How to design games in order to implement optimal allocations?
Principal-Agent problems
– moral hazard
– adverse selection
The Social Planner problems (social optimum)
– fiscal policy
– monetary policy
– regulation
The Revelation Principle
14
Cap. 1 - Static Games of Complete Information
Static Games of Complete Information:
Static:
– players simultaneously choose actions

– payoffs players receive depend on the combination of actions just chosen
Complete information:
– each player’s payoff function is common knowledge among all players

– exemplo: leilões
Questão 6. What is a game?
Definição 5. A game is a formal representation of a situation in which a number of individuals interact

in a setting of strategic interdependence.
Each individual’s welfare depends not only on his own actions but also on the actions of the other
individuals
The actions that are best for an individual to take may depend on what he expects the other players to
do
1.1 Normal form games and Nash equilibrium

Normal-form representation of games
In the normal-form representation of the game
each player simultaneously chooses a strategy.
the combination of strategies chosen by players determines a payoff for each player
Exemplo 1 (The prisoners’ dilemma).

The environment
Two suspects are arrested and charged with a crime

The police lack sufficient evidence to convict the suspects, unless at least one confesses
The suspects are in separate cells
The police explain the consequences that will follow from the actions they could take
15
Actions and payoffs
If neither confesses, they will be convicted of a minor offense and sentenced to one month in jail
If both confess then both will be sentenced to jail for six months
If one confesses but the other does not, then the confessor will be released immediately but the other
will be sentenced to nine months in jail
– Six for crime

– Three for obstructing justice
Matrix representation
Each player has 2 strategies: Confess or not Confess
Implicitly assumed that each player does not like to stay in jail
Prisoner 2
not Confess Confess
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma
General case:
The normal form representation
(a) Players
(b) Strategies
(c) Payoffs
Players: A finite set I of players
We write “player i” where i is the name of the player and I is the collection of names
We denote by n the number of players, i.e., n = #I
The set I may be denoted by I = {1, 2, · · · , n}
Strategies: The set of strategies available to player i is denoted by Si
An element si ∈ Si is called a strategy (play or action)
The set Si is called strategy space and may have any structure: finite, countable, metric space, vector
space
The collection (si )i∈I = (s1 , · · · , sn ) is called a strategy profile and denoted by s or s
16
Given an agent j and a profile s, we denote by (s−j ; s′j ) the new profile σ = (σi )i∈I defined by
(
s′j if i=j
σi =
si if i 6= j
1 < j < n ⇒ (s−j ; s′j ) = (s1 , . . . , sj−1 , s′j , sj+1 , . . . , sn )
Payoffs:
The payoff of player i is a function

Q
i j∈I Sj → [−∞, +∞]
u :
s 7→ ui (s)
where ui (s) is the payoff of player i when
– he plays strategy si
– and any other player j plays strategy sj
We use alternatively the following notation
ui (s) = ui ((sj )j∈I ) = ui (s−i ; si ) = ui (s1 , s2 , . . . , sn )
Definição 6. A game in normal form is a family
G = (Si , ui )i∈I
where for each i ∈ I
Si is a set
Q
ui is a function from S = j∈I Sj to [−∞, +∞]
Observação 3. We should know describe how to solve a game-theoretic problem
Questão 7. Can we anticipate how a game will be played?
What should we expect to observe in a game played by
– rational players
17
– who are fully knowledgeable about
* the structure of the game
* and each others’ rationality?
Simultaneous moves: In a normal form game the players choose their strategies simultaneously
This does not imply that they act simultaneously
It suffices that each choose his or her action without knowledge of the others’ choices
– Prisoners’ dilemma: the prisoners may reach decisions at arbitrary times but it must be in separate
cells
– Bidders in an sealed-bid auction
Iterated elimination of strictly dominated strategies
Definição 7 (Strictly dominated strategies). Consider a normal form game (Si , ui )i∈I .
Let s′i and s′′i be two strategies in Si .
Strategy s′i is strictly dominated by strategy s′′i if,
for each possible combination of the other players’ strategies,
the player i’s payoff from playing s′i is strictly less than the payoff playing s′′i .
Formally,
Y
ui (s′i , s−i ) < ui (s′′i , s−i ) ∀s−i ∈ Sk ,
k6=i
Rationality: Rational players do not play strictly dominated strategies
Observação 4 (The prisoners’ dilemma). For a prisoner, playing not Confess is strictly dominated by
playing Confess
Prisoner 2
not Confess Confess
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma
18
Assume we are player 1
If player 2 chooses Confess
– We prefer to play Confess and stay 6 months in jail

– rather than playing not Confess and stay 9 months in jail
If player 2 chooses not Confess
– We prefer to play Confess and be free

– rather than playing not Confess and stay 1 month in jail
A rational player will not choose to play not Confess
– therefore, a rational player will choose to play Confess
The outcome reached by the two prisoners is (Confess,Confess)
This results in a worse payoff for both players than would

(Not Confess,Not confess)
– This inefficiency is a consequence of the lack of coordination
This happens in many other situations
– the arms race

– the free-rider problem in the provision of public goods
Iterated elimination
Questão 8. Can we use the idea that “rational players do not play strictly dominated strategies” to find a
solution to other games?
Consider a game (in normal form) with two players

I = {1, 2}
Player 1 has two available strategies

S1 = {U p; Down}
Player 2 has three available strategies

S2 = {Lef t; M iddle; Right}
The payoffs are given by the following matrix
Player 2
Left Middle Right
Up 1, 0 1, 2 0, 1
Player 1
Down 0, 3 0, 1 2, 0
19
for Player 1
– Up is not strictly dominated by Down

– Down is not strictly dominated by Up
for Player 2
– Right is strictly dominated by Middle

– player 2 will never play Right
if Player 1 knows that Player 2 is rational
– then Player 1 can eliminate Right from Player 2’s strategy set
then both players can play the game as if it were the following game
Player 2
Left Middle
Up 1, 0 1, 2
Player 1
Down 0, 3 0, 1
For Player 1 the strategy Down is strictly dominated by Up
If Player 2 knows that
– Player 1 is rational; and

– Player 1 knows that Player 2 is rational
then Player 2 can eliminate Down from S1
Now the game is as follows
Player 2
Left Middle
Player 1 Up 1, 0 1, 2
For Player 2 the strategy Left is strictly dominated by Middle
Observação 5. By iterated elimination of strictly dominated strategies
the outcome of the game is (Up,Middle)
Definição 8 (Iterated elimination of strictly dominated strategies). This process is called iterated elimi-
nation of strictly dominated strategies.
20
Proposição 1. The set of strategy profiles that survive to iterated elimination of strictly dominated strate-
gies is independent of the order of deletion
Drawbacks:
(i) Each step requires a further assumption about what the players know about each other’s rationality
to apply the process for an arbitrary number of steps,

we need to assume that
it is common knowledge that players are rational
Definição 9. Players’ rationality is common knowledge if

– All the players are rational
– All the players know that all the players are rational
– So on, ad infinitum
(ii) this process often produces a very imprecise prediction about the play of the game
Consider the following game
L C R
U 0, 4 4, 0 5, 3
M 4, 0 0, 4 5, 3
D 3, 5 3, 5 6, 6
There are no strictly dominated strategies to be eliminated
The process produces no prediction whatsoever about the play of the game
Questão 9. Is there a stronger solution concept than IESDS which produces much tighter predictions in a
very broad class of games?
Nash equilibrium: motivation and definition
Motivation:
Suppose that game theory makes a unique prediction about the strategy each player will choose
in order for this prediction to be compatible with incentives (or correct) it is necessary that
– each player be willing to choose the strategy predicted by the theory
21
– each player’s predicted strategies must be that player’s best response to the predicted strategies of
other players
such a prediction could be called

strategically stable or self-enforcing
– no single player wants to deviate from his or her predicted strategy
Definição 10 (Nash Equilibrium). Consider a game G = (Si , ui )i∈I .
A strategy profile s∗ = (s∗i )i∈I is a Nash equilibrium of G if for each player i, the strategy s∗i is player
i’s best response to the strategies specified in s∗ for the other players.
Formally, s∗ = (s∗i )i∈I is a Nash equilibrium if

∀i ∈ I, s∗i ∈ arg max ui (si , s∗−i ); si ∈ Si
Observação 6. The set

arg max ui (si , s∗−i ); si ∈ Si
may not be uniquely valued
Interpretation
If the theory offers as a prediction
a profile s′ = (s′i )i∈I that is not a Nash equilibrium
then
there exists at least one player that will have an incentive to deviate from the theory’s prediction
Observação 7. If a convention is to develop about how to play a given game, then the strategies prescribed
by the convention must be a Nash equilibrium, else at least one player will not abide the convention
Examples
In a 2-player game we can compute the set of NE as follows:
for each player
– and for each strategy for this player

* determine the other player’s best response to it
* underline the corresponding payoff on the matrix
22
A pair of strategies (profile) is NE if
both corresponding payoffs are underlined in the matrix
L C R
U 0, 4 4, 0 5, 3
M 4, 0 0, 4 5, 3
D 3, 5 3, 5 6, 6
Player 2
Left Middle Right
Up 1, 0 1, 2 0, 1
Player 1
Down 0, 3 0, 1 2, 0
Prisoner 2
not Confess Confess
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma
Nash equilibrium: a stronger solution

Consider a game G = (Si , ui )i∈I .
Proposição 2. If IESDS eliminates all but the strategy profile s∗ = (s∗ )i∈I , then s∗ is the unique NE of
the game.
Teorema 4. If the strategy profile s∗ is a NE, then s∗ survives iterated elimination of strictly dominated
strategies.
Observação 8. NE is a stronger solution concept than IESDS.
there can be strategy profiles that survive IESDS, but which are not NE
all NE survive IESDS
Questão 10. Is NE too strong? Can we be sure that a Nash equilibrium exists?
existence: Nash et al. (1950) for any finite game
multiple equilibria: next example
A classic example:
The battle of sexes
A man (Pat) and a woman (Chris) are trying to decide on an evening’s entertainment
23
while at workplaces, Pat and Chris must choose to attend either the opera or a rock concert
both players would rather spend the evening together than apart
Pat
Opera Rock
Opera 2,1 0,0
Chris
Rock 0,0 1,2
there are two NE : (Opera,Opera) and (Rock,Rock)
In some games with multiple NE, one equilibrium stands out as the compelling solution
in particular a convention can be developed
Theory’s effort:
identify such a compelling equilibrium in different classes of games
In the example above,
the NE concept loses much of its appeal as a prediction of play
– both equilibria seem equally compelling

– none can be developed as a convention
1.2 Applications
Cournot model of duopoly
A model of duopoly: Cournot (1838)
two firms 1 and 2 producing the same good (homogeneous product)
q1 and q2 : quantities produced by the firms, respectively
Q = q1 + q2 : aggregate quantity on the market
P (Q) = [a − Q]+ : market-clearing price under Q
C(qi ) = cqi : total cost to firm i of producing quantity qi
– there is no fixed cost
– the marginal cost is constant at c
– we assume c < a
firms choose their quantities simultaneously
We should translate the problem into a normal form game
There are two players, the two firm: I = {1, 2}
24
The strategies available to each firm are the different quantities Qi = [0, ∞)
An element of Qi is denoted qi
– One could reduce the set Qi to [0, a] since P (Q) = 0 for Q ≥ a
The payoff of firm i for a profile (qi , qj ) is its profit defined by

πi (qi , qj ) = qi [P (qi + qj ) − c] = qi [a − (qi + qj )]+ − c
The game is then G = (Qi , πi )i∈I
Nash equilibrium (NE)

A strategy to find a NE is to look for necessary condition (and then to check that it is sufficient)
if (q1∗ , q2∗ ) is a Nash equilibrium, then
∀i ∈ I, qi∗ ∈ arg max{πi (qi ; qj∗ ) : qi ≥ 0}
we have (
qi [(a − c − qj∗ ) − qi ] if qi < a − qj∗
πi (qi ; qj∗ ) =
−qi c if qi ≥ a − qj∗
all strategies qi ≥ a − qj∗ are SD by qi = 0. Therefore,
qi∗ ∈ arg max{πi (qi ; qj∗ ) : 0 ≤ qi < a − qj∗ }
and the first order condition is necessary and sufficient
objective function’s derivative is

∂πi
(qi , qj∗ ) = a − c − qj∗ − 2qi
∂qi
assuming that qi∗ ∈ (0, a − c) for each firm i, we have
1
qi∗ = (a − qj∗ − c)
2
which yields
a−c
qi∗ = , ∀i ∈ I
3
obs.: this is consistent with the assumption qi∗ ∈ (0, a − c)
Observação 9. There is a unique Nash equilibrium, called Cournot equilibrium
Interpretation
Each firm would like to be a monopolist in this market
25
it would choose qi to maximize πi (qi , 0). The solution is
a−c
qm =
2
and the associated profit is

(a − c)2
πi (qm ; 0) =
4
With the two firms
aggregate profits would be maximized by setting
q1 + q2 = qm
this would occur with qi = qm /2
Problem with the strategy profile (qm /2, qm /2):
the market price P (qm ) is too high
– at this price, each firm has an incentive to deviate by increasing the production
– in spite of the fact that such a deviation drives down the market price, the profit obtained still
increases
In the Cournot equilibrium,
the aggregate quantity is higher
so the associated price is lower
the temptation to increase output is reduced,
just enough that each firm i is just deterred from increasing qi
Graphical solution
if q1 < a − c, then firm 2’s best response is
1
R2 (q1 ) = (a − q1 − c)
2
Likewise,
if q2 < a − c then firm 1’s best response is
1
R1 (q2 ) = (a − q2 − c)
2
26
The two best response functions intersect only once,
at the equilibrium profile (q1∗ , q2∗ )
Cournot duopoly and iterated elimination
A third proof: iterated elimination of strictly dominated strategies
if there is a unique solution then it is a Nash equilibrium
a−c
Proposição 3. The monopoly quantity qm = 2 strictly dominates any higher quantity.
(3)
We can then consider the game G (3) = (Qi , πi )i∈I with
(3)
Qi = [0, qm ]
Proof. Step 1: Assume qm + qj < a. Then

a−c
πi (qm ; qj ) = qm − qj
2
while

a−c
πi (qm + x, qj ) = [qm + x] − x − qj
2
= πi (qm , qj ) − x(x + qj )
Step 2: Assume qm + qj > a. Then,

πi (qm + x; qj ) = −c[qm + x]
27
Proposição 4. Given that quantities exceeding qm = (a − c)/2 have been eliminated, the quantity qm /2
strictly dominates any lower quantity.
Formally,
for any x ∈ (0, qm /2] we have
πi [qm /2, qj ] > πi [qm /2 − x, qj ]; ∀qj ∈ [0, (a − c)/2]
Proof.
3
πi (qm /2, qj ) = qm /2 (a − c) − qj
4
and

3
πi (qm /2 − x, qj ) = [qm /2 − x] (a − c) + x − qj
4

a−c
= πi (qm /2, qj ) − x + x − qj
2
After these two steps, the quantities remaining in each firm’s strategy space are those in the interval

a−c a−c
,
4 2
repeating these arguments leads to ever smaller intervals of remaining quantities
in the limit (we need countably many steps), these intervals converge to the single point qi∗ = a−c
3
Cournot duopoly and iterated elimination
If we add one or more firms in Cournot’s model
then the first step of elimination continues to hold
but that’s it
Observação 10. IESDS yields only the imprecise prediction that each firm’s quantity will not exceed the
monopoly quantity
Example: Three firms
28
Q−i : sum of the quantities chosen by the firms other than i
(
qi (a − qi − Q−i − c) if qi + Q−i < a
πi (qi ; Q−i ) =
−cqi if qi + Q−i > a
it is again true that qm strictly dominates any higher quantity
∀x > 0; πi (qm ; Q−i ) > πi (qm + x; Q−i ); ∀Q−i > 0
Each firm reduces its strategy set to [0, qm ], but
no further strategies can be eliminated
Proposição 5. No quantity qi ∈ [0, qm ] is strictly dominated
Proof. For each qi ∈ [0, qm ] there is a Q−i such that
qi ∈ arg max{πi (qi′ , Q−i ) : qi′ ∈ [0, qm ]}
In effect, we know that Q−i ∈ [0, 2qm ] = [0, a − c]. Fix qi ∈ [0, qm ] and recall that
(
qi (a − qi − Q−i − c) if qi + Q−i < a
πi (qi ; Q−i ) =
−cqi if qi + Q−i > a.
Then, the FOC is satisfied for qi and Q−i = a − c − 2qi .
Bertrand model of duopoly
We consider a different model of how two duopolists might interact
Bertrand (1883) suggested that firms actually choose prices, rather than quantities as in Cournot’s model
We consider the case of differentiated products
firms 1 and 2 choose prices p1 and p2 , respectively
the quantity that consumers demand from firm i’s product is
qi (pi , pj ) = [a − pi + bpj ]+ 0<b<2
b reflects the extent to which

firm i’s product is a substitute for firm j’s product
29
Observação 11. This is an unrealistic demand function
demand for firm i’s product is positive
even when firm i charges an arbitrarily high price,
provided firm j also charges a high enough price.
there are no fixed costs of production
marginal cost of production is constant at a value c ∈ (0, a)
The normal form game:
the set of players is I = {1, 2}
the strategy set Pi of player i is Pi = [0, ∞)
the payoff function corresponds to profits:
πi (pi , pj ) = qi (pi , pj )[pi − c] = [a − pi + bpj ]+ (pi − c)
The game is G = (Pi , πi )i∈I
Nash equilibrium:
the price pair (p∗1 , p∗2 ) is a Nash equilibrium if,
for each firm i the price p∗i solves
max πi (pi , p∗j ) = max [a − pi + bp∗j ][pi − c]

0≤pi <∞ c<pi <a+bp∗j
objective function’s derivative is

∂πi
(pi , p∗j ) = a + c + bp∗j − 2pi
∂pi
and, therefore, the solution to firm i’s optimization problem is
1
p∗i = (a + bp∗j + c)
2
If (p∗1 , p∗2 ) is a Nash equilibrium, one must have
1 1
p∗1 = (a + bp∗2 + c) and p∗2 = (a + bp∗1 + c)
2 2
30
if b < 2 then the unique Nash equilibrium is
a+c
p∗1 = p∗2 =
2−b
Final-offer arbritation
Consider a firm and a union which dispute wages
(i) the firm and the union simultaneously make offers
the firm offers the wage wf

the union offers the wage wu
(ii) an arbitrator chooses one of the two offers as the settlement
x: ideal settlement arbitrator would like to impose
The decision rule is as follows:
after observing the parties’ offers, wf and wu ,

the arbitrator simply chooses the offer that is closer to x
Provided that wf < wu
the arbitrator chooses wf if x < (wf + wu )/2
the arbitrator chooses wu if x > (wf + wu )/2
the arbitrator flips a coin if x = (wf + wu )/2
The arbitrator knows x but the parties do not
the parties believe that x is randomly distributed according to a probability measure µ on the Borelian
sets of [0, 1)
the cumulative probability distribution is denoted by F
F (x̄) ≡ Pr{x ≤ x̄} = µ[0, x̄]
31
F : [0, 1) → [0, 1] is differentiable, with derivative f
f represents the density function, i.e.,

Z Z
h(x)µ(dx) = h(x)f (x)dx
[0,∞) [0,∞)
for every Borel measurable function h : [0, ∞) → R+
Given the offers wf and wu , the parties believe that
wf is chosen under probability

wf + wu wf + wu
Pr{wf chosen} = µ 0, =F
2 2
wu is chosen under probability

wf + wu wf + wu
Pr{wu chosen} = µ ,∞ =1−F
2 2
and, therefore, expected wage settlement is given by
E(w) = wf × Pr{wf chosen} + wu × Pr{wu chosen}

wf + wu wf + wu
= wf F + wu 1 − F
2 2
We assume that
the firm wants to minimize E(w)
the union wants to maximize E(w)
If (wf∗ , wu∗ ) is a Nash equilibrium, then
wf∗ must solve

wf + wu∗ ∗ wf + wu∗
min wf F + wu 1 − F
0≤wf <∞ 2 2
wu∗ must solve ∗ ∗

∗
wf + wu wf + wu
max wf F + wu 1 − F
0≤wu <∞ 2 2
Suppose that (wf∗ , wu∗ ) is strictly positive
FOC for the firm’s problem
wu∗ + wf∗ wu∗ + wf∗

1
(wu∗ − wf∗ ) × f =F
2 2 2
32
FOC for the union’s problem
wu∗ + wf∗ wu∗ + wf∗

1
(wu∗ − wf∗ ) × f =1−F
2 2 2
Therefore,
wu∗ + wf∗

1
F =
2 2
The average of the offers must equal
the median of the arbitrator’s preferred settlement
wu∗ + wf∗

1
F =
2 2
The gap between the offers must equal
the inverse of the value of the density function
at the median of the arbitrator’s preferred settlement
1
wu∗ − wf∗ = w∗ +w∗
u f
f 2
An example:
Suppose the arbitrator’s preferred settlement is normally distributed with mean m and variance σ 2 , i.e.,

1 1 2
f (x) = √ exp − (x − m)
2πσ 2 2σ
the median of the distribution equals the mean m
– the normal distribution is symmetric around its mean,
The necessary conditions are then translated into
wu∗ + wf∗ 1 √
=m and wu∗ − wf∗ = = 2πσ 2
2 f (m)
The Nash equilibrium offers are

r r
πσ 2 πσ 2
wu∗ =m+ and wf∗ =m−
2 2
In equilibrium,
33
the parties’ offers are centered around m
– m: expectation of the arbitrator’s preferred settlement
the gap between the offers increases with σ 2
– σ 2 : parties’ uncertainty about the arbitrator’s preferred settlement
A more aggressive offer
lower offer by the firm
higher offer by the union
yields a better payoff if it is chosen by the arbitrator
but is less likely to be chosen
When there is more uncertainty (i.e., σ 2 higher)
the parties can afford to be more aggressive
When there is hardly any uncertainty, in contrast,
neither party can afford to make an offer far from the mean
The problem of the Commons
consider the n farmers in a village: I = {1, · · · , n}
each summer, all the farmers graze their goats on the village green
during the spring, the farmers simultaneously choose how many goats to own
Let
gi : number of goats owned by farmer i
G = g1 + · · · + gn : total number of goats in the village
c > 0: the cost of buying and caring for a goat
v(G): the value (per goats) to a farmer of grazing a goat on the green
goats are continuously divisible
34
v : [0, Gmax ] → R+ is
– twice continuously differentiable

– v ′ < 0 and v ′′ < 0
The normal-form representation:
a strategy for farmer i is gi
the strategy space is Gi = [0, ∞) (we could have chosen Gi = [0, Gmax])
the payoff to farmer i
– from grazing gi goats

– when the numbers of goats of the other farmers are g−i
is
πi (gi , g−i ) = gi v(gi + σ[g−i ]) − cgi
P
where σ[g−i ] = k6=i gk
If (gi∗ )i∈I is a Nash equilibrium
then gi∗ is a solution to

∗

max gi v(gi + σ[g−i ]) − cgi
gi ≥0
if gi∗ > 0, then the FOC is

v(gi∗ + σ[g−i
∗
]) + gi∗ v ′ (gi∗ + σ[g−i ]) − c = 0
Summing over all farmers and dividing by n, we get
1 ∗ ′ ∗
v(G∗ ) + G v (G ) − c = 0
n
P
where G∗ denotes i∈I gi∗
Social optimum
A social planner decides how many goats the “society” should graze on the village green
35
the planner should solve
max {Gv(G) − Gc}
G≥0
independent on the way the social profit should be divided
the FOC is
v(Gs ) + Gs v ′ (Gs ) − c = 0
Lema 1. One must have G∗ > Gs .
Observação 12. Too many goats are grazed in the Nash equilibrium, compared to the social equilibrium
The common resource is overutilized
When a farmer considers the effect of adding one more goat, he focuses on
the cost of production: c
the additional benefit: v(gi + σ[g−i ∗])
the harm to his other goats: gi v ′ (gi + σ[g−i ∗])
He does not care about the effect of his action on the other farmers
this is the reason we have G∗ v ′ (G∗ )/n and not GS v ′ (GS )
1.3 Mixed strategies and existence of equilibrium

Non-existence: Matching pennies
There are two players I = {i1 , i2 }
Each player’s strategy space is Si = {Heads, T ails}
The payoff of the game is as follows:
– Each player has a penny and must choose whether to display it with heads or tails facing up
– If the two pennies match then player i2 wins player i1 ’s penny
– If the pennies do not match then i1 wins i2 ’s penny
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
36
Proposição 6. There is no Nash equilibrium.
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
If the players’ strategies
match then player i1 prefers to switch strategies
do not match then i2 prefers to switch
This situation occurs in many games
Poker, battle
To overcome this difficulty, we introduce the notion of a mixed strategy
Mixed strategies
A mixed strategy is a probability measure (distribution) over the strategies in Si
A strategy in Si is called a pure strategy
The set of mixed strategies is denoted by P rob(Si ) or ∆Si
Definição 11. A mixed strategy p = (p(si ))si ∈Si of player i is a vector in RSi satisfying
X
∀si ∈ Si , psi = p(si ) ≥ 0 and p(si ) = 1
si ∈Si
if the mixed strategy p is such that there exists ŝi ∈ Si satisfying

(
0 if si 6= ŝi
∀si ∈ Si , p(si ) =
1 if si = ŝi
then p is denoted Dirac(si ) or 1si and (abusing notations) is assimilated with the pure strategy ŝi
Interpretation
A family p−i = (pj )j6=i of mixed strategies pj ∈ ∆(Sj ) can represent
agent i’s uncertainty about
which strategy each other agent j will play
37
Notation 5. The expected value of agent i’s payoff if he plays si believing that the other players will play
according to p−i is denoted by
ui (si , p−i )
and is defined by  
X Y
ui (si , p−i ) ≡ Ep−i [ui (si )] =  pj (sj ) ui (si , s−i )
s−i ∈S−i j6=i
| {z }
p−i (s−i )
Notation 6. If pi is a mixed strategy in ∆(Si ) we let p = (pj )j∈I and the expected value
 
X Y
Ep [ui ] =  pj (sj ) ui (si1 , . . . , sin )
s∈S j∈J
X
= pi1 (si1 ) . . . pin (sin )ui (si1 , . . . , sin )
s∈S
is denoted by
ui (p)
Observe that
X
ui (p) = pi (si )ui (si , p−i )
si ∈Si
Definição 12. We say that
there is no belief that player i could hold about the strategies the other players will choose
such that it would be optimal to play si
when
Y
∀p−i ∈ ∆(Sj ), / arg max{Ep−i [ui (si )] : si ∈ Si }
si ∈
j6=i
In other words,
for every belief p−i that agent i could hold about the others,
there exists a pure strategy ŝi ∈ Si such that
Ep−i [ui (si )] < Ep−i [ui (ŝi )]
be careful, the strategy ŝi may depend on the belief p−i .
38
Proposição 7. Assume that the pure strategy si is strictly dominated by the pure strategy σi
∀s−i ∈ S−i , ui (si , s−i ) < ui (σi , s−i )
Then
there is no belief that player i could hold about the strategies the other players will choose such that
it would be optimal to play si .
More precisely, for every family p−i = (pj )j6=i of mixed strategies pj ∈ ∆(Sj ), we have
Ep−i [ui (si )] < Ep−i [ui (σi )]
In this case, the strategy σi improves the expected payoff independently of the belief p−i agent i holds about
the other players’ actions
Observação 13. The converse may not be true

Player i2
L R
T 3,− 0,−
Player i1 M 0,− 3,−
B 1,− 1,−
For any belief pi2 agent i1 may have about i2 ’s strategies, the strategy B is never a best response
if pi2 (L) > 1/2 then i1 ’s best response is T
if pi2 (L) < 1/2 then i1 ’s best response is M
if pi2 (L) = 1/2 then i1 ’s best response is either T or M

However, the strategy B is not strictly dominated by another pure strategy
Consider the mixed strategy pi1 defined by
pi1 (T ) = 1/2, pi1 (M ) = 1/2 and pi1 (B) = 0
Such a probability will be denoted by1

pi1 = (1/2, 1/2, 0)
For any belief pi2 agent i1 may have about i2 ’s strategies,
ui1 (B, pi2 ) = ui1 (1B , pi2 ) = 1 < 3/2 = ui1 (pi1 , pi2 )
1
Sometimes one my find the notations: pi1 = 21 Dirac(T ) + 12 Dirac(M ) or pi1 = 21 1T + 12 1M .
39
Observação 14. The strategy B is strictly dominated by the mixed strategy pi1 = (1/2, 1/2, 0)
Observação 15. A given pure strategy can be a best response to a mixed strategy
even if the pure strategy is not a best response to any other pure strategy
Player i2
L R
T 3,− 0,−
Player i1 M 0,− 3,−
B 2,− 2,−
The pure strategy B is not a best response for player i1 to either L or R by player i2
but B is the best response for player i1 to the mixed strategy pi2 by player i2 provided that
1 2
< pi2 (L) <
3 3
Existence of Nash equilibrium
Nash equilibrium with mixed strategies

We fix a game G = (Si , ui )i∈I
Definição 13. A profile of mixed strategies p∗ = (p∗i )i∈I is a Nash equilibrium of the game G if
each player’s mixed strategy is a best response to the other players’ mixed strategies,

∀i ∈ I, p∗i ∈ arg max ui (pi , p∗−i ) : pi ∈ ∆(Si ) .
The family p−i = (pj )j6=i represents player i’s uncertainty about which strategy each player j will choose
Observação 16. Fix three players i, j and k.

What player j believes about the possible strategies played by player i coincides with what player k believes
Consider an abstract game G = (Si , ui )i∈I
fix a family pi−i = (pij )j6=i of mixed strategies representing player i’s beliefs about player j’s strategies
denote by Si∗ (pi−i ) the set of pure strategies best response of player i defined by
Si∗ (pi−i ) ≡ arg max{ui (si , pi−i ) : si ∈ Si }
– assume that Si is finite, then Si∗ (pi−i ) is non-empty
40
if pi is a mixed strategy in ∆(Si ), we denote by supp pi its support defined by
supp pi = {pi > 0} = {si ∈ Si : pi (si ) > 0}
Proposição 8. A mixed strategy p∗i is a best response to pi−i , i.e.,
p∗i ∈ arg max{ui (pi , pi−i ) : pi ∈ ∆(Si )}
if and only if the support of p∗i is a subset of all pure strategies that are best response to pi−i , i.e.,
{si ∈ Si : p∗i (si ) > 0} ≡ supp p∗i ⊂ Si∗ (pi−i )
In other words the set

arg max{ui (pi , pi−i ) : pi ∈ ∆(Si )}
of best responses to pi−i coincides with
P rob(Si∗ (pi−i )) = ∆(Si∗ (pi−i ))
NE with mixed strategies: An equivalent definition
Teorema 7. A profile of mixed strategies p∗ = (p∗i )i∈I is a Nash equilibrium of the game G if and only if
for every player i every pure strategy in the support of p∗i is a best response to the other players’ mixed
strategies
∀i ∈ I, supp p∗i ⊂ arg max{ui (si , p∗−i ) : si ∈ Si }
Interpretation 8. Players
have identical beliefs about other players’ possible actions or strategies
choose best response strategies consistent with these beliefs
Matching pennies
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
suppose that player i1 believes that player i2 will play
– Heads with probability q and

– Tails with probability 1 − q
41
given this belief we have
ui1 (Heads, (q, 1 − q)) = 1 − 2q and ui1 (T ails, (q, 1 − q)) = 2q − 1
player i1 ’s best response(s) is
– Heads if q < 1/2

– T ails if q > 1/2
– Heads and T ails if q = 1/2
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
Fix now a mixed strategy pi1 = (r, 1 − r) for player i1 , i.e.,
pi1 (Heads) = r and pi1 (T ails) = 1 − r
If agent i1 believes that i2 is playing the mixed strategy pi2 = (q, 1 − q)
Then we can compute the set of best responses
βi1 (pi2 ) ≡ arg max{ui1 (pi1 , pi2 ) : pi1 ∈ ∆(Si1 )}
Remember that we must have

βi1 (pi2 ) = P rob(Si∗1 (pi2 ))
since Si1 = {Head, T ails}, there are only three possibilities
βi1 (pi2 ) = {Heads}, βi1 (pi2 ) = {T ails} or βi1 (pi2 ) = ∆(Si1 )
Observe that
ui1 (pi1 , pi2 ) = (2q − 1) + r(2 − 4q)
The mixed strategy pi1 = (r, 1 − r) solves
pi1 ∈ arg max{ui1 (qi1 , pi2 ) : qi1 ∈ P rob(Si1 )}
if and only if r belongs to the set
r ∗ (q) = arg max{(2q − 1) + r(2 − 4q) : r ∈ [0, 1]}
if q < 1/2 then r ∗ (q) = 1 and i1 ’s best response is to play the pure strategy Heads
if q > 1/2 then r ∗ (q) = 0 and i1 ’s best response is to play the pure strategy T ails
42
if q = 1/2 then r ∗ (q) = [0, 1] and any mixed strategy is a best response, i.e., i1 is indifferent between
Heads and T ails
Observação 17. The object q 7→ r ∗ (q) is called a correspondence.
Player i1 ’s best response (r ∗ (q), 1 − r ∗ (q)) to i2 ’s strategy (q, 1 − q)
Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1
assume now that player i2 plans to choose a mixed strategy pi2 = (q, 1 − q) i.e.,
pi2 (Heads) = q and pi2 (T ails) = 1 − q
if agent i2 believes that i1 is playing the mixed strategy pi1 = (r, 1 − r)
then we can compute the set of best responses
arg max{ui2 (pi1 , qi2 ) : qi2 ∈ ∆(Si2 )}
Observe that
ui2 (pi1 , pi2 ) = q(4r − 2) + (1 − r)
A mixed strategy pi2 = (q, 1 − q) solves (2) if q belongs to the set
q ∗ (r) = arg max{q(4r − 2) + (1 − r) : q ∈ [0, 1]}
if r < 1/2 then q ∗ (r) = 0 and i2 ’s best response is to play the pure strategy T ails
if r > 1/2 then q ∗ (r) = 1 and i2 ’s best response is to play the pure strategy Heads
if r = 1/2 then q ∗ (r) = [0, 1] and any mixed strategy is a best response, i.e., i2 is indifferent between
Heads and T ails
43
Player i2 ’s best response (q ∗ (r), 1 − q ∗ (r)) to i1 s strategy (r, 1 − r)
Permuting q and r we get the following graph
We can draw in the same picture the best response correspondence of each player
A Nash equilibrium is a pair (p∗i1 , p∗i2 ) such that
p∗i ∈ arg max{ui (pi , p∗j ) : pi ∈ ∆(Si )}
The pair defined by p∗i1 = (r̂, 1 − r̂) and p∗i2 = (q̂, 1 − q̂)
is a Nash equilibrium if and only if
r̂ ∈ r ∗ (q̂) and q̂ ∈ r ∗ (r̂)
44
The unique Nash equilibrium of the Matching Pennies is then
p̂i1 = (1/2, 1/2) and p̂i2 = (1/2, 1/2)
The battle of sexes

Pat
Opera Fight
Opera 2,1 0,0
Chris
Fight 0,0 1,2
Denote by
(q, 1 − q) the mixed strategy in which Pat plays Opera with probability q
(r, 1 − r) the mixed strategy in which Chris plays Opera with probability r
3 NE with mixed strategies
1. Pat and Chris play the pure strategy Opera
2. Pat and Chris play the pure strategy F ight
3. Pat plays the mixed strategy where Opera is chosen with probability 1/3 and Chris plays the mixed
strategy where Opera is chosen with probability 2/3
2 players with 2 pure strategies

Consider the problem of defining player i1 ’s best response (r, 1 − r) when player i2 plays (q, 1 − q)
Player i2
Left Right
Up x,− y,−
Player i1
Down z,− w,−
We discuss the four following cases
(i) x > z and y > w
45
(ii) x < z and y < w
(iii) x > z and y < w
(iv) x < z and y > w
Then we turn to the remaining cases involving x = z or y = w
In case (i), the pure strategy U p strictly dominates Down
In case (ii), the pure strategy Down strictly dominates U p
In cases (iii) and (iv), neither U p nor Down is strictly dominated
let q ′ = (w − y)/(x − z + w − y)
In case (iii) U p is optimal for q > q ′ and Down for q < q ′ , whereas in case (iv) the reverse is true
In both cases, any value of r is optimal when q = q ′
Observe that q ′ = 1 if x = z and q ′ = 0 if y = w
For cases involving x = z or y = w the best response correspondences are L-shaped (two adjacent sides
of the unit square)
If we add arbitrary payoffs for player i2
Then we can perform analogous computations and get the same 4 best-response correspondences
46
Fix any of the four best response correspondence for player i1
Fix any of the four best response correspondence for player i2
Checking all 16 possible pairs, there is always at least one intersection
We obtain the following qualitative features that can result: There can be
a single pure strategy Nash equilibrium
a single mixed strategy equilibrium
2 pure strategy equilibria and a single mixed strategy equilibrium
Nash existence result
Teorema 9 (Nash). Consider a game G = (Si , ui )i∈I . If for each player i the set of pure strategies Si is
finite then there exists at least one Nash equilibrium with mixed strategies.
47
General existence result
Teorema 10. Consider a game G = (Si , ui )i∈I

and assume that for each player i,
(1) the set Si is a compact, convex and non-empty subset of Rni for some ni ∈ N
Q
(2) the payoff function s → ui (s) is continuous on S = i∈I Si
(3) for each s−i ∈ S−i , the function si → ui (si , s−i ) is quasi-concave in the sense that
∀si ∈ Si , {s′i ∈ Si : ui (s′i , s−i ) ≥ ui (si , s−i )} is convex
Then there exists at least one pure strategy Nash equilibrium
48
Cap. 2 - Dynamic games of complete information
2.1 Dynamic games of complete and perfect information
Theory: Backwards induction
Important words
we introduce dynamic games
restrict our attention to games with complete information
– the players’ payoff functions are common knowledge
in this chapter we analyze dynamic games with complete but also perfect information
– at each move in the game

– the player with the move knows the full history of the play of the game thus far
the central issue in dynamic games is credibility
An example:
Consider the following 2-move game
1. player i1 chooses between giving player i2 $1,000 and giving player i2 nothing
2. player i2 observes player i1 ’s move and then chooses whether or not to explode a grenade that will kill
both players
Suppose that player i2 threatens to explode the grenade unless player i1 pays the $1, 000
if player i1 believes the threat, then

player i1 ’s best response is to pay the $1, 000
but player i1 should not believe the threat, because it is not credible:
– if player i2 were given the opportunity to carry out the threat

– player i2 would choose not to carry it out
player i1 should pay player i2 nothing
The framework
We analyze in this chapter the following class of dynamic games with complete and perfect information
there are 2 players and 2 moves
first, player i1 moves
49
then player i2 observes player i1 ’s move
then player i2 moves and the game ends
Description of a specific class of games
1. player i1 chooses an action ai1 from a feasible set Ai1
2. player i2 observes ai1 and then chooses an action ai2 from a feasible set Ai2
3. payoffs are ui1 (ai1 , ai2 ) and ui2 (ai1 , ai2 )
Other dynamic games with complete and perfect information

The key features of a dynamic game of complete and perfect information are that
1. the moves occur in sequence
2. all previous moves are observed before the next move is chosen
3. the players’ payoffs from each feasible combination of moves are common knowledge
Backwards induction
We solve a game from this class by backwards induction as follows:
when player i2 gets the move at the second stage of the game
he will face the following problem
max {ui2 (ai1 , ai2 ) : ai1 given}

ai2 ∈Ai2
assume that for each ai1 ∈ Ai1 , player i2 ’s optimization problem has a unique solution, denoted by Ri2 (ai1 )
– this is player i2 ’s reaction (or best response) to player i1 ’s action
recall that payoffs are common knowledge
– therefore player i1 can solve i2 ’s problem as well as i2 can
player i1 will anticipate player i2 ’s reaction to each action ai1 that i1 might take
thus player i1 ’s problem at the first stage amounts to
max {ui1 (ai1 , Ri2 (ai1 ))}

ai1 ∈Ai1
assume that the previous optimization problem for i1 also has a unique solution, denoted by a∗i1
50
Definição 14. The pair of actions (a∗i1 , Ri2 (a∗i1 )) is called the backwards induction outcome of this game
Backwards induction and credible threats
the backwards induction outcome does not involve non-credible threats
player i1 anticipates that player i2 will respond optimally to any action ai1 that i1 might choose, by
playing Ri2 (ai1 )
player i1 gives no credence to threats by player i2 to respond in ways that will not be in i2 ’s self-interest
when the second stage arrives
A 3-move game
Consider the following 3-move game: player i1 moves twice
1. Player i1 chooses L or R
L ends the game with payoffs of 2 to i1 and 0 to i2
2. Player i2 observes i1 ’s choice:

if i1 chose R then i2 can choose between L′ and R′
L′ ends the game with payoffs of 1 to both players
3. Player i1 observes i2 ’s choice2 :

if the earlier choices were R and R′ then i1 chooses L′′ of R′′ , both of which end the game
L′′ with payoffs of 3 to player i1 and 0 to player i2

R′′ with payoffs of 0 to player i1 and 2 to player i2
2
And recalls his own choice in the first stage
51
Let’s compute the backwards induction outcome of this game
we begin at the third stage, i.e., player i1 ’s second move
– the strategy L′′ is optimal
at the second stage, player i2 anticipates that if the game reaches the third stage then i1 will play L′′
– payoff of 1 from action L′
– payoff of 0 from action R′
at the second stage, the optimal action for player i2 is L′
at the first stage, player i1 anticipates that if the game reaches the second stage then i2 will play L′
– payoff of 2 from action L
– payoff of 1 from action R
the first stage choice for player i1 is L, thereby ending the game
Stackelberg model of duopoly
Von Stackelberg (1934) proposed a dynamic model of duopoly
a dominant (leader) firm moves first
a subordinate (follower) firm moves second
At some points in the history of the U.S. automobile history, for example, General Motors has seemed to play
such a leadership role
as in the Cournot model, Stackelberg assumes that firms choose quantities
Timing of the game
1. firm i1 chooses the quantity qi1
2. firm i2 observes qi1 and then chooses a quantity qi2
3. the payoff to firm i is given by the profit function
πi (qi , qj ) = qi [P (Q) − c]
where
52
P (Q) = [a − Q]+ is the market-clearing price when the aggregate quantity on the market is Q =
q i1 + q i2
c is the constant marginal cost of production (no fixed costs)
Solving by backwards induction
we first compute i2 ’s reaction to an arbitrary quantity of i1
Ri2 (qi1 ) ≡ arg max{πi2 (qi1 , qi2 ) : qi2 ≥ 0}
which yields ( a−c−qi1

2 if q i1 < a − c
Ri2 (qi1 ) =
0 if q i1 ≥ a − c
second, i1 can solve i2 ’s problem as well as i2 can solve it
firm i1 should anticipate that the quantity choice qi1 will be met with the reaction Ri2 (qi1 )
Firm i1 ’s problem in the first stage of the game amounts to
arg max{πi1 (qi1 , Ri2 (qi1 )) : qi1 ≥ 0}
The backwards induction outcome of the Stackelberg duopoly game is (qi∗1 , qi∗2 ) where
a−c a−c
qi∗1 = and qi∗2 = Ri2 (qi∗1 ) =
2 4
Interpretation
in the Nash equilibrium of the Cournot game (simultaneous moves) each firm produces (a − c)/3
– thus aggregate quantity in the backwards induction outcome of the Stackelberg game, 3(a − c)/4, is
greater than in the Cournot-Nash equilibrium
– so the market clearing price is lower in the Stackelberg game
in the Stackelberg game, i1 could have chosen its Cournot quantity, (a − c)/3
– in which case i2 would have responded with its Cournot quantity
53
in the Stackelberg game, i1 could have achieved its Cournot profit level but chose to do otherwise
so i1 ’s profit in the Stackelberg game must exceed its profit in the Cournot game
but because the market clearing price is lower in the Stackelberg game
the aggregate profits are lower wrt. the Cournot outcome
therefore, the fact that i1 is better off implies that i2 is worse off
Observação 18. In game theory, having more information can make a player worse off.
more precisely, having it known to the other players that one has more information can make a player
worse off
in the Stackelberg game, the information in question is i1 ’s quantity
firm i2 knows i1 ’s action qi1
and firm i1 knows that i2 knows qi1
Wages and employment in a unionized firm
Leontief (1946) proposed the following model of the relationship between a firm and a monopoly union
The union is the monopoly seller of labor to the firm
The union has exclusive control over wages
But the firm has exclusive control over employment
The union’s utility function is U (w, L) where
– w is the wage the union demands from the firm

– L is employment
We assume that (w, L) → U (w, L) is increasing in both w and L
The firm’s profit function is

π(w, L) ≡ R(L) − wL
– R(L): revenue the firm can earn if it employs L workers
We assume that L 7→ R(L) is
– twice continuously differentiable
54
– strictly increasing (i.e., R′ > 0)
– strictly concave (i.e., R′′ < 0) and
– satisfies Inada’s condition at 0 and ∞, i.e.,
lim R′ (L) = ∞ and lim R(L) = 0

L→0+ L→∞
Timing of the game
1. The union makes a wage demand, w
2. The firm observes and accepts w and then chooses employment, L
3. Payoffs are U (w, L) and π(w, L)
Backwards induction outcome of the game
First, we can characterize the firm’s best response L∗ (w) in stage 2 to an arbitrary wage demand w by
the union in stage 1
Given w the firm chooses L∗ (w) to solve
L∗ (w) ≡ arg max{π(w, L) : L ≥ 0}
If w > 0 then there is a unique solution L∗ (w) satisfying
R′ (L∗ (w)) = w
Firm’s isoprofit curves:
55
Fixing the wage level w′ on the vertical line
the firm’s choice of L is a point on the horizontal line {(L, w′ ) : L ≥ 0}
Holding L fixed, the firm does better when w is lower
– optimal L is such that the isoprofit curve through (L, w) is tangent to the constraint {(L, w′ ) : L ≥ 0}
Union’s indifference curves
Holding L fixed, the union does better when w is higher
Higher indifference curves represent higher utility levels for the union
We turn to the union’s problem at stage 1
The union can solve the firm’s second stage problem as well as the firm can solve it
The union should anticipate that the firm’s reaction to the wage demand w will be to choose the employ-
ment level L∗ (w)
Thus, the union’s problem at stage 1 amounts to solve
arg max{U (w, L∗ (w)) : w > 0}
The union would like to choose the wage demand w that yields the outcome (w, L∗ (w)) that is on the
highest possible indifference curve
56
The solution to the union’s problem, w, is the wage demand such that
the union’s indifference curve through the point (L∗ (w∗ ), w∗ ) is tangent to the curve {L∗ (w) : w > 0} at
that point
Inefficiency
The backwards induction outcome (w∗ , L∗ (w)) is inefficient
Both the union’s utility and the firm’s profit would be increased it (L, w) were in the shaded region
Repeated games
Espinosa and Rhee (1989) propose one answer to this puzzle
Based on the fact that the union and the firm negotiate repeatedly over time
There may exist an equilibrium of such a repeated game in which the union’s choice of w and the firm’s
choice of L lie in the shaded region
57
Sequential bargaining
Two players are bargaining over one dollar
They alternate in making offers
First player i1 makes a proposal that i2 can accept or reject
If i2 rejects then i2 makes a proposal that i1 can accept and reject
And so on
Each offer takes one period, and the players are impatient
– they discount payoffs received in later periods by a factor δ ∈ (0, 1) per period
Discount factor
The discount factor δ reflects the time-value of money
A dollar received at the beginning of one period can be put in the bank to earn interest, say at rate r per
period
– So this dollar will be worth 1 + r dollars at the beginning of the next period
Equivalently, a dollar to be received at the beginning of the next period is worth only 1/(1 + r) of a dollar
now
Let δ = 1/(1 + r). Then, a payoff π to be received
in the next period is worth only δπ now
two periods from now is worth only δ2 w now, and so on
Observação 19. The value today of a future payoff is called the present value of that payoff.
The 3-period case

Timing of 3-period bargaining game
(1a) At the beginning of the first period, player i1 proposes to take a share s1 of the dollar, leaving 1 − s1 for
player i2
(1b) Player i2 either
accepts the offer:

the game ends and the payoffs s1 to i1 and 1 − s1 to i2 are immediately received
rejects the offer,
play continues to the second period
58
(2a) At the beginning of the second period, i2 proposes that player i1 take a share s2 of the dollar,3 leaving
1 − s2 for i2
(2b) Player i1 either
accepts the offer:

the game ends and the payoffs s2 to i1 and 1 − s2 to i2 are immediately received
rejects the offer:
play continues to the third period
(3) At the beginning of the third period,
i1 receives a share s of the dollar

i2 receives a share 1 − s of the dollar
where s ∈ (0, 1) is exogenously given
Backwards induction outcome

We first compute i2 ’s optimal offer if the second period is reached
Player i1 can receive s in the third period by rejecting i2 ’s offer of s2 this period
But the value this period of receiving s next period is only δs
Thus, i1 will
– accept s2 if s2 ≥ δs
– reject s2 if s2 < δs
We assume that each player will accept an offer if indifferent between accepting and rejecting
Player i2 ’s decision problem in the second period amounts to choosing between
– receiving 1 − δs this period by offering s2 = δs to player i1

– receiving 1 − s next period by offering player i1 any s2 < δs
The discounted value of the latter decision is δ(1 − s),
– which is less than 1 − δs available from the former option
So player i2 ’s optimal second-period offer is s∗2 = δs
3
st always goes to player i1 regardless of who made the offer
59
Observação 20. If play reaches the second period, player i2 will offer s∗2 and player i1 will accept.
Since i1 can solve i2 ’s second-period problem as well as player i2 can
Then i1 knows that i2 can receive 1 − s∗2 in the second period by rejecting i1 ’s offer of s1 this period
The value this period of receiving 1 − s∗2 next period is only δ(1 − s∗2 )
Thus player i2 will accept i1 ’s offer of s1 this period ⇔
1 − s1 ≥ δ(1 − s∗2 ) or s1 ≤ 1 − δ(1 − s∗2 )
Player i1 ’s first-period decision problem therefore amounts to choosing between
– receiving 1 − δ(1 − s∗2 ) this period by offering 1 − s1 = δ(1 − s∗2 ) to i2

– receiving s∗2 next period by offering 1 − s1 < δ(1 − s∗2 ) to i2
The discounted value of the latter option is δs∗2 = δ2 s
– which is less than the 1 − δ(1 − s∗2 ) = 1 − δ(1 − δs) available from the former option
Thus player i1 ’s optimal first-period offer is
s∗1 = 1 − δ(1 − s∗2 ) = 1 − δ(1 − δs)
Observação 21. The backwards induction outcome of this 3-period game is
i1 offers the settlement (s∗1 , 1 − s∗1 ) to i2 , who accepts
The infinite horizon case
The timing is as described previously
Except that the exogenous settlement in step (3) is replaced by an infinite sequence of steps (3a), (3b),
(4a), (4b), and so on
– Player i1 makes the offer in odd-numbered period

– Player i2 in even-numbered
Bargaining continues until one player accepts an offer
We would like to solve backwards
Because the game could go on infinitely, there is no last move at which to begin such an analysis
60
A solution was proposed by Shaked and Sutton (1984)
The game beginning in the third period (should it be reached) is identical to the game as a whole (beginning
in the first period)
In both cases (game beginning in the 3° period or as a whole)
– player i1 makes the first offer

– the players alternate in making subsequent offers
– the bargaining continues until one player accepts an offer
Suppose that there is a backwards induction outcome of the game as a whole in which players i1 and i2
receive the payoffs s and 1 − s
We can use these payoffs in the game beginning in the third period, should it be reached
And then work backwards to the first period, as in the 3-period model, to compute a new backwards
induction outcome for the game as a whole
In this new backwards induction outcome, i1 will offer the settlement (f (s), 1 − f (s)) in the first period
and i2 will accept, where
f (s) = 1 − δ(1 − δs)
Let sH be the highest payoff player i1 can achieve in any backwards induction outcome of the game as a
whole
Using sH as the third-period payoff to player i1 , this will produce a new backwards induction outcome in
which player i1 ’s first-period payoff is f (sH )
Since s 7→ f (s) = 1 − δ + δ2 s is increasing, the payoff f (sH ) must coincide with sH
The only value of s that satisfies f (s) = s is 1/(1 + δ), which will be denoted by s∗
Actually we can prove that (s∗ , 1 − s∗ ) is the unique backwards-induction outcome of the game as a whole
– In the first period, i1 offers the settlement (s∗ , 1 − s∗ )

– Player i2 accepts
2.2 Two-stage games of complete but imperfect information

Theory: Subgame perfection
We continue to assume that play proceeds in a sequence of stages
The moves in all previous stages are observed before the next stage begins
However, we now allow there be simultaneous moves within each stage
61
– The game has imperfect information
We will analyze the following simple game:
1. Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively
2. Players i3 and i4 observe the outcome of the first stage, (ai1 , ai2 ), and then simultaneously choose actions
ai3 and ai4 from feasible sets Ai3 and Ai4 , respectively
3. Payoffs are ui (ai1 , ..., ai4 )
The feasible action sets of players i3 and i4 in the second stage, Ai3 and Ai4 , could be allowed to depend
on the outcome of the first stage, (ai1 , ai2 )
In particular, there may be values of (ai1 , ai2 ) that end the game
One could allow for a longer sequence of stages either by allowing players to move in more than one stage
or by adding players
In some applications, players i3 and i4 are players i1 and i2
In other applications, either player i2 or player i4 is missing
We solve the game by using an approach in the spirit of backwards induction
The first step in working backwards from the end of the game involves solving a simultaneous-move game
between players i3 and i4 in stage 2, given the outcome of stage 1
We will assume that for each feasible outcome (ai1 , ai2 ) of the first game, the second-stage game that
remains between players i3 and i4 has a unique Nash equilibrium denoted by (âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
62
If i1 and i2 anticipate that the second-stage behavior of i3 and i4 will be given by the functions âi3 and
âi4
Then the first-stage interaction between i1 and i2 amounts to the following simultaneous-move game
2. Payoffs are
ui (ai1 , ai2 , âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
Suppose (a∗i1 , a∗i2 ) is the unique Nash equilibrium of this simultaneous-move game
We will call
(a∗i1 , a∗i2 , a∗i3 , a∗i4 )
the subgame-perfect outcome of this two-stage game, where
a∗i3 = âi3 (a∗i1 , a∗i2 ) and a∗i4 = âi4 (a∗i1 , a∗i2 )
Attractive feature 11.
Players i1 and i2 should not believe a threat by players i3 and i4 that the latter will respond with actions
that are not a Nash equilibrium in the remaining second-stage game
Because when play actually reaches the second stage at least one of i3 and i4 will not want to carry out
such a threat exactly because it is not a best response
Unattractive feature 12.
Suppose player i1 is also player i3 and that player i1 does not play a∗i1 in the first stage
Player i4 may then want to reconsider the assumption that player i3 (i.e., player i1 ) will play âi3 (ai1 , ai2 )
in the second stage
Bank runs
Diamond and Dybvig (1983)
Two investors have each deposited D with a bank
The bank has invested the deposits 2D in a long-term project
If the bank is forced to liquidate its investment before the project matures, a total of α(2D) can be
recovered, where
1
<α<1
2
63
If the bank allows the investment to reach maturity, the project will pay out a total of β(2D), where
β>1
There are two dates at which investors can make withdrawals from the bank
– date 1 is before the bank’s investment matures

– date 2 is after
For simplicity we assume that there is no discounting
If both investors make withdrawals at date 1 then each receives αD and the game ends
If only one investor makes a withdrawal at date 1
– then that investor receives the whole deposit D,

– the other receives (2α − 1)D,
– and the game ends
Finally, if neither investor makes a withdrawal at date 1 then the project matures and the investors make
withdrawal decisions at date 2
If both investors make withdrawals at date 2 then each receives βD > D and the game ends
If only one investor makes a withdrawal at date 2 then that investor receives (2β − 1)D > βD, the other
receives D, and the game ends
Finally if neither investor makes a withdrawal at date 2 then the bank returns βD to each investor and
the game ends
withdraw don’t withdraw

withdraw αD, αD D, (2α − 1)D
don’t withdraw (2α − 1)D, D next stage

withdraw βD, βD (2β − 1)D, D
don’t withdraw D, (2β − 1)D βD, βD
To analyze this game, we work backwards
Consider the normal-form game at date 2
The strategy withdraw strictly dominates don’t withdraw
There is a unique Nash equilibrium in this game: both investors withdraw, leading to a payoff of (βD, βD)
Since there is no discounting, we can simply substitute this payoff into the normal-form game at date 1
64
Date 1

withdraw αD, αD D, (2α − 1)D
don’t withdraw (2α − 1)D, D βD, βD
This one period version of the two-period game has two pure strategy Nash equilibria:
1. both investors withdraw, leading to a payoff of (αD, αD)
2. both investors do not withdraw, leading to a payoff of (βD, βD)
The original 2-period bank runs game has two subgame perfect outcomes
1. both investors withdraw at date 1, yielding payoffs of (αD, αD)

2. both investors do not withdraw at date 1 but do withdraw at date 2, yielding payoffs of (βD, βD)
at date 2
The first of these outcomes can be interpreted as a run on the bank
If investor i1 believes that investor i2 will withdraw at t = 1
– then investor i1 ’s best response is to withdraw,

– even though both investors would be better off if they waited until date 2 to withdraw
Observação 22. Since there are two subgame perfect equilibria,
this model does not predict when bank runs will occur,
but does show that they can occur as an equilibrium phenomenon
Tariffs and imperfect international competition
Consider two identical countries, denoted by i1 and i2
Each country has
– a government that chooses a tariff rate

– a firm that produces output for both home consumption and export
– consumers who buy on the home market from either the home firm or the foreign firm
If the total quantity on the market in country i is Qi , then the market clearing price is
Pi (Qi ) = [a − Qi ]+
65
The firm in country i (called firm i) produces hi for home consumption and ei for export, in particular
we have
Qi = hi + ej
The firms have a constant marginal cost c and no fixed costs (we assume that c < a)
The total cost of production for firm i is
Ci (hi , ei ) ≡ c(hi + ei )
The firms also incur tariff costs on exports
– if firm i exports ei to country j

– when government j has set the tariff rate tj
then firm i must pay tj ei to government j
Timing
1. The governments simultaneously choose tariff rates, ti1 and ti2
2. The firms observe the tariff rates and simultaneously choose quantities for home consumption and for
export (hi , ei )
3. Payoffs are profit to firms and total welfare to governments
Profit to firm i is
πi (ti , tj , hi , ei , hj , ej ) ≡ [a − (hi + ej )]+ hi + [a − (ei + hj )]+ ei

−c(hi + ei ) − tj ei
Total welfare to government i,
1
Wi (ti , tj , hi , ei , hj , ej ) ≡ Q2i + πi (ti , tj , hi , ei , hj , ej ) + ti ej
2
where total welfare is the sum of
– consumers’ surplus enjoyed by the consumers in country i,

– the profit earned by the firm i, and
– the tariff revenue collected by government i from firm j
Solução
Suppose the governments have chosen the tariffs ti1 and ti2
Assume that (h∗i1 , e∗i1 , h∗i2 , e∗i2 ) is a Nash equilibrium in the remaining game between firms i1 and i2
66
Then, for each i, (h∗i , e∗i ) must solve
arg max{πi (ti , tj , hi , ei , h∗j , e∗j ) : hi ≥ 0 and ei ≥ 0}
Firm i is maximizing profits on market i and market j
– h∗i must solve

arg max{hi [a − (hi + e∗j )]+ − chi : hi ≥ 0}
– e∗i must solve

arg max{ei [a − (ei + h∗j )]+ − (c + tj )ei : ei ≥ 0}
Assuming e∗j ≤ a − c, we have

1
h∗i = (a − e∗j − c)
2
Assuming h∗j ≤ a − c − tj , we have
1
e∗i = (a − h∗j − c − tj )
2
We obtain four equations with four unknowns
If ti ≤ (a − c)/2 for each player i, then the solutions are
a − c + ti a − c − 2tj
h∗i (ti ) = and e∗i (tj ) =
3 3
67
In the Cournot game, both firms were choosing the quantity (a − c)/3,
– but this result was derived under the assumption of symmetric marginal costs
In the equilibrium described above, the governments’ tariff choices make marginal costs asymmetric
– On market i, firm i’s marginal cost is c but firm j’s is c + ti

– Since firm j’s cost is higher it wants to produce less
– If firm j is going to produce less, then the market-clearing price will be higher, so firm i wants to
produce more
In equilibrium the function h∗i increases in ti and e∗j decreases (at a faster rate) in ti
Having solved the second-stage game that remains between the two firms after the governments choose
tariff rates
We can now represent the first-stage interaction between the two governments as the following simultaneous-
move game
First, the governments simultaneously choose tariff rates ti1 and ti2
Second, payoffs are

Wi (ti , tj , h∗i (ti ), e∗i (tj ), h∗j (tj ), e∗j (ti ))
We now solve for the Nash equilibrium of this game between the governments
We denote by (ti , tj ) 7→ Wi∗ (ti , tj ) the function defined by
Wi∗ (ti , tj ) ≡ Wi (ti , tj , h∗i (ti ), e∗i (tj ), h∗j (tj ), e∗j (ti ))
If (t∗i , t∗j ) is a Nash equilibrium of this game between governments then, for each i, the tariff t∗i must solve
arg max{Wi∗ (ti , t∗j ) : ti ≥ 0}
We propose to show that there exists a solution
(t∗i , t∗j ) ∈ (0, (a − c)/2) × (0, (a − c)/2)
Observe that if ti and t∗j belong to (0, (a − c)/2) then Wi∗ (ti , t∗j ) equals
(2(a − c) − ti )2 (a − c + ti )2 (a − c − 2t∗j )2 ti (a − c − 2ti )

+ + +
18 9 9 3
A solution is
a−c
t∗i =
3
for each i, independent of t∗j
68
In this model, choosing a tariff rate of (a − c)/3 is a dominant strategy for each government
We then obtain the following firms’ quantity choices for the second-stage
4(a − c) a−c
h∗i (t∗i ) = and e∗i (t∗j ) =
9 9
Thus, the subgame-perfect outcome of this tariff game is
a−c ∗ 4(a − c) a−c

t∗i1 = t∗i2 = , hi1 = h∗i2 = and e∗i1 = e∗i2 =
3 9 9
If the governments had chosen tariff rates equal to 0
Then the aggregate quantity on each market would have been
2(a − c)
Qi =
3
just as in the Cournot model
The consumers’ surplus on market i is lower when the governments choose their dominant strategy tariffs
than it would be if they chose zero tariffs
In fact, zero tariffs is socially optimal, i.e., it is the solution of
arg max{Wi∗1 (ti1 , ti2 ) + Wi∗2 (ti1 , ti2 ) : ti1 ≥ 0 and ti2 ≥ 0}
There is an incentive for the governments to sign a treaty in which they commit to zero tariffs
Tournaments
Lazear and Rosen (1981)
Consider two workers J = {j1 , j2 } and their boss
Worker j produces output yj = ej + εj
– ej is effort and εj is noise
Production proceeds as follows:
1. The workers simultaneously choose non-negative effort levels: ej ≥ 0

2. The noise terms εj1 and εj2 are independently drawn from a density f : R → [0, ∞) with zero mean
3. The workers’ output are observed but their effort choices are not
The workers’ wages therefore can depend on their outputs but not directly on their effort levels
Suppose the boss decides to induce effort by having the workers compete in a tournament
69
The winner of the tournament is the worker with the higher output
– wH : wage earned by the winner of the tournament

– wL : the wage earned by the loser
The payoff to a worker from earning wage w and expending effort e is
u(w, e) = w − g(e)
– g(e): disutility under the effort level e

– g : [0, ∞) → [0, ∞) is twice continuously differentiable and satisfies g′ > 0 (strictly increasing) and
g ′′ > 0 (strictly convex)
The payoff to the boss is yj1 + yj2 − wH − wL
The boss is player i1 whose action ai1 is choosing the wages to be paid in the tournament, wH and wL
There is no player i2
Worker j1 is player i3 and worker j2 is player i4
Workers observe the wages chosen in the first stage and then simultaneously choose actions ai3 and ai4 ,
namely effort choices ej1 and ej2
Since outputs (and so also wages) are functions not only of the players actions but also of the noise term
εj1 and εj2 , we work with the players’ expected payoffs according to the density f
Suppose that the boss has chosen the wages wH and wL
Let (e∗j1 , e∗j2 ) be a Nash equilibrium of the remaining game between the workers
For each j, e∗j must solve

arg max{πj (wH , wL , ej , e∗k ) : ej ≥ 0}
where πj (wH , wL , ej , e∗k ) is the expected profit defined by
πj (wH , wL , ej , e∗k ) = wH Pr{yj (ej ) > yk (e∗k )} + wL Pr{yj (ej ) < yk (e∗k )} − g(ej )
= (wH − wL ) Pr{yj (ej ) > yk (e∗k )} + wL − g(ej )
where yj (ej ) = ej + εj and yk (e∗k ) = e∗k + εk
Assume e∗j is strictly positive
The first-order condition of the maximization problem is
∂ Pr{yj (ej ) > yk (e∗k )}

(wH − wL ) = g′ (ej )
∂ej
70
The worker j chooses ej such that the marginal disutility of extra effort, g ′ (ej ), equals the marginal gain
from extra effort
Observe that by Bayes’ rule
P rob{yj (ej ) > yk (e∗k )} = P rob{εj > e∗k + εk − ej }

Z
= P rob{εj > e∗k + z − ej |εk = z}f (z)dz
Since εj and εk are independent we have
P rob{εj > e∗k + z − ej |εk = z} = P rob{εj > e∗k + z − ej }
implying that4 Z
P rob{yj (ej ) > yk (e∗k )} = [1 − F (e∗k − ej + z)]f (z)dz
The first order condition becomes

Z
(wH − wL ) f (e∗k − ej + z)f (z)dz = g ′ (ej )
If we look for symmetric Nash equilibria
e∗j = e∗k = e∗ (wH , wL )
we get Z
(wH − wL ) f (z)2 dz = g ′ (e∗ (wH , wL ))
R
Since g is convex, g ′ is increasing
– a bigger prize for winning (i.e., a larger value of wH − wL ) induces more effort
Observação 23. Holding the prize constant,
it is not worthwhile to work hard when output is very noisy,
because the outcome of the tournament is likely to be determined by luck rather than effort
If εj is normally distributed with variance σ 2 , then

Z
1
f (z)2 dz = √
2σ π
which decreases in σ, so e∗ (wH , wL ) decreases in σ
We work backwards to the first stage of the game

4
F is the cumulative distribution of f
71
Suppose that if the workers agree to participate in the tournament (rather than accept alternative em-
ployment)
Then they will respond to the wages wH and wL by playing the symmetric Nash equilibrium previously
exhibited
We ignore the possibility of asymmetric equilibria and of an equilibrium with “corner” solutions
Suppose that the workers’ alternative employment opportunity would provide utility Ua
In the symmetric N E each worker wins the tournament with probability 1/2
1
P rob{yj (e∗ (wH , wL )) > yk (e∗ (wH , wL ))} =
2
If the boss intends to induce the workers to participate in the tournament then he must choose wages
(wH , wL ) that satisfy
1 1
wH + wL − g(e∗ (wH , wL )) ≥ Ua (IR)
2 2
The boss chooses wages to maximize expected profit
2e∗ (wH , wL ) + E[ε1 + ε2 ] − (wH + wL ) = 2e∗ (wH , wL ) − (wH + wL )
subject to the restriction (IR)
Assume there exists a solution (wH

∗ , w ∗ ) to the maximization problem with w ∗ > 0
L L
The participation restriction (IR) must be binding at the optimum, i.e., (wH
∗ , w ∗ ) must be a solution to
L
wH + wL = 2Ua + 2g(e∗ (wH , wL )) (IRb)
Expected profit becomes

2[e∗ (wH
∗ ∗
, wL ) − Ua − g(e∗ (wH
∗ ∗
, wL ))]
The choice (wH

∗ , w ∗ ) of the boss solves
L
max {e∗ (wH , wL ) − g(e∗ (wH , wL ))}

wH ≥wL ≥0
under the binding restriction (IRb)
72
We denote by f ∗ the function defined by
Z
∗ ′ −1
∀δ ≥ 0, f (δ) = [g ] (δξ) where ξ = f (z)2 dz
R
Observe that, from the FOC

e∗ (wH , wL ) = f ∗ (wH − wL )
We propose to replace the pair of variable (wH , wL ) by (δ, wL ) where δ = wH − wL
It follows that the choice (δ∗ , wL

∗ ) of the boss solves
max {f ∗ (δ) − g(f ∗ (δ))}

(δ,wL )≥0
under the restriction

δ
wL = Ua + g(f ∗ (δ)) − (IRt)
2
Since the choice variable wL does not enter the objective function, the maximization problem is equivalent
to the following one
max{f ∗ (δ) − g(f ∗ (δ))}
δ≥0
under the restriction

δ
wL = Ua + g(f ∗ (δ)) − ≥0 (IR’)
2
Since we assumed that wL
∗ >0
At the solution δ∗ the restriction (IR’) is not binding
δ∗
Ua + g(f ∗ (δ∗ )) − >0
2
It follows that the choice δ∗ of the boss satisfies the FOC
Ψ′ (δ∗ ) = 0
where the function Ψ is defined by

Ψ(δ) ≡ f ∗ (δ) − g(f ∗ (δ))
Thus the optimal induced effort e∗ (wH

∗ , w ∗ ) satisfies
L
g′ (e∗ (wH
∗ ∗
, wL )) = 1
Remember that Z
∗ ∗
(wH − wL ) f (z)2 dz = g′ (e∗ (wH
∗ ∗
, wL ))
73
Therefore the optimal wages satisfy
Z
∗ ∗
(wH − wL ) f (z)2 dz = 1
The pair (wH , wL ) is determined by the participation equation
∗ ∗
wH + wL = 2Ua + 2g([g′ ]−1 (1))
2.3 Repeated games

Theory: Two-stage repeated games
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
Consider the Prisoners’ Dilemma described above
Suppose the two players play this simultaneous-move game twice
Suppose the outcome of the first play is observed before the second play begins
Suppose the payoff for the entire game is simply the sum of the payoffs from the two stages (no discounting)
This game, called the two-stage Prisoners’ Dilemma belongs to the class of games analyzed in the previous
chapter
Players i3 and i4 are identical to players i1 and i2
The action spaces Ai3 and Ai4 are identical to Ai1 and Ai2
The payoff
ui (ai1 , ai2 , ai3 , ai4 )
is the sum of the payoffs from each stage
For each possible outcome of the first-stage game, (ai1 , ai2 ), the second-stage game that remains between
players i3 and i4 has a unique NE (âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
In the two-stage Prisoners’ Dilemma the unique equilibrium of the second-stage game is (L1 , L2 ), regardless
of the first-stage outcome
To compute the subgame perfect outcome of this game
We analyze the first-stage Prisonners’ Dilemma by taking into account that the outcome of the game
remaining in the second stage will be the NE (L1 , L2 ) with payoff (1, 1)
Thus the players’ first-stage interaction amounts to the one-shot game below
74
Player i2
L2 R2
L1 2,2 6,1
Player i1
R1 1,6 5,5
The above game has a unique NE (L1 , L2 )
The unique subgame perfect outcome of the two-stage Prisoners’ Dilemma is (L1 , L2 ) in the first-stage,
followed by (L1 , L2 ) in the second-stage
Cooperation-i.e., (R1 , R2 )-cannot be achieved in either stage of the subgame perfect outcome
This argument holds more generally
Let G = {Ai , ui }i∈I denote a static game of complete information
The payoff of player k is uk ((ai )i∈I ) where ai is chosen from the action set Ai
Definição 15. Given a static game G, let G(T ) denote the finitely repeated game in which G is played
T times:
the outcomes of all preceding plays are observed before the next play begins
the payoffs for G(T ) are simply the (discounted) sum of the payoffs from the T stage games
Proposição 9. If the stage game G has a unique NE then, for any finite T , the repeated game G(T ) has
a unique subgame-perfect outcome: the NE of G is played in every stage
Player i2
L2 M2 R2
L1 1,1 5,0 0,0
Player i1 M1 0,5 4,4 0,0
R1 0,0 0,0 3,3
Consider the stage-game described above
There are two pure NE strategies (L1 , L2 ) but also (R1 , R2 )
Suppose this stage game is played twice
We will show that there is a subgame perfect outcome of this repeated game in which the strategy (M1 , M2 )
is played in the first stage
We assume that in the first-stage, players anticipate that the second-stage outcome will be a NE of the
stage game
75
We have for this specific stage game, several Nash equilibria in the second stage
Players may anticipate that different first-stage outcomes will be followed by different stage-game equilibria
in the second stage
For example, suppose that players anticipate that (R1 , R2 ) will be the second-stage outcome if the first-
stage outcome is (M1 , M2 )
Players anticipate that (L1 , L2 ) will be the second-stage outcome if any of the eight other first-stage
outcomes occurs
The players’ first stage interaction then amounts to the following one-shot game
Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4
There are three pure-strategy Nash equilibria (L1 , L2 ), (M1 , M2 ) and (R1 , R2 )
Every NE of this one-shot game corresponds to a subgame perfect outcome of the original repeated game
Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4
Denote by ((w, x), (y, z)) the outcome of the repeated game
(w, x) is the first-stage outcome and (y, z) the second-stage outcome
The NE (L1 , L2 ) in the one-shot game above corresponds to the subgame-perfect outcome ((L1 , L2 ), (L1 , L2 ))
in the repeated game
The NE (R1 , R2 ) in the one-shot game above corresponds to the subgame-perfect outcome ((R1 , R2 ), (L1 , L2 ))
These two subgame-perfect outcomes of the repeated game simply concatenate NE outcomes from the
stage game
Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4
76
The third NE of the one shot game, (M1 , M2 ) corresponds to the subgame-perfect outcome ((M1 , M2 ), (R1 , R2 ))
This is a qualitatively different result
Because the anticipated second-stage outcome is (R1 , R2 ) following (M1 , M2 ), cooperation can be achieved
in the first stage of a subgame perfect outcome of the repeated game
This illustrates a more general point:
if G is a static game of complete information with multiple Nash equilibria then there may be subgame
perfect outcomes of the repeated game G(T ) in which, for any t < T , the outcome in stage t is not a NE
of G
Credible threats or promises about future behavior can influence current behavior
Subgame perfection may not embody a strong enough definition of credibility
In deriving the subgame perfect outcome ((M1 , M2 ), (R1 , R2 )) we assumed that the players anticipate that
(R1 , R2 ) will be the second-stage outcome if the first-stage outcome is (M1 , M2 ) and that (L1 , L2 ) will be
the second-stage outcome if any of the eight other first-stage outcomes occurs
But playing (L1 , L2 ) in the second stage, with its payoff of (1, 1), may seem silly when (R1 , R2 ), with its
payoff of (3, 3), is also available as a NE of the remaining stage game
It would seem natural for the players to “renegotiate” by introspection
They might reason that “bygones are bygones” and unanimously preferred stage-game equilibrium (R1 , R2 )
should be played instead
If (R1 , R2 ) is to be the second-stage outcome after every first-stage outcome, then the incentive to play
(M1 , M2 ) in the first stage is destroyed
Indeed, in that case, the payoff (3, 3) has been added to each cell of the stage game
So Li is player i’s best response to Mj
77
To suggest a solution to this renegotiation problem, we consider the following modification of the stage
game
Player i2
L2 M2 R2 P2 Q2
L1 1,1 5,0 0,0 0,0 0,0
M1 0,5 4,4 0,0 0,0 0,0
Player i1 R1 0,0 0,0 3,3 0,0 0,0
P1 0,0 0,0 0,0 4,1/2 0,0
Q1 0,0 0,0 0,0 0,0 1/2,4
There are four pure-strategy NE
– (L1 , L2 ) and (R1 , R2 ), and now also (P1 , P2 ) and (Q1 , Q2 )
The players unanimously prefer (R1 , R2 ) to (L1 , L2 ), in other words, (R1 , R2 ) Pareto dominates (L1 , L2 )
There is no NE (x, y) such that the players unanimously prefer (x, y) to (P1 , P2 ), or (Q1 , Q2 ), or (R1 , R2 )
We say that (P1 , P2 ), (Q1 , Q2 ), and (R1 , R2 ) belong to the Pareto frontier of the payoffs to Nash equilibria
of the stage game
Suppose that the stage game is played twice, with the first-stage outcome observed before the second
stage begins
Suppose that players anticipate that the second-stage outcome will be as follows
– (R1 , R2 ) if the first-stage outcome is (M1 , M2 )

– (P1 , P2 ) if the first-stage outcome is (M1 , w) where w is anything but M2
– (Q1 , Q2 ) if the first-stage outcome is (x, M2 ) where x is anything but M1
– (R1 , R2 ) if the first-stage outcome is (y, z) where y is anything but M1 and z is anything but M2
The players’ first stage interaction then amounts to the following one-shot game
Player i2
L2 M2 R2 P2 Q2
11
L1 4,4 2 ,4 3,3 3,3 3,3
M1 4, 11
2 7,7 4, 12 4, 12 4, 12
Player i1 R1 1
3,3 2 ,4 6,6 3,3 3,3
1
P1 3,3 2 ,4 3,3 7, 72 3,3
1 7
Q1 3,3 2 ,4 3,3 3,3 2 ,7
((M1 , M2 ), (R1 , R2 )) is a subgame perfect outcome of the repeated game
More importantly, the difficulty raised in the previous example does not arise here
78
In the previous example, the only way to “punish” a player for deviating in the first stage from col-
laboration was to play a Pareto dominated equilibrium in the second stage, thereby also punishing the
punisher
Here, in contrast, there are three equilibria in the Pareto frontier
– One to reward good behavior by both players in the first stage

– Two others to be used not only to punish a player who deviates in the first stage but also to reward
the punisher
– In the second stage, the punisher cannot be persuaded to renegotiate the punishment
Theory: Infinitely repeated games
A static game G is repeated infinitely, with the outcomes of all previous stages observed before the current
stage begins
We will define
– a player’s strategy
– a subgame
– a subgame perfect Nash equilibrium (SPNE)
The main theme is that credible threats or promises about future behavior can influence current behavior
– We will illustrate that even if the stage game G has a unique NE, there may be subgame perfect
outcomes of the infinitely repeated game in which no stage’s outcome is the NE of G
Suppose the following Prisoners’ Dilemma is to be repeated infinitely
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
1
The discount factor δ = 1+r is the value today of a dollar to be received one stage later, where r is the
interest rate per stage
Given the discount factor δ and a player’s payoffs from an infinite sequence of stage games, we can compute
the present value of the payoffs
79
Definição 16. Given the discount factor δ, the present value of the infinite sequence of payoffs (πt )t≥1
is
∞
X
π1 + δπ2 + δ2 π3 + · · · = δt−1 πt
t=1
We can also use δ to interpret a game that ends after a random number of repetitions
Suppose that after each stage is played a (weighted) coin is flipped to determine whether the game will
end
If the probability is p that the game ends immediately, then a payoff π to be received in the next stage
(if if is played) is worth only
1−p
π
1+r
before this stage’s coin flip occurs
Likewise, a payoff π to be received two stages from now is worth only
(1 − p)2
π
(1 + r)2
Let δ = (1 − p)/(1 + r) then the present value
π1 + δπ2 + δ2 π3 + . . .
reflects both the time-value of money and the possibility that the game will end
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
Consider the infinitely repeated Prisoners’ Dilemma in which each player’s discount factor is δ
We will show that cooperation (i.e., (R1 , R2 )) can occur in every stage of a subgame-perfect outcome of
the infinitely repeated game
– Even though the only NE in the stage game is noncooperation (i.e., (L1 , L2 ))
The argument is as follows:
– if the players cooperate today then they play a high-payoff equilibrium tomorrow
– otherwise they play a low-payoff equilibrium tomorrow
We do not need to add artificially the high-payoff equilibrium that might be played tomorrow
It is the strategy “continuing to cooperate tomorrow and thereafter”
80
Suppose player i begins the infinitely repeated game by cooperating and then cooperates in each subse-
quent stage game if and only if both players have cooperated in every previous stage
Player i’s strategy is
– Play Ri in the first stage

– In the stage t,
* if the outcome of all t − 1 preceding stages has been (R1 , R2 ) then play again Ri
* otherwise, play Li
This strategy is an example of a trigger strategy
Player i cooperates until someone fails to cooperate, which triggers a switch to noncooperation forever
after
If both players adopt this trigger strategy then the outcome of the infinitely repeated game will be (R1 , R2 )
in every stage
We will first show that if δ is close enough to 1 then it is a NE of the infinitely repeated game for both
players to adopt this strategy
We will also show that such a NE is subgame perfect
We propose to provide rigorous definitions of the following concepts for both finitely and infinitely repeated
games
1. a strategy in a repeated game
2. a subgame in a repeated game
3. a subgame-perfect Nash equilibrium (SPNE)
Definição 17. Given a stage game G = {Ai , ui }i∈I , let G(T, δ) denote the finitely repeated game in
which
G is played T times and the players share the discount factor δ
for each t , the outcomes of the t − 1 preceding plays are observed before the stage t begins
each player’s payoff in G(T, δ) is the present value of the player’ payoffs from the sequence of stage
games
Definição 18. Given a stage game G = {Ai , ui }i∈I , let G(∞, δ) denote the infinitely repeated game
in which
G is repeated forever and the players share the discount factor δ
for each t, the outcomes of the t − 1 preceding plays are observed before the stage t begins
81
each player’s payoff in G(∞, δ) is the present value of the player’ payoffs from the infinite sequence
of stage games
Definição 19. In the finitely repeated game G(T, δ) or the infinitely repeated game G(∞, δ), a player’s
strategy specifies the action the player will take in each stage, for each possible history of plays through
the previous stages
A history of plays up to stage t + 1 is a family
st = (s1 , s2 , . . . , st )
where
Y
∀0 ≤ τ ≤ t, sτ = (ai,τ )i∈I ∈ S ≡ Ai
i∈I
A strategy for player i in G(T, δ) is a function
fi : S(T ) → Ai
– The set of strategies for agent i in G(T, δ) is denoted by Fi (T )

−1 t
– S(T ) = ∅ ∪Tt=1 S where S t = S
| × ·{z
· · × S}
t times
A strategy for player i in G(∞, δ) is a function
fi : S(∞) → Ai
– The set of strategies for agent i in G(∞, δ) is denoted by Fi (∞)

– S(∞) = ∅ ∪t≥1 S t where S t = S
| × ·{z
· · × S}
t times
Interpretation 13.
– fi (∅) is the action of player i at stage 1

– fi (s1 , s2 , . . . , st ) is the action of player i at stage t + 1 , if the history of past plays is (s1 , s2 , . . . , st )
82
Consider a finitely repeated game G(T, δ)
Definição 20. A strategy profile f ∗ = (fi∗ )i∈I is a NE of the repeated game G(T, δ) if for each player i, fi∗
∗ , i.e.,
is a best response to f−i
fi∗ ∈ arg max{πi (fi , f−i

∗
) : fi ∈ Fi (T )}
Consider an infinitely repeated game G(∞, δ)
Definição 21. A strategy profile f ∗ = (fi∗ )i∈I is NE of the infinitely repeated game G(∞, δ) if for each
player i, fi∗ is a best response to f−i
∗ , i.e.,
fi∗ ∈ arg max{πi (fi , f−i

∗
) : fi ∈ Fi (∞)}
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4

– In the stage t,
In other words, we denote by fi the trigger strategy defined by
fi∗ (∅) = Ri
and for any history st = (s1 , s2 , . . . , st ) up to stage t, the strategy at stage t + 1 is

(
Ri if ∀τ ∈ {1, . . . , t}, sτ = (R1 , R2 )
fi∗ (st ) =
Li if ∃τ ∈ {1, . . . , t}, sτ 6= (R1 , R2 )
83
Proposição 10. If δ ≥ 1/4 then the profile f ∗ defined above is a NE of the infinitely repeated Prisoners’
Dilemma
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
Proof. Fix a player i, we shall prove that
fi∗ ∈ arg max{πi (fi , fj∗ ) : fi ∈ Fi (∞)}
We first compute πi (fi∗ , fj∗ )
Observe that the outcome Ot (f ∗ ) at stage t is (Ri , Rj ) implying that
4
πi (fi∗ , fj∗ ) = 4 + 4δ + · · · + 4δt + · · · =
1−δ
Now fix another strategy fi 6= fi∗ and assume that fi (∅) 6= fi∗ (∅), i.e., fi (∅) = Li
If follows that the outcome at the first stage is O1 (fi , fj∗ ) = (Li , Rj )
The outcome at the second stage is then O2 (fi , fj∗ ) = (ai,2 , Lj ) for some action ai,2 ∈ {Ri , Li }
Actually, for every stage t > 1 we have
Ot (fi , fj∗ ) = (ai,t , Lj )
for some action ai,t ∈ {Ri , Li }
This implies
πi (fi , fj∗ ) ≤ 5 + δ + · · · + δt + . . .
and therefore
δ 4δ − 1
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ −1 + 3 =
1−δ 1−δ
We have thus proved that πi (fi∗ , fj∗ ) ≥ πi (fi , fj∗ )

Now fix a strategy fi 6= fi∗ satisfying fi (∅) = fi∗ (∅), i.e., fi (∅) = Ri
Observe that the value of fi (s1 ) for an outcome s1 different from sco
1 ≡ (R1 , R2 ) is irrelevant for the payoff
ui (O2 (fi , fj∗ ))
Assume that fi (sco ∗ co co

1 ) 6= fi (s1 ), i.e., fi (s1 ) = Li
If follows that the outcome at the second stage is O2 (fi , fj∗ ) ≡ (fi (sco co
1 ), fj (s1 )) = (Li , Rj )
84
The outcome at the third stage is then O3 (fi , fj∗ ) = (ai,3 , Lj ) for some action ai,3 ∈ {Ri , Li }
Actually, for every stage t > 2 we have
Ot (fi , fj∗ ) = (ai,t , Lj )
for some action ai,t ∈ {Ri , Li }
This implies
πi (fi , fj∗ ) ≤ 4 + 5δ + δ2 + · · · + δt + . . .
and therefore
δ2
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ 0 − 1δ + 3
1−δ
δ δ(4δ − 1)
≥ [3δ − (1 − δ)] =
1−δ 1−δ
Now fix a strategy fi 6= fi∗ and a stage t > 2
Assume that fi (sτ ) = fi∗ (sτ ), for every history sτ with τ ≤ t − 1
This implies every outcome Oτ (fi , fj∗ ) coincides with sco

τ ≡ (R1 , R2 )
We denote by sco,τ the history (sco co

1 , . . . , sτ )
Observe that the value of fi (st−1 ) for an outcome history st−1 different from sco,t−1 is irrelevant for the
payoff ui (Ot (fi , fj∗ ))
Assume that fi (sco,t−1 ) 6= fi∗ (sco,t−1 ), i.e., fi (sco,t−1 ) = Li

If follows that the outcome at stage t is
Ot (fi , fj∗ ) ≡ (fi (sco,t−1 ), fj (sco,t−1 )) = (Li , Rj )
The outcome at stage t + 1 is then

Ot+1 (fi , fj∗ ) = (ai,t+1 , Lj )
for some action ai,t+1 ∈ {Ri , Li }
Actually, for every stage T > t we have
OT (fi , fj∗ ) = (ai,T , Lj )
for some action ai,T ∈ {Ri , Li }
This implies
πi (fi , fj∗ ) = 4 + 4δ + · · · + 4δt−2 + 5δt−1 + δt + · · · + δT + . . .
85
and therefore
δt
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ −δt−1 + 3
1−δ
t−1
δ [3δ − (1 − δ)] δt−1 (4δ − 1)
≥ =
1−δ 1−δ
86
Theory: Subgames
Definição 22.
Given a finitely repeated game G(T, δ)
Given a history st with t < T
The repeated game in which G is played T − t times after st is denoted by G(T − t, δ, st ) and is called the
subgame beginning at stage t + 1 following history st
Definição 23.
Given an infinitely repeated game G(∞, δ)
The repeated game in which G is played infinite times after st is denoted by G(∞, δ, st ) and is called the
subgame beginning at stage t + 1 following history st
Definição 24.
Given a strategy fi of a finitely repeated game G(T, δ)
We denote by fi (·|st ) the strategy of the subgame G(T − t, δ, st ) defined by
fi (σ τ |st ) = fi (st , σ τ )
for every history σ τ of G(T − t, δ, st ) with τ < T − t.
Definição 25.
Given a strategy fi of an infinitely repeated game G(∞, δ)
We denote by fi (·|st ) the strategy of the subgame G(∞, δ, st ) defined by
fi (σ τ |st ) = fi (st , σ τ )
87
for every history σ τ of G(∞, δ, st )
Definição 26. A subgame perfect equilibrium of a finitely repeated game G(T, δ) is a strategy profile f ∗ =
(fi∗ )i∈I which constitutes a NE of every subgame, i.e.,
f ∗ is a NE of G(T, δ)
for every stage t < T , for every history st ,
the strategy profile f ∗ (·|st ) ≡ (fi∗ (·|st ))i∈I is a NE of the subgame G(T − t, δ, st )
Observação 24.
Many possible histories st are out of equilibrium paths, i.e., they are different from the outcome history
(O1 (f ∗ ), . . . , Ot (f ∗ ))
This is to capture the idea of credible threats or promises
Definição 27. A subgame perfect equilibrium of an infinitely repeated game G(∞, δ) is a strategy profile
f ∗ = (fi∗ )i∈I which constitutes a NE of every subgame, i.e.,
f ∗ is a NE of G(∞, δ)
for every stage t < T , for every history st ,
the strategy profile f ∗ (·|st ) ≡ (fi∗ (·|st ))i∈I is a NE of the subgame G(∞, δ, st )
Subgame perfect Nash equilibrium is a refinement of NE
– To be subgame perfect, the players’ strategies must first be a NE and must then pass an additional
test
The notion of subgame perfect equilibrium eliminates Nash equilibria in which the players’ threats or
promises are not credible
Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4
88

– In the stage t,
In other words, we denote by fi∗ the trigger strategy defined by
fi∗ (∅) = Ri
and for any history st = (s1 , s2 , . . . , st ) up to stage t, the strategy at stage t + 1 is

(
Ri if ∀τ ∈ 1, . . . , t, sτ = (R1 , R2 )
fi∗ (st ) =
Li if ∃τ ∈ 1, . . . , t, sτ 6= (R1 , R2 )
Proposição 11. If δ ≥ 1/4 then the profile f ∗ defined above is a SPNE of the infinitely repeated Prisoners’
Dilemma
Proof. We already proved that f ∗ is a NE of the game G(∞, δ)
We only have to prove that for every T ≥ 1 and every possible history sT , the profile f ∗ (·|sT ) is a NE of
the subgame G(∞, δ, sT )
If no agent deviated up to period T , i.e.,
sT = (sco co co
1 , s2 , . . . , sT )
Then for every history σ t of the subgame G(∞, δ, sT ), we have
∀i ∈ I, fi∗ (σ t |sT ) = fi∗ (σ t )
Therefore we can reproduce the arguments of the proof that f ∗ is a NE of G(∞, δ) to show that f ∗ (·|sT )
is a NE of G(∞, δ, sT )
Assume now that at least one agent deviated, i.e.,
sT 6= (sco co co
1 , s2 , . . . , sT )
The strategy fj∗ (·|sT ) is given by fj∗ (σ t |sT ) = Lj for any history σ t of the subgame G(∞, δ, sT )
89
The payoff πi (f ∗ (·|sT )) for agent i of the strategy fi∗ (·|sT ) is
ui (O1 (fi∗ (·|sT ), fj∗ (·|sT ))) + δui (O2 (fi∗ (·|sT ), fj∗ (·|sT ))) + . . .
Since Ot (fi∗ (·|sT ), fj∗ (·|sT )) = (Li , Lj ) for every t
We get that
1
πi (f ∗ (·|sT )) = 1 + δ + δ2 + · · · =
1−δ
Fix another strategy gi of player i in the subgame G(∞, δ, sT )
The payoff πi (gi , fj∗ (·|sT )) for agent i of the strategy gi is
ui (O1 (gi , fj∗ (·|sT ))) + δui (O2 (gi , fj∗ (·|sT ))) + . . .
Since for every t there exists an action ai,t ∈ {Li , Ri } such that Ot (gi , fj∗ (·|sT )) = (ai,t , Lj )
We get that
πi (gi , fj∗ (·|sT )) ≤ 1 + δ + δ2 + · · · = πi (f ∗ (·|sT ))
Folk Theorem: Friedman (1971)

Consider an abstract stage game G = (Ai , ui )i∈I and the associated infinitely repeated game G(∞, δ)
Definição 28. A profile of payoffs x = (xi )i∈I is called feasible in the stage game G if it is a convex
combination of the pure strategy payoffs of G, i.e., if there exists a family (π k )k∈K of pure strategy payoffs
π k = (πik )i∈I such that
X
x= αk π k
k∈K
P
where αk ≥ 0 and k∈K αk = 1.
Feasible payoffs: Prisoners’ dilemma
90
Average payoff
Consider an infinite sequence π = (π1 , π2 , . . . ) of payoffs for every stage
The present value V (π) is defined by
V (π) = π1 + δπ2 + δ2 π3 + . . .
If a payoff π̄ were received in every stage, the present value would be π̄/(1 − δ)
The average payoff of an infinite sequence π = (π1 , π2 , . . . ) of payoffs is the payoff π̄ that should be received
in every stage in order to achieve the same present value
π̄
= V (π)
1−δ
Definição 29. Given a discount factor δ, the average payoff of an infinite sequence of payoffs π =
(π1 , π2 , . . . ) is
X
(1 − δ) δt−1 πt
t≥1
Since the average payoff is just a rescaling of the present value, maximizing the average payoff is equivalent
to maximizing the present value
The advantage of the average payoff over the present value is that the former is directly comparable to
the payoffs from the stage game
91
Teorema 14 (Folk Theorem: Friedman (1971)). Let
G = (Ai , ui )i∈I be a static game of complete information
e = (ei )i∈I denote the profile of payoffs from a NE of G
x = (xi )i∈I denote any other feasible profile of payoffs from G
If
xi > ei for every player i and
δ is sufficiently close to one,
then there exists a subgame perfect NE of the infinitely repeated game G(∞, δ) that achieves x = (xi )i∈I as
the profile of average payoffs
Folk Theorem: Prisoners’ dilemma
92
Collusion between Cournot duopolists
Cournot duopoly
Recall the static Cournot game
If the aggregate quantity on the market is Q = qi1 + qi2
Then the market clearing price is P (Q) = [a − Q]+
Each firm has a marginal cost of c > 0 and no fixed costs
Firms choose quantities simultaneously
In the unique NE, each firm produces the quantity (a − c)/3
This quantity is called the Cournot quantity and is denoted by qC
The equilibrium aggregate quantity, 2(a − c)/3 exceeds the monopoly quantity, qm ≡ (a − c)/2
Both firms would be better off if each produced half the monopoly quantity, qi = qm /2
Infinitely repeated Cournot duopoly
Consider the infinitely repeated game based on this Cournot stage game when both firms have the dis-
counted factor δ
Consider the following trigger strategy for each firm
– Produce half the monopoly quantity, qm /2, in the first period

– In stage t,
* produce qm /2 if both firms have produced qm /2 in each of the t − 1 previous stages;
* otherwise, produce the Cournot quantity, qC
We propose to compute the values of δ for which it is a subgame perfect Nash equilibrium to play the
previous trigger strategy
The profit to one firm when both produce qm /2 is (a − c)2 /8, which will be denoted by πm /2
The profit to one firm when both produce qC is (a − c)2 /9, which will be denoted by πC
If firm i is going to produce qm /2 this period then the quantity that maximizes firm j’s profit this period
solves
arg max{[a − c − qj − (qm /2)]qj : qj ≥ 0}
The solution is qj = 3(a − c)/8, with associated profit of πd ≡ 9(a − c)2 /64
It is a NE for both firms to play the trigger strategy when
1 πm δ
× ≥ πd + πC
1−δ 2 1−δ
93
This yields δ ≥ 9/17
For the same reasons as the Prisoners’ Dilemma infinitely repeated game, this NE is subgame perfect
We propose to study what the firms can achieve if δ < 9/17
We first determine, for a given value of δ, the most profitable quantity the firms can produce if they both
play trigger strategies that switch forever to the Cournot quantity after any deviation
We know that such trigger strategies cannot support a quantity as low as qm /2
But for any value of δ it is a SPNE simply to repeat the Cournot quantity forever
This implies that the most profitable quantity that triggers strategies can support is between qm /2 and
qC
Consider the following trigger strategy
– Produce q ∗ in the first period

– In the stage t,
* produce q ∗ if both firms have produced q ∗ in each of t − 1 previous periods;
* otherwise, produce the Cournot quantity, qC
The profit of one firm if both play q ∗ is π ∗ ≡ (a − c − 2q ∗ )q ∗
If firm i is going to produce q ∗ this period, then the quantity that maximizes firm j’s profit this period
solves
arg max{(a − c − qj − q ∗ )qj : qj ≥ 0}
The solution is qj = (a − c − q ∗ )/2 with associated profit
(a − c − q ∗ )2
πd ≡
4
It is a NE for both firms to play the trigger strategy given before provided that
1 δ
π ∗ ≥ πd + πC
1−δ 1−δ
Solving the resulting quadratic in q ∗ shows that the lowest value of q ∗ for which the trigger strategies are
a SPNE is
9 − 5δ
q ∗ (δ) = (a − c)
3(9 − δ)
The function δ 7→ q ∗ (δ) is decreasing and satisfies
qm
lim q ∗ (δ) = and lim q ∗ (δ) = qC
9
δ→ 17 2 δ→0
94
We now explore the second approach, which involves threatening to administer the strongest credible
punishment
We propose to show that Abreu’s approach can achieve the monopoly outcome in our model when δ = 1/2
(which is less than 9/17)
Consider the following “two phase” strategy
– Produce half the monopoly quantity, qm /2, in the first period

– In stage t,
* produce qm /2 if both firms produced qm /2 in period t − 1
* produce qm /2 if both firms produced x in period t − 1
* Otherwise produce x
This strategy involves a one-period punishment phase
And a (potentially infinite) collusive phase in which the firm produces qm /2
The profit to one firm if both produce x is π(x) ≡ (a − c − 2x)x
Let V (x) denote the present value of receiving π(x) this period and half the monopoly profit forever after:
δ πm
V (x) = π(x) + ×
1−δ 2
If firm i is going to produce x this period, then the quantity that maximizes firm j’s profit this period
solves
arg max{(a − c − qj − x)qj : qj ≥ 0}
The solution is qj = (a − c − x)/2, with associated profit
(a − c − x)2
πdp (x) ≡
4
Teorema 15. The “two-phase” strategy is a SPNE if and only if
1 1
× πm ≥ πd + δV (x) (1)
1−δ 2
and
V (x) ≥ πdp (x) + δV (x) (2)
For δ = 1/2, condition (3) is satisfied provided

x 1 3
∈
/ ,
a−c 8 8
95
For δ = 1/2, condition (4) is satisfied provided

x 3 1
∈ ,
a−c 10 2
For δ = 1/2, the two-phase strategy achieves the monopoly outcome as a SPNE provided that

x 3 1
∈ ,
a−c 8 2
2.4 Dynamic games of complete but imperfect information

Extensive-form representation of games
Extensive-form representation
It may seem that that static games must be represented in normal form and dynamic games in extensive
form
– This is not the case
Any game can be represented in either normal or extensive form
Although for some games one of the two forms is more convenient to analyze
We will discuss how static games can be represented using extensive form and how dynamic games can
be represented using normal form
Normal-form representation
The normal-form representation of a game specifies
1. the players in the game
2. the strategies available to each player
3. the payoff received by each player for each combination of strategies that could be chosen by the players
Definição 30. The extensive-form representation of a game specifies
1. the players in the game
2. (a) when each player has the move

(b) what each player can do at each of his or her opportunities to move
(c) what each player knows at each of his or her opportunities to move
96
3. the payoff received by each player for each combination of moves that could be chosen by the players
We already analyzed several games represented in extensive form
We propose to describe such games using game trees rather than words
Example: consider the following class of two-stage games of complete and perfect information
1. Player 1 chooses an action a1 from the feasible set A1 = {L, R}
2. Player 2 observes a1 and then chooses an action a2 from the set A2 = {L, R}
3. Payoffs are u1 (a1 , a2 ) and u2 (a1 , a2 ), as shown in the following game tree
We can extend in a straightforward manner the previous game tree to represent any dynamic game of
complete and perfect information:
– players move in sequence

– all previous moves are common knowledge before the next move is chosen
– the players’ payoffs from each feasible combination of moves are common knowledge
We propose to derive the normal-form representation of the previous dynamic game
To represent a dynamic game in normal form, we need to translate the information in the extensive form
into the description of each player’s strategy
97
Normal-form representation of dynamic games
Definição 31. A strategy for a player is a complete plan of action: it specifies a feasible action for the
player in every contingency in which the player might be called on to act
We could not apply the notion of Nash equilibrium to dynamic games of complete information if we
allowed a player’s strategy to leave the actions in some contingencies unspecified
For player j to compute a best response to payer i’s strategy, j may need to consider how i would act in
every contingency, not just in the contingencies i thinks likely to arise
In the previous game, player 2 has two actions but four strategies
This is because there are two contingencies
Strategies of player 2:
Strategy 1 If player 1 plays L then play L′ , if player 1 plays R then play L′

(
L′ if a1 = L
f2 (a1 ) =
L′ if a1 = R
This strategy may be denoted by (L′ , L′ )
Strategy 2 If player 1 plays L then play L′ , if player 1 plays R then play R′

(
L′ if a1 = L
f2 (a1 ) =
R′ if a1 = R
This strategy may be denoted by (L′ , R′ )
Strategy 3 If player 1 plays L then play R′ , if player 1 plays R then play L′

(
R′ if a1 = L
f2 (a1 ) =
L′ if a1 = R
This strategy may be denoted by (R′ , L′ )
Strategy 4 If player 1 plays L then play R′ , if player 1 plays R then play R′

(
R′ if a1 = L
f2 (a1 ) =
R′ if a1 = R
This strategy may be denoted by (R′ , R′ )
Player 1 has two actions but only two strategies: play L or R
The reason is that player 1 has only one contingency in which he might be called upon to act
98
Player 1’s strategy space is equivalent to the action space A1 = {L, R}
Recall the extensive-form representation
We can now derive the normal-form representation of the game from its extensive-form representation
Player 2
(L′ , L′ ) (L′ , R′ ) (R′ , L′ ) (R′ , R′ )
L 3, 1 3, 1 1, 2 1, 2
Player 1
R 2, 1 0, 0 2, 1 0, 0
Extensive-form of static games
We turn to showing how a static (i.e., simultaneous-move) game can be represented in extensive form
In a static game players do not need to act simultaneously
It suffices that each choose a strategy without knowledge of the other’s choice
We can represent a simultaneous game between players 1 and 2 as follows
1. Player 1 chooses an action a1 from the feasible set A1

2. Player 2 does not observe player 1’s move but chooses an action a2 from the feasible set A2
3. Payoffs are u1 (a1 , a2 ) and u2 (a1 , a2 )
Alternatively, player 2 could move first and player 1 could then move without observing 2’s action
To represent that some player ignores the previous moves, we introduce the notion of a player’s information
set
Definição 32. An information set for a player is a collection of decision nodes satisfying
(i) the player has the move at every node in the information set
(ii) when the play of the game reaches a node in the information set, the player does not know which node
in the information set has (or has not) been reached
(iii) it is the largest set satisfying (i) and (ii)
Part (ii) implies that the player must have the same set of feasible actions at each decision in an information
set
99
In an extensive-form game, we will indicate that a collection of decision nodes constitutes an information
set by connecting the nodes by a dotted line
Extensive-form of the Prisonners’ Dilemma
Fink = confess
Information set: an example
We propose a second example of the use of an information set in representing ignorance of a previous play
Consider the following dynamic game of complete but imperfect information
1. Player 1 chooses an action a1 from the feasible set A1 = {L, R}

2. Player 2 observes a1 and then chooses an action a2 from the feasible set A2 = {L′ , R′ }
3. Player 3 observes whether or not (a1 , a2 ) = (R, R′ ) and then chooses an action a3 from the feasible
set A3 = {L′′ , R′′ }
Player 3 has two information sets
1. a singleton information set following R by player 1 and R′ by player 2

2. a non-singleton information set that includes every other node at which player 3 has the move
100
Perfect and imperfect information
We previously defined perfect information to mean that at each move in the game the player with the
move knows the full history of the play of the game thus far
An equivalent definition is that every information set is a singleton
Imperfect information means that there is at least one non-singleton information set
The extensive-form representation of a simultaneous-move game (such as the Prisoners’ Dilemma) is a

game of imperfect information
Subgame-perfect Nash equilibrium
Subgames
We gave a formal definition of a subgame for repeated games
We extend this definition to general dynamic games of complete information in terms of the game’s
extensive-form representation
Definição 33. A subgame in an extensive-form game is a game that
(a) begins at a decision node n that is a singleton information set but is not the game’s first decision node
(b) includes all the decision and terminal nodes following n in the game tree but no nodes that do not follow
n
(c) does not cut any information sets, i.e., if a decision node n′ follows n in the game tree, then all other
nodes in the information set containing n must also follow n, and so must be included in the subgame
Subgames: example
There are two subgames, one beginning at each of player 2’s decision nodes
101
Subgames: example
There are no subgames
Subgames: example
There is only one subgame: it begins at player 3’s decision node following R by player 1 and R′ by player
2
Because of part (c), a subgame does not begin at either of player 2’s decision nodes, even though both of
these nodes are singleton information sets
102
Subgame perfect Nash equilibrium
Definição 34. A profile of strategies of a dynamic game with complete information is a subgame perfect
Nash equilibrium if it is a Nash equilibrium of the initial game and the players’ strategies restricted to every
subgame constitute a Nash equilibrium of the subgame
We already encountered two game solutions for dynamic games: backwards induction outcome and subgame
perfect outcome
The difference is that a SPNE is a collection of strategies and a strategy is a complete plan of actions
Whereas an outcome describes what will happen only in the contingencies that are expected to arise, not
in every contingency that might arise
Equilibrium vs outcome
Consider the standard two-stage game of complete and perfect information defined as follows
1. Player 1 chooses an action a1 from a feasible set A1
2. Player 2 observes a1 and then chooses an action a2 from a feasible set A2
3. Payoffs are u1 (a1 , a2 ) and u2 (a1 , a2 )
Assume that for each a1 in A1 , player 2’s optimization problem
arg max{u2 (a1 , a2 ) : a2 ∈ A2 }
has a unique solution, denoted by R2 (a1 )
Player 1’s problem at the first stage amounts to
arg max{u1 (a1 , R2 (a1 )) : a1 ∈ A1 }
Assume that the previous optimization problem for player 1 also has a unique solution, denoted by a∗1
The pair of actions (a∗1 , R2 (a∗1 )) is the backwards induction outcome of this game
To define a SPNE we need to construct strategies
For player 1 a strategy coincides with an action since there is only one contingency in which player 1 can
be called upon to act – the beginning of the game
A strategy for player 2 is a function a1 7→ f2 (a1 ) from A1 to A2
– R2 (a∗1 ) is an action but not a strategy

– the best response function R2 is a possible strategy for player 2
In this game, the subgames begin with player 2’s move in the second stage
There is one subgame for each player 1’s feasible action a1
103
Observação 25. The profile of strategies f ∗ ≡ (a∗1 , R2 ) is a SPNE
We have to show that f ∗ = (a∗1 , R2 ) is a NE and that the restriction to each subgame is also a NE
Subgames are simply single-person decision problems
– Being a NE reduces to requiring that player 2’s action be optimal in every subgame
– This is exactly the problem that the best-response function R2 solves
Now we have to prove that f ∗ is a Nash equilibrium
Recall that a∗1 satisfies

u1 (a∗1 , R2 (a∗1 )) ≥ u1 (a1 , R2 (a1 )) ∀a1 ∈ A1
implying that a1 is a best response to R2
R2 is a best response to a1 since

u2 (a∗1 , R2 (a∗1 )) ≥ u2 (a∗1 , f2 (a∗1 ))
for every strategy f2 : A1 7→ A2
Consider the standard two-stage game of complete but imperfect information defined as follows:
Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively
Players i3 and i4 observe the outcome of the first stage, (ai1 , ai2 ), and then simultaneously choose actions
ai3 and ai4 from feasible sets Ai3 and Ai4 , respectively
Payoffs are ui (ai1 , ..., ai4 )
We will assume that for each feasible outcome (ai1 , ai2 ) of the first game, the second-stage game that
remains between players i3 and i4 has a unique NE denoted by
(âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
Assume that (ai1 , ai2 ) is the unique NE of the first-stage interaction between i1 and i2 defined by the
following simultaneous-move game
2. Payoffs are
ui (ai1 , ai2 , âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))
Proposição 12. In the two-stage game of complete but imperfect information defined above, the subgame
perfect outcome is
(a∗i1 , a∗i2 , âi3 (a∗i1 , a∗i2 ), âi4 (a∗i1 , a∗i2 ))
104
but the subgame perfect Nash equilibrium is
(a∗i1 , a∗i2 , âi3 , âi4 )
Subgame perfect Nash equilibrium and credible threats

Consider the following dynamic game with complete and perfect information
The backwards induction outcome of the game is (R, L′ )
The SPNE is the profile (R, f2 ) where f2 : {L, R} 7→ {L′ , R′ } is defined by
f2 (L) = R′ and f2 (R) = L′
Recall that the normal-form representation of this game is given by
Player 2
(L′ , L′ ) (L′ , R′ ) (R′ , L′ ) (R′ , R′ )
L 3, 1 3, 1 1, 2 1, 2
Player 1
R 2, 1 0, 0 2, 1 0, 0
There are two NE : (R, (R′ , L′ )) and (L, (R′ , R′ ))
The first one corresponds to the SPNE (R, f2 )
105
The second one corresponds to a non-credible threat of player 2
Player 2 is threatening to play R′ if player 1 plays R
If the threat works then 2 is not given the opportunity to carry out the threat
The threat should not work because it is not credible:
– if player 2 were given the opportunity to carry it out,

– then player 2 would decide to play L′ rather than R′
Observe that players strategies do not constitute a NE in one of the subgames
106
Cap. 3 - Static Games of Incomplete Information
3.1 Static Bayesian games and Bayesian Nash equilibrium
Introduction
In a game of complete information the players’ payoff functions are common knowledge
In a game of incomplete information, at least one player is uncertain about another player’s payoff function
An example of a static game of incomplete information is a sealed-bid auction:
– each bidder knows his or her own valuation for the good being sold
– but each bidder does not know any other bidder’s valuation
– bids are submitted in sealed envelopes, so the players’ moves can be thought of as simultaneous
Cournot competition under asymmetric information
An example:
Consider a Cournot duopoly model with inverse demand given by P (Q) = a − Q, where Q = q1 + q2 is
the aggregate quantity in the market
Firm 1’s cost function is C1 (q1 ) = cq1
Firm 2’s cost function is (

cH q2 with probability θ
C2 (q2 ) =
cL q2 with probability 1 − θ
where cL < cH
Information is asymmetric:
– Firm 2 knows its cost function and firm 1’s

– Firm 1 knows its cost function and only that firm 2’s marginal cost is cH with probability θ and cL
with probability 1 − θ
– Firm 2 could be a new entrant to the industry or could have just invented a new technology
All of this is common knowledge
– firm 1 knows that firm 2 has superior information

– firm 2 knows that firm 1 knows this, and so on
Firm 2 may want to choose a different (and presumably lower) quantity if its marginal cost is high than
if it is low
Firm 1 should anticipate that firm 2 may tailor its quantity to its cost in this way
Let q2 (cH ) and q2 (cL ) denote firm 2’s quantity choices as a function of its cost
107
Let q1 denote firm 1’s single quantity choice
If firm 2’s cost is high, it will choose q2 (cH ) to solve
arg max{[(a − q1 − q2 ) − cH ]q2 : q2 ≥ 0}
If firm 2’s cost is low, it will choose q2 (cL ) to solve
arg max{[(a − q1 − q2 ) − cL ]q2 : q2 ≥ 0}
Firm 1 knows that firm 2’s cost is high with probability θ and should anticipate that firm 2’s quantity
choice will be q2 (cH ) or q2 (cL ), depending on firm 2’s cost
Firm 1 chooses q1 to solve

arg max{f1 (q1 , q2∗ ) : q1 ≥ 0}
where
f1 (q1 , q2∗ ) ≡ θ[(a − q1 − q2∗ (cH )) − c]q1 + (1 − θ)[(a − q1 − q2∗ (cL )) − c]q1
so as to maximize expected profits
The first order conditions are

a − q1∗ − cH
q2∗ (cH ) =
2
a − q1∗ − cL
q2∗ (cL ) =
2
and
θ[a − q2∗ (cH ) − c] + (1 − θ)[(a − q2∗ (cL ) − c]
q1∗ =
2
We assume that parameters are such that these FOCs characterize the solutions to the optimization
problems
The solutions to the three FOCs are
a − 2cH + c 1 − θ
q2∗ (cH ) = + (cH − cL )
3 6
a − 2cL + c θ
q2∗ (cL ) = − (cH − cL )
3 6
and
a − 2c + θcH + (1 − θ)cL
q1∗ =
3
Consider the Cournot equilibrium under complete information with costs c1 and c2
Under conditions on c1 and c2 , firm i produces at equilibrium the quantity
a − 2ci + cj
q̂i =
3
108
In the incomplete information case, q2∗ (cH ) is greater than (a − 2cH + c)/3 and q2∗ (cL ) is less than (a −
2cL + c)/3
This occurs because firm 2 not only tailors its quantity to its costs
But also responds to the fact that firm 1 cannot do so
Normal-form representation of static Bayesian games
Recall that the normal-form representation of a game of complete information is G = (Si , ui )i∈I
– Si is player i’s strategy space

– ui (si , s−i ) is player i’s payoff when he chooses the strategy si and the others choose s−i
In a simultaneous-move game of complete information a strategy for a player is simply an action
We can write G = (Ai , ui )i∈I where Ai is player i’s action space and ui (ai , a−i ) is player i’s payoff
The timing of a static game of complete information is as follows
1. the players simultaneously choose actions

2. payoffs are received
We develop the normal-form representation of a simultaneous-move game of incomplete information, also

called static Bayesian game
We should represent the idea that each player knows his or her payoff function but may be uncertain
about the other players’ payoff functions
Let player i’s possible payoff functions be represented by ui (ai , a−i ; ti ) where ti is called player i’s type
The type ti belongs to a set of possible types Ti also called type space
Each ti corresponds to a different payoff function that player i might have
For example, suppose that player i has two possible payoff functions
We would say that player i has two types ti1 and ti2
Player i’s type space is Ti = {ti1 , ti2 } and player i’s two payoff functions are ui (a; ti1 ) and ui (a; ti2 )
We can also represent the possibility that the player might have different sets of feasible actions
– Suppose for example that player i’s set of feasible actions is {a, b} with probability q and {a, b, c}
with probability 1 − q
– We can say that i has two types: ti1 and ti2 where the probability of ti1 is q
109
– We can define i’s feasible set of actions to be {a, b, c} for both types but define the payoff from taking
action c to be −∞ for type ti1
Another example
Consider the Cournot game previously presented
The firms’ actions are their quantity choices, q1 and q2
Firm 2 has two possible cost functions and thus two possible profit or payoff functions
π2 (q1 , q2 ; cL ) = [(a − q1 − q2 ) − cL ]q2
and
π2 (q1 , q2 ; cH ) = [(a − q1 − q2 ) − cH ]q2
Firm 1 has only one possible payoff function
π1 (q1 , q2 ; c) = [(a − q1 − q2 ) − c]q1
We say that firm 2’s type space is T2 = {cL , cH } and that firm 1’s type space if T1 = {c}
Saying that player i knows his or her own payoff function is equivalent to saying that player i knows his
or her type
Saying that player i may be uncertain about the other players’ payoff functions is equivalent to saying
that player i may be uncertain about the types of the other players, denoted by
t−i = (t1 , · · · , ti−1 , ti+1 , · · · , tn )
We use T−i to denote the set of all possible values of t−i , i.e.,
Y
T−i ≡ Tj = T1 × · · · × Ti−1 × Ti+1 × · · · × Tn
j6=i
We use the probability distribution π(t−i |ti ) to denote player i’s belief about the other players’ types, t−i ,
given player i’s knowledge of his or her own type, ti
In many applications, the players’ types are independent, in which case π(t−i |ti ) does not depend on ti ,
so we can write player i’s beliefs as π(t−i )
Imagine two firms racing to develop a new technology
Each firm’s chance of success depends in part on how difficult the technology is to develop, which is not
known
110
Each firm knows only whether it has succeeded and not whether the other has
If firm 1 has succeeded, then it is more likely that the technology is easy to develop and so also more
likely that firm 2 has succeeded
Firm 1’s belief about firm 2’s type depends on firm 1’s knowledge of its own type
Definição 35. The normal-form representation of a static Bayesian game is
G = (Ai , Ti , pi , ui )i∈I
where
Ai is player i’s action space
Ti is player i’s type space
pi ∈ P rob(T ) is player i’s beliefs about T = Ti × T−i
ui : Ai × A−i × Ti 7→ [−∞, ∞) is player i’s payoff function
ui (ai , a−i ; ti )
The normal-form representation of a static Bayesian game is
G = (Ai , Ti , pi , ui )i∈I
Player i’s type ti is privately known by player i, determines player i’s payoff function ui (ai , a−i ; ti )
Player i’s belief

pi ({(ti , t−i )})
pi (t−i |ti ) =
pi ({ti } × T−i )
describes i’s uncertainty about the other players’ possible types t−i , given i’s own type ti
111
To simplify notations,
– pi ({(ti , t−i )}) is denoted by pi (ti , t−i ) or pi (t)

– pi ({ti } × T−i ) is denoted by pi (ti )
Therefore
pi (ti , t−i )
pi (t−i |ti ) =
pi (ti )
Since player i observes his own type, we do not need to define the probability pi on the whole space T
One may consider as a primitive of the game the conditional probabilities
(pi (·|ti ))ti ∈Ti ∈ [P rob(T−i )]Ti
i.e., player i’s beliefs can be represented by a function
ti 7→ pi (·|ti )
from Ti to P rob(T−i )
Following Harsanyi (1968) we will assume that the timing of a static Bayesian game is as follows
1. Nature draws a type vector t = (ti )i∈I where ti is drawn from the set of possible types Ti
2. Nature reveals ti to player i but not to any other player
3. The players simultaneously choose actions, player i choosing ai from the feasible set Ai
4. Payoffs ui (ai , a−i ; ti ) are received
We can interpret a game of incomplete information as a game of imperfect information since at some move
in the game the player with the move does not know the complete history of the game thus far
Indeed, nature reveals player i’s type to player i but not to player j in step (2)
Player j does not know the complete history of the game when actions are chosen in step (3)
For some games, player i’s payoff may depend not only on the actions (ai , a−i ), his type ti but also on all
the other types t−i
In that case player i’s payoff is denoted by ui (ai , a−i ; ti , t−i )
We will assume that it is common knowledge that in step (1) of the timing, nature draws a type vector
t = (ti )i∈I according to a common prior probability distribution p ∈ P rob(T )
When nature reveals ti to player i, he can compute the belief pi (t−i |ti ) using Bayes’ rule
p(ti , t−i ) p(ti , t−i )

pi (t−i |ti ) = p(t−i |ti ) ≡ =P
p(ti ) τ−i ∈T−i p(ti , τ−i )
112
The other players can compute the various beliefs π(·|ti ) that player i might hold, depending on i’s type
ti
We will frequently assume that players’ type are independent, i.e., there exists qi ∈ P rob(Ti ) such that
p(t1 , · · · , tn ) = q1 (t1 ) × · · · × qn (tn )
In this case pi (t−i |ti ) does not depend on ti since
pi (t−i |ti ) = q1 (t1 ) × · · · × qi−1 (ti−1 ) × qi+1 (ti+1 ) × · · · × qn (tn )
In this case the other players know i’s belief about their types
Definition of Bayesian Nash equilibrium
In order to define an equilibrium concept for static Bayesian games, we must first define the player’s
strategies
Recall that a player’s strategy is a complete plan of action, specifying a feasible action in every contingency
in which the player might be called on to act
Giving the timing of a static Bayesian game, in which nature begins the game by drawing the players’
types, a (pure) strategy for player i must specify a feasible action for each of player i’s possible types
Definição 36. A strategy for player i in the static Bayesian game G = (Ai , Ti , pi , ui )i∈I is a function
si : Ti 7→ Ai
which specifies for each type ti ∈ Ti an action si (ti ) from the feasible set Ai
In a Bayesian game the strategy spaces are constructed from the type and action spaces
Player i’s set of possible (pure) strategies, Si , is the set of all possible functions with domain Ti and range
Ai
Si ≡ ATi i
In a separating strategy, each type ti chooses a different action ai
In a pooling strategy, all types choose the same action
It may seem unnecessary to require player i’s strategy to specify a feasible action for each player i’s
possible types
113
Once nature has drawn a particular type and revealed it to a player, it may seem that the player need
not be concerned with the actions he should have taken had nature drawn some other type
But player i needs to consider what the other players will do
What they will do depends on what the other players think player i will do, for each ti in Ti (since they
do not observe ti )
In deciding what to do once one type has been drawn, player i will have to think about what he would
have done if each other types in Ti had been drawn
When player j has to decide what to do, he should think about what player i may do for each possible
types in Ti , since player j cannot observe player i’s type
Definição 37. In the static game G = (Ai , Ti , pi , ui )i∈I the profile of strategies
s∗ = (s∗i )i∈I
is a (pure strategy) Bayesian Nash equilibrium (BNE) if for each player i and for each of i’s types ti in Ti ,
the action s∗i (ti ) solves
 
 X 
arg max ui (ai , s∗−i (t−i ); (ti , t−i ))pi (t−i |ti ) : ai ∈ Ai
 
t−i ∈T−i
where ui (ai , s∗−i (t−i ); t) is given by
ui (s∗1 (t1 ), · · · , s∗i−1 (ti−1 ), ai , s∗i+1 (ti+1 ), · · · , s∗n (tn ); t)
One may also write in condensed form
s∗i (ti ) ∈ arg max{Ep [ui (ai , s∗−i )|ti ] : ai ∈ Ai }
In a static BNE, no player wants to change his strategy, even if the change involves only one action by
one type
We can show that in a finite static Bayesian game (i.e., a game in which Ai and Ti are finite sets) there
exists a BNE, perhaps in mixed strategies
Proposição 13. Assume that (s∗i )i∈I is a Bayesian Nash equilibrium, i.e.,
∀ti , s∗i (ti ) ∈ arg max{Ep [ui (ai , s∗−i )|ti ] : ai ∈ Ai }
Then we have
s∗i ∈ arg max{Ep [ui (si , s∗−i )] : si ∈ Si = [Ai ]Ti }
114
where
X
Ep [ui (si , s∗−i )] ≡ ui (si (ti ), s∗−i (t−i ))p(t).
t∈T
Moreover, the converse is true.
3.2 Applications
Mixed strategies revisited
Harsanyi (1973) suggested the following interpretation of mixed strategies
Player j’s mixed strategy represents player i’s uncertainty about j’s choice of a pure strategy
Player j’s choice in turn depends on the realization of a small amount of private information
More precisely, a mixed-strategy NE in a game of complete information can (almost always) be interpreted
as a pure-strategy BNE in a closely related game with a little bit of incomplete information
The crucial feature of a mixed-strategy NE is not that player j chooses a strategy randomly, but rather
that player i is uncertain about player j’s choice
The Battle of Sexes
Pat
Opera Fight
Opera 2, 1 0, 0
Chris
Fight 0, 0 1, 2
There are two pure-strategy Nash equilibria: (Opera,Opera) and (Fight,Fight)
And a mixed-strategy NE in which
– Chris plays Opera with probability 2/3

– Pat plays Fight with probability 2/3
The Battle of Sexes with incomplete information
Suppose that Chris and Pat are not quite sure of each other’s payoffs
For instance, suppose that Chris’s payoff if both attend the Opera is 2 + tc , where tc is privately known
by Chris
Pat’s payoff if both attend the Fight is 2 + tp , where tp is privately known by Pat
The parameters tc and tp are independent draws from a uniform distribution on [0, x], where x should be
thought as small with respect to 2
115
All the other payoffs are the same
The abstract static Bayesian games in normal-form is
G = {Ac , Ap ; Tc , Tp ; pc , pp ; uc , up }
where
the action spaces are Ac = Ap = {Opera, F ight}
the type space are Tc = Tp = [0, x]
the beliefs are

pc (X|tc ) = pp (X|tp ) = λ(X)/x
for all X ⊂ [0, x], and the payoffs are as follows
Pat
Opera Fight
Opera 2 + tc , 1 0, 0
Chris
Fight 0, 0 1, 2 + tp
Fix two critical values c and p in [0, x]
Consider the strategy profile s∗ = (s∗c , s∗p ) defined as follows
Chris plays Opera if tc exceeds the critical value c and plays F ight otherwise, i.e.,
(
Opera if tc > c
s∗c (tc ) =
F ight if tc ≤ c
Pat plays F ight if tp exceeds the critical value p and plays Opera otherwise, i.e.,
(
F ight if tp > p
s∗p (tp ) =
Opera if tp ≤ p
– Chris plays Opera with probability (x − c)/x

– Pat plays F ight with probability (x − p)/x
For a given value of x, we will determine values of c and p such that these strategies are a BNE
Given Pat’s strategy s∗p , Chris’s expected payoff from playing Opera is
p h pi p
uc (Opera, s∗p ; tc ) = (2 + tc ) + 1 − · 0 = (2 + tc )
x x x
and from playing F ight, Chris’s payoff is
p h pi p
uc (F ight, s∗p ; tc ) = ·0+ 1− ·1 =1−
x x x
116
Playing Opera is optimal (best response) if and only if
x
tc ≥ −3≡c
p
Given Chris’s strategy s∗c , Pat’s expected payoff from playing F ight is
h ci c c
up (F ight, s∗c ; tp ) = 1 − · 0 + (2 + tp ) = (2 + tp )
x x x
and from playing Opera, Pat’s payoff is

h ci c c
up (Opera, sc; tp ) = 1 − ·1+ ·0=1−
x x x
Playing F ight is optimal if and only if

x
tp ≥ −3≡p
c
Solving
x
−3≡c
p
and
x
−3≡p
c
simultaneously
Yields p = c and p2 + 3p − x = 0
Solving the quadratic equation then shows that both
– the probability that Chris plays Opera, namely (x − c)/x, and

– the probability that Pat plays F ight, namely (x − p)/x,
equal √
−3 + 9 + 4x
1−
2x
which approaches 2/3 as x approaches zero
As the incomplete information disappears (x → 0), the players’ behavior in this pure-strategy BNE of the
incomplete information game approaches the players’ behavior in the mixed-strategy NE in the original game
of complete information
An auction
Consider the following first-price, sealed-bid auction
There are two bidders, I = {1, 2}
117
Bidder i has a valuation vi for the good
– If bidder i gets the good and pays the price p, then i’s payoff is vi − p
The two players’ valuations are independently and uniformly distributed on [0, 1]
The players simultaneously submit their non-negative bids
The higher bidder wins the good and pays the price she bid; the other bidder gets and pays nothing
In case of a tie, the winner is determined by a flip of a coin
The bidders are risk-neutral and all this is common knowledge
The static Bayesian game associated to this problem is defined by
the action space Ai = [0, ∞)
the type space Ti = [0, 1]
Player i believes that vj is uniformly distributed on [0, 1]
∀V ∈ B([0, 1]), pi ({vj ∈ V }|vi ) = λ(V )
where λ is the Lebesgue measure on [0, 1]
Abusing notations, pi (·|vi ) is denoted pi (·) since it is independent of vi
Player i’s (expected) payoff function is


 vi − bi
 if bi > bj
ui (b1 , b2 ; v1 , v2 ) = (vi − bi )/2 if bi = bj


0 if bi < bj
A strategy for player i is a function

b̃i : vi 7→ b̃i (vi )
A profile (b̃1 , b̃2 ) is BNE if for each player i, for each valuation vi ∈ [0, 1], the value b̃i (vi ) belongs to

1
arg max (vi − bi )pi {bi > b̃j } + (vi − bi )pi {bi = b̃j } : bi ≥ 0
2
Recall that
pi {bi > b̃j } = λ{vj ∈ [0, 1] : bi > b̃j (vj )}
and
pi {bi = b̃j } = λ{vj ∈ [0, 1] : bi = b̃j (vj )}
An auction: existence of a linear equilibrium
118
We propose to look for a linear equilibrium of the form
b̃i : vi 7→ ai + ci vi
where ci > 0
We are not restricting the players’ strategy spaces to include only linear strategies
We allow players to choose arbitrary strategies but ask whether there is an equilibrium that is linear
Suppose that player j adopts the linear strategy
b̃j = aj + cj Id
where
Id : [0, 1] → [0, 1] is defined by Id(v) = v
For a given valuation vi , player i’s best response solves
max {(vi − bi )λ{bi > aj + cj Id}}

bi ≥0
where we recall that

λ{bi > aj + cj Id} = λ{vj ∈ [0, 1] : bj > aj + cj vj }
We have used the fact that λ{bi = b̃j } = 0
Observe that the best reply b̃i (vi ) must satisfy
aj ≤ b̃i (vi ) ≤ aj + cj
If bi belongs to [aj , aj + cj ] then
bi − a j
λ{bi > aj + cj Id} = λ[0, (bi − aj )/cj ) =
cj
Player i’s best response is therefore

(
(vi + aj )/2 if vi ≥ aj
b̃i (vi ) =
aj if vi < aj
If 0 < aj < 1 then there are some values of vi such that vi < aj , in which case b˜i is not linear
Can we find a NE where aj ≥ 1 or aj ≤ 0?
Assume that aj ≤ 0, in this case player i’s best response is
aj 1
b̃i (vi ) = + vi
2 2
119
The function b̃i takes the form ai + ci Id where ai = aj /2 and ci = 1/2
This implies that ((1/2)Id, (1/2)Id) is a BNE
An auction: uniqueness
We propose to prove that there is a unique symmetric BNE which is the linear equilibrium already derived
A BNE is called symmetric if the players’ strategies are identical
We propose to prove that there is a single function b̃ such that (b̃, b̃) is a BNE
Since players’ valuations typically will be different, their bids typically will be different, even if both use
the same strategy
Suppose that player j adopts a strategy b̃ and assume that b̃ is strictly increasing and differentiable
For a given value of vi , player i’s optimal bid solves5
max(vi − bi )λ{bi > b̃}

bi ≥0
If bi ∈ Im(b̃) then h
{bi > b̃} = 0, b̃−1 (bi )
implying that h i′
−b̃−1 (b̃i (vi )) + (vi − b̃i (vi )) b̃−1 (b̃i (vi )) = 0
In order to get a symmetric BNE we need to have b̃i = b̃
The first-order condition is then

h i′
−b̃−1 (b̃(vi )) + (vi − b̃(vi )) b̃−1 (b̃(vi )) = 0
Since we have b̃−1 (b̃(vi )) = vi and

h i′ 1
b̃−1 (b̃(vi )) = h i′
b̃ (vi )
Then the function b̃ must satisfy h i′

vi b̃ (vi ) + b̃(vi ) = vi
Observe that h i′ h i′
Id × b̃ (vi ) = vi b̃ (vi ) + b̃(vi )
This leads to h i′
Id × b̃ − (1/2)Id2 =0
5
Observe that λ{bi = b̃} = 0
120
Therefore there exists a constant k such that
1
∀vi ∈ [0, 1], vi b̃(vi ) = vi2 + k
2
We need a boundary condition to determine k
A player’s action should be individually rational: No player should bid more than his valuation
Thus, we require b̃ ≤ Id
This implies k = 0 and

1
b̃ = Id
2
A double auction
We consider the case in which a buyer and a seller each have private information about their valuations
The seller names an asking price ps
The buyer simultaneously names an offer price pb
If pb ≥ ps then trade occurs at price p = (pb + ps )/2
If pb < ps then no trade occurs
The buyer’s valuation for the seller’s good is vb , the seller’s valuation is vs
These valuations are private information and are drawn from independent uniform distribution on [0, 1]
If the buyer gets the good for price p then his utility is vb − p; if there is no trade the buyer’s utility is
zero
If the seller sells the good for price p then his utility is p − vs ; if there is no trade the seller’s utility is
zero
A strategy for the buyer is a function p̃b : vb 7→ p̃b (vb ) specifying the price the buyer will offer for each of
his possible valuation
A strategy for the seller is a function p̃s : vs 7→ p̃s (vs ) specifying the price the seller will demand for each
of his possible valuation
A profile of strategies (p̃b , p̃s ) is a BNE if the two following conditions hold
For each vb ∈ [0, 1], the price p̃b (vb ) solves

Z
pb + p̃s (vs )
max vb − λ(dvs )
pb ≥0 {pb ≥p̃s } 2
121
where
{pb ≥ p̃s } = {vs ∈ [0, 1] : pb ≥ p̃s (vs )}
For each vs ∈ [0, 1], the price p̃s (vs ) solves

Z
ps + p̃b (vb )
max − vs λ(dvb )
ps ≥0 {ps ≤p̃b } 2
where
{ps ≥ p̃b } = {vb ∈ [0, 1] : ps ≤ p̃b (vb )}
A profile of strategies (p̃b , p̃s ) is a BNE if the two following conditions hold
For each vb ∈ [0, 1], the price p̃b (vb ) solves

pb + E[p̃s |pb ≥ p̃s ]
max vb − λ{pb ≥ p̃s }
pb ≥0 2
where we recall that Z

1
E[p̃s |pb ≥ p̃s ] = p̃s (vs )λ(dvs )
λ{pb ≥ p̃s } λ{pb ≥p̃s }
For each vs ∈ [0, 1], the price ps(vs ) solves

ps + E[p̃b |ps ≤ p̃b ]
max − vs λ{ps ≤ p̃b }
ps ≥0 2
There are many BNE, we propose to exhibit one of them where trade occurs at a single price if it occurs
at all
For any x ∈ [0, 1],
– let the buyer’s strategy be to offer x if vb ≥ x and to offer 0 otherwise

– let the seller’s strategy be to demand x if vs ≤ x and to demand one otherwise
This profile of strategies is a BNE
122
Trade would be efficient for all (vs , vb ) pairs such that vb ≥ vs , but does not occur in the two shaded
regions
We propose to derive a linear Bayesian equilibrium
Suppose the seller’s strategy is p̃s : vs 7→ as + cs vs with cs > 0
If the buyer’s valuation is vb , his best reply p̃b (vb ) should solve

1 as + pb [pb − as ]+
max vb − pb +
pb ≥0 2 2 cs
The first order condition for which yields
2 1
p̃b (vb ) = vb + a s
3 3
Thus, if the seller plays a linear strategy, then the buyer’s best response is also linear
Analogously, suppose the buyer’s strategy is p̃b : vb 7→ ab + cb vb with cb > 0
If the seller’s valuation is vs , his best reply p̃s (vs ) should solve

1 cb − [ps − ab ]+
max {ps + E[p̃b |ps ≤ p̃b ]} − vs
ps ≥0 2 cb
where E[p̃b |ps ≤ p̃b ] equals ab + cb /2 if ps ≤ ab and (ab + cb + ps )/2 otherwise
The first order condition for which yields

(
2
3 vs + 13 (ab + cb ) if vs ≥ ab − cb /2
p̃s (vs ) =
ab if vs < ab − cb /2
Thus, if the buyer plays a linear strategy, then the seller’s best response may also be linear
– It will be the case if ab ≤ cb /2
Assume ab ≤ cb /2
If the players’ linear strategies are to be best response to each other then we get
2 2 as ab + cb
cb = , cs = , and ab = , as =
3 3 3 3
We obtain ab = cb /8 implying that the condition ab ≤ cb /2 is satisfied
The linear strategies are then
2 1 2 1
p̃b (vb ) = vb + and p̃s (vs ) = vs +
3 12 3 4
123
Trade occurs if and only if the pair (vs , vb ) is such that p̃b (vb ) ≥ p̃s (vs ), i.e., iff vb ≥ vs + (1/4)
A double auction: comparing the solutions
In both cases, the one-price and linear equilibria, the most valuable trade (vs = 0 and vb = 1) does occur
The one-price equilibrium misses some valuable trades
– vs = 0 and vb = x − ε where ε is small
and achieves some trades that are worth next to nothing
– vs = x − ε and vb = x + ε where ε is small
The linear equilibrium, in contrast, misses all trades worth next to nothing but achieves all trades worth
at least 1/4
124
This suggests that the linear equilibrium may dominate the one-price equilibria, in terms of the expected
gains the players receive
One may wonder if it is possible to find other BNE for which the players might do even better
Myerson and Satterthwaite (Jet 1983) show that, for the uniform valuation distributions, the linear
equilibrium yields higher expected gains for the players than any other Bayesian Nash equilibria of the
double auction
This implies that there is no BNE of the double auction in which trade occurs if and only if it is efficient
(i.e., if and only if vb ≥ vs )
3.3 The Revelation Principle

Desenho de Mecanismo
É comum que um dos jogadores possa definir os termos em que um processo de interação estratégica irá
se desenvolver
Quando o governo decide privatizar alguma de suas empresas
– o governo é vendedor
– mas também é o agente que determina as regras do jogo (regras da privatização)
Um empresário que decide vender sua empresa pode também definir os termos em que se desenvolverá a
negociação com eventuais compradores
O Banco Central, nos seus leilões de tı́tulos públicos, estabelece as próprias regras desses leilões
Informação privada e mecanismos
O fato que alguns jogadores possuem informação privada é essencial para que o jogador que define as
regras consiga maximizar sua recompensa
Quando se vai vender uma empresa (ou um tı́tulo do BC) em geral não se conhece o valor que o comprador
está disposto a pagar
Uma regra do jogo é chamada de mecanismo
Questão 11. Como elaborar ou desenhar um mecanismo para que determinado objetivo seja alcançado?
Regras básicas do mecanismo
O jogador que possui o poder de desenhar o mecanismo tem a liberdade de definir as regras do jogo
No entanto, ele enfrente duas restrições
125
1. individual rationality: o jogador responsável pelo desenho não pode adotar nenhuma coerção
– os jogadores envolvidos no mecanismo devem jogar de forma voluntária
2. incentive compatibility: o jogador responsável pelo desenho tem que ter expectativas razoáveis sobre
o comportamento dos outros jogadores
– os jogadores envolvidos no mecanismo não irão jogar algo que não seja um equilı́brio do mecan-
ismo consistente com seus próprios interesses
Exemplo
Um governo decidiu vender uma empresa pública
Apenas um comprador se qualificou em um processo prévio para a aquisição da empresa
Existem dois tipos de comprador:
– um tipo que atribui à empresa um valor elevado a > 0

– um tipo que atribui um valor baixo b, com a > b > 0
O governo não sabe de que tipo é o comprador
Mas sabe que ele pode ser de dois tipos, a ou b
Os possı́veis tipos de compradores da empresa, {a, b}, são de conhecimento comum dos jogadores
Seja v o valor pago pela empresa
– Se o comprador é do tipo t ∈ {a, b}, ele extrairá um excedente igual a t − v
O governo procura vender a empresa pelo maior preço possı́vel
O governo não sabe qual é o tipo do comprador
Ele atribui uma probabilidade p do que o comprador seja do tipo t = a
Vamos supor que a = 30, b = 10 e p = 0.5
Exemplo: mecanismos triviais
Existem dois mecanismos simples para o governo
1. Perguntar ao comprador qual é o tipo dele, ou de forma equivalente, e quanto ele está disposto a
pagar pela empresa
2. Estabelecer um preço que o comprador, qualquer que seja o seu tipo, estará disposto a pagar
No primeiro caso, o comprador sempre irá declarar que é do tipo b, que valoriza menos a empresa
– A consequência é que a venda da empresa se dará por o valor v = b
126
Será o mesmo resultado do caso em que o governo estabelece um preço que seja aceitável para qualquer
um dos dois tipos de comprador
Há alternativas possı́veis que podem produzir melhores resultados para o governo
Exemplo: um mecanismo mais sofisticado
O governo pode estabelecer um mecanismo em que a venda esteja assegurada se for oferecido um preço
acima de v̂ = 17
Se for oferecido um valor v menor (seria então v = b) a probabilidade de a venda efetivamente se concretizar
seria de 50%
– Nesse caso o governo joga um dado de duas faces

– Essa ameaça é crı́vel?
Como v̂ > v, o comprador tipo t = b irá preferir correr o risco de não efetuar a compra, fazendo a oferta
mais baixa b
Valerá a pena para o comprador de alta avaliação t = a pagar v̂?
Se o comprador de alta avaliação t = a paga v̂, seu excedente (payoff) é dado por a − v̂ = 13
– Esse valor é certo (sem risco)
Se o comprador de tipo t = a oferece o preço mais baixo

(porque oferecer mais do que b?),
seu excedente esperado (neutro ao risco) será
1 1 a−b 30 − 10
(a − b) + 0 = = = 10
2 2 2 2
Assim é melhor negócio para o comprador de alta avaliação t = a pagar o preço mais elevado estipulado
pelo governo e assegurar a aquisição da empresa
Esse esquema valerá a pena para o governo?
O governo venderá a empresa para um comprador de alta avaliação com uma probabilidade p
Para o comprador com baixa avaliação, há 50% de chances de o governo vender, mas também há 50% de
chances de o governo cancelar a venda
A receita esperada (neutro ao risco) do governo será:

1 1 1 1 1
P rob{t = a}v̂ + P rob{t = b} b + 0 = × 17 + × 10 + × 0 = 11
2 2 2 4 4
O valor esperado 11 é superior a 10 que alternativamente obteria
127
Esse mecanismo é melhor para o governo do que o mecanismo de um só preço para ambos os tipos
Exemplo: generalização do mecanismo
Sejam α > β > 0 e θ ∈ [0, 1] valores que caracterizam o seguinte mecanismo
Para um valor pago pelo comprador igual a α, a venda é assegurada
Para um valor igual a β, há uma probabilidade (1 − θ) de que o governo cancele a privatização
O comprador de alta avaliação (t = a) prefere comprar a empresa pagando o valor α do que correr o risco
de ofertar o valor baixo, desde que
α − θβ
a − α ≥ θ(a − β) i.e. , a≥
1−θ
O comprador de baixa avaliação (t = b) prefere correr o risco de ofertar o valor baixo β do que pagar o
valor alto, se
α − θβ
θ(b − β) ≥ b − α i.e., b≤
1−θ
A equação
α − θβ
b≤ ≤a
1−θ
é chamada de restrição de compatibilidade de incentivos (incentive compatibility)
Graças a ela, cada tipo de comprador prefere selecionar o valor a ser pago mais adequado ao seu tipo
Estabelecendo o valor α muito alto pode estimular o comprador de alta avaliação a incorrer o risco de
propor o valor mais baixo
O mesmo problema ocorre se a probabilidade (1 − θ) de cancelar a venda for muito baixo
Temos que adicionar a restrição de que nenhum tipo de comprador pode ser coagido a adquirir a empresa
(i.e., participar do mecanismo)
O lucro esperado de cada tipo de comprador tem que ser maior do que o custo de oportunidade (não
participar do mecanismo), i.e.,
a≥α e b≥β
Essa restrição é chamada de restrição de racionalidade individual (individual rationality)
Dada as restrições
α − θβ
b≤ ≤a
1−θ
a receita esperada do governo é
pα + (1 − p)θβ
A questão é encontrar os valores (α, β, θ) que maximizam a receita a e dadas as restrição de
128
1. compatibilidade com incentivos
α − θβ
b≤ ≤a (IC)
1−θ
2. racionalidade individual
a≥α e b≥β (IR)
Proposição 14. If (α, β, θ) is optimum then we must have
α − θβ
a=
1−θ
and
b=β
Temos então que

α = θb + (1 − θ)a e β=b
A receita esperada do governo é
pa(1 − θ) + θb i.e., pa + θ(b − pa)
Há dois casos possı́veis
1. Se b < pa o governo deve fazer θ = 0, i.e., não vender a empresa se o preço for inferior a α = a
2. Se b > pa o governo deve fazer θ = 1, i.e., vender com certeza, desde que obtenha o valor mı́nimo
pela empresa
Princı́pio de revelação
Quando o mecanismo assegura a compra apenas a um preço mais elevado (θ > 0), ele leva os jogadores a
revelarem indiretamente suas verdadeiras caracterı́sticas (tipos) por meio de suas decisões
Pode-se obter o mesmo resultado por meio de um mecanismo pelo qual os jogadores se vejam estimulados
a anunciar suas verdadeiras caracterı́sticas
Em vez de oferecer a empresa por um valor alto, porém certo, ou um valor baixo, porém incerto
O governo poderia simplesmente ter perguntado ao comprador qual era o seu verdadeiro tipo
Avisando que, se o tipo informado fosse t = a, a empresa lhe seria oferecida com certeza, mas ao valor
mais elevado
Se o tipo informado fosse t = b, a empresa seria oferecida ao valor mais baixo, mas a venda teria 50% de
chance de ser efetivamente concretizada
129
O resultado desse mecanismo chamado de direto seria o mesmo, apesar da forma do jogo ser diferente
Os jogadores anunciariam seu verdadeiro tipo ao governo que em seguida atribuiria as recompensas ade-
quadas
Eles não teriam qualquer motivo para mentira
Esse resultado é chamado de princı́pio da revelação
Fix
a set of players I
a family of types (Ti )i∈I
a family of priors (pi )i∈I with
pi = (pi (·|ti ))ti ∈Ti and pi (·|ti ) ∈ P rob(T−i )
Revelation principle: mechanisms
Definição 38. A mechanism is a family

(Ai , ui )i∈I
where
Ai is a set of available actions for player i
ui : A × T → [−∞, ∞) where
ui (ai , a−i ; t)
is the payoff received by player i if he chooses ai , given that the other players choose a−i and players
type are t = (ti , t−i )
A strategy for this mechanism is a function si : Ti → Ai
Revelation principle: direct mechanisms
Definição 39. A direct mechanism is a mechanism (Bi , vi )i∈I where
∀i ∈ I, Bi = Ti
A strategy for a direct mechanism is a function τi : Ti → Ti
Each agent is asked to announce his type
130
Definição 40. A direct mechanism (vi )i∈I is said to be incentive compatible (or truth telling) if telling the
truth is a Bayesian Nash equilibrium, i.e., the strategies
(Idi )i∈I where Idi (ti ) = ti
is a BNE of the Bayesian game defined by the direct mechanism.
Teorema 16 (Revelation principle). Every payoff profile (πi∗ )i∈I obtained in a BNE of any mechanism
(Ai , ui )i∈I can be obtained through an incentive compatible direct mechanism, i.e., there exists a direct
mechanism (vi )i∈I which is incentive compatible and for which (πi∗ )i∈I is the payoff profile of its truth
telling BNE Z
πi∗ (ti ) pi
= E [vi (ti , Id−i ; (ti , ·))] = vi (ti , t−i ; (ti , t−i ))pi (t−i |ti )
T−i
Proof. Pose
vi (τi , τ−i ; t) ≡ ui (s∗i (τi ), s∗−i (τ−i ); t)
where (s∗i )i∈I is the BNE of the mechanism (Ai , ui )i∈I leading to the payoff profile (πi )i∈I
131
Cap. 4 - Dynamic games of incomplete information
4.1 Introduction to Perfect Bayesian equilibrium
Consider the following dynamic game of complete but imperfect information
First, player 1 chooses among three actions- L, M , and R
If player 1 chooses R then the game ends without a move by player 2
If player 1 chooses either L or M then player 2 learns that R was not chosen
But he does not know which of L or M was chosen
Player 2 then chooses between two actions, L′ and R′ , after which the game ends
Payoffs are given in the extensive form in the previous figure
The normal-form representation of this game is
Player 2
L’ R’
L 2, 1 0, 0
Player 1 M 0, 2 0, 1
R 1, 3 1, 3
There are two pure-strategy Nash equilibria: (L, L′ ) and (R, R′ )
To determine whether these Nash equilibria are subgame perfect, we should define the game’s subgames
The game in consideration has no subgames since a subgame should begin at a decision node that is a
singleton
Both (L, L′ ) and (R, R′ ) are SPNE
(R, R′ ) depends on a non-credible threat:
– If player 2 gets the move, then playing L′ dominates playing R′
132
– So player 1 should not be induced to play R by 2’s threat to play R′ if given the move
One way to strengthen the equilibrium concept so as to rule out the SPNE (R, R′ ) is to impose the following
requirements
Requirement 17. At each information set, the player with the move must have a belief about which node in
the information set has been reached by the play of the game.
For a non-singleton information set, a belief is a probability over the nodes in the information set
For a singleton information set, the player’s belief puts probability one on the single decision node
Requirement 18. Given their beliefs, the players’ strategies must be sequentially rational in the sense that
at each information set the action taken by the player with the move (and the player’s subsequent strategy) must
be optimal given the player’s belief at that information set and the other players’ subsequent strategies
A “subsequent strategy” is a complete plan of action covering every contingency that might arise after
the given information set has been reached
Requirement 1 implies that if the play of the game reaches player 2’s non-singleton information set then
player 2 must have a belief about which node has been reached
– Or equivalently, about whether player 1 has played L or M
This belief is represented by the probabilities p and 1 − p
Given player 2’s belief, the expected payoff
– from playing R′ is p · 0 + (1 − p) · 1 = 1 − p
– from playing L′ is p · 1 + (1 − p) · 2 = 2 − p
Since 2 − p > 1 − p for any value of p, Requirement 2 prevents player 2 from choosing R′
Requiring that each player have a belief and act optimally given this belief suffices to eliminate the
implausible equilibrium (R, R′ )
Requirements 1 and 2 insist that the players have beliefs and act optimally given these beliefs
133
But these beliefs may not be reasonable
In order to impose further requirements on the players’ beliefs, we introduce the distinction between
information sets that are on the equilibrium path and those that are off the equilibrium path
Definition. For a given equilibrium in a given extensive-form game, an information set is
on the equilibrium path if it will be reached with positive probability if the game is played according to
the equilibrium strategies
off the equilibrium path if it is certain not to be reached if the game is played according to the equilibrium
strategies
“Equilibrium” can mean Nash, subgame perfect, Bayesian, or perfect Bayesian equilibrium
Requirement 19. At information sets on the equilibrium path, beliefs are determined by Bayes’ rule and the
players’ equilibrium strategies
Consider the subgame perfect Nash equilibrium (L, L′ )
Player 2’s belief must be p = 1
Indeed, given player 1’s equilibrium strategy (namely L), player 2 knows which node in the information
set has been reached
To illustrate Requirement 3, suppose that there were a mixed-strategy equilibrium in which player 1 plays
L with probability q1 , M with probability q2 , and R with probability 1 − q1 − q2
Requirement 3 would force player 2’s belief to be
q1
p=
q1 + q2
Requirements 1 through 3 capture the spirit of a perfect Bayesian equilibrium
The crucial new feature of this equilibrium concept is due to Kreps and Wilson (Econometrica 1982)
134
An equilibrium no longer consists of just a strategy for each player but now also includes a belief for each
player at each information set at which the player has the move
Requirement 3 imposes that players hold reasonable beliefs on the equilibrium path
We will introduce Requirement 4 which imposes that agents’ beliefs are reasonable off the equilibrium
path
Requirement 20. At information sets off the equilibrium path, beliefs are determined by Bayes’ rule and the
players’ equilibrium strategies where possible.
We will provide a more precise statement of “where possible” in each of the economic applications analyzed
subsequently
Definição 41. A perfect Bayesian equilibrium consists of strategies and beliefs satisfying Requirements
1 through 4.
To illustrate and motivate Requirement 4 we consider the following three-player game
This game has one subgame beginning at player 2’s singleton information set
The unique NE in this subgame between players 2 and 3 is (L, R′ )
The unique SPNE of the entire game is (D, L, R′ )
These strategies and the belief p = 1 for player 3 satisfy Requirements 1 through 3
They also trivially satisfy Requirement 4, since there is no information set off this equilibrium path, and
so constitute a PBE
135
Consider the strategies (A, L, L′ ), together with the belief p = 0
These strategies are a NE -no player wants to deviate unilaterally
These strateties and belief also satisfy Requirements 1 through 3
– Player 3 has a belief and acts optimally given it, and players 1 and 2 act optimally given the
subsequent strategies of the other players
This NE, namely (A, L, L′ ), is not subgame perfect
Because the unique NE of the game’s only subgame is (L, R′ )
Thus, Requirements 1 through 3 do not guarantee that the player’s strategies are a SPNE
The problem is that player 3’s belief (p = 0) is inconsistent with player 2’s strategy, L
– but Requirements 1 through 3 impose no restrictions on 3’s belief because 3’s information set is not
reached if the game is played according to the specified strategies
Requirement 4, however, forces player 3’s belief to be determined by player 2’s strategy:
– if 2’s strategy is L then 3’s belief must be p = 1

– if 2’s strategy is R then 3’s belief must be p = 0
But, if 3’s belief is p = 1 then Requirement 2 forces 3’s strategy to be R′
So the strategies (A, L, L′ ) and the belief p = 0 do not satisfy Requirements 1 through 4
Consider the following modification of the previous game
136
Player 2 now has a third possible action, A′ , which ends the game
If player 1’s equilibrium strategy is A then player 3’s information set is off the equilibrium path
But now Requirement 4 may not determine 3’s belief from 2’s strategy
If 2’s strategy is A′ then Requirement 4 puts no restrictions on 3’s belief
But if 2’s strategy is to play L with probability q1 , R with probability q2 , and A′ with probability 1−q1 −q2 ,
where q1 + q2 > 0, then Requirement 4 dictates that 3’s belief be
q1
p=
q1 + q2
Concluding remarks
In a NE no player chooses a strictly dominated strategy
In a PBE, Requirements 1 and 2 are equivalent to insisting that no player’s strategy be strictly dominated
beginning at any information set
Nash and Bayesian Nash equilibrium do not share this feature at information sets off the equilibrium path
Even SPNE does not share this feature at some information sets off the equilibrium path, such as infor-
mation sets that are not contained in any subgame
In a PBE, players cannot threaten to play strategies that are strictly dominated beginning at any infor-
mation set off the equilibrium path
PBE makes the player’s beliefs explicit
Such an equilibrium often cannot be constructed by working backwards through the game tree, as we did
to construct a SPNE
Requirement 2 determines a player’s action at a given information set based in part on the player’s belief
at that information set
137
either Requirement 3 or 4 applies at this information set, then it determines the player’s belief from the
players’ action higher up the game tree
But Requirement 2 determines these actions higher up the game tree based in part on the players’ subse-
quent strategies, including the action at the original information set
This circularity implies that a single pass working backwards through the tree will not suffice to compute
a PBE
4.2 Signaling Games

Perfect Bayesian equilibrium in signaling games
A signaling game is a dynamic game of incomplete information involving two players:
A Sender (S)
A Receiver (R)
The timing of the game is
1. Nature draws a type ti for the Sender from a finite set of feasible types T = {t1 , · · · , tI } according to a
probability distribution p ∈ P rob(T ) with full support, i.e., p(ti ) > 0 for every ti
2. The Sender observes ti and then chooses a message mj from a finite set of feasible messages M =
{m1 , · · · , mJ }
3. The Receiver observes mj but not ti and then chooses an action ak from a finite set of actions A =
{a1 , · · · , aK }
4. Payoffs are given by US (ti , mj , ak ) and UR (ti , mj , ak )
In many applications, the sets T , M and A are intervals on the real line, rather than finite sets
One may allow the set of feasible messages to depend on the type Nature draws
One may allow the set of feasible actions to depend on the message the Senders chooses
Job-market signaling
In Spence’s (QJE 1973) model of job-market signaling
– the Sender is the worker

– the Receiver is the market of prospective employers
– the type is the worker’s productive ability
– the message is the worker’s education choice
– the action is the wage paid by the market
138
Corporate investment and capital structure
In Myers and Majluf’s (JFE 1984) model of corporate investment and capital structure
– the Sender is a firm needing capital to finance a new project

– the Receiver is a potential investor
– the type is the profitability of the firm’s existing assets
– the message is the firm’s offer of an equity stake in return for financing
– the action is the investor’s decision about whether to invest
Monetary policy
A signaling game may be embedded within a richer game
– there could be an action by the Receiver before the Sender chooses the message in step 2
– there could be an action by the Sender after (or while) the Receiver chooses the action in step 3
Consider the following game:
In Vicker’s (1986) model of monetary policy
the Federal Reserve has private information about its willingness to accept inflation in order to increase
employment
the Sender is the Federal Reserve
the Receiver is the market of employers
the type is the Fed’s willingness to accept inflation in order to increase employment
the message is the Fed’s choice of first-period inflation
the action is the employers’ expectation of second-period inflation
the employers’ expectation of first-period inflation precedes the signaling game
the Fed’s choice of second-period inflation follows it
PBE definition in signaling games
We consider an extensive form representation of a simple case: T = {t1 , t2 }, M = {m1 , m2 }, A = {a1 , a2 }

and P rob{t1 } = p
139
A player’s strategy is a complete plan of action:
– a strategy specifies a feasible action in every contingency in which the player might be called upon
to act
In a signaling game:
– a pure strategy for the Sender is a function ti 7→ m(ti ) specifying which message will be chosen for
each type that Nature might draw
– a pure strategy for the Receiver is a function mj 7→ a(mj ) specifying which action will be chosen for
each message that the Sender might send
In the simple signaling depicted before, the Sender and the Receiver both have four pure strategies
The Sender’s strategy m is said to be
– a pooling strategy if each type sends the same message

* i.e., if m is constant
– a separating strategy if each type sends a different message
* i..e, m is injective
– a partially pooling (or semi-separating) if it is neither pooling nor separating
We translate the informal statements of Requirements 1 through 3 into a formal definition of a PBE in a
signaling game
Requirement 1 is trivial when applied to the Sender since his choice occurs at a singleton information set
The Receiver, in contrast, chooses an action after observing the Sender’s message but without knowing
the Sender’s type
– There is one information set for each message the Sender might choose
– Each such information set has one node for each type Nature might have drawn
140
Requirement 21 (1). After observing any message mj from M , the Receiver must have a belief about which
types could have sent mj
Denote this belief by the probability distribution µ(·|mj ) ∈ P rob(T )
Requirement 22 (2R). For each mj in M , the Receiver’s action a∗ (mj ) must maximize the Receiver’s expected
utility, given the belief µ(·|mj )
That is, a∗ (mj ) solves

X
max µ(t|mj )UR (t, mj , a)
a∈A
t∈T
Requirement 2 also applies to the Sender, but the Sender has complete information
Requirement 23 (2S). For each ti in T , the Sender’s message m∗ (ti ) must maximize the Sender’s utility,
given the Receiver’s strategy a∗ (mj )
– That is, m∗ (ti ) solves

max US (ti , m, a∗ (m))
m∈M
Given the Sender’s strategy ti 7→ m∗ (ti ), let Tj denote the set of types that send the message mj
Tj ≡ {ti ∈ T : m∗ (ti ) = mj }
or equivalently6
Tj = [m∗ ]−1 (mj )
– The signal ti is a member of the set Tj if m∗ (ti ) = mj
Given a message mj ,
– if Tj is non-empty then the information set corresponding to the message mj is on the equilibrium
path
– otherwise, mj is not sent (at equilibrium) by any type and so the corresponding information set is
off the equilibrium path
For messages on the equilibrium path, one should apply Requirement 3 to the Receiver’s strategy
Requirement 24 (3). For each mj ∈ M , if there exists ti ∈ T such that m∗ (ti ) = mj , then the Receiver’s
belief at the information set corresponding to mj must follow from Bayes’ rule and the Sender’s strategy:
p(ti )
µ(ti |mj ) = p(ti |[m∗ ]−1 (mj )) = P
τi ∈Tj p(τi )
6
Rigorously, we should write [m∗ ]−1 ({mj }).
141
Definição 42. A pure-strategy perfect Bayesian equilibrium in a signaling game is
a pair of strategies (m∗ , a∗ ) where
– m∗ : ti 7→ m∗ (ti )
– a∗ : mj 7→ a∗ (mj )
a family of beliefs (µ(·|mj ))mj ∈M with each µ(·|mj ) ∈ P rob(T )
satisfying Signaling Requirements (1), (2R), (2S), and (3)
Requirement 4 is vacuous in a signaling game
If the Sender’s strategy is pooling or separating then we call the equilibrium pooling or separating,
respectively
A simple signaling game

Consider the following example of a simple signaling game
Each type is equally likely to be drawn by Nature
The Receiver belief µ(·|L) at information set L is denoted (p, 1 − p)
The Receiver belief µ(·|R) at information set R is denoted (q, 1 − q)
There are four possible pure-strategy perfect Bayesian equilibria in this two-type, two-message game
Pooling on L
Pooling on R
Separating with t1 playing L and t2 playing R
Separating with t2 playing L and t1 playing R
142
A simple signaling game: pooling on L
Suppose there is an equilibrium (m∗ , a∗ , µ) in which the Sender’s strategy is

(
∗ L if t = t1
m (t) =
L if t = t2
Then the Receiver’s information set corresponding to L is on the equilibrium path
So the Receiver’s belief (p, 1 − p) at this information set is determined by Bayes’ rule and the Sender’s
strategy
This implies that

0.5
µ(t1 |L) ≡ p = = 0.5
0.5 + 0.5
Given this belief µ, the Receiver’s best response following L is to play u
The Sender’s type t1 earns payoff of 1 and the Sender’s type t2 earns payoff of 2
To determine whether both “Sender types” are willing to choose L, we need to specify how the Receiver
would react to R
If the Receiver’s response to R is u, i.e., a∗ (R) = u then type t1 ’s payoff from playing R is 2, which exceeds
t1 ’s payoff of 1 from playing L
But if the Receiver’s response to R is d, i.e., a∗ (R) = d then t1 and t2 earn payoffs of 0 and 1 from playing
R, whereas they earn 1 and 2 from playing L
To get the pooling equilibrium on L, the Receiver’s response to R must be d, i.e., a∗ (R) = d
One have to check that a∗ (R) = d is an optimal action with respect to the Receiver’s belief at the
information set corresponding to R
Observe that
Eµ(·|R) [UR (·, R, d)] = q × 0 + (1 − q) × 2 = 2(1 − q)
and
Eµ(·|R) [UR (·, R, u)] = q × 1 + (1 − q) × 0 = q
Playing d is optimal for the Receiver for any q ≤ 2/3
Observação 26. The pair of strategies (m∗ , a∗ ) defined by
m∗ (t) = L, ∀t ∈ {t1 , t2 }
(
∗ u if m = L
a (m) =
d if m = R
143
and the beliefs m 7→ µ(·|m) defined by
(
(0.5, 0.5) if m = L
µ(·|m) =
(q, 1 − q) if m = R
form a pure-strategy PBE if q ≤ 2/3.
A simple signaling game: pooling on R
Suppose the Sender’s strategy is m∗ (t) = R for any t in T
Then q = 0.5 and the Receiver’s best response is a∗ (R) = d
Thus the contingent payoffs for the Sender are
US (t1 , R, d) = 0 and US (t2 , R, d) = 1
But t1 can earn 1 by playing L, since the Receiver’s best response to L is u for any value of p
Observação 27. There is no equilibrium in which the Sender plays m∗ (t) = R for any t in T .
A simple signaling game: Separating with m∗ (t1 ) = L
Suppose the Sender’s strategy m∗ is defined by

(
∗ L if t = t1
m (t) =
R if t = t2
Both of the Receiver’s information sets are on the equilibrium path
So both beliefs are determined by Bayes’ rule and the Sender’s strategy
p=1 and q=0
The Receiver’s best responses to these beliefs are

(
u if m = L
a∗ (m) =
d if m = R
It remains to check whether the Sender’s strategy is optimal given the Receiver’s strategy a
It is not:
– if type t2 deviates by playing L rather than R,

– then the Receiver’s responds with u,
144
– earning t2 a payoff of 2,
– which exceeds t2 ’s payoff of 1 from playing R
A simple signaling game: Separating with m∗ (t1 ) = R
Suppose the Sender’s strategy m is defined by

(
∗ R if t = t1
m (t) =
L if t = t2
Both of the Receiver’s information sets are on the equilibrium path
So both beliefs are determined by Bayes’ rule and the Sender’s strategy
p=0 and q=1
The Receiver’s best response to these beliefs is
a∗ (m) = u, ∀m ∈ {L, R}
Both types t1 and t2 earn payoffs of 2
If t1 were to deviate by playing L, then the Receiver would react with u
t1 ’s payoff would then be 1, so there is no incentive for t1 to deviate from playing R
If t2 were to deviate by playing R, then the Receiver would react with u
t2 ’s payoff would then be 1, so there is no incentive for t2 to deviate from playing L
Observação 28. The pair of strategies (m∗ , a∗ ) defined by

(
∗ R if t = t1
m (t) =
L if t = t2
a∗ (m) = u, ∀m ∈ {L, R}
and the beliefs m 7→ µ(·|m) defined by

(
(0, 1) if m = L
µ(·|m) =
(1, 0) if m = R
form a separating pure-strategy perfect Bayesian equilibrium
145
Job market signaling
We restate Spence’s (QJE 1973) model as an extensive-form game and describe some of its perfect Bayesian
equilibria
The timing is as follows
1. Nature determines a worker’s productive ability, η, which can be either high H or low L. The
probability that η = H is q
2. The worker learns his or her ability and then chooses a level of education, e ≥ 0
3. Two firms observe the worker’s education but not the worker’s ability, and then simultaneously make
wage offers to the worker
4. The worker accepts the higher of the two wage offers, flipping a coin in case of a tie
Payoffs
Let w denote the wage the worker accepts
The payoff to worker is

w − c(η, e)
where c(η, e) is the cost to a worker with ability η of obtaining education e
The payoff to the firm that employs the worker is
y(η, e) − w
where y(η, e) is the output of a worker with ability η who has obtained education e
The payoff to the firm that does not employ the worker is zero
Assumption on production
We allow for the possibility that output increases not only with ability but also with education
We assume that high-ability workers are more productive, i.e.,
∀e, y(H, e) > y(L, e)
We assume that education does not reduce productivity, i.e.,
∀(η, e), ye (η, e) ≥ 0
where ye (η, e) is the marginal productivity of education for a worker of ability η at education e
∂y
ye (η, e) = (η, e) ≥ 0
∂e
146
Interpretation of education
We interpret differences in e as differences in the quality of a student’s performance
Not as differences in the duration of the student’s schooling
Thus, the game could apply to a cohort of high school graduates, or to a cohort of college graduates or
MBAs
Under this interpretation, e measures the number and kind of courses taken and the caliber of grades and
distinctions earned during an academic program of fixed length
Tuition costs (if they exist at all) are independent of e, so the cost function c(η, e) measures non-monetary
(or psychic) costs
Students of lower ability find it more difficult to achieve high grades at a given school, and also more
difficult to achieve the same grades at a more competitive school
Firm’s use of education as a signal thus reflects the fact that firms hire and pay more to the best graduates
of a given school and to the graduates of the best schools
Assumption on costs
The crucial assumption in Spence’s model is that low-ability workers find signaling more costly than do
high-ability workers
More precisely, we assume that the marginal cost of education is higher for low-ability than for high-ability
workers
∀e, ce (L, e) > ce (H, e)
where ce (η, e) denoted the marginal cost of education for a worker of ability η at education e
∂c
ce (η, e) = (η, e)
∂e
Assumption on costs: Interpretation
Consider a worker believing that with education e1 he would get paid wage w1
We investigate the increase in wages that would be necessary to compensate this worker for an increase
in education from e1 to e2
The answer depends on the worker’s ability:
– Low-ability workers find it more difficult to acquire the extra education and so require a larger
increase in wages to compensate them for it
Z e2
∂c
∆w = w2 − w1 = (η, e)de
e1 ∂e
147
The graphical statement of this assumption is that low-ability workers have steeper indifference curves
than do high-ability workers
IL is an indifference curve of a low-ability worker
IH is an indifference curve of a high-ability worker
Competition among firms
Spence also assumes that competition among firms will drive expected profits to zero
One can build this assumption into our model by replacing the two firms in stage 3 with a single player
called the market
The market makes a single wage offer w and has the payoff
−[y(η, e) − w]2
Doing so would make the model belong to the class of one-Receiver signaling games defined previously
To maximize its expected payoff, as required by Signaling Requirement 2R, the market would offer a wage
equal to the expected output of a worker with education e, given the market’s belief about the worker’s
ability after observing e
w̃(e) = µ(H|e) × y(H, e) + [1 − µ(H|e)] × y(L, e) (W)
µ(H|e) is the market’s assessment of the probability that the worker’s ability is H
The purpose of having two firms bidding against each other in Stage 3 is to achieve the same result without
resorting to a fictitious player called the market
Firms’ beliefs
To guarantee that firms will always offer a wage equal to the worker’s expected output
148
We need to impose that, after observing education choice e, both firms hold the same belief about the
worker’s ability, again denoted µ(H|e)
Signaling Requirement 3 determines the belief that both firms must hold after observing a choice of e that
is on the equilibrium path
The assumption is that the firms also share a common belief after observing a choice of e that is off the
equilibrium path
Given this assumption, it follows that in any PBE the firms both offer the wage w̃(e) given in (W )
Equation (W ) replaces Signaling Requirement 2R for this two-Receiver model
The complete information case
First, consider temporarily that the worker’s ability is common knowledge among all the players, rather
than privately known by the worker
Competition between the two firms in Stage 3 implies that a worker of ability η with education e earns
the wage
ŵ(η, e) = y(η, e)
A worker with ability η therefore chooses e∗ (η) to solve
max y(η, e) − c(η, e)

e≥0
The associated wage (when it exists) is denoted by w∗ (η), i.e.,
w∗ (η) = y[η, e∗ (η)]
Assume that e 7→ y(η, e) is concave and e 7→ c(η, e) is strictly convex
Assume that y(η, ·) and c(η, ·) are such that
∂y ∂c
lim (η, e) − (η, e) > 0
e→0+ ∂e ∂e
and
∂y ∂c
lim (η, e) − (η, e) < 0
e→∞ ∂e ∂e
Then the maximization problem has a unique solution e∗ (η) satisfying
∂y ∂c
(η, e∗ (η)) = (η, e∗ (η))
∂e ∂e
149
We propose to strengthen the assumption
∀e ≥ 0, y(H, e) > y(L, e)
that states that high-ability workers are more productive
Assume that
inf ye (H, e) ≥ max ye (L, e)
e≥0 e≥0
The previous assumption is automatically satisfied if e 7→ y(η, e) is linear
Proposição 15. Under the previous assumption, one must have
e∗ (L) < e∗ (H) and w∗ (L) < w∗ (H)
The private information case
We now return to the assumption that the worker’s ability is private information
A low-ability worker could try to masquerade as a high-ability worker
Two cases can arise
– The additional effort c[L, e∗ (H)] − c[L, e∗ (L)] needed to obtain the education level e∗ (H) is not
compensated by the additional wage w∗ (H) − w∗ (L)
– The additional effort c[L, e∗ (H)] − c[L, e∗ (L)] needed to obtain the education level e∗ (H) is compen-
sated by the additional wage w∗ (H) − w∗ (L)
150
The low-ability worker has no incentives to pretend being a high-ability worker by choosing e∗ (H), i.e.,
w∗ (L) − c[L, e∗ (L)] ≥ w∗ (H) − c[L, e∗ (H)]
The low-ability worker has incentives to pretend being a high-ability worker by choosing e∗ (H), i.e.,
w∗ (L) − c[L, e∗ (L)] ≤ w∗ (H) − c[L, e∗ (H)]
Perfect Bayesian Nash equilibria
Each kind of equilibrium
– pooling
– separating
– hybrid
can exist in this model
In a pooling equilibrium both worker-types choose a single level of education, say ep
Requirement 3 then implies that the firm’s belief after observing ep must be the prior belief
µ(H|ep ) = q and µ(L|ep ) = 1 − q
151
This in turn implies that the wage offered by the firm after observing ep must be
wp = q × y(H, ep ) + (1 − q) × y(L, ep )
To complete the description of a pooling PBE, it remains
1. to specify the firm’s belief µ(·|e) for out-of-equilibrium education choices e 6= ep (Requirement 1)
2. these beliefs will then determine the firm’s strategy e 7→ w̃(e) through
w̃(e) = µ(H|e) × y(H, e) + [1 − µ(H|e)] × y(L, e) (W)
(Requirement 2R)
3. to show that both worker-types’ best response to the firm’s strategy w̃ is to choose e = ep (Require-
ment 2S)
Pooling equilibrium
One possibility is that the firms believe that any education level other than ep implies that the worker
has low ability
∀e 6= ep , µ(H|e) = 0
Nothing in the definition of PBE rules these beliefs out
– Requirements 1 through 2 put no restrictions on beliefs off the equilibrium path

– Requirement 4 is vacuous in a signaling game
The refinement we will introduce in a subsequent chapter will rule out the beliefs analyzed here
If the firm’s beliefs are (

0 for e 6= ep
µ(H|e) =
q for e = ep
Then Equation (W) implies that the firms’ strategy is

(
y(L, e) for e 6= ep
w(e) =
wp for e = ep
where we recall that

wp = q × y(H, ep ) + (1 − q) × y(L, ep )
A worker of ability η chooses e to solve
max w̃(e) − c(η, e)

e≥0
152
Consider the following example
The low-ability worker’s indifference curve through the point [e∗ (L), w∗ (L)] lies below that type’s indif-
ference curve through (ep , wp )
This implies that the education ep is optimal for the low-ability worker
The high-ability worker’s indifference curve through the point (ep , wp ) lies above the wage function w =
y(L, e)
– This implies that the education ep is optimal for the high-ability worker
– This is because the solution e∗H to the maximization problem
max y(L, e) − c(H, e)

e≥0
will lead to a wage w̃(e∗H ) = y(L, e∗H )
Other pooling equilibria
In the previous example, many other pooling perfect Bayesian equilibria exist
Some of these equilibria involve a different education choice by the worker
Others involve the same education choice but different off the equilibrium path
153
Let ê denote a level of education between ep and e′
If we substitute ep by ê then the resulting belief and strategy for the firms together with the strategy
e(η) = ê for both worker’s types form another pooling PBE
Suppose that the firms’ belief is defined by


 0
 for e ≤ e′′ except for e = ep
µ(H|e) = q for e = ep


q for e > e′′
The firms’ strategy is then


 y(L, e)
 for e ≤ e′′ except for e = ep
w(e) = wp for e = ep


wp for e > e′′
These belief and strategy for the firms and the strategy (e(L) = ep , e(H) = ep ) for the worker form a
third pooling PBE
Separating equilibrium: the no-envy case
We now turn to separating equilibria
Consider again the no-envy example
The natural separating PBE involves the strategy
e(L) = e∗ (L) and e(H) = e∗ (H)
for the worker
154
Signaling Requirement 3 then determines the firms’ belief after observing either of these two education
levels
µ[H|e∗ (L)] = 0 and µ[H|e∗ (H)] = 1
Equation (W ) implies that the firms’ strategy is
w̃(e∗ (L)) = w∗ (L) = y[L, e∗ (L)]
and
w̃(e∗ (H)) = w∗ (H) = y[H, e∗ (H)]
To complete the description of this separating PBE, it remains
1. to specify the firms’ belief µ(H|e) for out-of-equilibrium education choices, i.e., values of e other than
e∗ (L) and e∗ (H)
2. which then determines the rest of the firms’ strategy w̃ through Equation (W )
3. to show that the best response for a worker of ability η to the firms’ strategy w̃ is to choose e∗ (η)
Consider the belief that the worker has high ability if e is at least e∗ (H) but has low ability otherwise
(
0 for e < e∗ (H)
µ(H|e) =
1 for e ≥ e∗ (H)
Equation (W ) then implies that the firms’ strategy is

(
y(L, e) for e < e∗ (H)
w̃(H|e) =
y(H, e) for e ≥ e∗ (H)
Recall that e∗ (H) is the high-ability worker’s best response to the wage function e 7→ y(H, e)
155
Since y(L, e) ≤ y(H, e) we get that e∗ (H) is still a best response to the wage function w̃
Recall that e∗ (L) is the low-ability worker’s best response to the wage function e 7→ y(L, e) on the whole
real line, this implies that it is also a best response on the interval [0, e∗ (H)) since e∗ (L) < e∗ (H)
We should now solve the following maximization problem
max y(H, e) − c(L, e)

e≥e∗ (H)
Denote by f the function from [e∗ (H), ∞) to R defined by
f (e) ≡ y(H, e) − c(L, e)
Observe that
f ′ (e) = ye (H, e) − ce (L, e) ≤ ye (H, e) − ce (H, e) ≤ 0
This implies that

w∗ (H) − c[L, e∗ (H)]
is the highest payoff the low-ability worker can achieve among all choices of e ≥ e∗ (H)
Since we are in the no-envy case, we have
w∗ (L) − c[L, e∗ (L)] > w∗ (H) − c[L, e∗ (H)]
Implying that e∗ (L) is the worker’s best response to the strategy w̃
Separating equilibrium: the envy case
We consider the envy case: more interesting
Now the high-ability worker cannot earn the high wage y(H, ·) simply by choosing the education e∗ (H)
that he should choose under complete information
156
To signal his ability, the high-ability worker must choose es where es > e∗ (H) is defined by
y(H, es ) − c(L, es ) = y(L, e∗ (L)) − c(L, e∗ (L))
This is because the low-ability worker will mimic any value of e between e∗ (H) and es
And will trick the firm into believing that the worker has high ability
Formally, the natural separating PBE involves the strategy
e(L) = e∗ (L) and e(H) = es
for the worker
The equilibrium beliefs for the firm must satisfy
µ[H|e∗ (L)] = 0 and µ[H|es ] = 1
The equilibrium wage strategy for the firms must satisfy
w̃(e∗ (L)) = w∗ (L) = y(L, e∗ (L)) and w̃(es ) = y(H, es )
Actually this is the only equilibrium that survives the refinement we will introduce in a subsequent chapter
We propose the following specification of the firms’ out-of-equilibrium beliefs that supports this equilibrium
behavior (
0 for e < es
µ(H|e) =
1 for e ≥ es
The firms’ strategy is then (

y(L, e) for e < es
w̃(e) =
y(H, e) for e ≥ es
157
Let us compute the best response of the low-ability worker
We already know that e∗ (L) is a best response among all choices of e < es
One should find the worker’s best response to the firms’ strategy among all choices of e ≥ es , i.e.,
max y(H, e) − c(L, e)

e≥es
Denote by g the function defined by g(e) = y(H, e) − c(L, e) for all e ≥ es
Observe that
g ′ (e) = ye (H, e) − ce (L, e) ≤ ye (H, e) − ce (H, e)
Recall that the function e 7→ y(H, e) − c(H, e) is concave and
ye (H, e∗ (H)) − ce (H, e∗ (H)) = 0
implying that g ′ (e) ≤ 0 for all e ≥ es
Therefore, the worker’s best response to the firms’ strategy among all choices of e ≥ es is es
Since
w∗ (L) − c(L, e∗ (L)) = y(H, es ) − c(L, es )
The worker has two best responses: e∗ (L) and es
We will assume that this indifference is resolved in favor of e∗ (L)
– Alternatively, we could increase es by an arbitrary small amount so that the low-ability worker would
strictly prefer e∗ (L)
Let us now analyze the best response of the high-ability worker
Denote by g the function defined by g(e) = y(H, e) − c(H, e) for all e ≥ es
Since g is concave, we have
∀e ≥ es , g ′ (e) = ye (H, e) − ce (H, e) ≤ ye (H, e∗ (H)) − ce (e∗ (H))
This implies that the worker’s best response to the firms’ strategy among all choices of e ≥ es is es
What about the worker’s best response among all choices of e < es ?
Let π ∗ (L) be the payoff of the low-ability worker at point (e∗ (L), w∗ (L))
Denote by W (L, ·) the function defined by
W (L, e) = π ∗ (L) + c(L, e)
158
This is the equation of the indifference curve IL of the low-ability worker passing through (e∗ (L), w∗ (L))
Denote by W (H, ·) the function defined by
W (H, e) = [y(H, es ) − c(H, es )] + c(H, e)
This is the equation of the indifference curve IH of the high-ability worker passing through (es , w̃(es ))
By definition of es we have
W (L, es ) = W (H, es )
Observe that
∂W ∂W
(H, e) − (L, e) = ce (H, e) − ce (L, e) < 0
∂e ∂e
Implying that the function e 7→ W (H, e) − W (L, e) is strictly decreasing
We then get that

∀e < es , W (H, e) > W (L, e)
By definition of e∗ (L), convexity of e 7→ c(L, e) and concavity of e 7→ y(L, e) we have
∀e ≥ 0, W (L, e) ≥ y(L, e)
This implies that W (H, e) > y(L, e)
It follows that the indifference curve of the high-ability worker passing through (es , w̃(es )) is always above
the production function y(L, e), implying that any payoff among e < es is inferior to the one obtained at
es
There are other separating equilibria that involve a different education choice by the high-ability worker
– the low-ability worker always separate at e∗ (L)
159
There are other separating equilibria that involve the education choices e∗ (L) and es but differ off the
equilibrium path
Hybrid equilibrium
We analyze the case of an hybrid equilibrium where the low-ability worker randomizes
The high-ability worker chooses the education level eh (h for hybrid)
The low-ability worker randomizes between choosing eh with probability π and choosing eL with proba-
bility 1 − π
Signaling Requirement 3 then determines the firms’ belief after observing eh and eL
Bayes’ rule yields

q
µ(H|eL ) = 0 and µ(H|eh ) =
q + (1 − q)π
Since the high-ability worker always choose eh but the low-ability worker does so only with probability π,
observing eh makes it more likely that the worker has high ability so µ(H|eh ) > q
Second, as π approaches zero, the low-ability worker almost never pools with the high-ability worker so
µ(H|eh ) approaches 1
Third, as π approaches one, the low-ability worker almost always pools with the high-ability worker so
µ(H|eh ) approaches the prior belief q
When the low-ability worker separates from the high-ability worker by choosing eL
The belief µ(H|eL ) = 0 implies the wage w(eL ) = y(L, eL )
We claim that eL = e∗ (L)
Suppose the low-ability worker separates by choosing some eL 6= e∗ (L)
Such separation yields the payoff y(L, eL ) − c(L, eL )
But choosing e∗ (L) would yield the payoff of at least y[L, e∗ (L)] − c[L, e∗ (L)]
– or more if the firms’ belief µ[H|e∗ (L)] is greater than 0
The definition of e∗ (L) implies
y[L, e∗ (L)] − c[L, e∗ (L)] > y(L, e) − c(L, e), ∀e 6= e∗ (L)
For the low-ability worker to be willing to randomize between separating at e∗ (L) and pooling at eh
The wage wh ≡ w̃(eh ) must make that worker indifferent between the two
w∗ (L) − c[L, e∗ (L)] = wh − c(L, eh ) (P)
160
Recall that Equation (W ) and the definition of the belief µ(·|eh ) imply
q (1 − q)π
wh = × y(H, eh ) + × y(L, eh )
q + (1 − q)π q + (1 − q)π
For a given value of eh , if Equation (P ) yields wh < y(H, eh ) then there is a unique possible value for π
consistent with a hybrid equilibrium in which the low-ability worker randomizes between e∗ (L) and eh
If wh > y(H, eh ), then there does not exist a hybrid equilibrium involving eh
Observe that Equation (P ) yields wh < y(H, eh ) if and only if eh < es where es is the education chosen
by the high-ability worker in the separating equilibrium
Given wh < y(H, eh ), the probability r solves
r × y(H, eh ) + (1 − r) × y(L, eh ) = wh
This probability is the firms’ equilibrium belief µ(H|eh ), so
q(1 − r)
π=
r(1 − q)
As eh approaches es , the probability r approaches 1 so π approaches 0
The separating equilibrium described previously is the limit of the hybrid equilibria considered here
To complete the description of the hybrid PBE, we should define the firms’ belief out-of-equilibrium path
and check the workers’ best response
Let µ(·|e) be defined as follows (

0 for e < eh
µ(H|e) =
r for e ≥ eh
161
The firms’ strategy is then
(
y(L, e) for e < eh
w̃(e) =
r × y(L, e) + (1 − r) × y(H, e) for e ≥ eh
It remains to check that the workers’ strategy
– e(L) = eh with probability π and e(L) = e∗ (L) with probability 1 − π

– e(H) = eh
is a best response to the firms’ strategy
Corporate investment and capital structure
Consider an entrepreneur who has started a company but needs outside financing to undertake an attrac-
tive new project
The entrepreneur has private information about the profitability of the existing company
The payoff of the new project cannot be disentangled from the payoff of the existing company
All that can be observed is the aggregate profit of the firm
Suppose the entrepreneur offers a potential investor an equity stake in the firm in exchange for the
necessary financing
Under what circumstances will the new project be undertaken?
What will the equity stake be?
Suppose that the profit of the existing company can be either high or low: π ∈ {H, L} with H > L > 0
The potential investor’s opportunity cost is r, i.e., there is an alternative investment possibility with rate
of return r
The required investment in the new project is I
The payoff will be R
The new project is attractive in the sense that the NPV is positive, i.e., R > I(1 + r)
The timing and the payoffs of the game are:
1. Nature determines the profit of the existing company
The probability that π = L is p
162
2. The entrepreneur learns π and then offers the potential investor an equity stake s, where 0 ≤ s ≤ 1
3. The investor observes s but not π and then decides either to accept or to reject the offer
4. Payoffs:
If the investor rejects the offer then the investor’s payoff is I(1 + r) and the entrepreneur’s payoff is
π
If the investor accepts s then the investor’s payoff is s(π + R) and the entrepreneur’s is (1 − s)(π + R)
Suppose that after receiving the offer s the investor believes that the probability that π = L is q(s)
Then the investor will accept s if and only if
s[qL + (1 − q)H + R] ≥ I(1 + r) (PC-I)
Suppose the profit of the existing company is π
The entrepreneur prefers to receive the financing at the cost of an equity stake of s if and only if
R
s≤ (PC-E)
π+R
In a pooling PBE, the investor’s belief must be q(spo ) = p after receiving the equilibrium offer spo
The participation constraint (P C − E) is more difficult to satisfy for π = H than for π = L
Therefore, a pooling equilibrium (with “accepts” as an action) exists only if
I(1 + r) R
≤ (NC-p)
pL + (1 − p)H + R H +R
If p is close enough to zero, (NC-p) holds because R > I(1 + r)
If p is close enough to one, however, the necessary condition (NC-p) holds only if
I(1 + r)
R − I(1 + r) ≥ H −L (sNC-p)
R
In a pooling equilibrium, the high-profit type must subsidize the low-profit type
Setting q(spo ) = p yields that the investor accepts to finance the project if and only if
I(1 + r) I(1 + r)
spo ≥ >
pL + (1 − p)H + R H +R
If the investor were certain that π = H the he would accept the smaller equity stake
I(1 + r)
ssy
H =
H +R
163
The larger equity stake required in a pooling equilibrium may be so expensive that the high-profit firm
would prefer to forego the new project
A pooling equilibrium exists if p is close to zero, so that the cost of subsidization is small
Or if the profit from the new project outweighs the cost of subsidization
If (NC-p) fails then a pooling equilibrium does not exist
A separating equilibrium always exists, however
The low-profit type offers

I(1 + r)
sL =
L+R
which the investor accepts
The high-profit type offers

I(1 + r)
sH <
H +R
and the investor rejects
In such an equilibrium, investment is inefficiently low: the new project is certain to be profitable, but the
high-profit type foregoes the investment
There is no way for the high-profit type to distinguish itself
Financing terms that are attractive to the high-profit type are even more attractive to the low-profit type
As Myers and Majluf (J. Fin. Econ. 1984) observes, the forces in this model push firms toward either
debt or internal sources of funds
Actually, Myers and Majluf analyze a large firm (with shareholders and a manager) rather than an
entrepreneur (who is both the manager and the sole shareholder)
We consider the possibility that the entrepreneur can offer debt as well as equity
Suppose the investor accepts the debt contract D
If the entrepreneur does not declare bankruptcy then the investor’s payoff is D and the entrepreneur’s is
π+R−D
If the entrepreneur does declare bankruptcy then the investor’s payoff is π + R and the entrepreneur’s is
zero
Since L > 0, there is always a pooling equilibrium: both profit-types offer the debt contract D = I(1 + r),
which the investor accepts
If L were sufficiently negative that R + L < I(1 + r), then the low-profit type could not repay this debt
so the investor would not accept the contract
164
A similar argument would apply if L and H represented expected rather than certain profits
Suppose that the type π means that the existing company’s profit will be
– π + K with probability 1/2

– π − K with probability 1/2
If L − K + R < I(1 + r) then there is probability 1/2 that the low profit type will not be able to repay
the debt D = I(1 + r) so the investor will not accept the contract
Monetary policy
Consider a sequential-move game in which employers and workers negotiate nominal wages
After the negotiation, the monetary authority chooses the money supply, which in turn determines the
rate of inflation
If wage contracts cannot be perfectly indexed, employers and workers will try to anticipate inflation in
setting the wage
Once an imperfectly indexed nominal wage has been set, actual inflation above the anticipated level of
inflation will erode the real wage
This causes employers to expand employment and output
The monetary authority therefore faces a trade-off between the costs of inflation and the benefits of
reduced unemployment and increased output that follow from surprise inflation
We follow Barro and Gordon (J. Mon. Econ. 1983) and analyzed a reduced-form version of this model in the
following game
First, employers form an expectation of inflation, π e
Second, the monetary authority observes this expectation and chooses actual inflation, π
The payoff to employers is −(π−π e )2 : employers simply want to anticipate inflation correctly, they achieve
their maximum payoff when π = π e
The monetary authority would like inflation to be zero but output (y) to be at its efficient level (y ∗ )
The payoff to the monetary authority is
U (π, y) = −cπ 2 − (y − y ∗ )2
where the parameter c > 0 reflects the monetary authority’s trade-off between its two goals
165
Suppose the actual output is the following function of target output and surprise inflation
ỹ(π, π e ) = by ∗ + d(π − π e )
– Where b < 1 reflects the presence of monopoly power in product markets

– If there is no surprise inflation, π = π e , then actual output will be smaller than would be efficient
– Where d > 0 measures the effect of surprise inflation on output through real wages
We can then rewrite the monetary authority’s payoff as
W (π, π e ) = U (π, ỹ(π, π e )) = −cπ 2 − [(b − 1)y ∗ + d(π − π e )]2
We propose to solve the subgame-perfect outcome of this game
We first compute the monetary authority’s optimal choice of π given employers’ expectation π e :
d
π ∗ (π e ) = [(1 − b)y ∗ + dπ e ]
c + d2
Since employers anticipate that the monetary authority will choose π ∗ (π e ), employers choose π e to maxi-
mize −[π ∗ (π e ) − π e ]2 , which yields π ∗ (π e ) = π e , or
d(1 − b) ∗
πe = y ≡ πs
c
In this subgame-perfect outcome, the monetary authority is expected to inflate and does so
We consider a two-period version of the previous model and we add private information
In the two-period model, each player’s payoff is the sum of the player’s one-period payoffs
W (π1 , π1e ) + W (π2 , π2e ) and − (π1 − π1e )2 − (π2 − π2e )2
where πt is actual inflation in period t and πte is employers’ expectation (at end of period t−1 or beginning
of period t) of inflation in period t
We now assume that the parameter c is privately known by the monetary authority: c ∈ {S, W }
– c = S or W for “strong” and “weak” at fighting inflation where S > W > 0
1. Nature draws the monetary authority’s type, c
The probability that c = W is p
2. Employers form π1e , their expectation of first-period inflation
3. The monetary authority observes π1e and then chooses actual first-period inflation, π1
4. Employers observe π1 but not c, and then form π2e , their expectation of second-period inflation
166
5. The monetary authority observes π2e and then chooses actual second-period inflation, π2
There is a one-period signaling game embedded in this two-period monetary-policy game
The Sender’s message is the monetary authority’s first-period choice of inflation, π1
The Receiver’s action is employers’ second-period expectation of inflation, π2e
Employers’ first-period expectation of inflation and the monetary authority’s second-period choice of
inflation precede and follow the signaling game
If the monetary authority’s type is c then its optimal choice of π2 given the expectation π2e is
d
π2∗ (π2e , c) ≡ [(1 − b)y ∗ + dπ2e ]
c + d2
Employers anticipate this
If employers begin the second period believing that the probability that c = W is q, then they will form
the expectation π2e (q) that maximizes
−q[π2∗ (π2e , W ) − π2e ]2 − (1 − q)[π2∗ (π2e , S) − π2e ]2
Monetary policy: pooling equilibrium
In a pooling equilibrium, both types choose the same first-period inflation π1∗
Employers’ first-period expectation is π1e = π1∗
On the equilibrium path, employers begin the second period believing that the probability that c = W is
p and so form the expectation π2e (p)
Then the monetary authority of type c chooses its optimal second-period inflation given this expectation,
namely π2∗ [π2e (p), c], thus ending the game
Monetary policy: separating equilibrium
In a separating equilibrium, the two types choose different first-period inflation levels, say πW and πS
So employers’ first-period expectation is π1e = pπW + (1 − p)πS
After observing πW , employers begin the second period believing that c = W and so form the expectation
π2e (1) solution of the equation
d(1 − b) ∗
π2e = π2∗ (π2e (1), W ) i.e., π2e (1) = y
W
Likewise, observing πS leads to

d(1 − b) ∗
π2e (0) = y
S
167
In equilibrium, the weak type then chooses π2∗ [π2e (1), W ] and the strong type π2∗ [π2e (0), S], ending the game
To complete the description of the such an equilibrium it remains
1. to specify the Receiver’s out-of-equilibrium beliefs and actions
2. to check that no Sender-type has an incentive to deviate
3. in particular, to check neither type has an incentive to mimic the other’s equilibrium behavior
The weak-type might be tempted to choose πS in the first period, thereby inducing π2e (0) as the employers’
second-period expectation
And then choose π2∗ [π2e (0), W ] to end the game
Even if πS is uncomfortably low for the weak type, the ensuing expectation π2e (0) might be so low that
the weak type receives a huge payoff from the unanticipated inflation
π2∗ [π2e (0), W ] − π2e (0)
In a separating equilibrium, the strong type’s first period inflation must be low enough that the weak
type is not tempted to mimic the strong type
In spite of the subsequent benefit from unanticipated second-period inflation
4.3 Other applications of Signaling Games

Cheap-Talk games
Cheap-talk games are analogous to signaling games
But the Sender’s messages are just talk: costless, non-binding, non-verifiable claims
Such talk cannot be informative in Spence’s job market signaling game:
– a worker who simply announced “My ability is high” would not be believed
In other contexts, cheap talk can be informative
Stein (Am. Econ. Rev. 1989) shows that policy announcements by the Federal Reserve can be informative
but cannot be too precise
Matthews (Quarterly J. Econ. 1989) studies how a veto threat by the president can influence which bill
gets through Congress
One can also ask how to design environments to take advantage of cheap talk
Austen-Smith (1990): muito interessante!!

shows that in some settings debate among self-interested legislators improves the social value of the
eventual legislation
168
Farrell and Gibbons (1991) show that in some settings unionization improves social welfare because it
facilitates communication from the work force to management
In Spence’s job market model, cheap talk cannot be informative because all the Sender’s types have the
same preferences over the Receiver’s possible actions:
– all workers prefer higher wages, independent of ability
Let’s illustrate why uniformity of preferences over the Receiver’s possible actions vitiates cheap talk
Suppose there were a pure-strategy equilibrium in which one subset of Sender-types, T1 , send one message,
m1
While another subset of types, T2 , sends another message, m2
In equilibrium, the Receiver will interpret mi as coming from Ti and so will take the optimal action given
this belief; denote this action by ai
Since all Sender-type have the same preferences over actions
– If one type prefers a1 to a2 , then all types have this preference and will send m1 rather than m2
This destroys the putative equilibrium
There are 3 necessary conditions for cheap talk to be informative
1. different Sender-types have different preferences over the Receiver’s actions
2. the Receiver prefers different actions depending on the Sender’s type
3. Receiver’s preferences over actions not be completely opposed to the Sender’s preferences
Suppose that the Receiver prefers low actions when the Sender’s type is low and high actions when
the Sender’s type is high
If low Sender-types prefer low actions and high types high actions, then communication can occur
If the Sender has the opposite preference then communication cannot occur because the Sender would
like to mislead the Receiver
Crawford and Sobel (Econometrica 1982) analyze an abstract model that satisfies these three necessary
conditions: they show that
– more communication can occur through cheap talk when the players’ preferences are more closely
aligned
– perfect communication cannot occur unless the players’ preferences are perfectly aligned
Each of economic applications (cheap talk of the Fed, veto threats, information transmission in debate,
union voice) involve complicated models of economic environments
We will only analyze abstract cheap-talk games
169
An abstract cheap-talk game
The timing of the simplest cheap-talk game is identical to the timing of a signaling game; only payoffs differ
1. Nature draws a type ti for the Sender from a set T = {t1 , ..., tI } of feasible types according to a probability
distribution p with full support, i.e., p(t) > 0 for every t ∈ T
2. The Sender observes ti and then chooses a message mj from a set of feasible messages M = {m1 , ..., mJ }
3. The Receiver observes mj (but not ti ) and then chooses an action ak from a set of feasible actions
A = {a1 , ..., ak }
4. Payoffs are given by US (ti , ak ) and UR (ti , ak )
The key feature of such a game is that the message has no direct effect on either the Sender’s or the
Receiver’s payoff
The only way the message can matter is through its information content
By changing the Receiver’s belief about the Sender’s type, a message can change the Receiver’s action
And thus, indirectly affect both players’ payoffs
We will assume that anything can be said in the sense that M = T
Because the simplest cheap-talk and signaling games have the same timing, the definitions of P BE in the
two games are identical
A pure-strategy PBE is a pair of strategies m∗ : T → M and a∗ : M → A, and a family (µ(·|mj ))mj ∈M

of beliefs over T satisfying Requirements (1), (2R), (2S) and (3)
In a cheap-talk game, a pooling equilibrium always exists
Because messages have no direct effect on the Sender’s payoff
If the Receiver will ignore all messages then pooling is a best response for the Sender; and if the Sender
is pooling then a best response for the Receiver is to ignore all messages
More formally, let a denote the Receiver’s optimal action in a pooling equilibrium, i.e., a solves
X
max p(ti )UR (ti , ak )
ak ∈A
ti ∈T
Define a∗ by a∗ (mj ) = ā for every mj ∈ M
Fix an arbitrary message m̄ in M and define m∗ by m∗ (ti ) = m̄ for every ti ∈ T
Let µ(·|ti ) = p for every ti ∈ T
This is a pooling equilibrium
170
The interesting question therefore is whether non-pooling equilibria exist
We consider a two-type, two-action example
T = {tL , tH }, p(tL ) = p, and A = {aL , aH }
The payoffs are given in the following table (this is not a game in normal form!)
Receiver
tL tH
aL x, 1 y, 0
Sender
aH z, 0 w, 1
The first payoff in each cell is the Sender’s and the second the Receiver’s
We have chosen the Receiver’s payoffs so that the Receiver
– prefers the low action aL when the Sender’s type is low tL

– prefers the high action aH when the Sender’s type is high tH
To illustrate the first necessary condition, suppose both Sender-types have the same preferences over
actions
– For example, x > z and y > w

– Both types prefer aL to aH and both types would like the Receiver to believe that t = tL : the
Receiver cannot believe such a claim
To illustrate the third necessary condition, suppose the players’ preferences are completely opposed
– For example, z > x and y > w

– The low Sender-type prefers the high action and the high Sender-type the low action
– Then tL would like the Receiver to believe that t = tH and tH would like the Receiver to believe
that t = tL
– The Receiver cannot believe either of these claims
Consider now the case: x ≥ z and w ≥ y;

the players’ interests are perfectly aligned, in the sense that given the Sender’s type, the players (Sender
and Receiver) agree on which action should be taken
We exhibit a separating PBE
The Sender’s strategy is m∗ (t) = t for every t ∈ T
The Receiver’s beliefs are µ(tL |tL ) = 1 and µ(tL |tH ) = 0
The Receiver’s strategy is a∗ (tL ) = aL and a∗ (tH ) = aH
We consider now a special case of Crawford and Sobel’s model
171
The type, message and action spaces are continuous
The Sender’s type is uniformly distributed between 0 and 1
– T = [0, 1] and p = λ the Lebesgue measure
The message space is the type space M = T
The action is the interval from 0 to 1, i.e., A = [0, 1]
The Receiver’s payoff function is UR (t, a) = −(a − t)2
The Sender’s payoff function is US (t, a) = −[a − (t + b)]2
When the Sender’s type is t, the Receiver’s optimal action is a = t, but the according to Sender’s
preferences, the optimal action is7 a = t + b
Different Sender-types have different preferences over the Receiver’s actions (higher types prefer higher
actions)
The player’s preferences are not completely opposed
– The parameter b > 0 measures the similarity of the players’ preferences

– When b is closer to 0, the players’ interests are more closely aligned
We will prove the existence of partially pooling equilibrium of the following form
The type space is divided into the n intervals
[0, x1 ), [x1 , x2 ), ..., [xn−1 , 1]
All the types in a given interval send the same message, but types in different intervals send different
messages
Given the value of the preference-similarity parameter b, there is a maximum number of intervals (or
“steps”) that can occur in equilibrium
This maximum number is denoted by n∗ (b), and partially pooling equilibria exist for each n ∈ {1, 2, ..., n∗ (b)}
A decrease in b increases n∗ (b):

more communication can occur through cheap talk when the players’ preferences are more closely aligned
n∗ (b) approaches infinite as b approaches zero: perfect communication cannot occur unless the players’
preferences are perfectly aligned
We characterize these partially pooling equilibria, starting with a two-step equilibrium, i.e., n = 2
Suppose all the types in [0, x1 ) send one message while those in [x1 , 1] send another
7
Actually, it is min{1, t + b}
172
After receiving the message from the types in [0, x1 ), the Receiver will believe that the Sender’s type is
uniformly distributed on [0, x1 )
So the Receiver’s optimal action will be x1 /2
After receiving the message from the types in [x1 , 1], the Receiver’s optimal action will be (x1 + 1)/2
For the types in [0, x1 ) to be willing to send their message, it must be that all these types prefer the action
x1 /2 to the action (x1 + 1)/2
Likewise, all the types above x1 must prefer (x1 + 1)/2 to x1 /2
The Sender-type t
– prefers x1 /2 to (x1 + 1)/2 if the midpoint between these two actions exceeds that type’s optimal
action, t + b
– prefers (x1 + 1)/2 to x1 /2 if t + b exceeds the midpoint
For a two-step equilibrium to exist, x1 must be the type t whose optimal action t + b exactly equals the
midpoint between the two actions
1 x1 x1 + 1
x1 + b = +
2 2 2
or x1 = (1/2) − 2b
Since x1 must be positive, a two-step equilibrium exists only if b < 1/4
For b ≥ 1/4 the players’ preferences are too dissimilar to allow even this limited communication
We still have to address the issue of messages that are off the equilibrium path
Let the Sender’s strategy be that all types t < x1 send the message 0
And all types t ≥ x1 send the message x1
173
Let the Receiver’s out-of-equilibrium belief after observing any message from (0, x1 ) be that t is uniformly
distributed on [0, x1 )
And after receiving any message from (x1 , 1] be that t is uniformly distributed on [x1 , 1]
We propose to characterize a n-step equilibrium
Assume the step [xk−1 , xk ) is of length c
To make the boundary type xk indifferent between the steps [xk−1 , xk ) and [xk , xk+1 )
One must have

xk+1 + xk c
− (xk + b) = + b
2 2
or
(xk+1 − xk ) = (xk − xk−1 ) + 4b
Each step must be 4b longer than the last
In an n-step equilibrium, if the first step is of length d
Then the second must be of length d + 4b
The third of length d + 8b
The nth step must end exactly at t = 1, so we must have
d + (d + 4b) + ... + [d + (n − 1)4b] = 1
Recall that 1 + 2 + ... + (n − 1) = n(n − 1)/2
Therefore we have
n × d + n(n − 1) × 2b = 1 (NC)
Given any n such that n(n − 1) × 2b < 1, there exists a value of d that solves (NC)
And therefore there exists an n-step partially pooling equilibrium
The largest possible number of steps in such an equilibrium, n∗ (b), is the largest value n such that
n(n − 1) × 2b < 1
Therefore n∗ (b) is the largest integer less than
1h p i
1 + 1 + (2/b)
2
Observe that n∗ (b) = 1 for b ≥ 1/4: no communication is possible if the players’ preferences are too
dissimilar
174
Moreover, n∗ (b) approaches infinity only as b approaches zero: perfect communication cannot occur unless
the players’ preferences are perfectly aligned
Sequential bargaining under asymmetric information
Consider a firm and a union bargaining over wages
For simplicity, assume that employment is fixed
The amount that union members earn if not employed, called union’s reservation wage, is denoted by wr
The firm’s profit, denoted by π, is uniformly distributed on [πL , πH ]
The value of π is privately known by the firm
– The firm might have superior knowledge concerning new products in the planning stage
We simplify the analysis by assuming that wr = πL = 0
The bargaining game lasts at most two periods
1. In the first period, the union makes a wage offer, w1
If the firm accepts this offer then the game ends

The union’s payoff is w1 and the firm’s is π − w1
These payoffs are the present values of the wage and (net) profit streams that accrue to the players
over the life of the contract being negotiated
2. If the firm rejects this offer the game proceeds to the second period
The union makes a second wage offer, w2

If the firm accepts this offer then the present values of the players’ payoffs are δw2 for the union and
δ(π − w2 ) for the firm
δ reflects both discounting and the reduced life of the contract remaining after the first period
If the firm rejects the union’s second offer then the game ends and payoffs are zero for both players
A more realistic model might allow the bargaining to continue until an offer is accepted
Or might force the parties to submit to binding arbitration after a prolonged strike
Here we sacrifice realism for tractability
We refer to Sobel adn Takahashi (Rev. Econ. Sud. 1983) for an infinite horizon analysis
175
We begin by sketching the unique PBE of this game
The union’s first-period wage offer is

(2 − δ)2
w1∗ = πH
2(4 − 3δ)
If the firm’s profit, π, exceeds

2w1∗ 2−δ
π1∗ = = πH
2−δ 4 − 3δ
then the firm accepts w1∗ ; otherwise the firm rejects w1∗
If its first-period offer is rejected, the union updates its belief about the firm’s profit
– The union believes that π is uniformly distributed on [0, π1∗ ]
The union’s second-period wage offer (conditional on w1 being rejected) is
π1∗ 2−δ
w2∗ = = πH < w1∗
2 2(4 − 3δ)
If the firm’s profit, π, exceeds w2∗ then the firm accepts the offer; otherwise, it rejects it
We will refer interchangeably to one firm with many possible profit types and to many firms each with
its own profit level
In each period, high-profits firms accept the union’s offer
While low-profit firms reject it
The union’s second-period belief reflects the fact that high-profit firms accepted the first-period offer
In equilibrium, low-profit firms tolerate a one-period strike in order to convince the union that they are
low-profit and so induce the union to offer a lower second-period wage
Firms with very low profits find even the lower second-period offer intolerably high and so reject it, too
We propose an extensive-form representation of a simplified version of the game
There are only two values of π: πL and πH
The union has only two possible wage offers wL and wH
In this simplified game, the union has the move at three information sets: the union’s strategy consists
of three wage offers
1. The first-period offer, w1

2. The second-period offer, w2 (H) after w1 = wH is rejected
3. The second-period offer, w2 (L) after w1 = wL is rejected
176
These three moves occur at three non-singleton information sets, at which the union’s beliefs are denoted
(p, 1 − p), (q, 1 − q) and (r, 1 − r)
respectively
177
In the full game, a strategy for the union is a
1. first-period offer w1
2. a second-period offer function w1 7→ w̃2 (w1 ) that specifies the offer w2 to be made after each possible
offer w1 is rejected
Each of these moves occur at a non-singleton information set
There is one second-period information set for each different first-period wage offer the union might make
So there is a continuum of such information sets, rather than two in the simplified game
With both the lone first-period and the continuum of second-period information sets, there is one decision
node for each possible value of π (so a continuum of such nodes, rather than two for the simplified game)
At each information set, the union’s belief is a probability distribution over these nodes
We denote the union’s first-period belief by µ1 ∈ P rob([0, πH ])
The union’s second-period belief, after observing the first-period offer w1 has been rejected, is denoted by
µ2 (·|w1 )
A strategy for the firm involves two decisions
178
Let A1 (w1 |π) equal one if the firm would accept the first-period offer w1 when its profit is π, and zero if
the firm would reject w1 under these circumstances
Let A2 (ww |π, w1 ) equal one if the firm would accept the second-period offer w2 when its profit is π and
the first-period offer was w1 , and zero if the firm would reject w2 under these circumstances
A strategy for the firm is a pair of functions (A1 , A2 ) with
A1 : (w1 , π) 7→ A1 (w1 |π) ∈ {0, 1}
and
A1 : (w2 , w1 , π) 7→ A2 (w2 |π, w1 ) ∈ {0, 1}
Since the firm has complete information throughout the game, its belief are trivial
The strategies (w̃1 , w̃2 ) and (A1 , A2 ), and the beliefs (µ1 , µ2 ) form a PBE if they satisfy Requirements 2,
3 and 4
Requirement 1 is satisfied by the mere existence of the union’s beliefs
We will show that there is a unique perfect Bayesian equilibrium
The simplest step of the argument is to apply Requirement 2 to the firm’s second-period decision A2 (w2 |π, w1 )
Since this is the last move of the game, the optimal decision for the firm is to accept w2 if and only if
π ≥ w2 ; the value of w1 is irrelevant
(
1 if π ≥ w2
A2 (w2 |π, w1 ) =
0 if π < w2
Given the strategy A2 , we can apply Requirement 2 to the union’s second-period choice of a wage offer
w2 should maximize the union’s expected payoff, given the union’s belief µ2 and the firm’s subsequent
strategy A2
The difficult part of the argument is to determine the belief µ2
We temporarily consider the following one-period bargaining problem
Suppose the union believes that the firm’s profit is uniformly distributed on [0, π1 ], where for the moment
π1 is arbitrary
If the union offers w then the firm’s best response is:
– accept w if and only if π ≥ w
Thus the union’s problem can be stated as
max[w × P rob{firm accepts w} + 0 × P rob{firm rejects w}]

w≥0
179
where
π1 − w
P rob{firm accepts w} =
π1
The optimal wage offer is therefore w∗ (π1 ) = π1 /2
We return (permanently) to the two-period problem
Assume that the union offers w1 in the first period and the firm expects the union to offer w2 in the
second period
The firm’s possible payoffs are
– π − w − 1 from accepting w1
– δ(π − w2 ) from rejecting w1 and accepting w2
– zero from rejecting both offers
The firm prefers accepting w1 to accepting w2 if π − w1 ≥ δ(π − w2 ), or
w1 − δw2
π≥ ≡ π ∗ (w1 , w2 )
1−δ
And the firm prefers accepting w1 to rejecting both offers if π − w1 ≥ 0
Thus for arbitrary values of w1 and w2 , firms with π ≥ max{π ∗ (w1 , w2 ), w1 } will accept w1 and the other
firms will reject
Since Requirement 2 dictates that the firm act optimally given the players’ subsequent strategies, we can
derive A1 (w1 |π) by replacing the arbitrary wage w2 by w̃2 (w1 ), i.e.,
(
1 if π ≥ max{π ∗ (w1 , w̃2 (w1 )), w1 }
A1 (w1 |π) =
0 if π < max{π ∗ (w1 , w̃2 (w1 )), w1 }
We can derive µ2 , the union’s second-period belief at the information set reached if the first period offer
w1 is rejected
Requirement 4 dictates that the union’s belief be determined by Bayes’ rule and the firm’s strategy
Thus, given the first part of the firm’s strategy A1 just derived
The union’s belief must be that the types remaining in the second period are uniformly distributed on
[0, π̂1 (w1 , w̃2 )] where
π̂1 (w1 , w̃2 ) ≡ max{π ∗ (w1 , w̃2 (w1 )), w1 }
Given this belief, the union’s optimal second-period offer must be
π̂1 (w1 , w̃2 )

w̃2 (w1 ) = w∗ (π̂1 (w1 , w̃2 )) =
2
180
It follows that w̃2 (w1 ) solves the implicit equation for w2 as a function of w1 :
2w2 = max{π ∗ (w1 , w2 ), w1 }
To solve this equation, suppose that w1 ≥ π ∗ (w1 , w2 )
Then 2w2 = w1 but this contradicts w1 ≥ π ∗ (w1 , w2 )
Therefore, we must have 2w2 = π ∗ (w1 , w2 ) implying that
w1
w̃2 (w1 ) =
2−δ
Therefore, the union’s second-period belief at the information set reached if the first period offer w1 is
rejected
Is that the types remaining in the second period are uniformly distributed on
[0, π̃(w1 )] where π̃(w1 ) = π̂1 (w1 , w̃2 (w1 ))
Since w̃2 (w1 ) = w1 /(2 − δ) we get that

2w1
π̃(w1 ) =
2−δ
We have now reduced the game to a single-period optimization problem for the union
Given the union’s first-period wage offer, w1 , we have specified
– the firm’s optimal first-period response
2w1
A1 (w1 |π) = 1 ⇔ π ≥ π̃(w1 ) =
2−δ
– the union’s belief entering the second period
1
µ2 (·|w1 ) = λ
π̃(w1 ) [0,π̃(w1 ))
– the union’s optimal second-period offer
w1
w̃2 (w1 ) =
2−δ
– the firm’s optimal second-period response
A2 (w2 |π, w1 ) = 1 ⇔ π ≥ w2
Thus, the union’s first-period wage offer w1 should be chosen to solve
w1 × µ1 {A1 (w1 |·) = 1} + Π2 (π1 ) × [1 − µ1 {A1 (w1 |·) = 1}]
181
where Π2 (π1 ) is the discounted of the second-period payoff conditional to the rejection by the firm of the
offer w1 , i.e.,
Π2 (π1 ) = δw̃2 (w1 ) × µ2 [{A2 (w̃2 (w1 )|·, w1 ) = 1}|w1 ]
Observe that
πH − π̃(w1 )
µ1 {A1 (w1 |·) = 1} = µ1 {π ≥ π̃(w1 )} =
πH
Observe that
µ2 [{A2 (w̃2 (w1 )|·, w1 ) = 1}|w1 ] = µ2 ([w̃2 (w1 ), πH ]|w1 )
Since π̃(1) = 2w̃2 (w1 ) we get that
π̃(w1 ) − w̃2 (w1 )

µ2 [{A2 (w̃2 (w1 )|·, w1 ) = 1}|w1 ] =
π̃(w1 )
The union’s first-period wage offer w1∗ should be chosen to solve

πH − π̃(w1 ) π̃(w1 ) − w̃2 (w1 )
max w1 + δw̃2 (w1 )
w1 ≥0 πH πH
The solution w1∗ is

(2 − δ)2
w1∗ = πH
2(4 − 3δ)
If the firm’s profit, π, exceeds

2 2−δ
π1∗ = w1∗ = πH
2−δ 4 − 3δ
then the firm accepts w1∗ ; otherwise, the firm rejects w1∗
If its first period offer is rejected, the union updates its belief about the firm’s profit: the union believes
that π is uniformly distributed on [0, π1∗ ]
The union’s second-period wage offer (conditional on w1∗ being rejected) is
π1∗ 2−δ
w2∗ = = πH < w1∗
2 2(4 − 3δ)
If the firm’s profit, π, exceeds w2∗ then the firm accepts the offer; otherwise, it rejects it
Reputation in the finitely repeated Prisoners’ Dilemma
Consider a stage game having a unique Nash equilibrium
Any finitely repeated game based on this stage game has a unique SPNE
– The Nash equilibrium of the stage game is played in every stage, after every history
182
A great deal of experimental evidence suggests that cooperation occurs frequently during finitely repeated
Prisoners’ Dilemmas
Especially in stages that are not too close to the end
Kreps, Milgrom, Roberts, and Wilson (J. Econ. Theory 1982) show that a reputation model offers an
explanation of this evidence
We introduce a new way of modeling asymmetric information
Rather than assume that one player has private information about his or her payoffs
We will assume that the player has private information about his or her feasible strategies
We will assume that with probability p the Row player can play only the Tit-for-Tat strategy
– This strategy begins the repeated game by cooperating and thereafter mimics the opponent’s previous
play
While with probability 1 − p the Row player can play any of the strategies available in the complete-
information repeated game (including Tit-for-Tat)
– This Row-type is called “rational”
Under this formulation, if the Row player ever deviates from the Tit-for-Tat strategy then it becomes
common knowledge that Row is rational
The spirit of KMRW’s analysis is that even if p is very small
– i.e., even if the Column player has only a tiny suspicion that the Row player might not be rational
This uncertainty can have a big effect
KMRW show that there is an upper bound on the number of stages in which either player finks in
equilibrium
This upper bound depends on p and on the stage-game payoffs but not on the number of stages in the
repeated game
Thus, in any equilibrium of a long enough repeated game, the fraction of stages in which both players
cooperate is large
The two key steps in KMRW’s argument are
1. If the Row player deviates from Tit-for-Tat then it becomes common knowledge that Row is rational
So neither player cooperates thereafter

So the rational Row has an incentive to mimic Tit-for-Tat
183
2. Given an assumption on the stage-game payoffs to be imposed below, the Column player’s best response
against Tit-for-Tat would be to cooperate until the last stage of the game
We will consider the complement of the analysis in KMRW
Rather than assume that p is small and analyze long repeated games
We will assume that p is large enough that there exists an equilibrium in which both players cooperate in
all but the last two stages of a (possibly short) repeated game
We begin with the two period case
The timing is
1. Nature draws a type for the Row player
With probability p, Row has only the Tit-for-Tat strategy available

With probability 1 − p, Row can play any strategy
Row learns his or her type, but Column does not learn Row’s type
2. Row and Column play the Prisoners’ Dilemma
The players’ choices in this stage game become common knowledge
3. Row and Column play the Prisoners’ Dilemma for a second and last time
4. Payoffs are received
The playoffs are the (undiscounted) sums of their stage-game payoffs
Column
Cooperate Fink
Cooperate 1, 1 b, a
Row
Fink a, b 0, 0
To make this stage game a Prisoners’ Dilemma, we assume that
a>1 and b<0
Recall that finking (F) strictly dominates cooperating (C) in the stage game, both for rational Row and
for Column
Since, in the last stage of this two-period game of incomplete information, Column will surely fink
Then, there is no reason for the rational Row to cooperate in the first stage
Tit-for-Tat begins the game by cooperating
Thus, the only move to be determined is Column’s first-period move (X)
184
This move is then mimicked by Tit-for-Tat in the second period
By choosing X = C, Column receives the expected payoff
p · 1 + (1 − p) · b
in the first period
Since Tit-for-Tat and the rational Row choose different moves in the first period
Column will begin the second period knowing whether Row is Tit-for-Tat or rational
The expected second-period payoff for the Column player is
p · a + (1 − p) · 0
This reflects Column’s uncertainty about Row’s type when deciding whether to cooperate or fink in the
first period
By choosing X = F , Column’s expected payoff in the first period is
p · a + (1 − p) · 0
In the second-period he also receives p · a
Therefore, Column will cooperate in the first period provided that
p + (1 − p)b ≥ 0 (C-1)
We hereafter assume that (C-1) holds
Now consider the three-period case
If Column and the rational Row both cooperate in the first period
Then the equilibrium path for the second and third periods will be given by the equilibrium of the previous
two-period game with X = C
185
We will derive sufficient conditions for Column and the rational Row to cooperate in the first period and
get the following three-period path, called “cooperation equilibrium”
In this equilibrium
The payoff to the rational Row is 1 + a
The expected payoff to Column is
[p · 1 + (1 − p) · 1] + [p · 1 + (1 − p)b] + [p · a + (1 − p) · 0] = 1 + p + (1 − p)b + pa
If the rational Row finks in the first period
Then it becomes common knowledge that Row is rational
So both players fink in the second and third periods
Thus, the total payoff to the rational Row from finking in the first period is a
This is less than the cooperation equilibrium payoff 1 + a
The rational Row has no incentive to deviate from the strategy of the cooperation equilibrium
We next consider whether Column has an incentive to deviate
If Column finks in the first period then
– Tit-for-Tat will fink in the second period

– the rational Row will fink in the second period because Column is sure to fink in the last period
Having finked in the first period, Column must then decide whether to fink or cooperate in the second
period
If Column finks in the second period, then Tit-for-Tat will fink in the third period
The play will be as follows
186
Column’s payoff from this deviation is a
This is less than Column’s expected payoff in the cooperation equilibrium provided that
1 + p + (1 − p)b + pa ≥ a
Given (C-1), a sufficient condition for Column not to play this deviation is
1 + pa ≥ a (C-2)
Alternatively, Column could deviate by finking in the first period but cooperating in the second
In which case Tit-for-Tat would cooperate in the third period
The play would be as follows:
Column expected payoff from this deviation is a + b + pa
This is less than Column’s expected payoff in the cooperation equilibrium provided that
1 + p + (1 − p)b + pa ≥ a + b + pa
Given (C-1), a sufficient condition for Column not to play this deviation is
a+b≤1 (C-3)
We have shown that if (C-1), (C-2) and (C-3) hold
Then the cooperation equilibrium is the equilibrium path of a PBE of the three-period Prisoners’ Dilemma
For a given value of p, the payoffs a and b satisfy these three conditions if they belong to the shaded region
187
As p approaches zero, this shaded region vanishes
4.4 Refinements of Perfect Bayesian Equilibrium

We defined a perfect Bayesian equilibrium to be strategies and beliefs satisfying Requirements 1 through
4
We observed that in such an equilibrium no player’s strategy can be strictly dominated beginning at any
information set
We now consider two further requirements on beliefs off the equilibrium path
The first additional requirement formalizes the following idea
Since PBE prevents player i from playing a strategy that is strictly dominated beginning at any information
set, it is not reasonable for player j to believe that i would play such a strategy
To make this idea more concrete, consider the following dynamic game with incomplete information
There are two pure-strategy perfect Bayesian equilibria
(L, L′ , p = 1) and (R, R′ , p ≤ 1/2)
188
In (L, L′ ), player 2’s information set is on the equilibrium path, so Requirement 3 dictates that p = 1
In (R, R′ ), this information set is off the equilibrium path but Requirement 4 puts no restriction on p
We thus require only that 2’s belief p make the action R′ optimal – i.e., p ≤ 1/2
The key feature of this example is that M is a strictly dominated strategy for player 1
Thus, it is not reasonable for player 2 to believe that 1 might have played M
Formally, it is not reasonable for 1 − p to be positive, so p must equal one
Therefore, the PBE (R, R′ , p ≤ 1/2) is not reasonable leaving (L, L′ , p = 1) as the only PBE satisfying
this requirement
Although M is strictly dominated, L is not
If L were striclty dominated (for instance if player 1’s payoff of 3 were, say, 3/2)
Then the same argument would imply that it is not reasonable for p to be positive, but this would
contradict the earlier result that p must be one
In such a case, the new requirement would not restrict player 2’s out-of-equilibrium beliefs
In the previous example, M is strictly dominated (in the whole game)
This strict dominance is too strong a test
We will require that player j should not believe that player i might have played a strategy that is strictly
dominated beginning at any information set
For example, consider the following modification of the previous game
– We expand the game in such a way that player 2 has a move preceding 1’s move and has two choices
at this initial move
– Either end the game or give the move to 1 at 1’s information set
– Now M is not any more strictly dominated because if 2 ends the game at the initial node then L,
M , and R all yield the same payoff
Definition
Consider an information set at which player i has the move
The strategy s′i is strictly dominated beginning at this information set if there exists another
strategy si such that
– for every belief that i could hold at the given information set
– for each possible combination of the other players’ subsequent strategies8
8
a “subsequent strategy” is a complete plan of action covering every contingency that might arise after the given information set
has been reached
189
Player i’s expected payoff from taking the action specified by si at the given information set and playing
the subsequent strategy specified by si
Is strictly greater than the expected payoff from taking the action and playing the subsequent strategy
specified by s′i
Requirement 25 (5). If possible, each player’s beliefs off the equilibrium path should place zero probability
on nodes that are reached only if another player plays a strategy that is strictly dominated beginning at some
information set
The qualification “If possible” in Requirement 5 covers the case that would arise in the previous game if
R dominated both M and L (as would occur if player 1’s payoff of 3 were 3/2
In such a case, Requirement 1 dictates that player 2 have a belief, but it is not possible for this belief to
place zero probability on the nodes following both M and L
So Requirement 5 would not apply
To illustrate Requirement 5, consider the following signaling game
– In the payoffs (3, 2), the payoff 3 is the Sender’s payoff
The Sender strategy (m′ , m′′ ) means that type t1 chooses a message m′ and type t2 chooses the message
m′′ , i.e., the Sender strategy m̃ = (m′ , m′′ ) is given by
(
m′ if t = t1
m̃(t) =
m′′ if t = t2
The Receiver strategy (a′ , a′′ ) means that the Receiver chooses action a′ following L and a′′ following R,
i.e., the Receiver strategy ã = (a′ , a′′ ) is given by
(
a′ if m = L
ã(m) =
a′′ if m = R
190
We can check that the strategies and beliefs
{(L, L), (u, d), p = 0.5, q}
constitute a pooling PBE for any q ≥ 1/2
The key feature of this signaling game, however, is that it makes no sense for t1 to play R
– The strategies in which t1 plays R are strictly dominated beginning at the Sender’s information set
corresponding to t1
– Showing that (R, L) and (R, R) are strictly dominated beginning at this information set amounts to
exhibiting an alternative strategy for the Sender that yields a higher payoff for t1 for each strategy
the Receiver could play
– (L, R) is such a strategy: it yields at worst 2 for t1 , whereas (R, L) and (R, R) yield at best 1
The t1 -node in the Receiver’s information set following R can be reached only if the Sender plays a strategy
that is strictly dominated
Furthermore, the t2 -node in the Receiver’s information set following R can be reached by a strategy that
is not strictly dominated beginning at an information set, namely (L, R)
Requirement 5 dictates that q = 0
Since {(L, L), (u, d), p = 0.5, q} is a PBE only if q ≥ 1/2, such an equilibrium cannot satisfy Requirement
5
An equivalent way to impose Requirement 5 on the signaling game is as follows
Definição 43. In a signaling game, the message mj from M is dominated for type ti from T if there
exists another message mj ′ from M such that ti ’s lowest possible payoff from mj ′ is greater than ti ’ highest
possible payoff from mj
min US (ti , mj ′ , ak ) > max US (ti , mj , ak )
ak ∈A ak ∈A
Signaling Requirement (5). If the information set following mj is off the equilibrium path and mj is domi-
nated for type ti then (if possible) the Receiver’s belief µ(ti |mj ) should place zero probability on type ti
This is possible provided mj is not dominated for all types in T
The separating PBE

{(L, R), (u, u), p = 1, q = 0}
satisfies Signaling Requirement 5 trivially because there are no information set off this equilibrium path
Suppose now that the Receiver’s payoffs when type t2 plays R are reversed:
– 1 from playing d and 0 from playing u
191
Now
{(L, L), (u, d), p = 0.5, q}
is a pooling PBE for any value of q
So
{(L, L), (u, d), p = 0.5, q = 0}
is a pooling PBE satisfying Requirement 5
In some games, there are perfect Bayesian equilibria that seem unreasonable but nonetheless satisfy
Requirement 5
Cho and Kreps (QJE 1987) proposed an additional refinement
We propose to discuss three aspects of their paper
1. the “Beer and Quiche” signaling game, which illustrates that unreasonable perfect Bayesian equilibria
can satisfy Signaling Requirement 5
2. a stronger version of Signaling Requirement 5, called the Intuitive Criterion
3. the application of the Intuitive Criterion to Spence’s job-market signaling game
The Beer and Quiche game
The Sender is one of two types
– “wimpy” (timid, coward, unadventurous) with probability 0.1

– “surly” (unfriendly, hostile, bad-tempered, threatening) with probability 0.9
The Sender’s message is the choice of whether to have beer or quiche for breakfast
The Receiver’s action is the choice of whether or not to duel with the Sender
The qualitative feature of the payoffs are that
– the wimpy type would prefer to have quiche for breakfast, the surly would prefer to have beer
– both types would prefer not to duel with the Receiver (and care about this more than about which
breakfast they have)
– the Receiver would prefer to duel with the wimpy type but not to duel with the surly type
192
In this game,
{m∗ , a∗ , p = 0.1, q}
with (
∗ ∗ not if m = Quiche
m (t) = Quiche, a (m) =
duel if m = Beer
is a pooling PBE for any q ≥ 1/2
This equilibrium satisfies Signaling Requirement 5, because Beer is not dominated for either Sender type
The Receiver’s belief off the equilibrium path does seem suspicious
If the Receiver unexpectedly observes Beer then the Receiver concludes that the Sender is at least as
likely to be wimpy as surly (i.e., q ≥ 1/2) even though
(a) the wimpy type cannot possibly improve on the equilibrium payoff of 3 by having Beer rather than
Quiche
(b) the surly type could improve on the equilibrium payoff of 2, by receiving the payoff of 3 that would
follow if the Receiver held a belief q < 1/2
Given (a) and (b), one might expect the surly type to choose Beer and then make the following speech:
Seeing me choose Beer should convince you that I am the surly type:
– choosing Beer could not possibly have improved the lot of the wimpy type, by (a)
– if choosing Beer will convince you that I am the surly type then doing so will improve my lot, by (b)
If such a speech is believed, it dictates that q = 0, which is incompatible with this pooling PBE
Definição 44. Given a PBE in a signaling game, the message mj from M is equilibrium-dominated
for type ti from T if ti ’s equilibrium payoff, denoted by U ∗ (ti ), is greater than ti ’s highest possible payoff
193
from mj
U ∗ (ti ) > max US (ti , mj , ak )
ak ∈A
Signaling Requirement (6). If the information set following mj is off the equilibrium path and mj is
equilibrium-dominated for type ti then (if possible) the Receiver’s belief µ(ti |mj ) should place zero probability on
type ti . This is possible provided mj is not equilibrium-dominated for all types in T
“Beer and Quiche” shows that a message mj can be equilibrium-dominated for ti without being dominated
for ti
If mj is dominated for ti , however, then mj must be equilibrium-dominated for ti
So imposing Signaling Requirement 6 makes Signaling Requirement 5 redundant
Arguments in this spirit are sometimes said to use forward induction
– because interpreting a deviation – i.e., in forming the belief µ(ti |mj ) – the Receiver asks whether the
Sender’s past behavior could have been rational
– whereas backwards induction assumes that future behavior will be rational
Spence’s job-market signaling game
Consider the envy case of the job-market signaling model
there are enormous number of pooling, separating and hybrid perfect Bayesian equilibria in this model
only one is consistent with signaling Requirement 6
194
– ti = L chooses e∗ (L)
– ti = H chooses es
Remember that, in any PBE, worker’s wage is
w(e) = µ(H|e) · y(H, e) + (1 − µ(H|e)) · y(L, e)
Because
y[L, e∗ (L)] − c[L, e∗ (L)] > w(e) − c(L, e) ∀e > es
any education level e > es is dominated for the low-ability type
– in terms of Signaling Requirement 5

– therefore, µ(H|e) = 1 for all e > es
There is no other separating PBE satisfying Signaling Requirement 5
For any PBE with e(H) = ê, ê > es , a deviation would be choose e ∈ [es , ê)
– see the previous figure
in any equilibrium that satisfies Signaling Requirement 5, type-H’s utility must be at least
y(H, es ) − c(H, es )
otherwise, the worker would deviate to (w, e) = (y(H, es ), es )
Some pooling and hybrid equilibria cannot satisfy Signaling Requirement 5
There are 2 cases:
(a) q is low enough

(b) q is not low enough
that the wage function

w = q · y(H, e) + (1 − q) · y(L, e)
is below the high-ability worker’s indifference curve through the point [es , y(H, es )]
First, consider case (a):
195
no pooling equilibria satisfy Signaling Requirement 5
– type-H worker cannot achieve the utility y(H, es ) − c(H, es ) in such an equilibrium
no hybrid equilibrium in which the type-H worker does the randomizing satisfy Signaling Requirement 5
– the point (e, w) at which pooling occurs in such an equilibrium lies below the wage function w =
q · y(H, e) + (1 − q) · y(L, e)
no hybrid equilibrium in which the type-L worker does the randomizing satisfy Signaling Requirement 5
– the point (e, w) at which pooling occurs in such an equilibrium must be on the type-L’s indifference
curve through the point [e∗ (L), w∗ (L)]
– and so lies below the type-H’s indifference curve through the point [es , y(H, es )]
Observação 29. In this case, there is only one PBE satisfying Signaling Requirement 5.
Second, consider the case (b)
196
no hybrid equilibrium in which the type-L worker does the randomizing satisfy Signaling Requirement 5
– the point (e, w) at which pooling occurs in such an equilibrium must be on the type-L’s indifference
curve through the point [e∗ (L), w∗ (L)]
– and so lies below the type-H’s indifference curve through the point [es , y(H, es )]
pooling and hybrid equilibria in which the type-H worker does the randomizing can satisfy this requirement
– if the pooling occurs at an point (e, w) in the shaded region of the figure
However, such equilibria cannot satisfy Signaling Requirement 6
Consider a pooling equilibrium at ep shown in the figure
Education choices e > e′ are equilibrium-dominated for type-L worker
– even the highest wage that could be paid to a worker with education e, y(H, e)
– yields an (e, w) point below the type-L’s indifference curve through the point (ep , wp )
Education choices between e′ and e′′ are not equilibrium-dominated for the type-H worker
– if such a choice convinces the firms that the worker has high ability,
– then the firms will offer the wage y(H, e)
– which will make type-H better off than in the indicated pooling equilibrium
Thus, if e′ < e < e′′ , Signaling Requirement 6 implies µ(H|e) = 1
Which in turn implies that the indicated pooling equilibrium cannot satisfy Signaling Requirement 6
Observação 30. This argument can be repeated for all the pooling and hybrid equilibria in the shaded
region in the figure
so the only PBE that satisfies Signaling Requirement 6 is the separating equilibrium previously dis-
197
cussed
198
Tópicos Especiais
Instabilidade Financeira (Bank runs)
To be written.
Casamentos Estáveis (Matching )

Matching: o algoritmo de Gale and Shapley (1962)
Considere dois grupos de agentes, grupo M = {m1 , m2 , · · · , mn } e grupo W = {w1 , w2 , · · · , wn }. Defina

N := {1, 2, · · · , n}.
O objetivo é associar cada um dos elementos de M a um, e somente um, elemento de W. * Ou seja, quer-se
construir uma função injetiva x : M → W.
Para todo a ∈ {m, w} e todo i ∈ N , a relação de preferências do agente ai é representada dada por ≻ai .
A relação de preferência ≻ai e é suposta estrita, completa e transitiva. Ou seja,
* para todo wi ∈ W tem-se
[(mk ≻wi ml ) ∨ (ml ≻wi mk )] ∧ ¬[(mk ≻wi ml ) ∧ (ml ≻wi mk )], ∀k, l ∈ N (1)
(mj ≻wi mk ) ∧ (mk ≻wi ml ) ⇒ (mj ≻wi ml ), ∀j, k, l ∈ N (2)
* para todo mi ∈ M tem-se
[(wk ≻mi wl ) ∨ (wl ≻mi wk )] ∧ ¬[(wk ≻mi wl ) ∧ (wl ≻mi wk )], ∀k, l ∈ N (3)
(wj ≻mi wk ) ∧ (wk ≻mi wl ) ⇒ (wj ≻mi wl ), ∀j, k, l ∈ N (4)
**Obs.:** Na notação acima, ∨ denota a disjunção ”ou”, ∧ denota a conjunção ”e” e ¬ denota a negação ”não”
Portanto, a relação de preferência de ai (denotada por ≻ai ) pode ser representada por um permutação do
conjunto N .
**Exemplo 1:** Suponha n = 3, a = w e i = 2. Então N = {1, 2, 3} e ai = w2 . O conjunto de possı́veis relações
de preferência de w2 é
Rw2 = {(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)}. (5)
A relação de preferência ≻w2 = (2, 1, 3), por exemplo, significa que o agente w2 considera * a opção m2 estri-
tamente melhor do que a opção m1 . * a opção m2 estritamente melhor do que a opção m3 . * a opção m1
estritamente melhor do que a opção m3 .
Ou seja, * Sempre que m2 for uma opção possı́vel para w2 , então w2 escolherá m2 . * w2 escolherá m1 somente
quando m2 não estiver disponı́vel e m1 estiver disponı́vel. * w2 escolherá m3 somente quando m2 e m1 não
estiverem disponı́veis.
199
**Exemplo 2 (descrição completa em [ams.org](http://www.ams.org/samplings/feature-column/fc-2015-03)):**
Suponha N = 4 e que as relações de preferência de mi e de wi são representadas pela i-ésima linha da matriz
M e W , respectivamente.
   
w1 w2 w3 w4 m4 m3 m1 m2
   
w1 w4 w3 w2  m2 m4 m1 m3 
M =
 ,
 W =
m
 (6)
w2 w1 w3 w4   4 m1 m2 m3 

w4 w2 w3 w1 m3 m2 m1 m4
Por exemplo, a preferência de m3 é dada por ≻m3 = (2, 1, 3, 4).

O algoritmo:
O objetivo do algoritmo é computar a função injetiva x : M → W. Tal função é calculada como o limite de
uma sequência de funções {xt }Tt=0 tal que xt : M → W para todo t. O valor de T é escolhido de forma que
xT = xT −1 . Ele segue os passos a seguir:
Passo 0: O primeiro elemento da sequência, x0 é escolhido neste passo. Cada m ∈ M é associado ao elemento
de W mais peferido por m, ou seja,
x0 (m) = min{≻(i)
m : wi ∈ W}, ∀m ∈ M. (7)
(i)
em que ≻m denota a i-ésima entrada de ≻m .
Passo 0.1: Considere um elemento w ∈ W arbitrário. O conjunto de elementos de M associados a w via

x0 é dado por
Mw0 = {m ∈ M : x0 (m) = w} ⊆ M. (8)
(i)
Deste conjunto, w escolhe seu elemento preferido, min{≻w : mi ∈ Mw0 }, e rejeita os demais.
Passo 0.2: Considere um elemento m ∈ M arbitrário. O conjunto de elementos de W que não rejeitaram
m é dado por
0
Wm = W \ {w ∈ W : (x0 (m) = w) ∧ (m 6= min{≻(i) 0
w : mi ∈ Mw })} ⊆ W. (9)
k−1 mais peferido por m, ou seja,

Passo k > 0: Cada m ∈ M é associado ao elemento de Wm
(i) k−1
xk (m) = min{≻m : wi ∈ W m }, ∀m ∈ M. (10)
Passo k.1: Considere um elemento w ∈ W arbitrário. O conjunto de elementos de M associados a w via

xk é dado por
Mwk = {m ∈ M : xk (m) = w} ⊆ M.
200
(i)
Deste conjunto, w escolhe seu elemento preferido, min{≻w : mi ∈ Mwk }, e rejeita os demais.
Passo k.2: Considere um elemento m ∈ M arbitrário. O conjunto de elementos de W que ainda não
rejeitaram m é dado por
k
Wm k−1
= Wm \ {w ∈ W : (xk (m) = w) ∧ (m 6= min{≻(i) k−1
w : mi ∈ Mw })} ⊆ W.
Passo Final: Este processo continua até a iteração k na qual xk (m) = xk−1 (m) para todo m ∈ M. Neste
passo do algoritmo, define-se x(m) = xk (m) para todo m ∈ M.
Implementação em Python:
A implementação a seguir usa as relações de preferência apresentadas no exemplo 2 acima.
import numpy as np
def initiate_Wmt():
"Inicialize o conjunto de $w$’s que ainda no rejeitaram $m$ como o conjunto de todas os $w$’s"
Wm_t = {}
for m in N:
Wm_t[m-1] = set()
for w in N:
Wm_t[m-1].add(w)
print(’Wm_t =’,Wm_t)
return Wm_t
def compute_xt(Wm_t):
"Dado o conjunto de $w$’s que ainda no rejeitaram $m$, dado por $W_m^t$, calcule $x_t(m)$"
x_t = np.zeros((1,n),dtype=np.int8)
for m in N:
tt = n
for w in Wm_t[m-1]:
for ii in range(tt):
# print(m,w,ii,tt,M[m-1,:])
if M[m-1,ii]==w:
x_t[0,m-1] = w
tt = ii
break
print(’x_t =’,x_t)
return x_t
def update_Mwt(x_t):
"Dado a associao $x_t$, calcule o conjunto de $m$’s associados a $w$"
Mw_t = {}
for w in N:
201
Mw_t[w-1] = set()
for m in N:
if x_t[0,m-1]==w:
Mw_t[w-1].add(m)
print(’Mw_t =’,Mw_t)
return Mw_t
def compute_xxt(Mw_t):
"Dado o conjunto de $m$’s associados a $w$, calcule o aceite de $w$, dado por $xx_t$"
xx_t = np.zeros((1,n),dtype=np.int8)
for w in N:
tt = n
for m in Mw_t[w-1]:
for ii in range(tt):
# print(w,m,ii,tt,W[w-1,:])
if W[w-1,ii]==m:
xx_t[0,w-1] = m
tt = ii
break
print(’xx_t =’,xx_t)
return xx_t
def update_Wmt(x_t,xx_t,Wmt):
"Com base na associao $x_t$ e no aceite de $w$, dado por $xx_t$, atualize o conjunto de $w$’s
que\
ainda no rejeitaram $m$"
Wm_t = {}
for m in N:
Wm_t[m-1] = set()
for w in Wmt[m-1]:
if x_t[0,m-1]!=w or m==xx_t[0,w-1]:
Wm_t[m-1].add(w)
print(’Wm_t =’,Wm_t)
return Wm_t
"ESTE O PROGRAMA PRINCIPAL..."

n = 4
print(’ Definio de parmetros...’)

N = np.linspace(1, n, n, endpoint=True,dtype=np.int8)
print(’N =’,N,end=’\n\n’)
# Inicialize as preferncias de cada conjunto de agentes

M = np.array([[1,2,3,4],[1,4,3,2],[2,1,3,4],[4,2,3,1]])
W = np.array([[4,3,1,2],[2,4,1,3],[4,1,2,3],[3,2,1,4]])
202
print(’M =’)
print(M,end=’\n\n’)
print(’W =’)
print(W,end=’\n\n’)
print(’ Inicializao de objetos...’)

# Inicialize o conjunto de w’s que ainda no rejeitaram os m’s
Wm_t = initiate_Wmt()
# Inicialize as associaes : $x_0$

x_t = compute_xt(Wm_t)
# Inicialize o conjunto de m’s que esto associados com cada w

Mw_t = update_Mwt(x_t)
# Inicialize as escolhas de m por parte de cada w

xx_t = compute_xxt(Mw_t)
print(’\n\ nIterao at convergncia ...’)

norm, it = 1, 0
while norm!=0:
it += 1
print(’\ nIterao ’,it)
# Atualize o conjunto de w’s que ainda no rejeitaram os m’s
Wm_t = update_Wmt(x_t,xx_t,Wm_t)
# Store current proposals $x_t$ and compute $x_{t+1}$

old_xt = np.copy(x_t)
x_t = compute_xt(Wm_t)
# Update $M_w^t$
Mw_t = update_Mwt(x_t)
# Update $xx_t$
xx_t = compute_xxt(Mw_t)
# Update norm
norm = max(abs(x_t-old_xt)[0])
print(’norma =’,norm,end=’\n\n’)
O programa gera como resultado:
Definio de parmetros...
N = [1 2 3 4]
M =
[[1 2 3 4]
203
[1 4 3 2]
[2 1 3 4]
[4 2 3 1]]
W =
[[4 3 1 2]
[2 4 1 3]
[4 1 2 3]
[3 2 1 4]]
Inicializao de objetos...
Wm_t = {0: {1, 2, 3, 4}, 1: {1, 2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3, 4}}
x_t = [[1 1 2 4]]
Mw_t = {0: {1, 2}, 1: {3}, 2: set(), 3: {4}}
xx_t = [[1 3 0 4]]
Iterao at convergncia ...
Iterao 1
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3, 4}}
x_t = [[1 4 2 4]]
Mw_t = {0: {1}, 1: {3}, 2: set(), 3: {2, 4}}
xx_t = [[1 3 0 2]]
norma = 3
Iterao 2
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3}}
x_t = [[1 4 2 2]]
Mw_t = {0: {1}, 1: {3, 4}, 2: set(), 3: {2}}
xx_t = [[1 4 0 2]]
norma = 2
Iterao 3
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[1 4 1 2]]
Mw_t = {0: {1, 3}, 1: {4}, 2: set(), 3: {2}}
xx_t = [[3 4 0 2]]
norma = 1
Iterao 4
Wm_t = {0: {2, 3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[2 4 1 2]]
204
Mw_t = {0: {3}, 1: {1, 4}, 2: set(), 3: {2}}
xx_t = [[3 4 0 2]]
norma = 1
Iterao 5
Wm_t = {0: {3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[3 4 1 2]]
Mw_t = {0: {3}, 1: {4}, 2: {1}, 3: {2}}
xx_t = [[3 4 1 2]]
norma = 1
Iterao 6
Wm_t = {0: {3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[3 4 1 2]]
Mw_t = {0: {3}, 1: {4}, 2: {1}, 3: {2}}
xx_t = [[3 4 1 2]]
norma = 0
References
J. Bertrand. Review of recherches sur le principe mathématique de lathéorie des richesses. Journal des Savants,
499, 1883.
I.-K. Cho and D. M. Kreps. Signaling games and stable equilibria. The Quarterly Journal of Economics, 102
(2):179–221, 1987.
A.-A. Cournot. Recherches sur les principes mathématiques de la théorie des richesses par Augustin Cournot.
chez L. Hachette, 1838.
D. W. Diamond and P. H. Dybvig. Bank runs, deposit insurance, and liquidity. Journal of political economy,
91(3):401–419, 1983.
M. P. Espinosa and C. Rhee. Efficient wage bargaining as a repeated game. The Quarterly Journal of Economics,
104(3):565–588, 1989.
J. W. Friedman. A non-cooperative equilibrium for supergames. The Review of Economic Studies, 38(1):1–12,
1971.
D. Gale and L. S. Shapley. College admissions and the stability of marriage. The American Mathematical
Monthly, 69(1):9–15, 1962.
R. Gibbons. Game Theory for Applied Economists. Princeton University Press, 1992. ISBN 9781400835881.
J. C. Harsanyi. Games with incomplete information played by” bayesian” players, i-iii. part ii. bayesian equi-
librium points. Management Science, pages 320–334, 1968.
205
J. C. Harsanyi. Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points.
International Journal of Game Theory, 2(1):1–23, 1973.
E. P. Lazear and S. Rosen. Rank-order tournaments as optimum labor contracts. Journal of political Economy,
89(5):841–864, 1981.
W. Leontief. The pure theory of the guaranteed annual wage contract. Journal of Political Economy, 54(1):
76–79, 1946.
J. F. Nash et al. Equilibrium points in n-person games. Proceedings of the national academy of sciences, 36(1):
48–49, 1950.
H. Von Stackelberg. Marktform und gleichgewicht. J. springer, 1934.
206

Notas de Aula Usp

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Notas de Aula Usp

Enviado por

Direitos autorais:

Formatos disponíveis

Notas de Aula (Gibbons, 1992) - Teoria dos Jogos

September 26, 2017

Cap. 1 - Static Games of Complete Information 15

Cap. 2 - Dynamic Games of Complete Information 49

Cap. 3 - Static Games of Incomplete Information 107

Cap. 4 - Dynamic games of incomplete information 132

Tópicos Especiais 199

Considere o jogo a seguir:

Questão 1. O podemos esperar sobre o comportamento dos prisioneiros?

 o melhor para 1 é confessar, não importa o que 2 faça

 o melhor para 2 é confessar, não importa o que 1 faça

“Equilı́brio em Estratégias Dominantes”

Considere o (novo) jogo a seguir:

Considere o (novo) jogo a seguir:

“Equilı́brios de Nash”: (not Confess, not Confess)

Considere o (caso geral de) jogo a seguir:

(nC, C) (nC, nC)

 o conjunto de estratégias do jogador i ∈ I é dado por Si := {nC, C}

Ri (s−i ) := arg max ui (σ, s−i )

Definição 1. O perfil de estratégias (s1 , s2 ) é um equilı́brio de Nash se s1 ∈ R1 (s2 ) e s2 ∈ R2 (s1 ). Ou

s1 ∈ R1 (R2 (s1 )) e s2 ∈ R2 (R1 (s2 ))

Teorema 1 (Nash et al. (1950)). In the n-player normal-form game

 if n is finite and Si is finite for every i

 a sociedade deseja maximizar −a − b − c − d?

Questão 4. Como convencer os prisioneiros a confessar?

 os 2 indivı́duos confessam somente quando y = b − d ≤ 0

– esta restrição é chamada de restrição de incentivos

 as penas são limitadas (a ≥ −1, b ≥ −2 e c, d ≥ −10)

– esta restrição é chamada de restrição de factibilidade

 O mecanismo (a, b, c, d) ótimo resolve:

ou seja, ele é dado por m∗ := (a∗ , b∗ , c∗ , d∗ ) = (−1, −2, −10, −2)

Proof. Sob o mecanismo ótimo m∗ = (−1, −2, −10, −2) tem-se

(nC, C) (nC, nC)

Aplicação: Corrida Bancária em equilı́brio

 Há dois indivı́duos na economia, chamados depositantes.

 Os depositantes vivem por 3 perı́odos: t = 0, 1, 2

– perı́odo inicial (data 0)

e auferem utilidade u(c1 + c2 ) = c1 + c2

e possuem dotação inicial (em t = 0) individual de D unidades de recursos

 Os indivı́duos participam de um arranjo bancário de três perı́odos:

– decisão de investimento (data 0)

 Na data zero ambos depositam D unidades de recurso no banco.

 O banco recebe 2D unidades de recurso e as aplica em um investimento que rende:

– 2r unidades de recursos na data 1 se liquidado no curto prazo

 Se ambos os depositantes sacam recursos na data 1,

– o investimento é liquidado no curto prazo

 Se somente um dos depositantes saca recursos na data 1,

– o investimento é liquidado no curto prazo

 Se ambos os depositantes sacam recursos na data 2,

– o investimento é liquidado somente no longo prazo

Calculando o(s) equilı́brio(s)

(nC, C) (nC, nC)

def NE( S = (’correr’,’no correr’) ):

– Caso I: (D, R, r) = (1, 1.2, 1.1)

O programa acima gera o seguinte resultado:

O conjunto de equilbrios de Nash :

Neste caso, há somente um equilı́brio de Nash: (n~

– Caso II: (D, R, r) = (1, 1.2, 0.9)

O programa acima gera o seguinte resultado:

O conjunto de equilbrios de Nash :

Neste caso, surge outro equilı́brio de Nash: (correr,correr).

– ou a economia beneficiará do retorno de longo prazo (Estabilidade bancária)

mas não há certeza sobre qual equilı́brio emergirá.

A Teoria Econômica e a Teoria dos Jogos

o melhor para 1 é confessar, não importa o que 2 faça

o melhor para 2 é confessar, não importa o que 1 faça

o conjunto de estratégias do jogador i ∈ I é dado por Si := {nC, C}

if n is finite and Si is finite for every i

a sociedade deseja maximizar −a − b − c − d?

os 2 indivı́duos confessam somente quando y = b − d ≤ 0

as penas são limitadas (a ≥ −1, b ≥ −2 e c, d ≥ −10)

O mecanismo (a, b, c, d) ótimo resolve:

Há dois indivı́duos na economia, chamados depositantes.

Os depositantes vivem por 3 perı́odos: t = 0, 1, 2

Os indivı́duos participam de um arranjo bancário de três perı́odos:

Na data zero ambos depositam D unidades de recurso no banco.

O banco recebe 2D unidades de recurso e as aplica em um investimento que rende:

Se ambos os depositantes sacam recursos na data 1,

Se somente um dos depositantes saca recursos na data 1,

Se ambos os depositantes sacam recursos na data 2,

fim último da Ciência Econômica é o indivı́duo e seu bem estar

com somente J = 2 bens, cuja dotação é ω j ≥ 0, e sem produção

propriedade mı́nima e consensual

Caixa de Edgeworth e a Curva de Contrato

Certo consenso entre economistas

Principal critério de previsão da Teoria econômica

Utility maximization: para cada consumidor i, x∗i resolve

se indivı́duos são tomadores de preços

se dotação de recursos é adequadamente arranjada

se as preferências dos indivı́duos são convexas

se indivı́duos agem como price-takers

se os mercados são completos

however (Confess, Confess) is more reasonable to expect

Equilibrium in Dominant strategies

subgame-Perfect Nash equilibrium

Bayesian Nash equilibrium

Perfect Bayesian Nash equilibrium

The Intuitive Criterion (Cho and Kreps (1987))

The Social Planner problems (social optimum)

The Revelation Principle

each player simultaneously chooses a strategy.

Two suspects are arrested and charged with a crime

Each player has 2 strategies: Confess or not Confess

We denote by n the number of players, i.e., n = #I

The set I may be denoted by I = {1, 2, · · · , n}

An element si ∈ Si is called a strategy (play or action)

The payoff of player i is a function

We use alternatively the following notation

What should we expect to observe in a game played by

This does not imply that they act simultaneously

Let s′i and s′′i be two strategies in Si .

for each possible combination of the other players’ strategies,

If player 2 chooses Confess

If player 2 chooses not Confess