Você está na página 1de 206

Notas de Aula (Gibbons, 1992) - Teoria dos Jogos

J. Bertolai

September 26, 2017

Contents
Teoria dos Jogos: Panorama geral 2
Um exemplo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
A Teoria Econômica e a Teoria dos Jogos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Cap. 1 - Static Games of Complete Information 15


1.1 Normal form games and Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3 Mixed strategies and existence of equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Cap. 2 - Dynamic Games of Complete Information 49


2.1 Dynamic games of complete and perfect information . . . . . . . . . . . . . . . . . . . . . . . 49
2.2 Two-stage games of complete but imperfect information . . . . . . . . . . . . . . . . . . . . . 61
2.3 Repeated games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.4 Dynamic games of complete but imperfect information . . . . . . . . . . . . . . . . . . . . . . 96

Cap. 3 - Static Games of Incomplete Information 107


3.1 Static Bayesian games and Bayesian Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . 107
3.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
3.3 The Revelation Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Cap. 4 - Dynamic games of incomplete information 132


4.1 Introduction to Perfect Bayesian equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.2 Signaling Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.3 Other applications of Signaling Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
4.4 Refinements of Perfect Bayesian Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Tópicos Especiais 199


Instabilidade Financeira (Bank runs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Casamentos Estáveis (Matching) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

References 206

1
Teoria dos Jogos: Panorama geral
Um exemplo
Remark (Teoria dos Jogos). A Teoria dos Jogos proporciona previsões sobre qual será o comportamento dos
indivı́duos sob um dado conjunto de regras (instituição).

Considere o jogo a seguir:


Prisoner 2
not Confess Confess
not Confess −1,−1 −9, 0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma

Questão 1. O podemos esperar sobre o comportamento dos prisioneiros?

ˆ o melhor para 1 é confessar, não importa o que 2 faça

ˆ o melhor para 2 é confessar, não importa o que 1 faça

Previsão 1. (Confess, Confess) é uma boa previsão sobre o comportamento dos indivı́duos.

“Equilı́brio em Estratégias Dominantes”

Considere o (novo) jogo a seguir:


Prisoner 2
not Confess Confess
not Confess −1,−1 −3, 0
Prisoner 1
Confess 0,−3 −6,−6
Prisoner Dilemma

(
confessar se 1 espera que 2 não confessará
ˆ o melhor para 1 é
não confessar se 1 espera que 2 confessará
(
confessar se 2 espera que 1 não confessará
ˆ o melhor para 2 é
não confessar se 2 espera que 1 confessará

2
Previsão 2. Há duas boas previsões para o comportamento dos indivı́duos: (not Confess, Confess) e
(Confess, not Confess)

“Equilı́brios de Nash”

Considere o (novo) jogo a seguir:

Prisoner 2
not Confess Confess
not Confess −1,−1 −3,−3
Prisoner 1
Confess −3,−3 −6,−6
Prisoner Dilemma

(
não confessar se 1 espera que 2 não confessará
ˆ o melhor para 1 é
não confessar se 1 espera que 2 confessará
(
não confessar se 2 espera que 1 não confessará
ˆ o melhor para 2 é
não confessar se 2 espera que 1 confessará

Previsão 3. Há somente uma boa previsão para o comportamento dos indivı́duos:

“Equilı́brios de Nash”: (not Confess, not Confess)

Considere o (caso geral de) jogo a seguir:

Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma

3
Questão 2. O podemos esperar sobre o comportamento dos prisioneiros?



 não confessar se 1 espera que 2 não confessará e a ≥ c

 confessar se 1 espera que 2 não confessará e a ≤ c
ˆ a melhor resposta para 1 é


 não confessar se 1 espera que 2 confessará e b ≥ d

 confessar se 1 espera que 2 confessará e b ≤ d
– a melhor resposta para 2 é análoga, dado a simetria do jogo

Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma



 (nC, nC) se a − c ≥ 0

 (nC, C) se a − c ≤ 0 e b − d ≥ 0
ˆ será uma boa previsão (Equilı́brio de Nash)


 ( C, nC) se a − c ≤ 0 e b − d ≥ 0

 ( C, C) se b − d ≤ 0
y =b−d

(nC, C) (nC, nC)


(C, nC)
✲x =a−c

(nC, nC)
(C, C) (C, C)

Formalização:
Prisoner 2
not Confess Confess
not Confess a,a b,c
Prisoner 1
Confess c,b d,d
Prisoner Dilemma
ˆ o conjunto de jogadores (prisioneiros) é dado por I := {1, 2}

ˆ o conjunto de estratégias do jogador i ∈ I é dado por Si := {nC, C}





 a se (si , s−i ) = (nC, nC)

 b se (si , s−i ) = (nC, C)
ˆ para cada (s1 , s2 ) ∈ S1 × S2 , o payoff do jogador i ∈ I é ui (s1 , s2 ) =


 c se (si , s−i ) = ( C, nC)

 d se (si , s−i ) = ( C, C)

4
ˆ o conjunto de melhores respostas de i para a conjectura s−i é dado por

Ri (s−i ) := arg max ui (σ, s−i )


σ∈Si

Definição 1. O perfil de estratégias (s1 , s2 ) é um equilı́brio de Nash se s1 ∈ R1 (s2 ) e s2 ∈ R2 (s1 ). Ou


seja, se si é ponto fixo de Ri ◦ R−i :

s1 ∈ R1 (R2 (s1 )) e s2 ∈ R2 (R1 (s2 ))

Teorema 1 (Nash et al. (1950)). In the n-player normal-form game

G = {S1 , S2 , · · · , Sn ; u1 , u2 , · · · , un },

ˆ if n is finite and Si is finite for every i

then there is at least one Nash Equilibrium (possibly involving mixed strategies).

Desenho de Mecanismos

Questão 3. Quais devem ser as possı́veis sentenças dos prisioneiros (a, b, c e d) quando a sociedade deseja
que eles revelem a verdade (C,C) e

ˆ não se pode prender alguém sem alguma confissão por mais do que 1 ano (a ≥ −1);

ˆ não se pode prender alguém usando testemunha por mais do que 2 anos (b ≥ −2);

ˆ não se pode prender alguém usando confissão por mais do que 10 anos (c ≥ −10 e d ≥ −10); e

ˆ a sociedade deseja maximizar −a − b − c − d?

P2
nC C
nC a,a b,c
P1
C c,b d,d
Prisoner Dilemma

5
Ou seja, como desenhar o mecanismo ótimo para revelar a verdade?

ˆ se houvesse evidência de culpa dos 2 prisioneiros, a sociedade escolheria (a, b, c, d) = (−1, −2, −10, −10)

ˆ como não há evidência da culpa, ninguém confessará o crime se (a, b, c, d) = (−1, −2, −10, −10)

y =b−d

P2 (nC, C) (nC, nC)
nC C (C, nC)
nC a,a b,c ✲x=a−c
P1
C c,b d,d
(nC, nC)
Prisoner Dilemma
(C, C) (C, C)

Questão 4. Como convencer os prisioneiros a confessar?

ˆ os 2 indivı́duos confessam somente quando y = b − d ≤ 0

– esta restrição é chamada de restrição de incentivos

ˆ as penas são limitadas (a ≥ −1, b ≥ −2 e c, d ≥ −10)

– esta restrição é chamada de restrição de factibilidade

ˆ O mecanismo (a, b, c, d) ótimo resolve:





 a ≥ −1

 b ≥ −2
max −(a + b + c + d) s.t. ,
(a,b,c,d) 

 c, d ≥ −10

 b≤d

ou seja, ele é dado por m∗ := (a∗ , b∗ , c∗ , d∗ ) = (−1, −2, −10, −2)

Observação 1. “A sociedade desenha o mecanismo para induzir a confissão (C,C), mas pode acabar em
uma situação (nC,nC) pior do que a esperada.”

Proof. Sob o mecanismo ótimo m∗ = (−1, −2, −10, −2) tem-se

x = a∗ − c∗ = 9 > 0 e y = b∗ − d∗ = 0 ≥ 0

6
e, portanto, há dois equilı́brios de Nash: (nC, nC) e (C, C).

y =b−d

(nC, C) (nC, nC)


(C, nC)
✲x=a−c

(nC, nC)
(C, C) (C, C)

Aplicação: Corrida Bancária em equilı́brio

ˆ Há dois indivı́duos na economia, chamados depositantes.

ˆ Os depositantes vivem por 3 perı́odos: t = 0, 1, 2

– perı́odo inicial (data 0)


– curto prazo (data 1)
– longo prazo (data 2)

e auferem utilidade u(c1 + c2 ) = c1 + c2

– c1 consumo no perı́odo t = 1
– c2 consumo no perı́odo t = 2

e possuem dotação inicial (em t = 0) individual de D unidades de recursos

ˆ Os indivı́duos participam de um arranjo bancário de três perı́odos:

– decisão de investimento (data 0)


– curto prazo (data 1)
– longo prazo (data 2)

ˆ Na data zero ambos depositam D unidades de recurso no banco.

ˆ O banco recebe 2D unidades de recurso e as aplica em um investimento que rende:

– 2r unidades de recursos na data 1 se liquidado no curto prazo

7
– 2R unidades de recursos na data 2 se liquidado no longo prazo

ˆ Se ambos os depositantes sacam recursos na data 1,

– o investimento é liquidado no curto prazo


– cada depositante recebe r

ˆ Se somente um dos depositantes saca recursos na data 1,

– o investimento é liquidado no curto prazo


– o depositante que saca na data 1 recebe D
– o depositante que saca na data 2 recebe 2r − D

ˆ Se ambos os depositantes sacam recursos na data 2,

– o investimento é liquidado somente no longo prazo


– cada depositante recebe R

A matriz de payoffs é

Depositante 2
correr n~
ao correr
correr r,r D, 2r − D
Depositante 1
n~
ao correr 2r − D,D R,R
Bank run Game

Calculando o(s) equilı́brio(s)

ˆ Graficamente:

y =r−D

(nC, C) (nC, nC)


(C, nC)
✲x =R−D

(nC, nC)
(C, C) (C, C)

ˆ Computacionalmente:

8
def payoffs(x,y):
"""Para cada par de estrat\’egias s1=x e s2=y,
esta fun\c{c}\~aoo retorna o payoff do jogador 1 e o payoff do jogador 2"""
if x==’correr’:
if y==’correr’:
z = [r,r]
else:
z = [D,2*r-D]
else:
if y== ’correr’:
z = [2*r-D,D]
else:
z = [R,R]
return z

def NE( S = (’correr’,’no correr’) ):


equilibrios = []
# Para cada perfil de estratgias s = (s1,s2)
for s1 in S:
for s2 in S:
# verifique se ’s’ equilbrio
v = payoffs(s1,s2)
eq = True
for t in S:
if payoffs(t,s2)[0]>v[0] or payoffs(s1,t)[1]>v[1]:
eq = False
break
if eq:
equilibrios.append(’({},{})’.format(s1,s2))
return equilibrios

– O programa a seguir usa as funções payoffs() e NE() para calcular o conjunto de equilı́brios de
Nash.
* dois casos serão estudados
* em ambos os casos, a dotação inicial é D = 1 e o retorno de longo prazo é de 20%, ou seja,
R = 1.2
* no primeiro caso, o retorno de curto prazo é 10%, ou seja, r = 1.1
* no segundo caso, o retorno de curto prazo é −10%, ou seja, r = 0.9

– Caso I: (D, R, r) = (1, 1.2, 1.1)

D, r, R = 1,1.1,1.2
eqs = NE()

9
print(’O conjunto de equilbrios de Nash :’)
print(’{ ’,end=’’)
for i, eq in enumerate(eqs):
#print(’{}’.format(eq),end=’’)
if i<len(eqs)-1:
print(’{}, ’.format(eq),end=’’)
else:
print(’{} ’.format(eq),end=’’)
print(’}’)

O programa acima gera o seguinte resultado:

O conjunto de equilbrios de Nash :


{ (nao correr,nao correr) }

Neste caso, há somente um equilı́brio de Nash: (n~


ao correr,n~
ao correr).
Neste equilı́brio,
* todos os depositantes aguardam para sacar em t = 2
* o projeto de investimento atinge sua maturação
* a economia consegue explorar o retorno de longo prazo, 20%

– Caso II: (D, R, r) = (1, 1.2, 0.9)

D, r, R = 1,0.9,2
eqs = NE()
print(’O conjunto de equilbrios de Nash :’)
print(’{ ’,end=’’)
for i, eq in enumerate(eqs):
#print(’{}’.format(eq),end=’’)
if i<len(eqs)-1:
print(’{}, ’.format(eq),end=’’)
else:
print(’{} ’.format(eq),end=’’)
print(’}’)

O programa acima gera o seguinte resultado:

O conjunto de equilbrios de Nash :


{ (correr,correr), (nao correr,nao correr) }

Neste caso, surge outro equilı́brio de Nash: (correr,correr).


Neste equilı́brio,
* todos os depositantes correrão para sacar em t = 1
* o projeto de investimento é liquidado antes de sua maturação
* a economia não consegue explorar o retorno de longo prazo, 20%

10
Observação 2. A previsão do modelo neste caso é:

– ou a economia beneficiará do retorno de longo prazo (Estabilidade bancária)


– ou a economia estará em situação pior do que sem o arranjo bancário (Instabilidade bancária)

mas não há certeza sobre qual equilı́brio emergirá.

A Teoria Econômica e a Teoria dos Jogos


Teoria Econômica: estudo das formas alternativas de se alocar recursos escassos.

Economics is a science which studies human behavior as a relationship between ends and scarce means which
have alternative uses

ˆ fim último da Ciência Econômica é o indivı́duo e seu bem estar

Exemplo: como escassez determina alocação ótima


Considere uma economia habitada por I indivı́duos

ˆ com somente J = 2 bens, cuja dotação é ω j ≥ 0, e sem produção

Definição 2. Uma alocação é um vetor



x = (x11 , x21 ), (x12 , x22 ), . . . , (x1I , x2I ) ∈ R2I
+

que especifica uma cesta de consumo (x1i , x2i ) ∈ R2+ para cada indivı́duo i ∈ {1, 2, . . . , I}. A alocação é dita
factı́vel se
I
X
xji ≤ ω j , ∀j ∈ {1, 2}.
i=1

11
ˆ distribuição de recursos entre os indivı́duos

ˆ Caixa de Edgeworth (I = 2)

Eficiência vs Equidade: o que é uma alocação socialmente ótima?

Definição 3. Uma alocação factı́vel x ∈ R2I ′ 2I


+ é Pareto ótima se não existe outra alocação x ∈ R+ factı́vel
tal que
ui (x′i ) ≥ ui (xi ) para todo i ∈ {1, 2, . . . , I}

e ui (x′i ) > ui (xi ) para algum i.

ˆ propriedade mı́nima e consensual

– mı́nima: toda alocação ótima precisa ser Pareto ótima


– consensual: não pode haver desperdı́cio sob a alocação ótima

ˆ Caixa de Edgeworth e a Curva de Contrato

Economics and Efficiency

Eficiência como critério de previsão: Eficiência de Pareto é a regra, não a exceção

ˆ Certo consenso entre economistas

ˆ Melhorias de Pareto: por que esperar que elas não são exploradas?

ˆ Principal critério de previsão da Teoria econômica

An equilibrium concept: competitive equilibrium – markets and prices

Definição 4. Uma alocação x∗ ∈ R2I ∗ ∗ ∗ 2


+ e um preço p = (p1 , p2 ) ∈ R+ constituem um equilı́brio competitivo
se

ˆ Utility maximization: para cada consumidor i, x∗i resolve



max ui (xi ) : s.t. p∗1 x1 + p∗2 x2 ≤ p∗1 ωi1 + p∗ ωi2
x∈R2+

12
ˆ Market clearing: demanda agregada igual a oferta agregada

I
X I
X
xj∗
i = ωij j = 1, 2
i=1 i=1

Teorema 2 (The First Fundamental Welfare Theorem). Toda alocação resultante de um equilı́brio com-
petitivo é Pareto ótima.

ˆ se mercados completos:

– cada bem é transacionado em um mercado


– sob um preço publicamente conhecido

ˆ se indivı́duos são tomadores de preços

– agem de forma perfeitamente competitiva

Teorema 3 (The Second Fundamental Welfare Theorem). Toda alocação Pareto ótima pode ser alcançada
(sustentada ou decentralizada) como um equilı́brio competitivo.

ˆ se dotação de recursos é adequadamente arranjada

ˆ se as preferências dos indivı́duos são convexas

ˆ se indivı́duos agem como price-takers

ˆ se os mercados são completos

Game theory revolution

But what about Prisoners’ Dilemma?


Prisoner 2
not Confess Confess
not Confess −1,−1 −9,0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma

ˆ the outcome from strategy profile (not Confess, not Confess) Pareto dominates the outcome from
(Confess, Confess)

ˆ however (Confess, Confess) is more reasonable to expect

13
Another equilibrium concept: Nash equilibrium

Strategic interdependence:

ˆ Each individual’s welfare depends not only on his own actions but also on the actions of the other
individuals

ˆ The actions that are best for an individual to take may depend on what he expects the other players to
do

Even more equilibrium concepts and some refinements:

ˆ Equilibrium in Dominant strategies

ˆ Nash equilibrium

ˆ subgame-Perfect Nash equilibrium

ˆ Bayesian Nash equilibrium

ˆ Perfect Bayesian Nash equilibrium

ˆ The Intuitive Criterion (Cho and Kreps (1987))

Mechanism Design

Questão 5 (Choosing among games:). How to design games in order to implement optimal allocations?

ˆ Principal-Agent problems

– moral hazard
– adverse selection

ˆ The Social Planner problems (social optimum)

– fiscal policy
– monetary policy
– regulation

ˆ The Revelation Principle

14
Cap. 1 - Static Games of Complete Information
Static Games of Complete Information:

ˆ Static:

– players simultaneously choose actions


– payoffs players receive depend on the combination of actions just chosen

ˆ Complete information:

– each player’s payoff function is common knowledge among all players


– exemplo: leilões

Questão 6. What is a game?

Definição 5. A game is a formal representation of a situation in which a number of individuals interact


in a setting of strategic interdependence.

ˆ Each individual’s welfare depends not only on his own actions but also on the actions of the other
individuals

ˆ The actions that are best for an individual to take may depend on what he expects the other players to
do

1.1 Normal form games and Nash equilibrium


Normal-form representation of games

In the normal-form representation of the game

ˆ each player simultaneously chooses a strategy.

ˆ the combination of strategies chosen by players determines a payoff for each player

Exemplo 1 (The prisoners’ dilemma).


The environment

ˆ Two suspects are arrested and charged with a crime


ˆ The police lack sufficient evidence to convict the suspects, unless at least one confesses
ˆ The suspects are in separate cells
ˆ The police explain the consequences that will follow from the actions they could take

15
Actions and payoffs

ˆ If neither confesses, they will be convicted of a minor offense and sentenced to one month in jail
ˆ If both confess then both will be sentenced to jail for six months
ˆ If one confesses but the other does not, then the confessor will be released immediately but the other
will be sentenced to nine months in jail

– Six for crime


– Three for obstructing justice

Matrix representation

ˆ Each player has 2 strategies: Confess or not Confess

ˆ Implicitly assumed that each player does not like to stay in jail

Prisoner 2
not Confess Confess
not Confess −1,−1 −9,0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma

General case:
The normal form representation

(a) Players

(b) Strategies

(c) Payoffs

Players: A finite set I of players

ˆ We write “player i” where i is the name of the player and I is the collection of names

ˆ We denote by n the number of players, i.e., n = #I

ˆ The set I may be denoted by I = {1, 2, · · · , n}

Strategies: The set of strategies available to player i is denoted by Si

ˆ An element si ∈ Si is called a strategy (play or action)

ˆ The set Si is called strategy space and may have any structure: finite, countable, metric space, vector
space

ˆ The collection (si )i∈I = (s1 , · · · , sn ) is called a strategy profile and denoted by s or s

16
ˆ Given an agent j and a profile s, we denote by (s−j ; s′j ) the new profile σ = (σi )i∈I defined by
(
s′j if i=j
σi =
si if i 6= j

1 < j < n ⇒ (s−j ; s′j ) = (s1 , . . . , sj−1 , s′j , sj+1 , . . . , sn )

Payoffs:

ˆ The payoff of player i is a function


Q
i j∈I Sj → [−∞, +∞]
u :
s 7→ ui (s)

where ui (s) is the payoff of player i when

– he plays strategy si
– and any other player j plays strategy sj

ˆ We use alternatively the following notation

ui (s) = ui ((sj )j∈I ) = ui (s−i ; si ) = ui (s1 , s2 , . . . , sn )

Definição 6. A game in normal form is a family

G = (Si , ui )i∈I

where for each i ∈ I

ˆ Si is a set
Q
ˆ ui is a function from S = j∈I Sj to [−∞, +∞]

Observação 3. We should know describe how to solve a game-theoretic problem

Questão 7. Can we anticipate how a game will be played?

ˆ What should we expect to observe in a game played by

– rational players

17
– who are fully knowledgeable about
* the structure of the game
* and each others’ rationality?

Simultaneous moves: In a normal form game the players choose their strategies simultaneously

ˆ This does not imply that they act simultaneously

ˆ It suffices that each choose his or her action without knowledge of the others’ choices

– Prisoners’ dilemma: the prisoners may reach decisions at arbitrary times but it must be in separate
cells
– Bidders in an sealed-bid auction

Iterated elimination of strictly dominated strategies

Definição 7 (Strictly dominated strategies). Consider a normal form game (Si , ui )i∈I .

ˆ Let s′i and s′′i be two strategies in Si .

Strategy s′i is strictly dominated by strategy s′′i if,

ˆ for each possible combination of the other players’ strategies,

the player i’s payoff from playing s′i is strictly less than the payoff playing s′′i .

ˆ Formally,
Y
ui (s′i , s−i ) < ui (s′′i , s−i ) ∀s−i ∈ Sk ,
k6=i

Rationality: Rational players do not play strictly dominated strategies

Observação 4 (The prisoners’ dilemma). For a prisoner, playing not Confess is strictly dominated by
playing Confess

Prisoner 2
not Confess Confess
not Confess −1,−1 −9,0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma

18
Assume we are player 1

ˆ If player 2 chooses Confess

– We prefer to play Confess and stay 6 months in jail


– rather than playing not Confess and stay 9 months in jail

ˆ If player 2 chooses not Confess

– We prefer to play Confess and be free


– rather than playing not Confess and stay 1 month in jail

ˆ A rational player will not choose to play not Confess

– therefore, a rational player will choose to play Confess

The outcome reached by the two prisoners is (Confess,Confess)

ˆ This results in a worse payoff for both players than would


(Not Confess,Not confess)

– This inefficiency is a consequence of the lack of coordination

ˆ This happens in many other situations

– the arms race


– the free-rider problem in the provision of public goods

Iterated elimination

Questão 8. Can we use the idea that “rational players do not play strictly dominated strategies” to find a
solution to other games?

ˆ Consider a game (in normal form) with two players


I = {1, 2}

ˆ Player 1 has two available strategies


S1 = {U p; Down}

ˆ Player 2 has three available strategies


S2 = {Lef t; M iddle; Right}

ˆ The payoffs are given by the following matrix

Player 2
Left Middle Right
Up 1, 0 1, 2 0, 1
Player 1
Down 0, 3 0, 1 2, 0

19
ˆ for Player 1

– Up is not strictly dominated by Down


– Down is not strictly dominated by Up

ˆ for Player 2

– Right is strictly dominated by Middle


– player 2 will never play Right

ˆ if Player 1 knows that Player 2 is rational

– then Player 1 can eliminate Right from Player 2’s strategy set

ˆ then both players can play the game as if it were the following game

Player 2
Left Middle
Up 1, 0 1, 2
Player 1
Down 0, 3 0, 1

ˆ For Player 1 the strategy Down is strictly dominated by Up

ˆ If Player 2 knows that

– Player 1 is rational; and


– Player 1 knows that Player 2 is rational

then Player 2 can eliminate Down from S1

ˆ Now the game is as follows

Player 2
Left Middle
Player 1 Up 1, 0 1, 2

ˆ For Player 2 the strategy Left is strictly dominated by Middle

Observação 5. By iterated elimination of strictly dominated strategies

ˆ the outcome of the game is (Up,Middle)

Definição 8 (Iterated elimination of strictly dominated strategies). This process is called iterated elimi-
nation of strictly dominated strategies.

20
Proposição 1. The set of strategy profiles that survive to iterated elimination of strictly dominated strate-
gies is independent of the order of deletion

Drawbacks:

(i) Each step requires a further assumption about what the players know about each other’s rationality

ˆ to apply the process for an arbitrary number of steps,


we need to assume that
it is common knowledge that players are rational

Definição 9. Players’ rationality is common knowledge if


– All the players are rational
– All the players know that all the players are rational
– So on, ad infinitum

(ii) this process often produces a very imprecise prediction about the play of the game

Consider the following game

L C R
U 0, 4 4, 0 5, 3
M 4, 0 0, 4 5, 3
D 3, 5 3, 5 6, 6

ˆ There are no strictly dominated strategies to be eliminated

ˆ The process produces no prediction whatsoever about the play of the game

Questão 9. Is there a stronger solution concept than IESDS which produces much tighter predictions in a
very broad class of games?

Nash equilibrium: motivation and definition

Motivation:
Suppose that game theory makes a unique prediction about the strategy each player will choose

ˆ in order for this prediction to be compatible with incentives (or correct) it is necessary that

– each player be willing to choose the strategy predicted by the theory

21
– each player’s predicted strategies must be that player’s best response to the predicted strategies of
other players

ˆ such a prediction could be called


strategically stable or self-enforcing

– no single player wants to deviate from his or her predicted strategy

Definição 10 (Nash Equilibrium). Consider a game G = (Si , ui )i∈I .

ˆ A strategy profile s∗ = (s∗i )i∈I is a Nash equilibrium of G if for each player i, the strategy s∗i is player
i’s best response to the strategies specified in s∗ for the other players.

Formally, s∗ = (s∗i )i∈I is a Nash equilibrium if



∀i ∈ I, s∗i ∈ arg max ui (si , s∗−i ); si ∈ Si

Observação 6. The set



arg max ui (si , s∗−i ); si ∈ Si

may not be uniquely valued

Interpretation
If the theory offers as a prediction

ˆ a profile s′ = (s′i )i∈I that is not a Nash equilibrium

then

ˆ there exists at least one player that will have an incentive to deviate from the theory’s prediction

Observação 7. If a convention is to develop about how to play a given game, then the strategies prescribed
by the convention must be a Nash equilibrium, else at least one player will not abide the convention

Examples
In a 2-player game we can compute the set of NE as follows:

ˆ for each player

– and for each strategy for this player


* determine the other player’s best response to it
* underline the corresponding payoff on the matrix

22
ˆ A pair of strategies (profile) is NE if
both corresponding payoffs are underlined in the matrix

L C R
U 0, 4 4, 0 5, 3
M 4, 0 0, 4 5, 3
D 3, 5 3, 5 6, 6

Player 2
Left Middle Right
Up 1, 0 1, 2 0, 1
Player 1
Down 0, 3 0, 1 2, 0

Prisoner 2
not Confess Confess
not Confess −1,−1 −9,0
Prisoner 1
Confess 0,−9 −6,−6
Prisoner Dilemma

Nash equilibrium: a stronger solution


Consider a game G = (Si , ui )i∈I .

Proposição 2. If IESDS eliminates all but the strategy profile s∗ = (s∗ )i∈I , then s∗ is the unique NE of
the game.

Teorema 4. If the strategy profile s∗ is a NE, then s∗ survives iterated elimination of strictly dominated
strategies.

Observação 8. NE is a stronger solution concept than IESDS.

ˆ there can be strategy profiles that survive IESDS, but which are not NE

ˆ all NE survive IESDS

Questão 10. Is NE too strong? Can we be sure that a Nash equilibrium exists?

ˆ existence: Nash et al. (1950) for any finite game

ˆ multiple equilibria: next example

A classic example:
The battle of sexes
A man (Pat) and a woman (Chris) are trying to decide on an evening’s entertainment

23
ˆ while at workplaces, Pat and Chris must choose to attend either the opera or a rock concert

ˆ both players would rather spend the evening together than apart

Pat
Opera Rock
Opera 2,1 0,0
Chris
Rock 0,0 1,2

ˆ there are two NE : (Opera,Opera) and (Rock,Rock)

In some games with multiple NE, one equilibrium stands out as the compelling solution

ˆ in particular a convention can be developed

Theory’s effort:
identify such a compelling equilibrium in different classes of games
In the example above,

ˆ the NE concept loses much of its appeal as a prediction of play

– both equilibria seem equally compelling


– none can be developed as a convention

1.2 Applications
Cournot model of duopoly

A model of duopoly: Cournot (1838)

ˆ two firms 1 and 2 producing the same good (homogeneous product)

ˆ q1 and q2 : quantities produced by the firms, respectively

ˆ Q = q1 + q2 : aggregate quantity on the market

ˆ P (Q) = [a − Q]+ : market-clearing price under Q

ˆ C(qi ) = cqi : total cost to firm i of producing quantity qi

– there is no fixed cost

– the marginal cost is constant at c

– we assume c < a

ˆ firms choose their quantities simultaneously

We should translate the problem into a normal form game

ˆ There are two players, the two firm: I = {1, 2}

24
ˆ The strategies available to each firm are the different quantities Qi = [0, ∞)

ˆ An element of Qi is denoted qi

– One could reduce the set Qi to [0, a] since P (Q) = 0 for Q ≥ a

ˆ The payoff of firm i for a profile (qi , qj ) is its profit defined by



πi (qi , qj ) = qi [P (qi + qj ) − c] = qi [a − (qi + qj )]+ − c

The game is then G = (Qi , πi )i∈I

Nash equilibrium (NE)


A strategy to find a NE is to look for necessary condition (and then to check that it is sufficient)

ˆ if (q1∗ , q2∗ ) is a Nash equilibrium, then

∀i ∈ I, qi∗ ∈ arg max{πi (qi ; qj∗ ) : qi ≥ 0}

ˆ we have (
qi [(a − c − qj∗ ) − qi ] if qi < a − qj∗
πi (qi ; qj∗ ) =
−qi c if qi ≥ a − qj∗

ˆ all strategies qi ≥ a − qj∗ are SD by qi = 0. Therefore,

qi∗ ∈ arg max{πi (qi ; qj∗ ) : 0 ≤ qi < a − qj∗ }

and the first order condition is necessary and sufficient

ˆ objective function’s derivative is


∂πi
(qi , qj∗ ) = a − c − qj∗ − 2qi
∂qi

ˆ assuming that qi∗ ∈ (0, a − c) for each firm i, we have

1
qi∗ = (a − qj∗ − c)
2

which yields
a−c
qi∗ = , ∀i ∈ I
3
obs.: this is consistent with the assumption qi∗ ∈ (0, a − c)

Observação 9. There is a unique Nash equilibrium, called Cournot equilibrium

Interpretation
Each firm would like to be a monopolist in this market

25
ˆ it would choose qi to maximize πi (qi , 0). The solution is

a−c
qm =
2

and the associated profit is


(a − c)2
πi (qm ; 0) =
4
With the two firms

ˆ aggregate profits would be maximized by setting

q1 + q2 = qm

this would occur with qi = qm /2

Problem with the strategy profile (qm /2, qm /2):

ˆ the market price P (qm ) is too high

– at this price, each firm has an incentive to deviate by increasing the production
– in spite of the fact that such a deviation drives down the market price, the profit obtained still
increases

In the Cournot equilibrium,

ˆ the aggregate quantity is higher

ˆ so the associated price is lower

the temptation to increase output is reduced,

ˆ just enough that each firm i is just deterred from increasing qi

Graphical solution

ˆ if q1 < a − c, then firm 2’s best response is

1
R2 (q1 ) = (a − q1 − c)
2

Likewise,

ˆ if q2 < a − c then firm 1’s best response is

1
R1 (q2 ) = (a − q2 − c)
2

26
ˆ The two best response functions intersect only once,

at the equilibrium profile (q1∗ , q2∗ )

Cournot duopoly and iterated elimination

A third proof: iterated elimination of strictly dominated strategies

ˆ if there is a unique solution then it is a Nash equilibrium

a−c
Proposição 3. The monopoly quantity qm = 2 strictly dominates any higher quantity.

(3)
We can then consider the game G (3) = (Qi , πi )i∈I with

(3)
Qi = [0, qm ]

Proof. Step 1: Assume qm + qj < a. Then


 
a−c
πi (qm ; qj ) = qm − qj
2

while
 
a−c
πi (qm + x, qj ) = [qm + x] − x − qj
2
= πi (qm , qj ) − x(x + qj )

Step 2: Assume qm + qj > a. Then,


πi (qm + x; qj ) = −c[qm + x]

27
Proposição 4. Given that quantities exceeding qm = (a − c)/2 have been eliminated, the quantity qm /2
strictly dominates any lower quantity.

Formally,

ˆ for any x ∈ (0, qm /2] we have

πi [qm /2, qj ] > πi [qm /2 − x, qj ]; ∀qj ∈ [0, (a − c)/2]

Proof.  
3
πi (qm /2, qj ) = qm /2 (a − c) − qj
4
and
 
3
πi (qm /2 − x, qj ) = [qm /2 − x] (a − c) + x − qj
4
 
a−c
= πi (qm /2, qj ) − x + x − qj
2

After these two steps, the quantities remaining in each firm’s strategy space are those in the interval
 
a−c a−c
,
4 2

ˆ repeating these arguments leads to ever smaller intervals of remaining quantities

ˆ in the limit (we need countably many steps), these intervals converge to the single point qi∗ = a−c
3

Cournot duopoly and iterated elimination

If we add one or more firms in Cournot’s model

ˆ then the first step of elimination continues to hold

ˆ but that’s it

Observação 10. IESDS yields only the imprecise prediction that each firm’s quantity will not exceed the
monopoly quantity

Example: Three firms

28
ˆ Q−i : sum of the quantities chosen by the firms other than i
(
qi (a − qi − Q−i − c) if qi + Q−i < a
πi (qi ; Q−i ) =
−cqi if qi + Q−i > a

ˆ it is again true that qm strictly dominates any higher quantity

∀x > 0; πi (qm ; Q−i ) > πi (qm + x; Q−i ); ∀Q−i > 0

Each firm reduces its strategy set to [0, qm ], but

ˆ no further strategies can be eliminated

Proposição 5. No quantity qi ∈ [0, qm ] is strictly dominated

Proof. For each qi ∈ [0, qm ] there is a Q−i such that

qi ∈ arg max{πi (qi′ , Q−i ) : qi′ ∈ [0, qm ]}

In effect, we know that Q−i ∈ [0, 2qm ] = [0, a − c]. Fix qi ∈ [0, qm ] and recall that
(
qi (a − qi − Q−i − c) if qi + Q−i < a
πi (qi ; Q−i ) =
−cqi if qi + Q−i > a.

Then, the FOC is satisfied for qi and Q−i = a − c − 2qi .

Bertrand model of duopoly

We consider a different model of how two duopolists might interact

ˆ Bertrand (1883) suggested that firms actually choose prices, rather than quantities as in Cournot’s model

We consider the case of differentiated products

ˆ firms 1 and 2 choose prices p1 and p2 , respectively

ˆ the quantity that consumers demand from firm i’s product is

qi (pi , pj ) = [a − pi + bpj ]+ 0<b<2

ˆ b reflects the extent to which


firm i’s product is a substitute for firm j’s product

29
Observação 11. This is an unrealistic demand function

ˆ demand for firm i’s product is positive

ˆ even when firm i charges an arbitrarily high price,

provided firm j also charges a high enough price.

ˆ there are no fixed costs of production

ˆ marginal cost of production is constant at a value c ∈ (0, a)

The normal form game:

ˆ the set of players is I = {1, 2}

ˆ the strategy set Pi of player i is Pi = [0, ∞)

ˆ the payoff function corresponds to profits:

πi (pi , pj ) = qi (pi , pj )[pi − c] = [a − pi + bpj ]+ (pi − c)

The game is G = (Pi , πi )i∈I

Nash equilibrium:
the price pair (p∗1 , p∗2 ) is a Nash equilibrium if,

ˆ for each firm i the price p∗i solves

max πi (pi , p∗j ) = max [a − pi + bp∗j ][pi − c]


0≤pi <∞ c<pi <a+bp∗j

ˆ objective function’s derivative is


∂πi
(pi , p∗j ) = a + c + bp∗j − 2pi
∂pi
and, therefore, the solution to firm i’s optimization problem is

1
p∗i = (a + bp∗j + c)
2

If (p∗1 , p∗2 ) is a Nash equilibrium, one must have

1 1
p∗1 = (a + bp∗2 + c) and p∗2 = (a + bp∗1 + c)
2 2

30
ˆ if b < 2 then the unique Nash equilibrium is

a+c
p∗1 = p∗2 =
2−b

Final-offer arbritation

Consider a firm and a union which dispute wages

(i) the firm and the union simultaneously make offers

ˆ the firm offers the wage wf


ˆ the union offers the wage wu

(ii) an arbitrator chooses one of the two offers as the settlement

ˆ x: ideal settlement arbitrator would like to impose

The decision rule is as follows:

ˆ after observing the parties’ offers, wf and wu ,


the arbitrator simply chooses the offer that is closer to x

Provided that wf < wu

ˆ the arbitrator chooses wf if x < (wf + wu )/2

ˆ the arbitrator chooses wu if x > (wf + wu )/2

ˆ the arbitrator flips a coin if x = (wf + wu )/2

The arbitrator knows x but the parties do not

ˆ the parties believe that x is randomly distributed according to a probability measure µ on the Borelian
sets of [0, 1)

ˆ the cumulative probability distribution is denoted by F

F (x̄) ≡ Pr{x ≤ x̄} = µ[0, x̄]

31
ˆ F : [0, 1) → [0, 1] is differentiable, with derivative f

ˆ f represents the density function, i.e.,


Z Z
h(x)µ(dx) = h(x)f (x)dx
[0,∞) [0,∞)

for every Borel measurable function h : [0, ∞) → R+

Given the offers wf and wu , the parties believe that

ˆ wf is chosen under probability


   
wf + wu wf + wu
Pr{wf chosen} = µ 0, =F
2 2

ˆ wu is chosen under probability


   
wf + wu wf + wu
Pr{wu chosen} = µ ,∞ =1−F
2 2

and, therefore, expected wage settlement is given by

E(w) = wf × Pr{wf chosen} + wu × Pr{wu chosen}


    
wf + wu wf + wu
= wf F + wu 1 − F
2 2

We assume that

ˆ the firm wants to minimize E(w)

ˆ the union wants to maximize E(w)

If (wf∗ , wu∗ ) is a Nash equilibrium, then

ˆ wf∗ must solve


     
wf + wu∗ ∗ wf + wu∗
min wf F + wu 1 − F
0≤wf <∞ 2 2

ˆ wu∗ must solve   ∗    ∗ 



wf + wu wf + wu
max wf F + wu 1 − F
0≤wu <∞ 2 2

Suppose that (wf∗ , wu∗ ) is strictly positive

ˆ FOC for the firm’s problem

wu∗ + wf∗ wu∗ + wf∗


   
1
(wu∗ − wf∗ ) × f =F
2 2 2

32
ˆ FOC for the union’s problem

wu∗ + wf∗ wu∗ + wf∗


   
1
(wu∗ − wf∗ ) × f =1−F
2 2 2

Therefore,
wu∗ + wf∗
 
1
F =
2 2
The average of the offers must equal

ˆ the median of the arbitrator’s preferred settlement

wu∗ + wf∗
 
1
F =
2 2

The gap between the offers must equal

ˆ the inverse of the value of the density function

ˆ at the median of the arbitrator’s preferred settlement

1
wu∗ − wf∗ =  w∗ +w∗ 
u f
f 2

An example:
Suppose the arbitrator’s preferred settlement is normally distributed with mean m and variance σ 2 , i.e.,
 
1 1 2
f (x) = √ exp − (x − m)
2πσ 2 2σ

ˆ the median of the distribution equals the mean m

– the normal distribution is symmetric around its mean,

The necessary conditions are then translated into

wu∗ + wf∗ 1 √
=m and wu∗ − wf∗ = = 2πσ 2
2 f (m)

The Nash equilibrium offers are


r r
πσ 2 πσ 2
wu∗ =m+ and wf∗ =m−
2 2

In equilibrium,

33
ˆ the parties’ offers are centered around m

– m: expectation of the arbitrator’s preferred settlement

ˆ the gap between the offers increases with σ 2

– σ 2 : parties’ uncertainty about the arbitrator’s preferred settlement

A more aggressive offer

ˆ lower offer by the firm

ˆ higher offer by the union

yields a better payoff if it is chosen by the arbitrator

ˆ but is less likely to be chosen

When there is more uncertainty (i.e., σ 2 higher)

ˆ the parties can afford to be more aggressive

When there is hardly any uncertainty, in contrast,

ˆ neither party can afford to make an offer far from the mean

The problem of the Commons

ˆ consider the n farmers in a village: I = {1, · · · , n}

ˆ each summer, all the farmers graze their goats on the village green

ˆ during the spring, the farmers simultaneously choose how many goats to own

Let

ˆ gi : number of goats owned by farmer i

ˆ G = g1 + · · · + gn : total number of goats in the village

ˆ c > 0: the cost of buying and caring for a goat

ˆ v(G): the value (per goats) to a farmer of grazing a goat on the green

ˆ goats are continuously divisible

34
ˆ v : [0, Gmax ] → R+ is

– twice continuously differentiable


– v ′ < 0 and v ′′ < 0

The normal-form representation:

ˆ a strategy for farmer i is gi

ˆ the strategy space is Gi = [0, ∞) (we could have chosen Gi = [0, Gmax])

ˆ the payoff to farmer i

– from grazing gi goats


– when the numbers of goats of the other farmers are g−i

is
πi (gi , g−i ) = gi v(gi + σ[g−i ]) − cgi
P
where σ[g−i ] = k6=i gk

If (gi∗ )i∈I is a Nash equilibrium

ˆ then gi∗ is a solution to


 ∗

max gi v(gi + σ[g−i ]) − cgi
gi ≥0

ˆ if gi∗ > 0, then the FOC is


v(gi∗ + σ[g−i

]) + gi∗ v ′ (gi∗ + σ[g−i ]) − c = 0

Summing over all farmers and dividing by n, we get

1 ∗ ′ ∗
v(G∗ ) + G v (G ) − c = 0
n
P
where G∗ denotes i∈I gi∗
Social optimum

A social planner decides how many goats the “society” should graze on the village green

35
ˆ the planner should solve
max {Gv(G) − Gc}
G≥0

independent on the way the social profit should be divided

ˆ the FOC is
v(Gs ) + Gs v ′ (Gs ) − c = 0

Lema 1. One must have G∗ > Gs .

Observação 12. Too many goats are grazed in the Nash equilibrium, compared to the social equilibrium

ˆ The common resource is overutilized

When a farmer considers the effect of adding one more goat, he focuses on

ˆ the cost of production: c

ˆ the additional benefit: v(gi + σ[g−i ∗])

ˆ the harm to his other goats: gi v ′ (gi + σ[g−i ∗])

He does not care about the effect of his action on the other farmers

ˆ this is the reason we have G∗ v ′ (G∗ )/n and not GS v ′ (GS )

1.3 Mixed strategies and existence of equilibrium


Non-existence: Matching pennies
Consider the following game

ˆ There are two players I = {i1 , i2 }

ˆ Each player’s strategy space is Si = {Heads, T ails}

ˆ The payoff of the game is as follows:

– Each player has a penny and must choose whether to display it with heads or tails facing up
– If the two pennies match then player i2 wins player i1 ’s penny
– If the pennies do not match then i1 wins i2 ’s penny

Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1

36
Proposição 6. There is no Nash equilibrium.

Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1

If the players’ strategies

ˆ match then player i1 prefers to switch strategies

ˆ do not match then i2 prefers to switch

This situation occurs in many games

ˆ Poker, battle

To overcome this difficulty, we introduce the notion of a mixed strategy

Mixed strategies

A mixed strategy is a probability measure (distribution) over the strategies in Si

ˆ A strategy in Si is called a pure strategy

ˆ The set of mixed strategies is denoted by P rob(Si ) or ∆Si

Definição 11. A mixed strategy p = (p(si ))si ∈Si of player i is a vector in RSi satisfying
X
∀si ∈ Si , psi = p(si ) ≥ 0 and p(si ) = 1
si ∈Si

ˆ if the mixed strategy p is such that there exists ŝi ∈ Si satisfying


(
0 if si 6= ŝi
∀si ∈ Si , p(si ) =
1 if si = ŝi

then p is denoted Dirac(si ) or 1si and (abusing notations) is assimilated with the pure strategy ŝi

Interpretation
A family p−i = (pj )j6=i of mixed strategies pj ∈ ∆(Sj ) can represent

ˆ agent i’s uncertainty about

ˆ which strategy each other agent j will play

37
Notation 5. The expected value of agent i’s payoff if he plays si believing that the other players will play
according to p−i is denoted by
ui (si , p−i )

and is defined by  
X Y
ui (si , p−i ) ≡ Ep−i [ui (si )] =  pj (sj ) ui (si , s−i )
s−i ∈S−i j6=i
| {z }
p−i (s−i )

Notation 6. If pi is a mixed strategy in ∆(Si ) we let p = (pj )j∈I and the expected value
 
X Y
Ep [ui ] =  pj (sj ) ui (si1 , . . . , sin )
s∈S j∈J
X
= pi1 (si1 ) . . . pin (sin )ui (si1 , . . . , sin )
s∈S

is denoted by
ui (p)

Observe that
X
ui (p) = pi (si )ui (si , p−i )
si ∈Si

Definição 12. We say that

ˆ there is no belief that player i could hold about the strategies the other players will choose

ˆ such that it would be optimal to play si

when
Y
∀p−i ∈ ∆(Sj ), / arg max{Ep−i [ui (si )] : si ∈ Si }
si ∈
j6=i

In other words,

ˆ for every belief p−i that agent i could hold about the others,

ˆ there exists a pure strategy ŝi ∈ Si such that

Ep−i [ui (si )] < Ep−i [ui (ŝi )]

ˆ be careful, the strategy ŝi may depend on the belief p−i .

38
Proposição 7. Assume that the pure strategy si is strictly dominated by the pure strategy σi

∀s−i ∈ S−i , ui (si , s−i ) < ui (σi , s−i )

Then

ˆ there is no belief that player i could hold about the strategies the other players will choose such that
it would be optimal to play si .

More precisely, for every family p−i = (pj )j6=i of mixed strategies pj ∈ ∆(Sj ), we have

Ep−i [ui (si )] < Ep−i [ui (σi )]

In this case, the strategy σi improves the expected payoff independently of the belief p−i agent i holds about
the other players’ actions

Observação 13. The converse may not be true

Consider the following game


Player i2
L R
T 3,− 0,−
Player i1 M 0,− 3,−
B 1,− 1,−
For any belief pi2 agent i1 may have about i2 ’s strategies, the strategy B is never a best response
ˆ if pi2 (L) > 1/2 then i1 ’s best response is T

ˆ if pi2 (L) < 1/2 then i1 ’s best response is M

ˆ if pi2 (L) = 1/2 then i1 ’s best response is either T or M


However, the strategy B is not strictly dominated by another pure strategy

Consider the mixed strategy pi1 defined by

pi1 (T ) = 1/2, pi1 (M ) = 1/2 and pi1 (B) = 0

Such a probability will be denoted by1


pi1 = (1/2, 1/2, 0)

For any belief pi2 agent i1 may have about i2 ’s strategies,

ui1 (B, pi2 ) = ui1 (1B , pi2 ) = 1 < 3/2 = ui1 (pi1 , pi2 )

1
Sometimes one my find the notations: pi1 = 21 Dirac(T ) + 12 Dirac(M ) or pi1 = 21 1T + 12 1M .

39
Observação 14. The strategy B is strictly dominated by the mixed strategy pi1 = (1/2, 1/2, 0)

Observação 15. A given pure strategy can be a best response to a mixed strategy

ˆ even if the pure strategy is not a best response to any other pure strategy

Player i2
L R
T 3,− 0,−
Player i1 M 0,− 3,−
B 2,− 2,−

ˆ The pure strategy B is not a best response for player i1 to either L or R by player i2

ˆ but B is the best response for player i1 to the mixed strategy pi2 by player i2 provided that

1 2
< pi2 (L) <
3 3

Existence of Nash equilibrium

Nash equilibrium with mixed strategies


We fix a game G = (Si , ui )i∈I

Definição 13. A profile of mixed strategies p∗ = (p∗i )i∈I is a Nash equilibrium of the game G if

ˆ each player’s mixed strategy is a best response to the other players’ mixed strategies,

∀i ∈ I, p∗i ∈ arg max ui (pi , p∗−i ) : pi ∈ ∆(Si ) .

The family p−i = (pj )j6=i represents player i’s uncertainty about which strategy each player j will choose

Observação 16. Fix three players i, j and k.


What player j believes about the possible strategies played by player i coincides with what player k believes

Consider an abstract game G = (Si , ui )i∈I

ˆ fix a family pi−i = (pij )j6=i of mixed strategies representing player i’s beliefs about player j’s strategies

ˆ denote by Si∗ (pi−i ) the set of pure strategies best response of player i defined by

Si∗ (pi−i ) ≡ arg max{ui (si , pi−i ) : si ∈ Si }

– assume that Si is finite, then Si∗ (pi−i ) is non-empty

40
ˆ if pi is a mixed strategy in ∆(Si ), we denote by supp pi its support defined by

supp pi = {pi > 0} = {si ∈ Si : pi (si ) > 0}

Proposição 8. A mixed strategy p∗i is a best response to pi−i , i.e.,

p∗i ∈ arg max{ui (pi , pi−i ) : pi ∈ ∆(Si )}

if and only if the support of p∗i is a subset of all pure strategies that are best response to pi−i , i.e.,

{si ∈ Si : p∗i (si ) > 0} ≡ supp p∗i ⊂ Si∗ (pi−i )

In other words the set


arg max{ui (pi , pi−i ) : pi ∈ ∆(Si )}

of best responses to pi−i coincides with

P rob(Si∗ (pi−i )) = ∆(Si∗ (pi−i ))

NE with mixed strategies: An equivalent definition

Teorema 7. A profile of mixed strategies p∗ = (p∗i )i∈I is a Nash equilibrium of the game G if and only if

ˆ for every player i every pure strategy in the support of p∗i is a best response to the other players’ mixed
strategies

∀i ∈ I, supp p∗i ⊂ arg max{ui (si , p∗−i ) : si ∈ Si }

Interpretation 8. Players

ˆ have identical beliefs about other players’ possible actions or strategies

ˆ choose best response strategies consistent with these beliefs

Matching pennies

Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1

ˆ suppose that player i1 believes that player i2 will play

– Heads with probability q and


– Tails with probability 1 − q

41
ˆ given this belief we have

ui1 (Heads, (q, 1 − q)) = 1 − 2q and ui1 (T ails, (q, 1 − q)) = 2q − 1

ˆ player i1 ’s best response(s) is

– Heads if q < 1/2


– T ails if q > 1/2
– Heads and T ails if q = 1/2

Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1

Fix now a mixed strategy pi1 = (r, 1 − r) for player i1 , i.e.,

pi1 (Heads) = r and pi1 (T ails) = 1 − r

ˆ If agent i1 believes that i2 is playing the mixed strategy pi2 = (q, 1 − q)

ˆ Then we can compute the set of best responses

βi1 (pi2 ) ≡ arg max{ui1 (pi1 , pi2 ) : pi1 ∈ ∆(Si1 )}

ˆ Remember that we must have


βi1 (pi2 ) = P rob(Si∗1 (pi2 ))

ˆ since Si1 = {Head, T ails}, there are only three possibilities

βi1 (pi2 ) = {Heads}, βi1 (pi2 ) = {T ails} or βi1 (pi2 ) = ∆(Si1 )

Observe that
ui1 (pi1 , pi2 ) = (2q − 1) + r(2 − 4q)

The mixed strategy pi1 = (r, 1 − r) solves

pi1 ∈ arg max{ui1 (qi1 , pi2 ) : qi1 ∈ P rob(Si1 )}

if and only if r belongs to the set

r ∗ (q) = arg max{(2q − 1) + r(2 − 4q) : r ∈ [0, 1]}

ˆ if q < 1/2 then r ∗ (q) = 1 and i1 ’s best response is to play the pure strategy Heads

ˆ if q > 1/2 then r ∗ (q) = 0 and i1 ’s best response is to play the pure strategy T ails

42
ˆ if q = 1/2 then r ∗ (q) = [0, 1] and any mixed strategy is a best response, i.e., i1 is indifferent between
Heads and T ails

Observação 17. The object q 7→ r ∗ (q) is called a correspondence.

Player i1 ’s best response (r ∗ (q), 1 − r ∗ (q)) to i2 ’s strategy (q, 1 − q)

Player i2
Heads Tails
Heads −1,1 1,−1
Player i1
Tails 1,−1 −1,1

ˆ assume now that player i2 plans to choose a mixed strategy pi2 = (q, 1 − q) i.e.,

pi2 (Heads) = q and pi2 (T ails) = 1 − q

ˆ if agent i2 believes that i1 is playing the mixed strategy pi1 = (r, 1 − r)

ˆ then we can compute the set of best responses

arg max{ui2 (pi1 , qi2 ) : qi2 ∈ ∆(Si2 )}

Observe that
ui2 (pi1 , pi2 ) = q(4r − 2) + (1 − r)

A mixed strategy pi2 = (q, 1 − q) solves (2) if q belongs to the set

q ∗ (r) = arg max{q(4r − 2) + (1 − r) : q ∈ [0, 1]}

ˆ if r < 1/2 then q ∗ (r) = 0 and i2 ’s best response is to play the pure strategy T ails

ˆ if r > 1/2 then q ∗ (r) = 1 and i2 ’s best response is to play the pure strategy Heads

ˆ if r = 1/2 then q ∗ (r) = [0, 1] and any mixed strategy is a best response, i.e., i2 is indifferent between
Heads and T ails

43
Player i2 ’s best response (q ∗ (r), 1 − q ∗ (r)) to i1 s strategy (r, 1 − r)

Permuting q and r we get the following graph

We can draw in the same picture the best response correspondence of each player

A Nash equilibrium is a pair (p∗i1 , p∗i2 ) such that

p∗i ∈ arg max{ui (pi , p∗j ) : pi ∈ ∆(Si )}

ˆ The pair defined by p∗i1 = (r̂, 1 − r̂) and p∗i2 = (q̂, 1 − q̂)
is a Nash equilibrium if and only if
r̂ ∈ r ∗ (q̂) and q̂ ∈ r ∗ (r̂)

44
ˆ The unique Nash equilibrium of the Matching Pennies is then

p̂i1 = (1/2, 1/2) and p̂i2 = (1/2, 1/2)

The battle of sexes


Pat
Opera Fight
Opera 2,1 0,0
Chris
Fight 0,0 1,2
Denote by

ˆ (q, 1 − q) the mixed strategy in which Pat plays Opera with probability q

ˆ (r, 1 − r) the mixed strategy in which Chris plays Opera with probability r

3 NE with mixed strategies

1. Pat and Chris play the pure strategy Opera

2. Pat and Chris play the pure strategy F ight

3. Pat plays the mixed strategy where Opera is chosen with probability 1/3 and Chris plays the mixed
strategy where Opera is chosen with probability 2/3

2 players with 2 pure strategies


Consider the problem of defining player i1 ’s best response (r, 1 − r) when player i2 plays (q, 1 − q)

Player i2
Left Right
Up x,− y,−
Player i1
Down z,− w,−
We discuss the four following cases

(i) x > z and y > w

45
(ii) x < z and y < w

(iii) x > z and y < w

(iv) x < z and y > w

Then we turn to the remaining cases involving x = z or y = w

ˆ In case (i), the pure strategy U p strictly dominates Down

ˆ In case (ii), the pure strategy Down strictly dominates U p

ˆ In cases (iii) and (iv), neither U p nor Down is strictly dominated

ˆ let q ′ = (w − y)/(x − z + w − y)

ˆ In case (iii) U p is optimal for q > q ′ and Down for q < q ′ , whereas in case (iv) the reverse is true

ˆ In both cases, any value of r is optimal when q = q ′

ˆ Observe that q ′ = 1 if x = z and q ′ = 0 if y = w

ˆ For cases involving x = z or y = w the best response correspondences are L-shaped (two adjacent sides
of the unit square)

ˆ If we add arbitrary payoffs for player i2

ˆ Then we can perform analogous computations and get the same 4 best-response correspondences

46
ˆ Fix any of the four best response correspondence for player i1

ˆ Fix any of the four best response correspondence for player i2

ˆ Checking all 16 possible pairs, there is always at least one intersection

We obtain the following qualitative features that can result: There can be

ˆ a single pure strategy Nash equilibrium

ˆ a single mixed strategy equilibrium

ˆ 2 pure strategy equilibria and a single mixed strategy equilibrium

Nash existence result

Teorema 9 (Nash). Consider a game G = (Si , ui )i∈I . If for each player i the set of pure strategies Si is
finite then there exists at least one Nash equilibrium with mixed strategies.

47
General existence result

Teorema 10. Consider a game G = (Si , ui )i∈I


and assume that for each player i,

(1) the set Si is a compact, convex and non-empty subset of Rni for some ni ∈ N
Q
(2) the payoff function s → ui (s) is continuous on S = i∈I Si

(3) for each s−i ∈ S−i , the function si → ui (si , s−i ) is quasi-concave in the sense that

∀si ∈ Si , {s′i ∈ Si : ui (s′i , s−i ) ≥ ui (si , s−i )} is convex

Then there exists at least one pure strategy Nash equilibrium

48
Cap. 2 - Dynamic games of complete information
2.1 Dynamic games of complete and perfect information
Theory: Backwards induction

Important words

ˆ we introduce dynamic games

ˆ restrict our attention to games with complete information

– the players’ payoff functions are common knowledge

ˆ in this chapter we analyze dynamic games with complete but also perfect information

– at each move in the game


– the player with the move knows the full history of the play of the game thus far

ˆ the central issue in dynamic games is credibility

An example:
Consider the following 2-move game

1. player i1 chooses between giving player i2 $1,000 and giving player i2 nothing

2. player i2 observes player i1 ’s move and then chooses whether or not to explode a grenade that will kill
both players

Suppose that player i2 threatens to explode the grenade unless player i1 pays the $1, 000

ˆ if player i1 believes the threat, then


player i1 ’s best response is to pay the $1, 000

ˆ but player i1 should not believe the threat, because it is not credible:

– if player i2 were given the opportunity to carry out the threat


– player i2 would choose not to carry it out

ˆ player i1 should pay player i2 nothing

The framework
We analyze in this chapter the following class of dynamic games with complete and perfect information

ˆ there are 2 players and 2 moves

ˆ first, player i1 moves

49
ˆ then player i2 observes player i1 ’s move

ˆ then player i2 moves and the game ends

Description of a specific class of games

1. player i1 chooses an action ai1 from a feasible set Ai1

2. player i2 observes ai1 and then chooses an action ai2 from a feasible set Ai2

3. payoffs are ui1 (ai1 , ai2 ) and ui2 (ai1 , ai2 )

Other dynamic games with complete and perfect information


The key features of a dynamic game of complete and perfect information are that

1. the moves occur in sequence

2. all previous moves are observed before the next move is chosen

3. the players’ payoffs from each feasible combination of moves are common knowledge

Backwards induction
We solve a game from this class by backwards induction as follows:

ˆ when player i2 gets the move at the second stage of the game

ˆ he will face the following problem

max {ui2 (ai1 , ai2 ) : ai1 given}


ai2 ∈Ai2

ˆ assume that for each ai1 ∈ Ai1 , player i2 ’s optimization problem has a unique solution, denoted by Ri2 (ai1 )

– this is player i2 ’s reaction (or best response) to player i1 ’s action

ˆ recall that payoffs are common knowledge

– therefore player i1 can solve i2 ’s problem as well as i2 can

ˆ player i1 will anticipate player i2 ’s reaction to each action ai1 that i1 might take

ˆ thus player i1 ’s problem at the first stage amounts to

max {ui1 (ai1 , Ri2 (ai1 ))}


ai1 ∈Ai1

ˆ assume that the previous optimization problem for i1 also has a unique solution, denoted by a∗i1

50
Definição 14. The pair of actions (a∗i1 , Ri2 (a∗i1 )) is called the backwards induction outcome of this game

Backwards induction and credible threats

ˆ the backwards induction outcome does not involve non-credible threats

ˆ player i1 anticipates that player i2 will respond optimally to any action ai1 that i1 might choose, by
playing Ri2 (ai1 )

ˆ player i1 gives no credence to threats by player i2 to respond in ways that will not be in i2 ’s self-interest
when the second stage arrives

A 3-move game
Consider the following 3-move game: player i1 moves twice

1. Player i1 chooses L or R

ˆ L ends the game with payoffs of 2 to i1 and 0 to i2

2. Player i2 observes i1 ’s choice:


if i1 chose R then i2 can choose between L′ and R′

ˆ L′ ends the game with payoffs of 1 to both players

3. Player i1 observes i2 ’s choice2 :


if the earlier choices were R and R′ then i1 chooses L′′ of R′′ , both of which end the game

ˆ L′′ with payoffs of 3 to player i1 and 0 to player i2


ˆ R′′ with payoffs of 0 to player i1 and 2 to player i2

2
And recalls his own choice in the first stage

51
Let’s compute the backwards induction outcome of this game

ˆ we begin at the third stage, i.e., player i1 ’s second move

– the strategy L′′ is optimal

ˆ at the second stage, player i2 anticipates that if the game reaches the third stage then i1 will play L′′

– payoff of 1 from action L′

– payoff of 0 from action R′

at the second stage, the optimal action for player i2 is L′

ˆ at the first stage, player i1 anticipates that if the game reaches the second stage then i2 will play L′

– payoff of 2 from action L

– payoff of 1 from action R

the first stage choice for player i1 is L, thereby ending the game

Stackelberg model of duopoly

Von Stackelberg (1934) proposed a dynamic model of duopoly

ˆ a dominant (leader) firm moves first

ˆ a subordinate (follower) firm moves second

At some points in the history of the U.S. automobile history, for example, General Motors has seemed to play
such a leadership role

ˆ as in the Cournot model, Stackelberg assumes that firms choose quantities

Timing of the game

1. firm i1 chooses the quantity qi1

2. firm i2 observes qi1 and then chooses a quantity qi2

3. the payoff to firm i is given by the profit function

πi (qi , qj ) = qi [P (Q) − c]

where

52
ˆ P (Q) = [a − Q]+ is the market-clearing price when the aggregate quantity on the market is Q =
q i1 + q i2
ˆ c is the constant marginal cost of production (no fixed costs)

Solving by backwards induction

ˆ we first compute i2 ’s reaction to an arbitrary quantity of i1

Ri2 (qi1 ) ≡ arg max{πi2 (qi1 , qi2 ) : qi2 ≥ 0}

which yields ( a−c−qi1


2 if q i1 < a − c
Ri2 (qi1 ) =
0 if q i1 ≥ a − c

ˆ second, i1 can solve i2 ’s problem as well as i2 can solve it

ˆ firm i1 should anticipate that the quantity choice qi1 will be met with the reaction Ri2 (qi1 )

ˆ Firm i1 ’s problem in the first stage of the game amounts to

arg max{πi1 (qi1 , Ri2 (qi1 )) : qi1 ≥ 0}

ˆ The backwards induction outcome of the Stackelberg duopoly game is (qi∗1 , qi∗2 ) where

a−c a−c
qi∗1 = and qi∗2 = Ri2 (qi∗1 ) =
2 4

Interpretation

ˆ in the Nash equilibrium of the Cournot game (simultaneous moves) each firm produces (a − c)/3

– thus aggregate quantity in the backwards induction outcome of the Stackelberg game, 3(a − c)/4, is
greater than in the Cournot-Nash equilibrium
– so the market clearing price is lower in the Stackelberg game

ˆ in the Stackelberg game, i1 could have chosen its Cournot quantity, (a − c)/3

– in which case i2 would have responded with its Cournot quantity

53
ˆ in the Stackelberg game, i1 could have achieved its Cournot profit level but chose to do otherwise

ˆ so i1 ’s profit in the Stackelberg game must exceed its profit in the Cournot game

ˆ but because the market clearing price is lower in the Stackelberg game

ˆ the aggregate profits are lower wrt. the Cournot outcome

ˆ therefore, the fact that i1 is better off implies that i2 is worse off

Observação 18. In game theory, having more information can make a player worse off.

ˆ more precisely, having it known to the other players that one has more information can make a player
worse off

ˆ in the Stackelberg game, the information in question is i1 ’s quantity

ˆ firm i2 knows i1 ’s action qi1

ˆ and firm i1 knows that i2 knows qi1

Wages and employment in a unionized firm

Leontief (1946) proposed the following model of the relationship between a firm and a monopoly union

ˆ The union is the monopoly seller of labor to the firm

ˆ The union has exclusive control over wages

ˆ But the firm has exclusive control over employment

ˆ The union’s utility function is U (w, L) where

– w is the wage the union demands from the firm


– L is employment

ˆ We assume that (w, L) → U (w, L) is increasing in both w and L

ˆ The firm’s profit function is


π(w, L) ≡ R(L) − wL

– R(L): revenue the firm can earn if it employs L workers

ˆ We assume that L 7→ R(L) is

– twice continuously differentiable

54
– strictly increasing (i.e., R′ > 0)
– strictly concave (i.e., R′′ < 0) and
– satisfies Inada’s condition at 0 and ∞, i.e.,

lim R′ (L) = ∞ and lim R(L) = 0


L→0+ L→∞

Timing of the game

1. The union makes a wage demand, w

2. The firm observes and accepts w and then chooses employment, L

3. Payoffs are U (w, L) and π(w, L)

Backwards induction outcome of the game

ˆ First, we can characterize the firm’s best response L∗ (w) in stage 2 to an arbitrary wage demand w by
the union in stage 1

ˆ Given w the firm chooses L∗ (w) to solve

L∗ (w) ≡ arg max{π(w, L) : L ≥ 0}

ˆ If w > 0 then there is a unique solution L∗ (w) satisfying

R′ (L∗ (w)) = w

Firm’s isoprofit curves:

55
Fixing the wage level w′ on the vertical line

ˆ the firm’s choice of L is a point on the horizontal line {(L, w′ ) : L ≥ 0}

ˆ Holding L fixed, the firm does better when w is lower

– optimal L is such that the isoprofit curve through (L, w) is tangent to the constraint {(L, w′ ) : L ≥ 0}

Union’s indifference curves

ˆ Holding L fixed, the union does better when w is higher

ˆ Higher indifference curves represent higher utility levels for the union

We turn to the union’s problem at stage 1

ˆ The union can solve the firm’s second stage problem as well as the firm can solve it

ˆ The union should anticipate that the firm’s reaction to the wage demand w will be to choose the employ-
ment level L∗ (w)

ˆ Thus, the union’s problem at stage 1 amounts to solve

arg max{U (w, L∗ (w)) : w > 0}

ˆ The union would like to choose the wage demand w that yields the outcome (w, L∗ (w)) that is on the
highest possible indifference curve

56
The solution to the union’s problem, w, is the wage demand such that

ˆ the union’s indifference curve through the point (L∗ (w∗ ), w∗ ) is tangent to the curve {L∗ (w) : w > 0} at
that point

Inefficiency

ˆ The backwards induction outcome (w∗ , L∗ (w)) is inefficient

ˆ Both the union’s utility and the firm’s profit would be increased it (L, w) were in the shaded region

Repeated games
Espinosa and Rhee (1989) propose one answer to this puzzle

ˆ Based on the fact that the union and the firm negotiate repeatedly over time

ˆ There may exist an equilibrium of such a repeated game in which the union’s choice of w and the firm’s
choice of L lie in the shaded region

57
Sequential bargaining

ˆ Two players are bargaining over one dollar

ˆ They alternate in making offers

ˆ First player i1 makes a proposal that i2 can accept or reject

ˆ If i2 rejects then i2 makes a proposal that i1 can accept and reject

ˆ And so on

ˆ Each offer takes one period, and the players are impatient

– they discount payoffs received in later periods by a factor δ ∈ (0, 1) per period

Discount factor
The discount factor δ reflects the time-value of money

ˆ A dollar received at the beginning of one period can be put in the bank to earn interest, say at rate r per
period

– So this dollar will be worth 1 + r dollars at the beginning of the next period

ˆ Equivalently, a dollar to be received at the beginning of the next period is worth only 1/(1 + r) of a dollar
now

Let δ = 1/(1 + r). Then, a payoff π to be received

ˆ in the next period is worth only δπ now

ˆ two periods from now is worth only δ2 w now, and so on

Observação 19. The value today of a future payoff is called the present value of that payoff.

The 3-period case


Timing of 3-period bargaining game

(1a) At the beginning of the first period, player i1 proposes to take a share s1 of the dollar, leaving 1 − s1 for
player i2

(1b) Player i2 either

ˆ accepts the offer:


the game ends and the payoffs s1 to i1 and 1 − s1 to i2 are immediately received
ˆ rejects the offer,
play continues to the second period

58
(2a) At the beginning of the second period, i2 proposes that player i1 take a share s2 of the dollar,3 leaving
1 − s2 for i2

(2b) Player i1 either

ˆ accepts the offer:


the game ends and the payoffs s2 to i1 and 1 − s2 to i2 are immediately received
ˆ rejects the offer:
play continues to the third period

(3) At the beginning of the third period,

ˆ i1 receives a share s of the dollar


ˆ i2 receives a share 1 − s of the dollar

where s ∈ (0, 1) is exogenously given

Backwards induction outcome


We first compute i2 ’s optimal offer if the second period is reached

ˆ Player i1 can receive s in the third period by rejecting i2 ’s offer of s2 this period

ˆ But the value this period of receiving s next period is only δs

ˆ Thus, i1 will

– accept s2 if s2 ≥ δs
– reject s2 if s2 < δs

ˆ We assume that each player will accept an offer if indifferent between accepting and rejecting

ˆ Player i2 ’s decision problem in the second period amounts to choosing between

– receiving 1 − δs this period by offering s2 = δs to player i1


– receiving 1 − s next period by offering player i1 any s2 < δs

ˆ The discounted value of the latter decision is δ(1 − s),

– which is less than 1 − δs available from the former option

ˆ So player i2 ’s optimal second-period offer is s∗2 = δs

3
st always goes to player i1 regardless of who made the offer

59
Observação 20. If play reaches the second period, player i2 will offer s∗2 and player i1 will accept.

ˆ Since i1 can solve i2 ’s second-period problem as well as player i2 can

ˆ Then i1 knows that i2 can receive 1 − s∗2 in the second period by rejecting i1 ’s offer of s1 this period

ˆ The value this period of receiving 1 − s∗2 next period is only δ(1 − s∗2 )

ˆ Thus player i2 will accept i1 ’s offer of s1 this period ⇔

1 − s1 ≥ δ(1 − s∗2 ) or s1 ≤ 1 − δ(1 − s∗2 )

ˆ Player i1 ’s first-period decision problem therefore amounts to choosing between

– receiving 1 − δ(1 − s∗2 ) this period by offering 1 − s1 = δ(1 − s∗2 ) to i2


– receiving s∗2 next period by offering 1 − s1 < δ(1 − s∗2 ) to i2

ˆ The discounted value of the latter option is δs∗2 = δ2 s

– which is less than the 1 − δ(1 − s∗2 ) = 1 − δ(1 − δs) available from the former option

ˆ Thus player i1 ’s optimal first-period offer is

s∗1 = 1 − δ(1 − s∗2 ) = 1 − δ(1 − δs)

Observação 21. The backwards induction outcome of this 3-period game is

ˆ i1 offers the settlement (s∗1 , 1 − s∗1 ) to i2 , who accepts

The infinite horizon case

ˆ The timing is as described previously

ˆ Except that the exogenous settlement in step (3) is replaced by an infinite sequence of steps (3a), (3b),
(4a), (4b), and so on

– Player i1 makes the offer in odd-numbered period


– Player i2 in even-numbered

ˆ Bargaining continues until one player accepts an offer

ˆ We would like to solve backwards

ˆ Because the game could go on infinitely, there is no last move at which to begin such an analysis

60
A solution was proposed by Shaked and Sutton (1984)

ˆ The game beginning in the third period (should it be reached) is identical to the game as a whole (beginning
in the first period)

ˆ In both cases (game beginning in the 3° period or as a whole)

– player i1 makes the first offer


– the players alternate in making subsequent offers
– the bargaining continues until one player accepts an offer

ˆ Suppose that there is a backwards induction outcome of the game as a whole in which players i1 and i2
receive the payoffs s and 1 − s

ˆ We can use these payoffs in the game beginning in the third period, should it be reached

ˆ And then work backwards to the first period, as in the 3-period model, to compute a new backwards
induction outcome for the game as a whole

ˆ In this new backwards induction outcome, i1 will offer the settlement (f (s), 1 − f (s)) in the first period
and i2 will accept, where
f (s) = 1 − δ(1 − δs)

ˆ Let sH be the highest payoff player i1 can achieve in any backwards induction outcome of the game as a
whole

ˆ Using sH as the third-period payoff to player i1 , this will produce a new backwards induction outcome in
which player i1 ’s first-period payoff is f (sH )

ˆ Since s 7→ f (s) = 1 − δ + δ2 s is increasing, the payoff f (sH ) must coincide with sH

ˆ The only value of s that satisfies f (s) = s is 1/(1 + δ), which will be denoted by s∗

ˆ Actually we can prove that (s∗ , 1 − s∗ ) is the unique backwards-induction outcome of the game as a whole

– In the first period, i1 offers the settlement (s∗ , 1 − s∗ )


– Player i2 accepts

2.2 Two-stage games of complete but imperfect information


Theory: Subgame perfection

ˆ We continue to assume that play proceeds in a sequence of stages

ˆ The moves in all previous stages are observed before the next stage begins

ˆ However, we now allow there be simultaneous moves within each stage

61
– The game has imperfect information

We will analyze the following simple game:

1. Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively

2. Players i3 and i4 observe the outcome of the first stage, (ai1 , ai2 ), and then simultaneously choose actions
ai3 and ai4 from feasible sets Ai3 and Ai4 , respectively

3. Payoffs are ui (ai1 , ..., ai4 )

ˆ The feasible action sets of players i3 and i4 in the second stage, Ai3 and Ai4 , could be allowed to depend
on the outcome of the first stage, (ai1 , ai2 )

ˆ In particular, there may be values of (ai1 , ai2 ) that end the game

ˆ One could allow for a longer sequence of stages either by allowing players to move in more than one stage
or by adding players

ˆ In some applications, players i3 and i4 are players i1 and i2

ˆ In other applications, either player i2 or player i4 is missing

ˆ We solve the game by using an approach in the spirit of backwards induction

ˆ The first step in working backwards from the end of the game involves solving a simultaneous-move game
between players i3 and i4 in stage 2, given the outcome of stage 1

ˆ We will assume that for each feasible outcome (ai1 , ai2 ) of the first game, the second-stage game that
remains between players i3 and i4 has a unique Nash equilibrium denoted by (âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))

62
ˆ If i1 and i2 anticipate that the second-stage behavior of i3 and i4 will be given by the functions âi3 and
âi4

ˆ Then the first-stage interaction between i1 and i2 amounts to the following simultaneous-move game

1. Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively
2. Payoffs are
ui (ai1 , ai2 , âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))

ˆ Suppose (a∗i1 , a∗i2 ) is the unique Nash equilibrium of this simultaneous-move game

ˆ We will call
(a∗i1 , a∗i2 , a∗i3 , a∗i4 )

the subgame-perfect outcome of this two-stage game, where

a∗i3 = âi3 (a∗i1 , a∗i2 ) and a∗i4 = âi4 (a∗i1 , a∗i2 )

Attractive feature 11.

ˆ Players i1 and i2 should not believe a threat by players i3 and i4 that the latter will respond with actions
that are not a Nash equilibrium in the remaining second-stage game

ˆ Because when play actually reaches the second stage at least one of i3 and i4 will not want to carry out
such a threat exactly because it is not a best response

Unattractive feature 12.

ˆ Suppose player i1 is also player i3 and that player i1 does not play a∗i1 in the first stage

ˆ Player i4 may then want to reconsider the assumption that player i3 (i.e., player i1 ) will play âi3 (ai1 , ai2 )
in the second stage

Bank runs

Diamond and Dybvig (1983)

ˆ Two investors have each deposited D with a bank

ˆ The bank has invested the deposits 2D in a long-term project

ˆ If the bank is forced to liquidate its investment before the project matures, a total of α(2D) can be
recovered, where
1
<α<1
2

63
ˆ If the bank allows the investment to reach maturity, the project will pay out a total of β(2D), where
β>1

ˆ There are two dates at which investors can make withdrawals from the bank

– date 1 is before the bank’s investment matures


– date 2 is after

ˆ For simplicity we assume that there is no discounting

ˆ If both investors make withdrawals at date 1 then each receives αD and the game ends

ˆ If only one investor makes a withdrawal at date 1

– then that investor receives the whole deposit D,


– the other receives (2α − 1)D,
– and the game ends

ˆ Finally, if neither investor makes a withdrawal at date 1 then the project matures and the investors make
withdrawal decisions at date 2

ˆ If both investors make withdrawals at date 2 then each receives βD > D and the game ends

ˆ If only one investor makes a withdrawal at date 2 then that investor receives (2β − 1)D > βD, the other
receives D, and the game ends

ˆ Finally if neither investor makes a withdrawal at date 2 then the bank returns βD to each investor and
the game ends

withdraw don’t withdraw


withdraw αD, αD D, (2α − 1)D
don’t withdraw (2α − 1)D, D next stage

withdraw don’t withdraw


withdraw βD, βD (2β − 1)D, D
don’t withdraw D, (2β − 1)D βD, βD

ˆ To analyze this game, we work backwards

ˆ Consider the normal-form game at date 2

ˆ The strategy withdraw strictly dominates don’t withdraw

ˆ There is a unique Nash equilibrium in this game: both investors withdraw, leading to a payoff of (βD, βD)

ˆ Since there is no discounting, we can simply substitute this payoff into the normal-form game at date 1

64
Date 1

withdraw don’t withdraw


withdraw αD, αD D, (2α − 1)D
don’t withdraw (2α − 1)D, D βD, βD
This one period version of the two-period game has two pure strategy Nash equilibria:

1. both investors withdraw, leading to a payoff of (αD, αD)

2. both investors do not withdraw, leading to a payoff of (βD, βD)

ˆ The original 2-period bank runs game has two subgame perfect outcomes

1. both investors withdraw at date 1, yielding payoffs of (αD, αD)


2. both investors do not withdraw at date 1 but do withdraw at date 2, yielding payoffs of (βD, βD)
at date 2

ˆ The first of these outcomes can be interpreted as a run on the bank

ˆ If investor i1 believes that investor i2 will withdraw at t = 1

– then investor i1 ’s best response is to withdraw,


– even though both investors would be better off if they waited until date 2 to withdraw

Observação 22. Since there are two subgame perfect equilibria,

ˆ this model does not predict when bank runs will occur,

ˆ but does show that they can occur as an equilibrium phenomenon

Tariffs and imperfect international competition

ˆ Consider two identical countries, denoted by i1 and i2

ˆ Each country has

– a government that chooses a tariff rate


– a firm that produces output for both home consumption and export
– consumers who buy on the home market from either the home firm or the foreign firm

ˆ If the total quantity on the market in country i is Qi , then the market clearing price is

Pi (Qi ) = [a − Qi ]+

65
ˆ The firm in country i (called firm i) produces hi for home consumption and ei for export, in particular
we have
Qi = hi + ej

ˆ The firms have a constant marginal cost c and no fixed costs (we assume that c < a)

ˆ The total cost of production for firm i is

Ci (hi , ei ) ≡ c(hi + ei )

ˆ The firms also incur tariff costs on exports

– if firm i exports ei to country j


– when government j has set the tariff rate tj

then firm i must pay tj ei to government j

Timing

1. The governments simultaneously choose tariff rates, ti1 and ti2

2. The firms observe the tariff rates and simultaneously choose quantities for home consumption and for
export (hi , ei )

3. Payoffs are profit to firms and total welfare to governments

ˆ Profit to firm i is

πi (ti , tj , hi , ei , hj , ej ) ≡ [a − (hi + ej )]+ hi + [a − (ei + hj )]+ ei


−c(hi + ei ) − tj ei

ˆ Total welfare to government i,

1
Wi (ti , tj , hi , ei , hj , ej ) ≡ Q2i + πi (ti , tj , hi , ei , hj , ej ) + ti ej
2

where total welfare is the sum of

– consumers’ surplus enjoyed by the consumers in country i,


– the profit earned by the firm i, and
– the tariff revenue collected by government i from firm j

Solução

ˆ Suppose the governments have chosen the tariffs ti1 and ti2

ˆ Assume that (h∗i1 , e∗i1 , h∗i2 , e∗i2 ) is a Nash equilibrium in the remaining game between firms i1 and i2

66
ˆ Then, for each i, (h∗i , e∗i ) must solve

arg max{πi (ti , tj , hi , ei , h∗j , e∗j ) : hi ≥ 0 and ei ≥ 0}

ˆ Firm i is maximizing profits on market i and market j

– h∗i must solve


arg max{hi [a − (hi + e∗j )]+ − chi : hi ≥ 0}

– e∗i must solve


arg max{ei [a − (ei + h∗j )]+ − (c + tj )ei : ei ≥ 0}

ˆ Assuming e∗j ≤ a − c, we have


1
h∗i = (a − e∗j − c)
2
ˆ Assuming h∗j ≤ a − c − tj , we have
1
e∗i = (a − h∗j − c − tj )
2
ˆ We obtain four equations with four unknowns

ˆ If ti ≤ (a − c)/2 for each player i, then the solutions are

a − c + ti a − c − 2tj
h∗i (ti ) = and e∗i (tj ) =
3 3

67
ˆ In the Cournot game, both firms were choosing the quantity (a − c)/3,

– but this result was derived under the assumption of symmetric marginal costs

ˆ In the equilibrium described above, the governments’ tariff choices make marginal costs asymmetric

– On market i, firm i’s marginal cost is c but firm j’s is c + ti


– Since firm j’s cost is higher it wants to produce less
– If firm j is going to produce less, then the market-clearing price will be higher, so firm i wants to
produce more

ˆ In equilibrium the function h∗i increases in ti and e∗j decreases (at a faster rate) in ti

ˆ Having solved the second-stage game that remains between the two firms after the governments choose
tariff rates

ˆ We can now represent the first-stage interaction between the two governments as the following simultaneous-
move game

ˆ First, the governments simultaneously choose tariff rates ti1 and ti2

ˆ Second, payoffs are


Wi (ti , tj , h∗i (ti ), e∗i (tj ), h∗j (tj ), e∗j (ti ))

ˆ We now solve for the Nash equilibrium of this game between the governments

ˆ We denote by (ti , tj ) 7→ Wi∗ (ti , tj ) the function defined by

Wi∗ (ti , tj ) ≡ Wi (ti , tj , h∗i (ti ), e∗i (tj ), h∗j (tj ), e∗j (ti ))

ˆ If (t∗i , t∗j ) is a Nash equilibrium of this game between governments then, for each i, the tariff t∗i must solve

arg max{Wi∗ (ti , t∗j ) : ti ≥ 0}

ˆ We propose to show that there exists a solution

(t∗i , t∗j ) ∈ (0, (a − c)/2) × (0, (a − c)/2)

ˆ Observe that if ti and t∗j belong to (0, (a − c)/2) then Wi∗ (ti , t∗j ) equals

(2(a − c) − ti )2 (a − c + ti )2 (a − c − 2t∗j )2 ti (a − c − 2ti )


+ + +
18 9 9 3

ˆ A solution is
a−c
t∗i =
3
for each i, independent of t∗j

68
ˆ In this model, choosing a tariff rate of (a − c)/3 is a dominant strategy for each government

ˆ We then obtain the following firms’ quantity choices for the second-stage

4(a − c) a−c
h∗i (t∗i ) = and e∗i (t∗j ) =
9 9

ˆ Thus, the subgame-perfect outcome of this tariff game is

a−c ∗ 4(a − c) a−c


t∗i1 = t∗i2 = , hi1 = h∗i2 = and e∗i1 = e∗i2 =
3 9 9

ˆ If the governments had chosen tariff rates equal to 0

ˆ Then the aggregate quantity on each market would have been

2(a − c)
Qi =
3

just as in the Cournot model

ˆ The consumers’ surplus on market i is lower when the governments choose their dominant strategy tariffs
than it would be if they chose zero tariffs

ˆ In fact, zero tariffs is socially optimal, i.e., it is the solution of

arg max{Wi∗1 (ti1 , ti2 ) + Wi∗2 (ti1 , ti2 ) : ti1 ≥ 0 and ti2 ≥ 0}

ˆ There is an incentive for the governments to sign a treaty in which they commit to zero tariffs

Tournaments

Lazear and Rosen (1981)

ˆ Consider two workers J = {j1 , j2 } and their boss

ˆ Worker j produces output yj = ej + εj

– ej is effort and εj is noise

ˆ Production proceeds as follows:

1. The workers simultaneously choose non-negative effort levels: ej ≥ 0


2. The noise terms εj1 and εj2 are independently drawn from a density f : R → [0, ∞) with zero mean
3. The workers’ output are observed but their effort choices are not

ˆ The workers’ wages therefore can depend on their outputs but not directly on their effort levels

ˆ Suppose the boss decides to induce effort by having the workers compete in a tournament

69
ˆ The winner of the tournament is the worker with the higher output

– wH : wage earned by the winner of the tournament


– wL : the wage earned by the loser

ˆ The payoff to a worker from earning wage w and expending effort e is

u(w, e) = w − g(e)

– g(e): disutility under the effort level e


– g : [0, ∞) → [0, ∞) is twice continuously differentiable and satisfies g′ > 0 (strictly increasing) and
g ′′ > 0 (strictly convex)

ˆ The payoff to the boss is yj1 + yj2 − wH − wL

ˆ The boss is player i1 whose action ai1 is choosing the wages to be paid in the tournament, wH and wL

ˆ There is no player i2

ˆ Worker j1 is player i3 and worker j2 is player i4

ˆ Workers observe the wages chosen in the first stage and then simultaneously choose actions ai3 and ai4 ,
namely effort choices ej1 and ej2

ˆ Since outputs (and so also wages) are functions not only of the players actions but also of the noise term
εj1 and εj2 , we work with the players’ expected payoffs according to the density f

ˆ Suppose that the boss has chosen the wages wH and wL

ˆ Let (e∗j1 , e∗j2 ) be a Nash equilibrium of the remaining game between the workers

ˆ For each j, e∗j must solve


arg max{πj (wH , wL , ej , e∗k ) : ej ≥ 0}

where πj (wH , wL , ej , e∗k ) is the expected profit defined by

πj (wH , wL , ej , e∗k ) = wH Pr{yj (ej ) > yk (e∗k )} + wL Pr{yj (ej ) < yk (e∗k )} − g(ej )
= (wH − wL ) Pr{yj (ej ) > yk (e∗k )} + wL − g(ej )

where yj (ej ) = ej + εj and yk (e∗k ) = e∗k + εk

ˆ Assume e∗j is strictly positive

ˆ The first-order condition of the maximization problem is

∂ Pr{yj (ej ) > yk (e∗k )}


(wH − wL ) = g′ (ej )
∂ej

70
ˆ The worker j chooses ej such that the marginal disutility of extra effort, g ′ (ej ), equals the marginal gain
from extra effort

Observe that by Bayes’ rule

P rob{yj (ej ) > yk (e∗k )} = P rob{εj > e∗k + εk − ej }


Z
= P rob{εj > e∗k + z − ej |εk = z}f (z)dz

Since εj and εk are independent we have

P rob{εj > e∗k + z − ej |εk = z} = P rob{εj > e∗k + z − ej }

implying that4 Z
P rob{yj (ej ) > yk (e∗k )} = [1 − F (e∗k − ej + z)]f (z)dz

The first order condition becomes


Z
(wH − wL ) f (e∗k − ej + z)f (z)dz = g ′ (ej )

ˆ If we look for symmetric Nash equilibria

e∗j = e∗k = e∗ (wH , wL )

we get Z
(wH − wL ) f (z)2 dz = g ′ (e∗ (wH , wL ))
R

ˆ Since g is convex, g ′ is increasing

– a bigger prize for winning (i.e., a larger value of wH − wL ) induces more effort

Observação 23. Holding the prize constant,

ˆ it is not worthwhile to work hard when output is very noisy,

ˆ because the outcome of the tournament is likely to be determined by luck rather than effort

ˆ If εj is normally distributed with variance σ 2 , then


Z
1
f (z)2 dz = √
2σ π

which decreases in σ, so e∗ (wH , wL ) decreases in σ

ˆ We work backwards to the first stage of the game


4
F is the cumulative distribution of f

71
ˆ Suppose that if the workers agree to participate in the tournament (rather than accept alternative em-
ployment)

ˆ Then they will respond to the wages wH and wL by playing the symmetric Nash equilibrium previously
exhibited

ˆ We ignore the possibility of asymmetric equilibria and of an equilibrium with “corner” solutions

ˆ Suppose that the workers’ alternative employment opportunity would provide utility Ua

ˆ In the symmetric N E each worker wins the tournament with probability 1/2

1
P rob{yj (e∗ (wH , wL )) > yk (e∗ (wH , wL ))} =
2

ˆ If the boss intends to induce the workers to participate in the tournament then he must choose wages
(wH , wL ) that satisfy
1 1
wH + wL − g(e∗ (wH , wL )) ≥ Ua (IR)
2 2
ˆ The boss chooses wages to maximize expected profit

2e∗ (wH , wL ) + E[ε1 + ε2 ] − (wH + wL ) = 2e∗ (wH , wL ) − (wH + wL )

subject to the restriction (IR)

ˆ Assume there exists a solution (wH


∗ , w ∗ ) to the maximization problem with w ∗ > 0
L L

ˆ The participation restriction (IR) must be binding at the optimum, i.e., (wH
∗ , w ∗ ) must be a solution to
L

wH + wL = 2Ua + 2g(e∗ (wH , wL )) (IRb)

ˆ Expected profit becomes


2[e∗ (wH
∗ ∗
, wL ) − Ua − g(e∗ (wH
∗ ∗
, wL ))]

ˆ The choice (wH


∗ , w ∗ ) of the boss solves
L

max {e∗ (wH , wL ) − g(e∗ (wH , wL ))}


wH ≥wL ≥0

under the binding restriction (IRb)

72
ˆ We denote by f ∗ the function defined by
Z
∗ ′ −1
∀δ ≥ 0, f (δ) = [g ] (δξ) where ξ = f (z)2 dz
R

ˆ Observe that, from the FOC


e∗ (wH , wL ) = f ∗ (wH − wL )

ˆ We propose to replace the pair of variable (wH , wL ) by (δ, wL ) where δ = wH − wL

ˆ It follows that the choice (δ∗ , wL


∗ ) of the boss solves

max {f ∗ (δ) − g(f ∗ (δ))}


(δ,wL )≥0

under the restriction


δ
wL = Ua + g(f ∗ (δ)) − (IRt)
2
ˆ Since the choice variable wL does not enter the objective function, the maximization problem is equivalent
to the following one
max{f ∗ (δ) − g(f ∗ (δ))}
δ≥0

under the restriction


δ
wL = Ua + g(f ∗ (δ)) − ≥0 (IR’)
2
ˆ Since we assumed that wL
∗ >0

ˆ At the solution δ∗ the restriction (IR’) is not binding

δ∗
Ua + g(f ∗ (δ∗ )) − >0
2

ˆ It follows that the choice δ∗ of the boss satisfies the FOC

Ψ′ (δ∗ ) = 0

where the function Ψ is defined by


Ψ(δ) ≡ f ∗ (δ) − g(f ∗ (δ))

ˆ Thus the optimal induced effort e∗ (wH


∗ , w ∗ ) satisfies
L

g′ (e∗ (wH
∗ ∗
, wL )) = 1

ˆ Remember that Z
∗ ∗
(wH − wL ) f (z)2 dz = g′ (e∗ (wH
∗ ∗
, wL ))

73
ˆ Therefore the optimal wages satisfy
Z
∗ ∗
(wH − wL ) f (z)2 dz = 1

ˆ The pair (wH , wL ) is determined by the participation equation

∗ ∗
wH + wL = 2Ua + 2g([g′ ]−1 (1))

2.3 Repeated games


Theory: Two-stage repeated games

Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4

ˆ Consider the Prisoners’ Dilemma described above

ˆ Suppose the two players play this simultaneous-move game twice

ˆ Suppose the outcome of the first play is observed before the second play begins

ˆ Suppose the payoff for the entire game is simply the sum of the payoffs from the two stages (no discounting)

This game, called the two-stage Prisoners’ Dilemma belongs to the class of games analyzed in the previous
chapter

ˆ Players i3 and i4 are identical to players i1 and i2

ˆ The action spaces Ai3 and Ai4 are identical to Ai1 and Ai2

ˆ The payoff
ui (ai1 , ai2 , ai3 , ai4 )

is the sum of the payoffs from each stage

ˆ For each possible outcome of the first-stage game, (ai1 , ai2 ), the second-stage game that remains between
players i3 and i4 has a unique NE (âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))

ˆ In the two-stage Prisoners’ Dilemma the unique equilibrium of the second-stage game is (L1 , L2 ), regardless
of the first-stage outcome

ˆ To compute the subgame perfect outcome of this game

ˆ We analyze the first-stage Prisonners’ Dilemma by taking into account that the outcome of the game
remaining in the second stage will be the NE (L1 , L2 ) with payoff (1, 1)

ˆ Thus the players’ first-stage interaction amounts to the one-shot game below

74
Player i2
L2 R2
L1 2,2 6,1
Player i1
R1 1,6 5,5

ˆ The above game has a unique NE (L1 , L2 )

ˆ The unique subgame perfect outcome of the two-stage Prisoners’ Dilemma is (L1 , L2 ) in the first-stage,
followed by (L1 , L2 ) in the second-stage

ˆ Cooperation-i.e., (R1 , R2 )-cannot be achieved in either stage of the subgame perfect outcome

ˆ This argument holds more generally

ˆ Let G = {Ai , ui }i∈I denote a static game of complete information

ˆ The payoff of player k is uk ((ai )i∈I ) where ai is chosen from the action set Ai

Definição 15. Given a static game G, let G(T ) denote the finitely repeated game in which G is played
T times:

ˆ the outcomes of all preceding plays are observed before the next play begins

ˆ the payoffs for G(T ) are simply the (discounted) sum of the payoffs from the T stage games

Proposição 9. If the stage game G has a unique NE then, for any finite T , the repeated game G(T ) has
a unique subgame-perfect outcome: the NE of G is played in every stage

Player i2
L2 M2 R2
L1 1,1 5,0 0,0
Player i1 M1 0,5 4,4 0,0
R1 0,0 0,0 3,3

ˆ Consider the stage-game described above

ˆ There are two pure NE strategies (L1 , L2 ) but also (R1 , R2 )

ˆ Suppose this stage game is played twice

ˆ We will show that there is a subgame perfect outcome of this repeated game in which the strategy (M1 , M2 )
is played in the first stage

ˆ We assume that in the first-stage, players anticipate that the second-stage outcome will be a NE of the
stage game

75
ˆ We have for this specific stage game, several Nash equilibria in the second stage

ˆ Players may anticipate that different first-stage outcomes will be followed by different stage-game equilibria
in the second stage

ˆ For example, suppose that players anticipate that (R1 , R2 ) will be the second-stage outcome if the first-
stage outcome is (M1 , M2 )

ˆ Players anticipate that (L1 , L2 ) will be the second-stage outcome if any of the eight other first-stage
outcomes occurs

ˆ The players’ first stage interaction then amounts to the following one-shot game

Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4

ˆ There are three pure-strategy Nash equilibria (L1 , L2 ), (M1 , M2 ) and (R1 , R2 )

ˆ Every NE of this one-shot game corresponds to a subgame perfect outcome of the original repeated game

Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4

ˆ Denote by ((w, x), (y, z)) the outcome of the repeated game

ˆ (w, x) is the first-stage outcome and (y, z) the second-stage outcome

ˆ The NE (L1 , L2 ) in the one-shot game above corresponds to the subgame-perfect outcome ((L1 , L2 ), (L1 , L2 ))
in the repeated game

ˆ The NE (R1 , R2 ) in the one-shot game above corresponds to the subgame-perfect outcome ((R1 , R2 ), (L1 , L2 ))
in the repeated game

ˆ These two subgame-perfect outcomes of the repeated game simply concatenate NE outcomes from the
stage game

Player i2
L2 M2 R2
L1 2,2 6,1 1,1
Player i1 M1 1,6 7,7 1,1
R1 1,1 1,1 4,4

76
ˆ The third NE of the one shot game, (M1 , M2 ) corresponds to the subgame-perfect outcome ((M1 , M2 ), (R1 , R2 ))
in the repeated game

ˆ This is a qualitatively different result

ˆ Because the anticipated second-stage outcome is (R1 , R2 ) following (M1 , M2 ), cooperation can be achieved
in the first stage of a subgame perfect outcome of the repeated game

This illustrates a more general point:

ˆ if G is a static game of complete information with multiple Nash equilibria then there may be subgame
perfect outcomes of the repeated game G(T ) in which, for any t < T , the outcome in stage t is not a NE
of G

Credible threats or promises about future behavior can influence current behavior

ˆ Subgame perfection may not embody a strong enough definition of credibility

ˆ In deriving the subgame perfect outcome ((M1 , M2 ), (R1 , R2 )) we assumed that the players anticipate that
(R1 , R2 ) will be the second-stage outcome if the first-stage outcome is (M1 , M2 ) and that (L1 , L2 ) will be
the second-stage outcome if any of the eight other first-stage outcomes occurs

ˆ But playing (L1 , L2 ) in the second stage, with its payoff of (1, 1), may seem silly when (R1 , R2 ), with its
payoff of (3, 3), is also available as a NE of the remaining stage game

ˆ It would seem natural for the players to “renegotiate” by introspection

ˆ They might reason that “bygones are bygones” and unanimously preferred stage-game equilibrium (R1 , R2 )
should be played instead

ˆ If (R1 , R2 ) is to be the second-stage outcome after every first-stage outcome, then the incentive to play
(M1 , M2 ) in the first stage is destroyed

ˆ Indeed, in that case, the payoff (3, 3) has been added to each cell of the stage game

ˆ So Li is player i’s best response to Mj

77
ˆ To suggest a solution to this renegotiation problem, we consider the following modification of the stage
game

Player i2
L2 M2 R2 P2 Q2
L1 1,1 5,0 0,0 0,0 0,0
M1 0,5 4,4 0,0 0,0 0,0
Player i1 R1 0,0 0,0 3,3 0,0 0,0
P1 0,0 0,0 0,0 4,1/2 0,0
Q1 0,0 0,0 0,0 0,0 1/2,4

ˆ There are four pure-strategy NE

– (L1 , L2 ) and (R1 , R2 ), and now also (P1 , P2 ) and (Q1 , Q2 )

ˆ The players unanimously prefer (R1 , R2 ) to (L1 , L2 ), in other words, (R1 , R2 ) Pareto dominates (L1 , L2 )

ˆ There is no NE (x, y) such that the players unanimously prefer (x, y) to (P1 , P2 ), or (Q1 , Q2 ), or (R1 , R2 )

ˆ We say that (P1 , P2 ), (Q1 , Q2 ), and (R1 , R2 ) belong to the Pareto frontier of the payoffs to Nash equilibria
of the stage game

ˆ Suppose that the stage game is played twice, with the first-stage outcome observed before the second
stage begins

ˆ Suppose that players anticipate that the second-stage outcome will be as follows

– (R1 , R2 ) if the first-stage outcome is (M1 , M2 )


– (P1 , P2 ) if the first-stage outcome is (M1 , w) where w is anything but M2
– (Q1 , Q2 ) if the first-stage outcome is (x, M2 ) where x is anything but M1
– (R1 , R2 ) if the first-stage outcome is (y, z) where y is anything but M1 and z is anything but M2

ˆ The players’ first stage interaction then amounts to the following one-shot game

Player i2
L2 M2 R2 P2 Q2
11
L1 4,4 2 ,4 3,3 3,3 3,3
M1 4, 11
2 7,7 4, 12 4, 12 4, 12
Player i1 R1 1
3,3 2 ,4 6,6 3,3 3,3
1
P1 3,3 2 ,4 3,3 7, 72 3,3
1 7
Q1 3,3 2 ,4 3,3 3,3 2 ,7

ˆ ((M1 , M2 ), (R1 , R2 )) is a subgame perfect outcome of the repeated game

ˆ More importantly, the difficulty raised in the previous example does not arise here

78
ˆ In the previous example, the only way to “punish” a player for deviating in the first stage from col-
laboration was to play a Pareto dominated equilibrium in the second stage, thereby also punishing the
punisher

ˆ Here, in contrast, there are three equilibria in the Pareto frontier

– One to reward good behavior by both players in the first stage


– Two others to be used not only to punish a player who deviates in the first stage but also to reward
the punisher
– In the second stage, the punisher cannot be persuaded to renegotiate the punishment

Theory: Infinitely repeated games

ˆ A static game G is repeated infinitely, with the outcomes of all previous stages observed before the current
stage begins

ˆ We will define

– a player’s strategy
– a subgame
– a subgame perfect Nash equilibrium (SPNE)

ˆ The main theme is that credible threats or promises about future behavior can influence current behavior

– We will illustrate that even if the stage game G has a unique NE, there may be subgame perfect
outcomes of the infinitely repeated game in which no stage’s outcome is the NE of G

ˆ Suppose the following Prisoners’ Dilemma is to be repeated infinitely

Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4

1
ˆ The discount factor δ = 1+r is the value today of a dollar to be received one stage later, where r is the
interest rate per stage

ˆ Given the discount factor δ and a player’s payoffs from an infinite sequence of stage games, we can compute
the present value of the payoffs

79
Definição 16. Given the discount factor δ, the present value of the infinite sequence of payoffs (πt )t≥1
is

X
π1 + δπ2 + δ2 π3 + · · · = δt−1 πt
t=1

ˆ We can also use δ to interpret a game that ends after a random number of repetitions

ˆ Suppose that after each stage is played a (weighted) coin is flipped to determine whether the game will
end

ˆ If the probability is p that the game ends immediately, then a payoff π to be received in the next stage
(if if is played) is worth only
1−p
π
1+r
before this stage’s coin flip occurs

ˆ Likewise, a payoff π to be received two stages from now is worth only

(1 − p)2
π
(1 + r)2

ˆ Let δ = (1 − p)/(1 + r) then the present value

π1 + δπ2 + δ2 π3 + . . .

reflects both the time-value of money and the possibility that the game will end

Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4

ˆ Consider the infinitely repeated Prisoners’ Dilemma in which each player’s discount factor is δ

ˆ We will show that cooperation (i.e., (R1 , R2 )) can occur in every stage of a subgame-perfect outcome of
the infinitely repeated game

– Even though the only NE in the stage game is noncooperation (i.e., (L1 , L2 ))

ˆ The argument is as follows:

– if the players cooperate today then they play a high-payoff equilibrium tomorrow
– otherwise they play a low-payoff equilibrium tomorrow

ˆ We do not need to add artificially the high-payoff equilibrium that might be played tomorrow

ˆ It is the strategy “continuing to cooperate tomorrow and thereafter”

80
ˆ Suppose player i begins the infinitely repeated game by cooperating and then cooperates in each subse-
quent stage game if and only if both players have cooperated in every previous stage

ˆ Player i’s strategy is

– Play Ri in the first stage


– In the stage t,
* if the outcome of all t − 1 preceding stages has been (R1 , R2 ) then play again Ri
* otherwise, play Li

ˆ This strategy is an example of a trigger strategy

ˆ Player i cooperates until someone fails to cooperate, which triggers a switch to noncooperation forever
after

ˆ If both players adopt this trigger strategy then the outcome of the infinitely repeated game will be (R1 , R2 )
in every stage

ˆ We will first show that if δ is close enough to 1 then it is a NE of the infinitely repeated game for both
players to adopt this strategy

ˆ We will also show that such a NE is subgame perfect

We propose to provide rigorous definitions of the following concepts for both finitely and infinitely repeated
games
1. a strategy in a repeated game

2. a subgame in a repeated game

3. a subgame-perfect Nash equilibrium (SPNE)

Definição 17. Given a stage game G = {Ai , ui }i∈I , let G(T, δ) denote the finitely repeated game in
which

ˆ G is played T times and the players share the discount factor δ

ˆ for each t , the outcomes of the t − 1 preceding plays are observed before the stage t begins

ˆ each player’s payoff in G(T, δ) is the present value of the player’ payoffs from the sequence of stage
games

Definição 18. Given a stage game G = {Ai , ui }i∈I , let G(∞, δ) denote the infinitely repeated game
in which

ˆ G is repeated forever and the players share the discount factor δ

ˆ for each t, the outcomes of the t − 1 preceding plays are observed before the stage t begins

81
ˆ each player’s payoff in G(∞, δ) is the present value of the player’ payoffs from the infinite sequence
of stage games

Definição 19. In the finitely repeated game G(T, δ) or the infinitely repeated game G(∞, δ), a player’s
strategy specifies the action the player will take in each stage, for each possible history of plays through
the previous stages

A history of plays up to stage t + 1 is a family

st = (s1 , s2 , . . . , st )

where
Y
∀0 ≤ τ ≤ t, sτ = (ai,τ )i∈I ∈ S ≡ Ai
i∈I

ˆ A strategy for player i in G(T, δ) is a function

fi : S(T ) → Ai

– The set of strategies for agent i in G(T, δ) is denoted by Fi (T )


−1 t
– S(T ) = ∅ ∪Tt=1 S where S t = S
| × ·{z
· · × S}
t times
ˆ A strategy for player i in G(∞, δ) is a function

fi : S(∞) → Ai

– The set of strategies for agent i in G(∞, δ) is denoted by Fi (∞)


– S(∞) = ∅ ∪t≥1 S t where S t = S
| × ·{z
· · × S}
t times
Interpretation 13.

– fi (∅) is the action of player i at stage 1


– fi (s1 , s2 , . . . , st ) is the action of player i at stage t + 1 , if the history of past plays is (s1 , s2 , . . . , st )

82
Consider a finitely repeated game G(T, δ)

Definição 20. A strategy profile f ∗ = (fi∗ )i∈I is a NE of the repeated game G(T, δ) if for each player i, fi∗
∗ , i.e.,
is a best response to f−i

fi∗ ∈ arg max{πi (fi , f−i



) : fi ∈ Fi (T )}

Consider an infinitely repeated game G(∞, δ)

Definição 21. A strategy profile f ∗ = (fi∗ )i∈I is NE of the infinitely repeated game G(∞, δ) if for each
player i, fi∗ is a best response to f−i
∗ , i.e.,

fi∗ ∈ arg max{πi (fi , f−i



) : fi ∈ Fi (∞)}

Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4

ˆ Consider the infinitely repeated Prisoners’ Dilemma in which each player’s discount factor is δ

ˆ Suppose player i begins the infinitely repeated game by cooperating and then cooperates in each subse-
quent stage game if and only if both players have cooperated in every previous stage

ˆ Player i’s strategy is

– Play Ri in the first stage


– In the stage t,
* if the outcome of all t − 1 preceding stages has been (R1 , R2 ) then play again Ri
* otherwise, play Li

ˆ In other words, we denote by fi the trigger strategy defined by

fi∗ (∅) = Ri

and for any history st = (s1 , s2 , . . . , st ) up to stage t, the strategy at stage t + 1 is


(
Ri if ∀τ ∈ {1, . . . , t}, sτ = (R1 , R2 )
fi∗ (st ) =
Li if ∃τ ∈ {1, . . . , t}, sτ 6= (R1 , R2 )

83
Proposição 10. If δ ≥ 1/4 then the profile f ∗ defined above is a NE of the infinitely repeated Prisoners’
Dilemma

Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4

Proof. ˆ Fix a player i, we shall prove that

fi∗ ∈ arg max{πi (fi , fj∗ ) : fi ∈ Fi (∞)}

ˆ We first compute πi (fi∗ , fj∗ )

ˆ Observe that the outcome Ot (f ∗ ) at stage t is (Ri , Rj ) implying that

4
πi (fi∗ , fj∗ ) = 4 + 4δ + · · · + 4δt + · · · =
1−δ

ˆ Now fix another strategy fi 6= fi∗ and assume that fi (∅) 6= fi∗ (∅), i.e., fi (∅) = Li

ˆ If follows that the outcome at the first stage is O1 (fi , fj∗ ) = (Li , Rj )

ˆ The outcome at the second stage is then O2 (fi , fj∗ ) = (ai,2 , Lj ) for some action ai,2 ∈ {Ri , Li }

ˆ Actually, for every stage t > 1 we have

Ot (fi , fj∗ ) = (ai,t , Lj )

for some action ai,t ∈ {Ri , Li }

ˆ This implies
πi (fi , fj∗ ) ≤ 5 + δ + · · · + δt + . . .

ˆ and therefore

δ 4δ − 1
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ −1 + 3 =
1−δ 1−δ

ˆ We have thus proved that πi (fi∗ , fj∗ ) ≥ πi (fi , fj∗ )


Now fix a strategy fi 6= fi∗ satisfying fi (∅) = fi∗ (∅), i.e., fi (∅) = Ri

ˆ Observe that the value of fi (s1 ) for an outcome s1 different from sco
1 ≡ (R1 , R2 ) is irrelevant for the payoff
ui (O2 (fi , fj∗ ))

ˆ Assume that fi (sco ∗ co co


1 ) 6= fi (s1 ), i.e., fi (s1 ) = Li

ˆ If follows that the outcome at the second stage is O2 (fi , fj∗ ) ≡ (fi (sco co
1 ), fj (s1 )) = (Li , Rj )

84
ˆ The outcome at the third stage is then O3 (fi , fj∗ ) = (ai,3 , Lj ) for some action ai,3 ∈ {Ri , Li }

ˆ Actually, for every stage t > 2 we have

Ot (fi , fj∗ ) = (ai,t , Lj )

for some action ai,t ∈ {Ri , Li }

ˆ This implies
πi (fi , fj∗ ) ≤ 4 + 5δ + δ2 + · · · + δt + . . .

ˆ and therefore

δ2
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ 0 − 1δ + 3
1−δ
δ δ(4δ − 1)
≥ [3δ − (1 − δ)] =
1−δ 1−δ

ˆ We have thus proved that πi (fi∗ , fj∗ ) ≥ πi (fi , fj∗ )

Now fix a strategy fi 6= fi∗ and a stage t > 2

ˆ Assume that fi (sτ ) = fi∗ (sτ ), for every history sτ with τ ≤ t − 1

ˆ This implies every outcome Oτ (fi , fj∗ ) coincides with sco


τ ≡ (R1 , R2 )

ˆ We denote by sco,τ the history (sco co


1 , . . . , sτ )

ˆ Observe that the value of fi (st−1 ) for an outcome history st−1 different from sco,t−1 is irrelevant for the
payoff ui (Ot (fi , fj∗ ))

Assume that fi (sco,t−1 ) 6= fi∗ (sco,t−1 ), i.e., fi (sco,t−1 ) = Li


If follows that the outcome at stage t is

Ot (fi , fj∗ ) ≡ (fi (sco,t−1 ), fj (sco,t−1 )) = (Li , Rj )

The outcome at stage t + 1 is then


Ot+1 (fi , fj∗ ) = (ai,t+1 , Lj )

for some action ai,t+1 ∈ {Ri , Li }

ˆ Actually, for every stage T > t we have

OT (fi , fj∗ ) = (ai,T , Lj )

for some action ai,T ∈ {Ri , Li }

ˆ This implies
πi (fi , fj∗ ) = 4 + 4δ + · · · + 4δt−2 + 5δt−1 + δt + · · · + δT + . . .

85
ˆ and therefore

δt
πi (fi∗ , fj∗ ) − πi (fi , fj∗ ) ≥ −δt−1 + 3
1−δ
t−1
δ [3δ − (1 − δ)] δt−1 (4δ − 1)
≥ =
1−δ 1−δ

ˆ We have thus proved that πi (fi∗ , fj∗ ) ≥ πi (fi , fj∗ )

86
Theory: Subgames

Definição 22.

ˆ Given a finitely repeated game G(T, δ)

ˆ Given a history st with t < T

The repeated game in which G is played T − t times after st is denoted by G(T − t, δ, st ) and is called the
subgame beginning at stage t + 1 following history st

Definição 23.

ˆ Given an infinitely repeated game G(∞, δ)

ˆ Given a history st with t < T

The repeated game in which G is played infinite times after st is denoted by G(∞, δ, st ) and is called the
subgame beginning at stage t + 1 following history st

Definição 24.

ˆ Given a strategy fi of a finitely repeated game G(T, δ)

ˆ Given a history st with t < T

We denote by fi (·|st ) the strategy of the subgame G(T − t, δ, st ) defined by

fi (σ τ |st ) = fi (st , σ τ )

for every history σ τ of G(T − t, δ, st ) with τ < T − t.

Definição 25.

ˆ Given a strategy fi of an infinitely repeated game G(∞, δ)

ˆ Given a history st with t < T

We denote by fi (·|st ) the strategy of the subgame G(∞, δ, st ) defined by

fi (σ τ |st ) = fi (st , σ τ )

87
for every history σ τ of G(∞, δ, st )

Definição 26. A subgame perfect equilibrium of a finitely repeated game G(T, δ) is a strategy profile f ∗ =
(fi∗ )i∈I which constitutes a NE of every subgame, i.e.,

ˆ f ∗ is a NE of G(T, δ)

ˆ for every stage t < T , for every history st ,

the strategy profile f ∗ (·|st ) ≡ (fi∗ (·|st ))i∈I is a NE of the subgame G(T − t, δ, st )

Observação 24.

ˆ Many possible histories st are out of equilibrium paths, i.e., they are different from the outcome history

(O1 (f ∗ ), . . . , Ot (f ∗ ))

ˆ This is to capture the idea of credible threats or promises

Definição 27. A subgame perfect equilibrium of an infinitely repeated game G(∞, δ) is a strategy profile
f ∗ = (fi∗ )i∈I which constitutes a NE of every subgame, i.e.,

ˆ f ∗ is a NE of G(∞, δ)

ˆ for every stage t < T , for every history st ,

the strategy profile f ∗ (·|st ) ≡ (fi∗ (·|st ))i∈I is a NE of the subgame G(∞, δ, st )

ˆ Subgame perfect Nash equilibrium is a refinement of NE

– To be subgame perfect, the players’ strategies must first be a NE and must then pass an additional
test

ˆ The notion of subgame perfect equilibrium eliminates Nash equilibria in which the players’ threats or
promises are not credible

Player i2
L2 R2
L1 1,1 5,0
Player i1
R1 0,5 4,4

88
ˆ Consider the infinitely repeated Prisoners’ Dilemma in which each player’s discount factor is δ

ˆ Suppose player i begins the infinitely repeated game by cooperating and then cooperates in each subse-
quent stage game if and only if both players have cooperated in every previous stage

ˆ Player i’s strategy is

– Play Ri in the first stage


– In the stage t,
* if the outcome of all t − 1 preceding stages has been (R1 , R2 ) then play again Ri
* otherwise, play Li

ˆ In other words, we denote by fi∗ the trigger strategy defined by

fi∗ (∅) = Ri

and for any history st = (s1 , s2 , . . . , st ) up to stage t, the strategy at stage t + 1 is


(
Ri if ∀τ ∈ 1, . . . , t, sτ = (R1 , R2 )
fi∗ (st ) =
Li if ∃τ ∈ 1, . . . , t, sτ 6= (R1 , R2 )

Proposição 11. If δ ≥ 1/4 then the profile f ∗ defined above is a SPNE of the infinitely repeated Prisoners’
Dilemma

Proof. We already proved that f ∗ is a NE of the game G(∞, δ)

ˆ We only have to prove that for every T ≥ 1 and every possible history sT , the profile f ∗ (·|sT ) is a NE of
the subgame G(∞, δ, sT )

If no agent deviated up to period T , i.e.,

sT = (sco co co
1 , s2 , . . . , sT )

ˆ Then for every history σ t of the subgame G(∞, δ, sT ), we have

∀i ∈ I, fi∗ (σ t |sT ) = fi∗ (σ t )

ˆ Therefore we can reproduce the arguments of the proof that f ∗ is a NE of G(∞, δ) to show that f ∗ (·|sT )
is a NE of G(∞, δ, sT )

Assume now that at least one agent deviated, i.e.,

sT 6= (sco co co
1 , s2 , . . . , sT )

ˆ The strategy fj∗ (·|sT ) is given by fj∗ (σ t |sT ) = Lj for any history σ t of the subgame G(∞, δ, sT )

89
ˆ The payoff πi (f ∗ (·|sT )) for agent i of the strategy fi∗ (·|sT ) is

ui (O1 (fi∗ (·|sT ), fj∗ (·|sT ))) + δui (O2 (fi∗ (·|sT ), fj∗ (·|sT ))) + . . .

ˆ Since Ot (fi∗ (·|sT ), fj∗ (·|sT )) = (Li , Lj ) for every t

ˆ We get that
1
πi (f ∗ (·|sT )) = 1 + δ + δ2 + · · · =
1−δ

Fix another strategy gi of player i in the subgame G(∞, δ, sT )

ˆ The payoff πi (gi , fj∗ (·|sT )) for agent i of the strategy gi is

ui (O1 (gi , fj∗ (·|sT ))) + δui (O2 (gi , fj∗ (·|sT ))) + . . .

ˆ Since for every t there exists an action ai,t ∈ {Li , Ri } such that Ot (gi , fj∗ (·|sT )) = (ai,t , Lj )

ˆ We get that
πi (gi , fj∗ (·|sT )) ≤ 1 + δ + δ2 + · · · = πi (f ∗ (·|sT ))

Folk Theorem: Friedman (1971)


Consider an abstract stage game G = (Ai , ui )i∈I and the associated infinitely repeated game G(∞, δ)

Definição 28. A profile of payoffs x = (xi )i∈I is called feasible in the stage game G if it is a convex
combination of the pure strategy payoffs of G, i.e., if there exists a family (π k )k∈K of pure strategy payoffs
π k = (πik )i∈I such that
X
x= αk π k
k∈K
P
where αk ≥ 0 and k∈K αk = 1.

Feasible payoffs: Prisoners’ dilemma

90
Average payoff

ˆ Consider an infinite sequence π = (π1 , π2 , . . . ) of payoffs for every stage

ˆ The present value V (π) is defined by

V (π) = π1 + δπ2 + δ2 π3 + . . .

ˆ If a payoff π̄ were received in every stage, the present value would be π̄/(1 − δ)

ˆ The average payoff of an infinite sequence π = (π1 , π2 , . . . ) of payoffs is the payoff π̄ that should be received
in every stage in order to achieve the same present value

π̄
= V (π)
1−δ

Definição 29. Given a discount factor δ, the average payoff of an infinite sequence of payoffs π =
(π1 , π2 , . . . ) is
X
(1 − δ) δt−1 πt
t≥1

ˆ Since the average payoff is just a rescaling of the present value, maximizing the average payoff is equivalent
to maximizing the present value

ˆ The advantage of the average payoff over the present value is that the former is directly comparable to
the payoffs from the stage game

91
Teorema 14 (Folk Theorem: Friedman (1971)). Let

ˆ G = (Ai , ui )i∈I be a static game of complete information

ˆ e = (ei )i∈I denote the profile of payoffs from a NE of G

ˆ x = (xi )i∈I denote any other feasible profile of payoffs from G

If

ˆ xi > ei for every player i and

ˆ δ is sufficiently close to one,

then there exists a subgame perfect NE of the infinitely repeated game G(∞, δ) that achieves x = (xi )i∈I as
the profile of average payoffs

Folk Theorem: Prisoners’ dilemma

92
Collusion between Cournot duopolists

Cournot duopoly

ˆ Recall the static Cournot game

ˆ If the aggregate quantity on the market is Q = qi1 + qi2

ˆ Then the market clearing price is P (Q) = [a − Q]+

ˆ Each firm has a marginal cost of c > 0 and no fixed costs

ˆ Firms choose quantities simultaneously

ˆ In the unique NE, each firm produces the quantity (a − c)/3

ˆ This quantity is called the Cournot quantity and is denoted by qC

ˆ The equilibrium aggregate quantity, 2(a − c)/3 exceeds the monopoly quantity, qm ≡ (a − c)/2

ˆ Both firms would be better off if each produced half the monopoly quantity, qi = qm /2

Infinitely repeated Cournot duopoly

ˆ Consider the infinitely repeated game based on this Cournot stage game when both firms have the dis-
counted factor δ

ˆ Consider the following trigger strategy for each firm

– Produce half the monopoly quantity, qm /2, in the first period


– In stage t,
* produce qm /2 if both firms have produced qm /2 in each of the t − 1 previous stages;
* otherwise, produce the Cournot quantity, qC

ˆ We propose to compute the values of δ for which it is a subgame perfect Nash equilibrium to play the
previous trigger strategy

ˆ The profit to one firm when both produce qm /2 is (a − c)2 /8, which will be denoted by πm /2

ˆ The profit to one firm when both produce qC is (a − c)2 /9, which will be denoted by πC

ˆ If firm i is going to produce qm /2 this period then the quantity that maximizes firm j’s profit this period
solves
arg max{[a − c − qj − (qm /2)]qj : qj ≥ 0}

ˆ The solution is qj = 3(a − c)/8, with associated profit of πd ≡ 9(a − c)2 /64

ˆ It is a NE for both firms to play the trigger strategy when

1 πm δ
× ≥ πd + πC
1−δ 2 1−δ

93
ˆ This yields δ ≥ 9/17

ˆ For the same reasons as the Prisoners’ Dilemma infinitely repeated game, this NE is subgame perfect

ˆ We propose to study what the firms can achieve if δ < 9/17

ˆ We first determine, for a given value of δ, the most profitable quantity the firms can produce if they both
play trigger strategies that switch forever to the Cournot quantity after any deviation

ˆ We know that such trigger strategies cannot support a quantity as low as qm /2

ˆ But for any value of δ it is a SPNE simply to repeat the Cournot quantity forever

ˆ This implies that the most profitable quantity that triggers strategies can support is between qm /2 and
qC

ˆ Consider the following trigger strategy

– Produce q ∗ in the first period


– In the stage t,
* produce q ∗ if both firms have produced q ∗ in each of t − 1 previous periods;
* otherwise, produce the Cournot quantity, qC

ˆ The profit of one firm if both play q ∗ is π ∗ ≡ (a − c − 2q ∗ )q ∗

ˆ If firm i is going to produce q ∗ this period, then the quantity that maximizes firm j’s profit this period
solves
arg max{(a − c − qj − q ∗ )qj : qj ≥ 0}

ˆ The solution is qj = (a − c − q ∗ )/2 with associated profit

(a − c − q ∗ )2
πd ≡
4

ˆ It is a NE for both firms to play the trigger strategy given before provided that

1 δ
π ∗ ≥ πd + πC
1−δ 1−δ

ˆ Solving the resulting quadratic in q ∗ shows that the lowest value of q ∗ for which the trigger strategies are
a SPNE is
9 − 5δ
q ∗ (δ) = (a − c)
3(9 − δ)

ˆ The function δ 7→ q ∗ (δ) is decreasing and satisfies

qm
lim q ∗ (δ) = and lim q ∗ (δ) = qC
9
δ→ 17 2 δ→0

94
ˆ We now explore the second approach, which involves threatening to administer the strongest credible
punishment

ˆ We propose to show that Abreu’s approach can achieve the monopoly outcome in our model when δ = 1/2
(which is less than 9/17)

ˆ Consider the following “two phase” strategy

– Produce half the monopoly quantity, qm /2, in the first period


– In stage t,
* produce qm /2 if both firms produced qm /2 in period t − 1
* produce qm /2 if both firms produced x in period t − 1
* Otherwise produce x

ˆ This strategy involves a one-period punishment phase

ˆ And a (potentially infinite) collusive phase in which the firm produces qm /2

ˆ The profit to one firm if both produce x is π(x) ≡ (a − c − 2x)x

ˆ Let V (x) denote the present value of receiving π(x) this period and half the monopoly profit forever after:

δ πm
V (x) = π(x) + ×
1−δ 2

ˆ If firm i is going to produce x this period, then the quantity that maximizes firm j’s profit this period
solves
arg max{(a − c − qj − x)qj : qj ≥ 0}

ˆ The solution is qj = (a − c − x)/2, with associated profit

(a − c − x)2
πdp (x) ≡
4

Teorema 15. The “two-phase” strategy is a SPNE if and only if

1 1
× πm ≥ πd + δV (x) (1)
1−δ 2

and
V (x) ≥ πdp (x) + δV (x) (2)

ˆ For δ = 1/2, condition (3) is satisfied provided


 
x 1 3

/ ,
a−c 8 8

95
ˆ For δ = 1/2, condition (4) is satisfied provided
 
x 3 1
∈ ,
a−c 10 2

ˆ For δ = 1/2, the two-phase strategy achieves the monopoly outcome as a SPNE provided that
 
x 3 1
∈ ,
a−c 8 2

2.4 Dynamic games of complete but imperfect information


Extensive-form representation of games

Extensive-form representation

ˆ It may seem that that static games must be represented in normal form and dynamic games in extensive
form

– This is not the case

ˆ Any game can be represented in either normal or extensive form

ˆ Although for some games one of the two forms is more convenient to analyze

ˆ We will discuss how static games can be represented using extensive form and how dynamic games can
be represented using normal form

Normal-form representation
The normal-form representation of a game specifies

1. the players in the game

2. the strategies available to each player

3. the payoff received by each player for each combination of strategies that could be chosen by the players

Definição 30. The extensive-form representation of a game specifies

1. the players in the game

2. (a) when each player has the move


(b) what each player can do at each of his or her opportunities to move
(c) what each player knows at each of his or her opportunities to move

96
3. the payoff received by each player for each combination of moves that could be chosen by the players

ˆ We already analyzed several games represented in extensive form

ˆ We propose to describe such games using game trees rather than words

Example: consider the following class of two-stage games of complete and perfect information

1. Player 1 chooses an action a1 from the feasible set A1 = {L, R}

2. Player 2 observes a1 and then chooses an action a2 from the set A2 = {L, R}

3. Payoffs are u1 (a1 , a2 ) and u2 (a1 , a2 ), as shown in the following game tree

ˆ We can extend in a straightforward manner the previous game tree to represent any dynamic game of
complete and perfect information:

– players move in sequence


– all previous moves are common knowledge before the next move is chosen
– the players’ payoffs from each feasible combination of moves are common knowledge

ˆ We propose to derive the normal-form representation of the previous dynamic game

ˆ To represent a dynamic game in normal form, we need to translate the information in the extensive form
into the description of each player’s strategy

97
Normal-form representation of dynamic games

Definição 31. A strategy for a player is a complete plan of action: it specifies a feasible action for the
player in every contingency in which the player might be called on to act

ˆ We could not apply the notion of Nash equilibrium to dynamic games of complete information if we
allowed a player’s strategy to leave the actions in some contingencies unspecified

ˆ For player j to compute a best response to payer i’s strategy, j may need to consider how i would act in
every contingency, not just in the contingencies i thinks likely to arise

ˆ In the previous game, player 2 has two actions but four strategies

ˆ This is because there are two contingencies

Strategies of player 2:

Strategy 1 If player 1 plays L then play L′ , if player 1 plays R then play L′


(
L′ if a1 = L
f2 (a1 ) =
L′ if a1 = R

This strategy may be denoted by (L′ , L′ )

Strategy 2 If player 1 plays L then play L′ , if player 1 plays R then play R′


(
L′ if a1 = L
f2 (a1 ) =
R′ if a1 = R

This strategy may be denoted by (L′ , R′ )

Strategy 3 If player 1 plays L then play R′ , if player 1 plays R then play L′


(
R′ if a1 = L
f2 (a1 ) =
L′ if a1 = R

This strategy may be denoted by (R′ , L′ )

Strategy 4 If player 1 plays L then play R′ , if player 1 plays R then play R′


(
R′ if a1 = L
f2 (a1 ) =
R′ if a1 = R

This strategy may be denoted by (R′ , R′ )

ˆ Player 1 has two actions but only two strategies: play L or R

ˆ The reason is that player 1 has only one contingency in which he might be called upon to act

98
ˆ Player 1’s strategy space is equivalent to the action space A1 = {L, R}

Recall the extensive-form representation

We can now derive the normal-form representation of the game from its extensive-form representation

Player 2
(L′ , L′ ) (L′ , R′ ) (R′ , L′ ) (R′ , R′ )
L 3, 1 3, 1 1, 2 1, 2
Player 1
R 2, 1 0, 0 2, 1 0, 0
Extensive-form of static games

ˆ We turn to showing how a static (i.e., simultaneous-move) game can be represented in extensive form

ˆ In a static game players do not need to act simultaneously

ˆ It suffices that each choose a strategy without knowledge of the other’s choice

ˆ We can represent a simultaneous game between players 1 and 2 as follows

1. Player 1 chooses an action a1 from the feasible set A1


2. Player 2 does not observe player 1’s move but chooses an action a2 from the feasible set A2
3. Payoffs are u1 (a1 , a2 ) and u2 (a1 , a2 )

ˆ Alternatively, player 2 could move first and player 1 could then move without observing 2’s action

To represent that some player ignores the previous moves, we introduce the notion of a player’s information
set

Definição 32. An information set for a player is a collection of decision nodes satisfying

(i) the player has the move at every node in the information set

(ii) when the play of the game reaches a node in the information set, the player does not know which node
in the information set has (or has not) been reached

(iii) it is the largest set satisfying (i) and (ii)

ˆ Part (ii) implies that the player must have the same set of feasible actions at each decision in an information
set

99
ˆ In an extensive-form game, we will indicate that a collection of decision nodes constitutes an information
set by connecting the nodes by a dotted line

Extensive-form of the Prisonners’ Dilemma

ˆ Fink = confess

Information set: an example

ˆ We propose a second example of the use of an information set in representing ignorance of a previous play

ˆ Consider the following dynamic game of complete but imperfect information

1. Player 1 chooses an action a1 from the feasible set A1 = {L, R}


2. Player 2 observes a1 and then chooses an action a2 from the feasible set A2 = {L′ , R′ }
3. Player 3 observes whether or not (a1 , a2 ) = (R, R′ ) and then chooses an action a3 from the feasible
set A3 = {L′′ , R′′ }

ˆ Player 3 has two information sets

1. a singleton information set following R by player 1 and R′ by player 2


2. a non-singleton information set that includes every other node at which player 3 has the move

100
Perfect and imperfect information

ˆ We previously defined perfect information to mean that at each move in the game the player with the
move knows the full history of the play of the game thus far

ˆ An equivalent definition is that every information set is a singleton

ˆ Imperfect information means that there is at least one non-singleton information set

ˆ The extensive-form representation of a simultaneous-move game (such as the Prisoners’ Dilemma) is a


game of imperfect information

Subgame-perfect Nash equilibrium

Subgames

ˆ We gave a formal definition of a subgame for repeated games

ˆ We extend this definition to general dynamic games of complete information in terms of the game’s
extensive-form representation

Definição 33. A subgame in an extensive-form game is a game that

(a) begins at a decision node n that is a singleton information set but is not the game’s first decision node

(b) includes all the decision and terminal nodes following n in the game tree but no nodes that do not follow
n

(c) does not cut any information sets, i.e., if a decision node n′ follows n in the game tree, then all other
nodes in the information set containing n must also follow n, and so must be included in the subgame

Subgames: example

ˆ There are two subgames, one beginning at each of player 2’s decision nodes

101
Subgames: example

ˆ There are no subgames

Subgames: example

ˆ There is only one subgame: it begins at player 3’s decision node following R by player 1 and R′ by player
2

ˆ Because of part (c), a subgame does not begin at either of player 2’s decision nodes, even though both of
these nodes are singleton information sets

102
Subgame perfect Nash equilibrium

Definição 34. A profile of strategies of a dynamic game with complete information is a subgame perfect
Nash equilibrium if it is a Nash equilibrium of the initial game and the players’ strategies restricted to every
subgame constitute a Nash equilibrium of the subgame

ˆ We already encountered two game solutions for dynamic games: backwards induction outcome and subgame
perfect outcome

ˆ The difference is that a SPNE is a collection of strategies and a strategy is a complete plan of actions

ˆ Whereas an outcome describes what will happen only in the contingencies that are expected to arise, not
in every contingency that might arise

Equilibrium vs outcome
Consider the standard two-stage game of complete and perfect information defined as follows

1. Player 1 chooses an action a1 from a feasible set A1

2. Player 2 observes a1 and then chooses an action a2 from a feasible set A2

3. Payoffs are u1 (a1 , a2 ) and u2 (a1 , a2 )

Assume that for each a1 in A1 , player 2’s optimization problem

arg max{u2 (a1 , a2 ) : a2 ∈ A2 }

has a unique solution, denoted by R2 (a1 )

ˆ Player 1’s problem at the first stage amounts to

arg max{u1 (a1 , R2 (a1 )) : a1 ∈ A1 }

ˆ Assume that the previous optimization problem for player 1 also has a unique solution, denoted by a∗1

ˆ The pair of actions (a∗1 , R2 (a∗1 )) is the backwards induction outcome of this game

ˆ To define a SPNE we need to construct strategies

ˆ For player 1 a strategy coincides with an action since there is only one contingency in which player 1 can
be called upon to act – the beginning of the game

ˆ A strategy for player 2 is a function a1 7→ f2 (a1 ) from A1 to A2

– R2 (a∗1 ) is an action but not a strategy


– the best response function R2 is a possible strategy for player 2

ˆ In this game, the subgames begin with player 2’s move in the second stage

ˆ There is one subgame for each player 1’s feasible action a1

103
Observação 25. The profile of strategies f ∗ ≡ (a∗1 , R2 ) is a SPNE

ˆ We have to show that f ∗ = (a∗1 , R2 ) is a NE and that the restriction to each subgame is also a NE

ˆ Subgames are simply single-person decision problems

– Being a NE reduces to requiring that player 2’s action be optimal in every subgame
– This is exactly the problem that the best-response function R2 solves

ˆ Now we have to prove that f ∗ is a Nash equilibrium

ˆ Recall that a∗1 satisfies


u1 (a∗1 , R2 (a∗1 )) ≥ u1 (a1 , R2 (a1 )) ∀a1 ∈ A1

implying that a1 is a best response to R2

ˆ R2 is a best response to a1 since


u2 (a∗1 , R2 (a∗1 )) ≥ u2 (a∗1 , f2 (a∗1 ))

for every strategy f2 : A1 7→ A2

Consider the standard two-stage game of complete but imperfect information defined as follows:

ˆ Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively

ˆ Players i3 and i4 observe the outcome of the first stage, (ai1 , ai2 ), and then simultaneously choose actions
ai3 and ai4 from feasible sets Ai3 and Ai4 , respectively

ˆ Payoffs are ui (ai1 , ..., ai4 )

ˆ We will assume that for each feasible outcome (ai1 , ai2 ) of the first game, the second-stage game that
remains between players i3 and i4 has a unique NE denoted by

(âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))

ˆ Assume that (ai1 , ai2 ) is the unique NE of the first-stage interaction between i1 and i2 defined by the
following simultaneous-move game

1. Players i1 and i2 simultaneously choose actions ai1 and ai2 from feasible sets Ai1 and Ai2 , respectively
2. Payoffs are
ui (ai1 , ai2 , âi3 (ai1 , ai2 ), âi4 (ai1 , ai2 ))

Proposição 12. In the two-stage game of complete but imperfect information defined above, the subgame
perfect outcome is
(a∗i1 , a∗i2 , âi3 (a∗i1 , a∗i2 ), âi4 (a∗i1 , a∗i2 ))

104
but the subgame perfect Nash equilibrium is

(a∗i1 , a∗i2 , âi3 , âi4 )

Subgame perfect Nash equilibrium and credible threats


Consider the following dynamic game with complete and perfect information

ˆ The backwards induction outcome of the game is (R, L′ )

ˆ The SPNE is the profile (R, f2 ) where f2 : {L, R} 7→ {L′ , R′ } is defined by

f2 (L) = R′ and f2 (R) = L′

Recall that the normal-form representation of this game is given by

Player 2
(L′ , L′ ) (L′ , R′ ) (R′ , L′ ) (R′ , R′ )
L 3, 1 3, 1 1, 2 1, 2
Player 1
R 2, 1 0, 0 2, 1 0, 0

ˆ There are two NE : (R, (R′ , L′ )) and (L, (R′ , R′ ))

ˆ The first one corresponds to the SPNE (R, f2 )

105
ˆ The second one corresponds to a non-credible threat of player 2

ˆ Player 2 is threatening to play R′ if player 1 plays R

ˆ If the threat works then 2 is not given the opportunity to carry out the threat

ˆ The threat should not work because it is not credible:

– if player 2 were given the opportunity to carry it out,


– then player 2 would decide to play L′ rather than R′

ˆ Observe that players strategies do not constitute a NE in one of the subgames

106
Cap. 3 - Static Games of Incomplete Information
3.1 Static Bayesian games and Bayesian Nash equilibrium
Introduction

ˆ In a game of complete information the players’ payoff functions are common knowledge

ˆ In a game of incomplete information, at least one player is uncertain about another player’s payoff function

ˆ An example of a static game of incomplete information is a sealed-bid auction:

– each bidder knows his or her own valuation for the good being sold
– but each bidder does not know any other bidder’s valuation
– bids are submitted in sealed envelopes, so the players’ moves can be thought of as simultaneous

Cournot competition under asymmetric information

An example:

ˆ Consider a Cournot duopoly model with inverse demand given by P (Q) = a − Q, where Q = q1 + q2 is
the aggregate quantity in the market

ˆ Firm 1’s cost function is C1 (q1 ) = cq1

ˆ Firm 2’s cost function is (


cH q2 with probability θ
C2 (q2 ) =
cL q2 with probability 1 − θ
where cL < cH

ˆ Information is asymmetric:

– Firm 2 knows its cost function and firm 1’s


– Firm 1 knows its cost function and only that firm 2’s marginal cost is cH with probability θ and cL
with probability 1 − θ
– Firm 2 could be a new entrant to the industry or could have just invented a new technology

ˆ All of this is common knowledge

– firm 1 knows that firm 2 has superior information


– firm 2 knows that firm 1 knows this, and so on

ˆ Firm 2 may want to choose a different (and presumably lower) quantity if its marginal cost is high than
if it is low

ˆ Firm 1 should anticipate that firm 2 may tailor its quantity to its cost in this way

ˆ Let q2 (cH ) and q2 (cL ) denote firm 2’s quantity choices as a function of its cost

107
ˆ Let q1 denote firm 1’s single quantity choice

ˆ If firm 2’s cost is high, it will choose q2 (cH ) to solve

arg max{[(a − q1 − q2 ) − cH ]q2 : q2 ≥ 0}

ˆ If firm 2’s cost is low, it will choose q2 (cL ) to solve

arg max{[(a − q1 − q2 ) − cL ]q2 : q2 ≥ 0}

ˆ Firm 1 knows that firm 2’s cost is high with probability θ and should anticipate that firm 2’s quantity
choice will be q2 (cH ) or q2 (cL ), depending on firm 2’s cost

ˆ Firm 1 chooses q1 to solve


arg max{f1 (q1 , q2∗ ) : q1 ≥ 0}

where
f1 (q1 , q2∗ ) ≡ θ[(a − q1 − q2∗ (cH )) − c]q1 + (1 − θ)[(a − q1 − q2∗ (cL )) − c]q1

so as to maximize expected profits

ˆ The first order conditions are


a − q1∗ − cH
q2∗ (cH ) =
2
a − q1∗ − cL
q2∗ (cL ) =
2
and
θ[a − q2∗ (cH ) − c] + (1 − θ)[(a − q2∗ (cL ) − c]
q1∗ =
2
ˆ We assume that parameters are such that these FOCs characterize the solutions to the optimization
problems

The solutions to the three FOCs are

a − 2cH + c 1 − θ
q2∗ (cH ) = + (cH − cL )
3 6

a − 2cL + c θ
q2∗ (cL ) = − (cH − cL )
3 6
and
a − 2c + θcH + (1 − θ)cL
q1∗ =
3
ˆ Consider the Cournot equilibrium under complete information with costs c1 and c2

ˆ Under conditions on c1 and c2 , firm i produces at equilibrium the quantity

a − 2ci + cj
q̂i =
3

108
ˆ In the incomplete information case, q2∗ (cH ) is greater than (a − 2cH + c)/3 and q2∗ (cL ) is less than (a −
2cL + c)/3

ˆ This occurs because firm 2 not only tailors its quantity to its costs

ˆ But also responds to the fact that firm 1 cannot do so

Normal-form representation of static Bayesian games

ˆ Recall that the normal-form representation of a game of complete information is G = (Si , ui )i∈I

– Si is player i’s strategy space


– ui (si , s−i ) is player i’s payoff when he chooses the strategy si and the others choose s−i

ˆ In a simultaneous-move game of complete information a strategy for a player is simply an action

ˆ We can write G = (Ai , ui )i∈I where Ai is player i’s action space and ui (ai , a−i ) is player i’s payoff

ˆ The timing of a static game of complete information is as follows

1. the players simultaneously choose actions


2. payoffs are received

ˆ We develop the normal-form representation of a simultaneous-move game of incomplete information, also


called static Bayesian game

ˆ We should represent the idea that each player knows his or her payoff function but may be uncertain
about the other players’ payoff functions

ˆ Let player i’s possible payoff functions be represented by ui (ai , a−i ; ti ) where ti is called player i’s type

ˆ The type ti belongs to a set of possible types Ti also called type space

ˆ Each ti corresponds to a different payoff function that player i might have

ˆ For example, suppose that player i has two possible payoff functions

ˆ We would say that player i has two types ti1 and ti2

ˆ Player i’s type space is Ti = {ti1 , ti2 } and player i’s two payoff functions are ui (a; ti1 ) and ui (a; ti2 )

ˆ We can also represent the possibility that the player might have different sets of feasible actions

– Suppose for example that player i’s set of feasible actions is {a, b} with probability q and {a, b, c}
with probability 1 − q
– We can say that i has two types: ti1 and ti2 where the probability of ti1 is q

109
– We can define i’s feasible set of actions to be {a, b, c} for both types but define the payoff from taking
action c to be −∞ for type ti1

Another example

ˆ Consider the Cournot game previously presented

ˆ The firms’ actions are their quantity choices, q1 and q2

ˆ Firm 2 has two possible cost functions and thus two possible profit or payoff functions

π2 (q1 , q2 ; cL ) = [(a − q1 − q2 ) − cL ]q2

and
π2 (q1 , q2 ; cH ) = [(a − q1 − q2 ) − cH ]q2

ˆ Firm 1 has only one possible payoff function

π1 (q1 , q2 ; c) = [(a − q1 − q2 ) − c]q1

ˆ We say that firm 2’s type space is T2 = {cL , cH } and that firm 1’s type space if T1 = {c}

ˆ Saying that player i knows his or her own payoff function is equivalent to saying that player i knows his
or her type

ˆ Saying that player i may be uncertain about the other players’ payoff functions is equivalent to saying
that player i may be uncertain about the types of the other players, denoted by

t−i = (t1 , · · · , ti−1 , ti+1 , · · · , tn )

ˆ We use T−i to denote the set of all possible values of t−i , i.e.,
Y
T−i ≡ Tj = T1 × · · · × Ti−1 × Ti+1 × · · · × Tn
j6=i

ˆ We use the probability distribution π(t−i |ti ) to denote player i’s belief about the other players’ types, t−i ,
given player i’s knowledge of his or her own type, ti

ˆ In many applications, the players’ types are independent, in which case π(t−i |ti ) does not depend on ti ,
so we can write player i’s beliefs as π(t−i )

ˆ Imagine two firms racing to develop a new technology

ˆ Each firm’s chance of success depends in part on how difficult the technology is to develop, which is not
known

110
ˆ Each firm knows only whether it has succeeded and not whether the other has

ˆ If firm 1 has succeeded, then it is more likely that the technology is easy to develop and so also more
likely that firm 2 has succeeded

ˆ Firm 1’s belief about firm 2’s type depends on firm 1’s knowledge of its own type

Definição 35. The normal-form representation of a static Bayesian game is

G = (Ai , Ti , pi , ui )i∈I

where

ˆ Ai is player i’s action space

ˆ Ti is player i’s type space

ˆ pi ∈ P rob(T ) is player i’s beliefs about T = Ti × T−i

ˆ ui : Ai × A−i × Ti 7→ [−∞, ∞) is player i’s payoff function

ui (ai , a−i ; ti )

The normal-form representation of a static Bayesian game is

G = (Ai , Ti , pi , ui )i∈I

ˆ Player i’s type ti is privately known by player i, determines player i’s payoff function ui (ai , a−i ; ti )

ˆ Player i’s belief


pi ({(ti , t−i )})
pi (t−i |ti ) =
pi ({ti } × T−i )
describes i’s uncertainty about the other players’ possible types t−i , given i’s own type ti

111
ˆ To simplify notations,

– pi ({(ti , t−i )}) is denoted by pi (ti , t−i ) or pi (t)


– pi ({ti } × T−i ) is denoted by pi (ti )

ˆ Therefore
pi (ti , t−i )
pi (t−i |ti ) =
pi (ti )

ˆ Since player i observes his own type, we do not need to define the probability pi on the whole space T

ˆ One may consider as a primitive of the game the conditional probabilities

(pi (·|ti ))ti ∈Ti ∈ [P rob(T−i )]Ti

i.e., player i’s beliefs can be represented by a function

ti 7→ pi (·|ti )

from Ti to P rob(T−i )

Following Harsanyi (1968) we will assume that the timing of a static Bayesian game is as follows

1. Nature draws a type vector t = (ti )i∈I where ti is drawn from the set of possible types Ti

2. Nature reveals ti to player i but not to any other player

3. The players simultaneously choose actions, player i choosing ai from the feasible set Ai

4. Payoffs ui (ai , a−i ; ti ) are received

ˆ We can interpret a game of incomplete information as a game of imperfect information since at some move
in the game the player with the move does not know the complete history of the game thus far

ˆ Indeed, nature reveals player i’s type to player i but not to player j in step (2)

ˆ Player j does not know the complete history of the game when actions are chosen in step (3)

ˆ For some games, player i’s payoff may depend not only on the actions (ai , a−i ), his type ti but also on all
the other types t−i

ˆ In that case player i’s payoff is denoted by ui (ai , a−i ; ti , t−i )

ˆ We will assume that it is common knowledge that in step (1) of the timing, nature draws a type vector
t = (ti )i∈I according to a common prior probability distribution p ∈ P rob(T )

ˆ When nature reveals ti to player i, he can compute the belief pi (t−i |ti ) using Bayes’ rule

p(ti , t−i ) p(ti , t−i )


pi (t−i |ti ) = p(t−i |ti ) ≡ =P
p(ti ) τ−i ∈T−i p(ti , τ−i )

112
ˆ The other players can compute the various beliefs π(·|ti ) that player i might hold, depending on i’s type
ti

ˆ We will frequently assume that players’ type are independent, i.e., there exists qi ∈ P rob(Ti ) such that

p(t1 , · · · , tn ) = q1 (t1 ) × · · · × qn (tn )

ˆ In this case pi (t−i |ti ) does not depend on ti since

pi (t−i |ti ) = q1 (t1 ) × · · · × qi−1 (ti−1 ) × qi+1 (ti+1 ) × · · · × qn (tn )

ˆ In this case the other players know i’s belief about their types

Definition of Bayesian Nash equilibrium

ˆ In order to define an equilibrium concept for static Bayesian games, we must first define the player’s
strategies

ˆ Recall that a player’s strategy is a complete plan of action, specifying a feasible action in every contingency
in which the player might be called on to act

ˆ Giving the timing of a static Bayesian game, in which nature begins the game by drawing the players’
types, a (pure) strategy for player i must specify a feasible action for each of player i’s possible types

Definição 36. A strategy for player i in the static Bayesian game G = (Ai , Ti , pi , ui )i∈I is a function

si : Ti 7→ Ai

which specifies for each type ti ∈ Ti an action si (ti ) from the feasible set Ai

ˆ In a Bayesian game the strategy spaces are constructed from the type and action spaces

ˆ Player i’s set of possible (pure) strategies, Si , is the set of all possible functions with domain Ti and range
Ai
Si ≡ ATi i

ˆ In a separating strategy, each type ti chooses a different action ai

ˆ In a pooling strategy, all types choose the same action

ˆ It may seem unnecessary to require player i’s strategy to specify a feasible action for each player i’s
possible types

113
ˆ Once nature has drawn a particular type and revealed it to a player, it may seem that the player need
not be concerned with the actions he should have taken had nature drawn some other type

ˆ But player i needs to consider what the other players will do

ˆ What they will do depends on what the other players think player i will do, for each ti in Ti (since they
do not observe ti )

ˆ In deciding what to do once one type has been drawn, player i will have to think about what he would
have done if each other types in Ti had been drawn

ˆ When player j has to decide what to do, he should think about what player i may do for each possible
types in Ti , since player j cannot observe player i’s type

Definição 37. In the static game G = (Ai , Ti , pi , ui )i∈I the profile of strategies

s∗ = (s∗i )i∈I

is a (pure strategy) Bayesian Nash equilibrium (BNE) if for each player i and for each of i’s types ti in Ti ,
the action s∗i (ti ) solves
 
 X 
arg max ui (ai , s∗−i (t−i ); (ti , t−i ))pi (t−i |ti ) : ai ∈ Ai
 
t−i ∈T−i

where ui (ai , s∗−i (t−i ); t) is given by

ui (s∗1 (t1 ), · · · , s∗i−1 (ti−1 ), ai , s∗i+1 (ti+1 ), · · · , s∗n (tn ); t)

ˆ One may also write in condensed form

s∗i (ti ) ∈ arg max{Ep [ui (ai , s∗−i )|ti ] : ai ∈ Ai }

ˆ In a static BNE, no player wants to change his strategy, even if the change involves only one action by
one type

ˆ We can show that in a finite static Bayesian game (i.e., a game in which Ai and Ti are finite sets) there
exists a BNE, perhaps in mixed strategies

Proposição 13. Assume that (s∗i )i∈I is a Bayesian Nash equilibrium, i.e.,

∀ti , s∗i (ti ) ∈ arg max{Ep [ui (ai , s∗−i )|ti ] : ai ∈ Ai }

Then we have
s∗i ∈ arg max{Ep [ui (si , s∗−i )] : si ∈ Si = [Ai ]Ti }

114
where
X
Ep [ui (si , s∗−i )] ≡ ui (si (ti ), s∗−i (t−i ))p(t).
t∈T

Moreover, the converse is true.

3.2 Applications
Mixed strategies revisited

ˆ Harsanyi (1973) suggested the following interpretation of mixed strategies

ˆ Player j’s mixed strategy represents player i’s uncertainty about j’s choice of a pure strategy

ˆ Player j’s choice in turn depends on the realization of a small amount of private information

ˆ More precisely, a mixed-strategy NE in a game of complete information can (almost always) be interpreted
as a pure-strategy BNE in a closely related game with a little bit of incomplete information

ˆ The crucial feature of a mixed-strategy NE is not that player j chooses a strategy randomly, but rather
that player i is uncertain about player j’s choice

The Battle of Sexes

Pat
Opera Fight
Opera 2, 1 0, 0
Chris
Fight 0, 0 1, 2

ˆ There are two pure-strategy Nash equilibria: (Opera,Opera) and (Fight,Fight)

ˆ And a mixed-strategy NE in which

– Chris plays Opera with probability 2/3


– Pat plays Fight with probability 2/3

The Battle of Sexes with incomplete information

ˆ Suppose that Chris and Pat are not quite sure of each other’s payoffs

ˆ For instance, suppose that Chris’s payoff if both attend the Opera is 2 + tc , where tc is privately known
by Chris

ˆ Pat’s payoff if both attend the Fight is 2 + tp , where tp is privately known by Pat

ˆ The parameters tc and tp are independent draws from a uniform distribution on [0, x], where x should be
thought as small with respect to 2

115
ˆ All the other payoffs are the same

The abstract static Bayesian games in normal-form is

G = {Ac , Ap ; Tc , Tp ; pc , pp ; uc , up }

where

ˆ the action spaces are Ac = Ap = {Opera, F ight}

ˆ the type space are Tc = Tp = [0, x]

ˆ the beliefs are


pc (X|tc ) = pp (X|tp ) = λ(X)/x

for all X ⊂ [0, x], and the payoffs are as follows

Pat
Opera Fight
Opera 2 + tc , 1 0, 0
Chris
Fight 0, 0 1, 2 + tp

ˆ Fix two critical values c and p in [0, x]

ˆ Consider the strategy profile s∗ = (s∗c , s∗p ) defined as follows

ˆ Chris plays Opera if tc exceeds the critical value c and plays F ight otherwise, i.e.,
(
Opera if tc > c
s∗c (tc ) =
F ight if tc ≤ c

ˆ Pat plays F ight if tp exceeds the critical value p and plays Opera otherwise, i.e.,
(
F ight if tp > p
s∗p (tp ) =
Opera if tp ≤ p

– Chris plays Opera with probability (x − c)/x


– Pat plays F ight with probability (x − p)/x

ˆ For a given value of x, we will determine values of c and p such that these strategies are a BNE

ˆ Given Pat’s strategy s∗p , Chris’s expected payoff from playing Opera is

p h pi p
uc (Opera, s∗p ; tc ) = (2 + tc ) + 1 − · 0 = (2 + tc )
x x x

and from playing F ight, Chris’s payoff is

p h pi p
uc (F ight, s∗p ; tc ) = ·0+ 1− ·1 =1−
x x x

116
ˆ Playing Opera is optimal (best response) if and only if

x
tc ≥ −3≡c
p

ˆ Given Chris’s strategy s∗c , Pat’s expected payoff from playing F ight is
h ci c c
up (F ight, s∗c ; tp ) = 1 − · 0 + (2 + tp ) = (2 + tp )
x x x

and from playing Opera, Pat’s payoff is


h ci c c
up (Opera, sc; tp ) = 1 − ·1+ ·0=1−
x x x

ˆ Playing F ight is optimal if and only if


x
tp ≥ −3≡p
c
ˆ Solving
x
−3≡c
p
and
x
−3≡p
c
simultaneously

ˆ Yields p = c and p2 + 3p − x = 0

ˆ Solving the quadratic equation then shows that both

– the probability that Chris plays Opera, namely (x − c)/x, and


– the probability that Pat plays F ight, namely (x − p)/x,

equal √
−3 + 9 + 4x
1−
2x
which approaches 2/3 as x approaches zero

As the incomplete information disappears (x → 0), the players’ behavior in this pure-strategy BNE of the
incomplete information game approaches the players’ behavior in the mixed-strategy NE in the original game
of complete information

An auction

ˆ Consider the following first-price, sealed-bid auction

ˆ There are two bidders, I = {1, 2}

117
ˆ Bidder i has a valuation vi for the good

– If bidder i gets the good and pays the price p, then i’s payoff is vi − p

ˆ The two players’ valuations are independently and uniformly distributed on [0, 1]

ˆ The players simultaneously submit their non-negative bids

ˆ The higher bidder wins the good and pays the price she bid; the other bidder gets and pays nothing

ˆ In case of a tie, the winner is determined by a flip of a coin

ˆ The bidders are risk-neutral and all this is common knowledge

The static Bayesian game associated to this problem is defined by

ˆ the action space Ai = [0, ∞)

ˆ the type space Ti = [0, 1]

ˆ Player i believes that vj is uniformly distributed on [0, 1]

∀V ∈ B([0, 1]), pi ({vj ∈ V }|vi ) = λ(V )

where λ is the Lebesgue measure on [0, 1]

ˆ Abusing notations, pi (·|vi ) is denoted pi (·) since it is independent of vi

ˆ Player i’s (expected) payoff function is



 vi − bi
 if bi > bj
ui (b1 , b2 ; v1 , v2 ) = (vi − bi )/2 if bi = bj


0 if bi < bj

ˆ A strategy for player i is a function


b̃i : vi 7→ b̃i (vi )

ˆ A profile (b̃1 , b̃2 ) is BNE if for each player i, for each valuation vi ∈ [0, 1], the value b̃i (vi ) belongs to
 
1
arg max (vi − bi )pi {bi > b̃j } + (vi − bi )pi {bi = b̃j } : bi ≥ 0
2

ˆ Recall that
pi {bi > b̃j } = λ{vj ∈ [0, 1] : bi > b̃j (vj )}

and
pi {bi = b̃j } = λ{vj ∈ [0, 1] : bi = b̃j (vj )}

An auction: existence of a linear equilibrium

118
ˆ We propose to look for a linear equilibrium of the form

b̃i : vi 7→ ai + ci vi

where ci > 0

ˆ We are not restricting the players’ strategy spaces to include only linear strategies

ˆ We allow players to choose arbitrary strategies but ask whether there is an equilibrium that is linear

ˆ Suppose that player j adopts the linear strategy

b̃j = aj + cj Id

where
Id : [0, 1] → [0, 1] is defined by Id(v) = v

ˆ For a given valuation vi , player i’s best response solves

max {(vi − bi )λ{bi > aj + cj Id}}


bi ≥0

where we recall that


λ{bi > aj + cj Id} = λ{vj ∈ [0, 1] : bj > aj + cj vj }

ˆ We have used the fact that λ{bi = b̃j } = 0

ˆ Observe that the best reply b̃i (vi ) must satisfy

aj ≤ b̃i (vi ) ≤ aj + cj

ˆ If bi belongs to [aj , aj + cj ] then

bi − a j
λ{bi > aj + cj Id} = λ[0, (bi − aj )/cj ) =
cj

ˆ Player i’s best response is therefore


(
(vi + aj )/2 if vi ≥ aj
b̃i (vi ) =
aj if vi < aj

ˆ If 0 < aj < 1 then there are some values of vi such that vi < aj , in which case b˜i is not linear

ˆ Can we find a NE where aj ≥ 1 or aj ≤ 0?

ˆ Assume that aj ≤ 0, in this case player i’s best response is

aj 1
b̃i (vi ) = + vi
2 2

119
ˆ The function b̃i takes the form ai + ci Id where ai = aj /2 and ci = 1/2

ˆ This implies that ((1/2)Id, (1/2)Id) is a BNE

An auction: uniqueness

ˆ We propose to prove that there is a unique symmetric BNE which is the linear equilibrium already derived

ˆ A BNE is called symmetric if the players’ strategies are identical

ˆ We propose to prove that there is a single function b̃ such that (b̃, b̃) is a BNE

ˆ Since players’ valuations typically will be different, their bids typically will be different, even if both use
the same strategy

ˆ Suppose that player j adopts a strategy b̃ and assume that b̃ is strictly increasing and differentiable

ˆ For a given value of vi , player i’s optimal bid solves5

max(vi − bi )λ{bi > b̃}


bi ≥0

If bi ∈ Im(b̃) then h 
{bi > b̃} = 0, b̃−1 (bi )

implying that h i′
−b̃−1 (b̃i (vi )) + (vi − b̃i (vi )) b̃−1 (b̃i (vi )) = 0

ˆ In order to get a symmetric BNE we need to have b̃i = b̃

ˆ The first-order condition is then


h i′
−b̃−1 (b̃(vi )) + (vi − b̃(vi )) b̃−1 (b̃(vi )) = 0

ˆ Since we have b̃−1 (b̃(vi )) = vi and


h i′ 1
b̃−1 (b̃(vi )) = h i′
b̃ (vi )

ˆ Then the function b̃ must satisfy h i′


vi b̃ (vi ) + b̃(vi ) = vi

ˆ Observe that h i′ h i′
Id × b̃ (vi ) = vi b̃ (vi ) + b̃(vi )

ˆ This leads to h i′
Id × b̃ − (1/2)Id2 =0
5
Observe that λ{bi = b̃} = 0

120
ˆ Therefore there exists a constant k such that

1
∀vi ∈ [0, 1], vi b̃(vi ) = vi2 + k
2

ˆ We need a boundary condition to determine k

ˆ A player’s action should be individually rational: No player should bid more than his valuation

ˆ Thus, we require b̃ ≤ Id

ˆ This implies k = 0 and


1
b̃ = Id
2

A double auction

ˆ We consider the case in which a buyer and a seller each have private information about their valuations

ˆ The seller names an asking price ps

ˆ The buyer simultaneously names an offer price pb

ˆ If pb ≥ ps then trade occurs at price p = (pb + ps )/2

ˆ If pb < ps then no trade occurs

ˆ The buyer’s valuation for the seller’s good is vb , the seller’s valuation is vs

ˆ These valuations are private information and are drawn from independent uniform distribution on [0, 1]

ˆ If the buyer gets the good for price p then his utility is vb − p; if there is no trade the buyer’s utility is
zero

ˆ If the seller sells the good for price p then his utility is p − vs ; if there is no trade the seller’s utility is
zero

ˆ A strategy for the buyer is a function p̃b : vb 7→ p̃b (vb ) specifying the price the buyer will offer for each of
his possible valuation

ˆ A strategy for the seller is a function p̃s : vs 7→ p̃s (vs ) specifying the price the seller will demand for each
of his possible valuation

ˆ A profile of strategies (p̃b , p̃s ) is a BNE if the two following conditions hold

ˆ For each vb ∈ [0, 1], the price p̃b (vb ) solves


Z  
pb + p̃s (vs )
max vb − λ(dvs )
pb ≥0 {pb ≥p̃s } 2

121
where
{pb ≥ p̃s } = {vs ∈ [0, 1] : pb ≥ p̃s (vs )}

ˆ For each vs ∈ [0, 1], the price p̃s (vs ) solves


Z  
ps + p̃b (vb )
max − vs λ(dvb )
ps ≥0 {ps ≤p̃b } 2

where
{ps ≥ p̃b } = {vb ∈ [0, 1] : ps ≤ p̃b (vb )}

ˆ A profile of strategies (p̃b , p̃s ) is a BNE if the two following conditions hold

ˆ For each vb ∈ [0, 1], the price p̃b (vb ) solves


 
pb + E[p̃s |pb ≥ p̃s ]
max vb − λ{pb ≥ p̃s }
pb ≥0 2

where we recall that Z


1
E[p̃s |pb ≥ p̃s ] = p̃s (vs )λ(dvs )
λ{pb ≥ p̃s } λ{pb ≥p̃s }

ˆ For each vs ∈ [0, 1], the price ps(vs ) solves


 
ps + E[p̃b |ps ≤ p̃b ]
max − vs λ{ps ≤ p̃b }
ps ≥0 2

ˆ There are many BNE, we propose to exhibit one of them where trade occurs at a single price if it occurs
at all

ˆ For any x ∈ [0, 1],

– let the buyer’s strategy be to offer x if vb ≥ x and to offer 0 otherwise


– let the seller’s strategy be to demand x if vs ≤ x and to demand one otherwise

ˆ This profile of strategies is a BNE

122
ˆ Trade would be efficient for all (vs , vb ) pairs such that vb ≥ vs , but does not occur in the two shaded
regions

ˆ We propose to derive a linear Bayesian equilibrium

ˆ Suppose the seller’s strategy is p̃s : vs 7→ as + cs vs with cs > 0

ˆ If the buyer’s valuation is vb , his best reply p̃b (vb ) should solve
  
1 as + pb [pb − as ]+
max vb − pb +
pb ≥0 2 2 cs

ˆ The first order condition for which yields

2 1
p̃b (vb ) = vb + a s
3 3

ˆ Thus, if the seller plays a linear strategy, then the buyer’s best response is also linear

ˆ Analogously, suppose the buyer’s strategy is p̃b : vb 7→ ab + cb vb with cb > 0

ˆ If the seller’s valuation is vs , his best reply p̃s (vs ) should solve
 
1 cb − [ps − ab ]+
max {ps + E[p̃b |ps ≤ p̃b ]} − vs
ps ≥0 2 cb

where E[p̃b |ps ≤ p̃b ] equals ab + cb /2 if ps ≤ ab and (ab + cb + ps )/2 otherwise

ˆ The first order condition for which yields


(
2
3 vs + 13 (ab + cb ) if vs ≥ ab − cb /2
p̃s (vs ) =
ab if vs < ab − cb /2

ˆ Thus, if the buyer plays a linear strategy, then the seller’s best response may also be linear

– It will be the case if ab ≤ cb /2

ˆ Assume ab ≤ cb /2

ˆ If the players’ linear strategies are to be best response to each other then we get

2 2 as ab + cb
cb = , cs = , and ab = , as =
3 3 3 3

ˆ We obtain ab = cb /8 implying that the condition ab ≤ cb /2 is satisfied

ˆ The linear strategies are then

2 1 2 1
p̃b (vb ) = vb + and p̃s (vs ) = vs +
3 12 3 4

123
ˆ Trade occurs if and only if the pair (vs , vb ) is such that p̃b (vb ) ≥ p̃s (vs ), i.e., iff vb ≥ vs + (1/4)

A double auction: comparing the solutions

ˆ In both cases, the one-price and linear equilibria, the most valuable trade (vs = 0 and vb = 1) does occur

ˆ The one-price equilibrium misses some valuable trades

– vs = 0 and vb = x − ε where ε is small

and achieves some trades that are worth next to nothing

– vs = x − ε and vb = x + ε where ε is small

ˆ The linear equilibrium, in contrast, misses all trades worth next to nothing but achieves all trades worth
at least 1/4

124
ˆ This suggests that the linear equilibrium may dominate the one-price equilibria, in terms of the expected
gains the players receive

ˆ One may wonder if it is possible to find other BNE for which the players might do even better

ˆ Myerson and Satterthwaite (Jet 1983) show that, for the uniform valuation distributions, the linear
equilibrium yields higher expected gains for the players than any other Bayesian Nash equilibria of the
double auction

ˆ This implies that there is no BNE of the double auction in which trade occurs if and only if it is efficient
(i.e., if and only if vb ≥ vs )

3.3 The Revelation Principle


Desenho de Mecanismo

ˆ É comum que um dos jogadores possa definir os termos em que um processo de interação estratégica irá
se desenvolver

ˆ Quando o governo decide privatizar alguma de suas empresas

– o governo é vendedor
– mas também é o agente que determina as regras do jogo (regras da privatização)

ˆ Um empresário que decide vender sua empresa pode também definir os termos em que se desenvolverá a
negociação com eventuais compradores

ˆ O Banco Central, nos seus leilões de tı́tulos públicos, estabelece as próprias regras desses leilões

Informação privada e mecanismos

ˆ O fato que alguns jogadores possuem informação privada é essencial para que o jogador que define as
regras consiga maximizar sua recompensa

ˆ Quando se vai vender uma empresa (ou um tı́tulo do BC) em geral não se conhece o valor que o comprador
está disposto a pagar

ˆ Uma regra do jogo é chamada de mecanismo

Questão 11. Como elaborar ou desenhar um mecanismo para que determinado objetivo seja alcançado?

Regras básicas do mecanismo

ˆ O jogador que possui o poder de desenhar o mecanismo tem a liberdade de definir as regras do jogo

ˆ No entanto, ele enfrente duas restrições

125
1. individual rationality: o jogador responsável pelo desenho não pode adotar nenhuma coerção
– os jogadores envolvidos no mecanismo devem jogar de forma voluntária
2. incentive compatibility: o jogador responsável pelo desenho tem que ter expectativas razoáveis sobre
o comportamento dos outros jogadores
– os jogadores envolvidos no mecanismo não irão jogar algo que não seja um equilı́brio do mecan-
ismo consistente com seus próprios interesses

Exemplo

ˆ Um governo decidiu vender uma empresa pública

ˆ Apenas um comprador se qualificou em um processo prévio para a aquisição da empresa

ˆ Existem dois tipos de comprador:

– um tipo que atribui à empresa um valor elevado a > 0


– um tipo que atribui um valor baixo b, com a > b > 0

ˆ O governo não sabe de que tipo é o comprador

ˆ Mas sabe que ele pode ser de dois tipos, a ou b

ˆ Os possı́veis tipos de compradores da empresa, {a, b}, são de conhecimento comum dos jogadores

ˆ Seja v o valor pago pela empresa

– Se o comprador é do tipo t ∈ {a, b}, ele extrairá um excedente igual a t − v

ˆ O governo procura vender a empresa pelo maior preço possı́vel

ˆ O governo não sabe qual é o tipo do comprador

ˆ Ele atribui uma probabilidade p do que o comprador seja do tipo t = a

ˆ Vamos supor que a = 30, b = 10 e p = 0.5

Exemplo: mecanismos triviais

ˆ Existem dois mecanismos simples para o governo

1. Perguntar ao comprador qual é o tipo dele, ou de forma equivalente, e quanto ele está disposto a
pagar pela empresa
2. Estabelecer um preço que o comprador, qualquer que seja o seu tipo, estará disposto a pagar

ˆ No primeiro caso, o comprador sempre irá declarar que é do tipo b, que valoriza menos a empresa

– A consequência é que a venda da empresa se dará por o valor v = b

126
ˆ Será o mesmo resultado do caso em que o governo estabelece um preço que seja aceitável para qualquer
um dos dois tipos de comprador

ˆ Há alternativas possı́veis que podem produzir melhores resultados para o governo

Exemplo: um mecanismo mais sofisticado

ˆ O governo pode estabelecer um mecanismo em que a venda esteja assegurada se for oferecido um preço
acima de v̂ = 17

ˆ Se for oferecido um valor v menor (seria então v = b) a probabilidade de a venda efetivamente se concretizar
seria de 50%

– Nesse caso o governo joga um dado de duas faces


– Essa ameaça é crı́vel?

ˆ Como v̂ > v, o comprador tipo t = b irá preferir correr o risco de não efetuar a compra, fazendo a oferta
mais baixa b

ˆ Valerá a pena para o comprador de alta avaliação t = a pagar v̂?

ˆ Se o comprador de alta avaliação t = a paga v̂, seu excedente (payoff) é dado por a − v̂ = 13

– Esse valor é certo (sem risco)

ˆ Se o comprador de tipo t = a oferece o preço mais baixo


(porque oferecer mais do que b?),
seu excedente esperado (neutro ao risco) será

1 1 a−b 30 − 10
(a − b) + 0 = = = 10
2 2 2 2

ˆ Assim é melhor negócio para o comprador de alta avaliação t = a pagar o preço mais elevado estipulado
pelo governo e assegurar a aquisição da empresa

ˆ Esse esquema valerá a pena para o governo?

ˆ O governo venderá a empresa para um comprador de alta avaliação com uma probabilidade p

ˆ Para o comprador com baixa avaliação, há 50% de chances de o governo vender, mas também há 50% de
chances de o governo cancelar a venda

ˆ A receita esperada (neutro ao risco) do governo será:


 
1 1 1 1 1
P rob{t = a}v̂ + P rob{t = b} b + 0 = × 17 + × 10 + × 0 = 11
2 2 2 4 4

ˆ O valor esperado 11 é superior a 10 que alternativamente obteria

127
ˆ Esse mecanismo é melhor para o governo do que o mecanismo de um só preço para ambos os tipos

Exemplo: generalização do mecanismo

ˆ Sejam α > β > 0 e θ ∈ [0, 1] valores que caracterizam o seguinte mecanismo

ˆ Para um valor pago pelo comprador igual a α, a venda é assegurada

ˆ Para um valor igual a β, há uma probabilidade (1 − θ) de que o governo cancele a privatização

ˆ O comprador de alta avaliação (t = a) prefere comprar a empresa pagando o valor α do que correr o risco
de ofertar o valor baixo, desde que

α − θβ
a − α ≥ θ(a − β) i.e. , a≥
1−θ

ˆ O comprador de baixa avaliação (t = b) prefere correr o risco de ofertar o valor baixo β do que pagar o
valor alto, se
α − θβ
θ(b − β) ≥ b − α i.e., b≤
1−θ
ˆ A equação
α − θβ
b≤ ≤a
1−θ
é chamada de restrição de compatibilidade de incentivos (incentive compatibility)

ˆ Graças a ela, cada tipo de comprador prefere selecionar o valor a ser pago mais adequado ao seu tipo

ˆ Estabelecendo o valor α muito alto pode estimular o comprador de alta avaliação a incorrer o risco de
propor o valor mais baixo

ˆ O mesmo problema ocorre se a probabilidade (1 − θ) de cancelar a venda for muito baixo

ˆ Temos que adicionar a restrição de que nenhum tipo de comprador pode ser coagido a adquirir a empresa
(i.e., participar do mecanismo)

ˆ O lucro esperado de cada tipo de comprador tem que ser maior do que o custo de oportunidade (não
participar do mecanismo), i.e.,
a≥α e b≥β

ˆ Essa restrição é chamada de restrição de racionalidade individual (individual rationality)

ˆ Dada as restrições
α − θβ
b≤ ≤a
1−θ
ˆ a receita esperada do governo é
pα + (1 − p)θβ

ˆ A questão é encontrar os valores (α, β, θ) que maximizam a receita a e dadas as restrição de

128
1. compatibilidade com incentivos
α − θβ
b≤ ≤a (IC)
1−θ
2. racionalidade individual
a≥α e b≥β (IR)

Proposição 14. If (α, β, θ) is optimum then we must have

α − θβ
a=
1−θ

and
b=β

ˆ Temos então que


α = θb + (1 − θ)a e β=b

ˆ A receita esperada do governo é

pa(1 − θ) + θb i.e., pa + θ(b − pa)

ˆ Há dois casos possı́veis

1. Se b < pa o governo deve fazer θ = 0, i.e., não vender a empresa se o preço for inferior a α = a
2. Se b > pa o governo deve fazer θ = 1, i.e., vender com certeza, desde que obtenha o valor mı́nimo
pela empresa

Princı́pio de revelação

ˆ Quando o mecanismo assegura a compra apenas a um preço mais elevado (θ > 0), ele leva os jogadores a
revelarem indiretamente suas verdadeiras caracterı́sticas (tipos) por meio de suas decisões

ˆ Pode-se obter o mesmo resultado por meio de um mecanismo pelo qual os jogadores se vejam estimulados
a anunciar suas verdadeiras caracterı́sticas

ˆ Em vez de oferecer a empresa por um valor alto, porém certo, ou um valor baixo, porém incerto

ˆ O governo poderia simplesmente ter perguntado ao comprador qual era o seu verdadeiro tipo

ˆ Avisando que, se o tipo informado fosse t = a, a empresa lhe seria oferecida com certeza, mas ao valor
mais elevado

ˆ Se o tipo informado fosse t = b, a empresa seria oferecida ao valor mais baixo, mas a venda teria 50% de
chance de ser efetivamente concretizada

129
ˆ O resultado desse mecanismo chamado de direto seria o mesmo, apesar da forma do jogo ser diferente

ˆ Os jogadores anunciariam seu verdadeiro tipo ao governo que em seguida atribuiria as recompensas ade-
quadas

ˆ Eles não teriam qualquer motivo para mentira

ˆ Esse resultado é chamado de princı́pio da revelação

Fix

ˆ a set of players I

ˆ a family of types (Ti )i∈I

ˆ a family of priors (pi )i∈I with

pi = (pi (·|ti ))ti ∈Ti and pi (·|ti ) ∈ P rob(T−i )

Revelation principle: mechanisms

Definição 38. A mechanism is a family


(Ai , ui )i∈I

where

ˆ Ai is a set of available actions for player i

ˆ ui : A × T → [−∞, ∞) where
ui (ai , a−i ; t)

is the payoff received by player i if he chooses ai , given that the other players choose a−i and players
type are t = (ti , t−i )

A strategy for this mechanism is a function si : Ti → Ai

Revelation principle: direct mechanisms

Definição 39. A direct mechanism is a mechanism (Bi , vi )i∈I where

∀i ∈ I, Bi = Ti

ˆ A strategy for a direct mechanism is a function τi : Ti → Ti

ˆ Each agent is asked to announce his type

130
Definição 40. A direct mechanism (vi )i∈I is said to be incentive compatible (or truth telling) if telling the
truth is a Bayesian Nash equilibrium, i.e., the strategies

(Idi )i∈I where Idi (ti ) = ti

is a BNE of the Bayesian game defined by the direct mechanism.

Teorema 16 (Revelation principle). Every payoff profile (πi∗ )i∈I obtained in a BNE of any mechanism
(Ai , ui )i∈I can be obtained through an incentive compatible direct mechanism, i.e., there exists a direct
mechanism (vi )i∈I which is incentive compatible and for which (πi∗ )i∈I is the payoff profile of its truth
telling BNE Z
πi∗ (ti ) pi
= E [vi (ti , Id−i ; (ti , ·))] = vi (ti , t−i ; (ti , t−i ))pi (t−i |ti )
T−i

Proof. Pose
vi (τi , τ−i ; t) ≡ ui (s∗i (τi ), s∗−i (τ−i ); t)

where (s∗i )i∈I is the BNE of the mechanism (Ai , ui )i∈I leading to the payoff profile (πi )i∈I

131
Cap. 4 - Dynamic games of incomplete information
4.1 Introduction to Perfect Bayesian equilibrium
ˆ Consider the following dynamic game of complete but imperfect information

ˆ First, player 1 chooses among three actions- L, M , and R

ˆ If player 1 chooses R then the game ends without a move by player 2

ˆ If player 1 chooses either L or M then player 2 learns that R was not chosen

ˆ But he does not know which of L or M was chosen

ˆ Player 2 then chooses between two actions, L′ and R′ , after which the game ends

ˆ Payoffs are given in the extensive form in the previous figure

ˆ The normal-form representation of this game is

Player 2
L’ R’
L 2, 1 0, 0
Player 1 M 0, 2 0, 1
R 1, 3 1, 3

ˆ There are two pure-strategy Nash equilibria: (L, L′ ) and (R, R′ )

ˆ To determine whether these Nash equilibria are subgame perfect, we should define the game’s subgames

ˆ The game in consideration has no subgames since a subgame should begin at a decision node that is a
singleton

ˆ Both (L, L′ ) and (R, R′ ) are SPNE

ˆ (R, R′ ) depends on a non-credible threat:

– If player 2 gets the move, then playing L′ dominates playing R′

132
– So player 1 should not be induced to play R by 2’s threat to play R′ if given the move

One way to strengthen the equilibrium concept so as to rule out the SPNE (R, R′ ) is to impose the following
requirements

Requirement 17. At each information set, the player with the move must have a belief about which node in
the information set has been reached by the play of the game.

ˆ For a non-singleton information set, a belief is a probability over the nodes in the information set

ˆ For a singleton information set, the player’s belief puts probability one on the single decision node

Requirement 18. Given their beliefs, the players’ strategies must be sequentially rational in the sense that
at each information set the action taken by the player with the move (and the player’s subsequent strategy) must
be optimal given the player’s belief at that information set and the other players’ subsequent strategies

ˆ A “subsequent strategy” is a complete plan of action covering every contingency that might arise after
the given information set has been reached

ˆ Requirement 1 implies that if the play of the game reaches player 2’s non-singleton information set then
player 2 must have a belief about which node has been reached

– Or equivalently, about whether player 1 has played L or M

ˆ This belief is represented by the probabilities p and 1 − p

ˆ Given player 2’s belief, the expected payoff

– from playing R′ is p · 0 + (1 − p) · 1 = 1 − p
– from playing L′ is p · 1 + (1 − p) · 2 = 2 − p

ˆ Since 2 − p > 1 − p for any value of p, Requirement 2 prevents player 2 from choosing R′

ˆ Requiring that each player have a belief and act optimally given this belief suffices to eliminate the
implausible equilibrium (R, R′ )

ˆ Requirements 1 and 2 insist that the players have beliefs and act optimally given these beliefs

133
ˆ But these beliefs may not be reasonable

ˆ In order to impose further requirements on the players’ beliefs, we introduce the distinction between
information sets that are on the equilibrium path and those that are off the equilibrium path

Definition. For a given equilibrium in a given extensive-form game, an information set is

ˆ on the equilibrium path if it will be reached with positive probability if the game is played according to
the equilibrium strategies

ˆ off the equilibrium path if it is certain not to be reached if the game is played according to the equilibrium
strategies

ˆ “Equilibrium” can mean Nash, subgame perfect, Bayesian, or perfect Bayesian equilibrium

Requirement 19. At information sets on the equilibrium path, beliefs are determined by Bayes’ rule and the
players’ equilibrium strategies

ˆ Consider the subgame perfect Nash equilibrium (L, L′ )

ˆ Player 2’s belief must be p = 1

ˆ Indeed, given player 1’s equilibrium strategy (namely L), player 2 knows which node in the information
set has been reached

ˆ To illustrate Requirement 3, suppose that there were a mixed-strategy equilibrium in which player 1 plays
L with probability q1 , M with probability q2 , and R with probability 1 − q1 − q2

ˆ Requirement 3 would force player 2’s belief to be

q1
p=
q1 + q2

ˆ Requirements 1 through 3 capture the spirit of a perfect Bayesian equilibrium

ˆ The crucial new feature of this equilibrium concept is due to Kreps and Wilson (Econometrica 1982)

134
ˆ An equilibrium no longer consists of just a strategy for each player but now also includes a belief for each
player at each information set at which the player has the move

ˆ Requirement 3 imposes that players hold reasonable beliefs on the equilibrium path

ˆ We will introduce Requirement 4 which imposes that agents’ beliefs are reasonable off the equilibrium
path

Requirement 20. At information sets off the equilibrium path, beliefs are determined by Bayes’ rule and the
players’ equilibrium strategies where possible.

ˆ We will provide a more precise statement of “where possible” in each of the economic applications analyzed
subsequently

Definição 41. A perfect Bayesian equilibrium consists of strategies and beliefs satisfying Requirements
1 through 4.

ˆ To illustrate and motivate Requirement 4 we consider the following three-player game

ˆ This game has one subgame beginning at player 2’s singleton information set

ˆ The unique NE in this subgame between players 2 and 3 is (L, R′ )

ˆ The unique SPNE of the entire game is (D, L, R′ )

ˆ These strategies and the belief p = 1 for player 3 satisfy Requirements 1 through 3

ˆ They also trivially satisfy Requirement 4, since there is no information set off this equilibrium path, and
so constitute a PBE

135
ˆ Consider the strategies (A, L, L′ ), together with the belief p = 0

ˆ These strategies are a NE -no player wants to deviate unilaterally

ˆ These strateties and belief also satisfy Requirements 1 through 3

– Player 3 has a belief and acts optimally given it, and players 1 and 2 act optimally given the
subsequent strategies of the other players

ˆ This NE, namely (A, L, L′ ), is not subgame perfect

ˆ Because the unique NE of the game’s only subgame is (L, R′ )

ˆ Thus, Requirements 1 through 3 do not guarantee that the player’s strategies are a SPNE

ˆ The problem is that player 3’s belief (p = 0) is inconsistent with player 2’s strategy, L

– but Requirements 1 through 3 impose no restrictions on 3’s belief because 3’s information set is not
reached if the game is played according to the specified strategies

ˆ Requirement 4, however, forces player 3’s belief to be determined by player 2’s strategy:

– if 2’s strategy is L then 3’s belief must be p = 1


– if 2’s strategy is R then 3’s belief must be p = 0

ˆ But, if 3’s belief is p = 1 then Requirement 2 forces 3’s strategy to be R′

ˆ So the strategies (A, L, L′ ) and the belief p = 0 do not satisfy Requirements 1 through 4

ˆ Consider the following modification of the previous game

136
ˆ Player 2 now has a third possible action, A′ , which ends the game

ˆ If player 1’s equilibrium strategy is A then player 3’s information set is off the equilibrium path

ˆ But now Requirement 4 may not determine 3’s belief from 2’s strategy

ˆ If 2’s strategy is A′ then Requirement 4 puts no restrictions on 3’s belief

ˆ But if 2’s strategy is to play L with probability q1 , R with probability q2 , and A′ with probability 1−q1 −q2 ,
where q1 + q2 > 0, then Requirement 4 dictates that 3’s belief be

q1
p=
q1 + q2

Concluding remarks

ˆ In a NE no player chooses a strictly dominated strategy

ˆ In a PBE, Requirements 1 and 2 are equivalent to insisting that no player’s strategy be strictly dominated
beginning at any information set

ˆ Nash and Bayesian Nash equilibrium do not share this feature at information sets off the equilibrium path

ˆ Even SPNE does not share this feature at some information sets off the equilibrium path, such as infor-
mation sets that are not contained in any subgame

ˆ In a PBE, players cannot threaten to play strategies that are strictly dominated beginning at any infor-
mation set off the equilibrium path

ˆ PBE makes the player’s beliefs explicit

ˆ Such an equilibrium often cannot be constructed by working backwards through the game tree, as we did
to construct a SPNE

ˆ Requirement 2 determines a player’s action at a given information set based in part on the player’s belief
at that information set

137
ˆ either Requirement 3 or 4 applies at this information set, then it determines the player’s belief from the
players’ action higher up the game tree

ˆ But Requirement 2 determines these actions higher up the game tree based in part on the players’ subse-
quent strategies, including the action at the original information set

ˆ This circularity implies that a single pass working backwards through the tree will not suffice to compute
a PBE

4.2 Signaling Games


Perfect Bayesian equilibrium in signaling games

A signaling game is a dynamic game of incomplete information involving two players:

ˆ A Sender (S)

ˆ A Receiver (R)

The timing of the game is

1. Nature draws a type ti for the Sender from a finite set of feasible types T = {t1 , · · · , tI } according to a
probability distribution p ∈ P rob(T ) with full support, i.e., p(ti ) > 0 for every ti

2. The Sender observes ti and then chooses a message mj from a finite set of feasible messages M =
{m1 , · · · , mJ }

3. The Receiver observes mj but not ti and then chooses an action ak from a finite set of actions A =
{a1 , · · · , aK }

4. Payoffs are given by US (ti , mj , ak ) and UR (ti , mj , ak )

ˆ In many applications, the sets T , M and A are intervals on the real line, rather than finite sets

ˆ One may allow the set of feasible messages to depend on the type Nature draws

ˆ One may allow the set of feasible actions to depend on the message the Senders chooses

Job-market signaling

ˆ In Spence’s (QJE 1973) model of job-market signaling

– the Sender is the worker


– the Receiver is the market of prospective employers
– the type is the worker’s productive ability
– the message is the worker’s education choice
– the action is the wage paid by the market

138
Corporate investment and capital structure

ˆ In Myers and Majluf’s (JFE 1984) model of corporate investment and capital structure

– the Sender is a firm needing capital to finance a new project


– the Receiver is a potential investor
– the type is the profitability of the firm’s existing assets
– the message is the firm’s offer of an equity stake in return for financing
– the action is the investor’s decision about whether to invest

Monetary policy

ˆ A signaling game may be embedded within a richer game

– there could be an action by the Receiver before the Sender chooses the message in step 2
– there could be an action by the Sender after (or while) the Receiver chooses the action in step 3

ˆ Consider the following game:

In Vicker’s (1986) model of monetary policy

ˆ the Federal Reserve has private information about its willingness to accept inflation in order to increase
employment

ˆ the Sender is the Federal Reserve

ˆ the Receiver is the market of employers

ˆ the type is the Fed’s willingness to accept inflation in order to increase employment

ˆ the message is the Fed’s choice of first-period inflation

ˆ the action is the employers’ expectation of second-period inflation

ˆ the employers’ expectation of first-period inflation precedes the signaling game

ˆ the Fed’s choice of second-period inflation follows it

PBE definition in signaling games

ˆ We consider an extensive form representation of a simple case: T = {t1 , t2 }, M = {m1 , m2 }, A = {a1 , a2 }


and P rob{t1 } = p

139
ˆ A player’s strategy is a complete plan of action:

– a strategy specifies a feasible action in every contingency in which the player might be called upon
to act

ˆ In a signaling game:

– a pure strategy for the Sender is a function ti 7→ m(ti ) specifying which message will be chosen for
each type that Nature might draw
– a pure strategy for the Receiver is a function mj 7→ a(mj ) specifying which action will be chosen for
each message that the Sender might send

ˆ In the simple signaling depicted before, the Sender and the Receiver both have four pure strategies

ˆ The Sender’s strategy m is said to be

– a pooling strategy if each type sends the same message


* i.e., if m is constant
– a separating strategy if each type sends a different message
* i..e, m is injective
– a partially pooling (or semi-separating) if it is neither pooling nor separating

ˆ We translate the informal statements of Requirements 1 through 3 into a formal definition of a PBE in a
signaling game

ˆ Requirement 1 is trivial when applied to the Sender since his choice occurs at a singleton information set

ˆ The Receiver, in contrast, chooses an action after observing the Sender’s message but without knowing
the Sender’s type

– There is one information set for each message the Sender might choose
– Each such information set has one node for each type Nature might have drawn

140
Requirement 21 (1). After observing any message mj from M , the Receiver must have a belief about which
types could have sent mj

ˆ Denote this belief by the probability distribution µ(·|mj ) ∈ P rob(T )

Requirement 22 (2R). For each mj in M , the Receiver’s action a∗ (mj ) must maximize the Receiver’s expected
utility, given the belief µ(·|mj )

ˆ That is, a∗ (mj ) solves


X
max µ(t|mj )UR (t, mj , a)
a∈A
t∈T

ˆ Requirement 2 also applies to the Sender, but the Sender has complete information

Requirement 23 (2S). For each ti in T , the Sender’s message m∗ (ti ) must maximize the Sender’s utility,
given the Receiver’s strategy a∗ (mj )

– That is, m∗ (ti ) solves


max US (ti , m, a∗ (m))
m∈M

ˆ Given the Sender’s strategy ti 7→ m∗ (ti ), let Tj denote the set of types that send the message mj

Tj ≡ {ti ∈ T : m∗ (ti ) = mj }

or equivalently6
Tj = [m∗ ]−1 (mj )

– The signal ti is a member of the set Tj if m∗ (ti ) = mj

ˆ Given a message mj ,

– if Tj is non-empty then the information set corresponding to the message mj is on the equilibrium
path
– otherwise, mj is not sent (at equilibrium) by any type and so the corresponding information set is
off the equilibrium path

For messages on the equilibrium path, one should apply Requirement 3 to the Receiver’s strategy

Requirement 24 (3). For each mj ∈ M , if there exists ti ∈ T such that m∗ (ti ) = mj , then the Receiver’s
belief at the information set corresponding to mj must follow from Bayes’ rule and the Sender’s strategy:

p(ti )
µ(ti |mj ) = p(ti |[m∗ ]−1 (mj )) = P
τi ∈Tj p(τi )

6
Rigorously, we should write [m∗ ]−1 ({mj }).

141
Definição 42. A pure-strategy perfect Bayesian equilibrium in a signaling game is

ˆ a pair of strategies (m∗ , a∗ ) where

– m∗ : ti 7→ m∗ (ti )
– a∗ : mj 7→ a∗ (mj )

ˆ a family of beliefs (µ(·|mj ))mj ∈M with each µ(·|mj ) ∈ P rob(T )

satisfying Signaling Requirements (1), (2R), (2S), and (3)

ˆ Requirement 4 is vacuous in a signaling game

ˆ If the Sender’s strategy is pooling or separating then we call the equilibrium pooling or separating,
respectively

A simple signaling game


Consider the following example of a simple signaling game

ˆ Each type is equally likely to be drawn by Nature

ˆ The Receiver belief µ(·|L) at information set L is denoted (p, 1 − p)

ˆ The Receiver belief µ(·|R) at information set R is denoted (q, 1 − q)

There are four possible pure-strategy perfect Bayesian equilibria in this two-type, two-message game

ˆ Pooling on L

ˆ Pooling on R

ˆ Separating with t1 playing L and t2 playing R

ˆ Separating with t2 playing L and t1 playing R

142
A simple signaling game: pooling on L

ˆ Suppose there is an equilibrium (m∗ , a∗ , µ) in which the Sender’s strategy is


(
∗ L if t = t1
m (t) =
L if t = t2

ˆ Then the Receiver’s information set corresponding to L is on the equilibrium path

ˆ So the Receiver’s belief (p, 1 − p) at this information set is determined by Bayes’ rule and the Sender’s
strategy

ˆ This implies that


0.5
µ(t1 |L) ≡ p = = 0.5
0.5 + 0.5
ˆ Given this belief µ, the Receiver’s best response following L is to play u

ˆ The Sender’s type t1 earns payoff of 1 and the Sender’s type t2 earns payoff of 2

ˆ To determine whether both “Sender types” are willing to choose L, we need to specify how the Receiver
would react to R

ˆ If the Receiver’s response to R is u, i.e., a∗ (R) = u then type t1 ’s payoff from playing R is 2, which exceeds
t1 ’s payoff of 1 from playing L

ˆ But if the Receiver’s response to R is d, i.e., a∗ (R) = d then t1 and t2 earn payoffs of 0 and 1 from playing
R, whereas they earn 1 and 2 from playing L

ˆ To get the pooling equilibrium on L, the Receiver’s response to R must be d, i.e., a∗ (R) = d

ˆ One have to check that a∗ (R) = d is an optimal action with respect to the Receiver’s belief at the
information set corresponding to R

ˆ Observe that
Eµ(·|R) [UR (·, R, d)] = q × 0 + (1 − q) × 2 = 2(1 − q)

ˆ and
Eµ(·|R) [UR (·, R, u)] = q × 1 + (1 − q) × 0 = q

ˆ Playing d is optimal for the Receiver for any q ≤ 2/3

Observação 26. The pair of strategies (m∗ , a∗ ) defined by

m∗ (t) = L, ∀t ∈ {t1 , t2 }
(
∗ u if m = L
a (m) =
d if m = R

143
and the beliefs m 7→ µ(·|m) defined by
(
(0.5, 0.5) if m = L
µ(·|m) =
(q, 1 − q) if m = R

form a pure-strategy PBE if q ≤ 2/3.

A simple signaling game: pooling on R

ˆ Suppose the Sender’s strategy is m∗ (t) = R for any t in T

ˆ Then q = 0.5 and the Receiver’s best response is a∗ (R) = d

ˆ Thus the contingent payoffs for the Sender are

US (t1 , R, d) = 0 and US (t2 , R, d) = 1

ˆ But t1 can earn 1 by playing L, since the Receiver’s best response to L is u for any value of p

Observação 27. There is no equilibrium in which the Sender plays m∗ (t) = R for any t in T .

A simple signaling game: Separating with m∗ (t1 ) = L

ˆ Suppose the Sender’s strategy m∗ is defined by


(
∗ L if t = t1
m (t) =
R if t = t2

ˆ Both of the Receiver’s information sets are on the equilibrium path

ˆ So both beliefs are determined by Bayes’ rule and the Sender’s strategy

p=1 and q=0

ˆ The Receiver’s best responses to these beliefs are


(
u if m = L
a∗ (m) =
d if m = R

ˆ It remains to check whether the Sender’s strategy is optimal given the Receiver’s strategy a

ˆ It is not:

– if type t2 deviates by playing L rather than R,


– then the Receiver’s responds with u,

144
– earning t2 a payoff of 2,
– which exceeds t2 ’s payoff of 1 from playing R

A simple signaling game: Separating with m∗ (t1 ) = R

ˆ Suppose the Sender’s strategy m is defined by


(
∗ R if t = t1
m (t) =
L if t = t2

ˆ Both of the Receiver’s information sets are on the equilibrium path

ˆ So both beliefs are determined by Bayes’ rule and the Sender’s strategy

p=0 and q=1

ˆ The Receiver’s best response to these beliefs is

a∗ (m) = u, ∀m ∈ {L, R}

ˆ Both types t1 and t2 earn payoffs of 2

ˆ If t1 were to deviate by playing L, then the Receiver would react with u

ˆ t1 ’s payoff would then be 1, so there is no incentive for t1 to deviate from playing R

ˆ If t2 were to deviate by playing R, then the Receiver would react with u

ˆ t2 ’s payoff would then be 1, so there is no incentive for t2 to deviate from playing L

Observação 28. ˆ The pair of strategies (m∗ , a∗ ) defined by


(
∗ R if t = t1
m (t) =
L if t = t2

a∗ (m) = u, ∀m ∈ {L, R}

ˆ and the beliefs m 7→ µ(·|m) defined by


(
(0, 1) if m = L
µ(·|m) =
(1, 0) if m = R

form a separating pure-strategy perfect Bayesian equilibrium

145
Job market signaling

ˆ We restate Spence’s (QJE 1973) model as an extensive-form game and describe some of its perfect Bayesian
equilibria

ˆ The timing is as follows

1. Nature determines a worker’s productive ability, η, which can be either high H or low L. The
probability that η = H is q
2. The worker learns his or her ability and then chooses a level of education, e ≥ 0
3. Two firms observe the worker’s education but not the worker’s ability, and then simultaneously make
wage offers to the worker
4. The worker accepts the higher of the two wage offers, flipping a coin in case of a tie

Payoffs

ˆ Let w denote the wage the worker accepts

ˆ The payoff to worker is


w − c(η, e)

where c(η, e) is the cost to a worker with ability η of obtaining education e

ˆ The payoff to the firm that employs the worker is

y(η, e) − w

where y(η, e) is the output of a worker with ability η who has obtained education e

ˆ The payoff to the firm that does not employ the worker is zero

Assumption on production

ˆ We allow for the possibility that output increases not only with ability but also with education

ˆ We assume that high-ability workers are more productive, i.e.,

∀e, y(H, e) > y(L, e)

ˆ We assume that education does not reduce productivity, i.e.,

∀(η, e), ye (η, e) ≥ 0

where ye (η, e) is the marginal productivity of education for a worker of ability η at education e

∂y
ye (η, e) = (η, e) ≥ 0
∂e

146
Interpretation of education

ˆ We interpret differences in e as differences in the quality of a student’s performance

ˆ Not as differences in the duration of the student’s schooling

ˆ Thus, the game could apply to a cohort of high school graduates, or to a cohort of college graduates or
MBAs

ˆ Under this interpretation, e measures the number and kind of courses taken and the caliber of grades and
distinctions earned during an academic program of fixed length

ˆ Tuition costs (if they exist at all) are independent of e, so the cost function c(η, e) measures non-monetary
(or psychic) costs

ˆ Students of lower ability find it more difficult to achieve high grades at a given school, and also more
difficult to achieve the same grades at a more competitive school

ˆ Firm’s use of education as a signal thus reflects the fact that firms hire and pay more to the best graduates
of a given school and to the graduates of the best schools

Assumption on costs

ˆ The crucial assumption in Spence’s model is that low-ability workers find signaling more costly than do
high-ability workers

ˆ More precisely, we assume that the marginal cost of education is higher for low-ability than for high-ability
workers
∀e, ce (L, e) > ce (H, e)

where ce (η, e) denoted the marginal cost of education for a worker of ability η at education e

∂c
ce (η, e) = (η, e)
∂e

Assumption on costs: Interpretation

ˆ Consider a worker believing that with education e1 he would get paid wage w1

ˆ We investigate the increase in wages that would be necessary to compensate this worker for an increase
in education from e1 to e2

ˆ The answer depends on the worker’s ability:

– Low-ability workers find it more difficult to acquire the extra education and so require a larger
increase in wages to compensate them for it
Z e2
∂c
∆w = w2 − w1 = (η, e)de
e1 ∂e

147
ˆ The graphical statement of this assumption is that low-ability workers have steeper indifference curves
than do high-ability workers

ˆ IL is an indifference curve of a low-ability worker

ˆ IH is an indifference curve of a high-ability worker

Competition among firms

ˆ Spence also assumes that competition among firms will drive expected profits to zero

ˆ One can build this assumption into our model by replacing the two firms in stage 3 with a single player
called the market

ˆ The market makes a single wage offer w and has the payoff

−[y(η, e) − w]2

ˆ Doing so would make the model belong to the class of one-Receiver signaling games defined previously

ˆ To maximize its expected payoff, as required by Signaling Requirement 2R, the market would offer a wage
equal to the expected output of a worker with education e, given the market’s belief about the worker’s
ability after observing e

w̃(e) = µ(H|e) × y(H, e) + [1 − µ(H|e)] × y(L, e) (W)

ˆ µ(H|e) is the market’s assessment of the probability that the worker’s ability is H

ˆ The purpose of having two firms bidding against each other in Stage 3 is to achieve the same result without
resorting to a fictitious player called the market

Firms’ beliefs

ˆ To guarantee that firms will always offer a wage equal to the worker’s expected output

148
ˆ We need to impose that, after observing education choice e, both firms hold the same belief about the
worker’s ability, again denoted µ(H|e)

ˆ Signaling Requirement 3 determines the belief that both firms must hold after observing a choice of e that
is on the equilibrium path

ˆ The assumption is that the firms also share a common belief after observing a choice of e that is off the
equilibrium path

ˆ Given this assumption, it follows that in any PBE the firms both offer the wage w̃(e) given in (W )

ˆ Equation (W ) replaces Signaling Requirement 2R for this two-Receiver model

The complete information case

ˆ First, consider temporarily that the worker’s ability is common knowledge among all the players, rather
than privately known by the worker

ˆ Competition between the two firms in Stage 3 implies that a worker of ability η with education e earns
the wage
ŵ(η, e) = y(η, e)

ˆ A worker with ability η therefore chooses e∗ (η) to solve

max y(η, e) − c(η, e)


e≥0

ˆ The associated wage (when it exists) is denoted by w∗ (η), i.e.,

w∗ (η) = y[η, e∗ (η)]

ˆ Assume that e 7→ y(η, e) is concave and e 7→ c(η, e) is strictly convex

ˆ Assume that y(η, ·) and c(η, ·) are such that

∂y ∂c
lim (η, e) − (η, e) > 0
e→0+ ∂e ∂e

and
∂y ∂c
lim (η, e) − (η, e) < 0
e→∞ ∂e ∂e
ˆ Then the maximization problem has a unique solution e∗ (η) satisfying

∂y ∂c
(η, e∗ (η)) = (η, e∗ (η))
∂e ∂e

149
ˆ We propose to strengthen the assumption

∀e ≥ 0, y(H, e) > y(L, e)

that states that high-ability workers are more productive

ˆ Assume that
inf ye (H, e) ≥ max ye (L, e)
e≥0 e≥0

ˆ The previous assumption is automatically satisfied if e 7→ y(η, e) is linear

Proposição 15. Under the previous assumption, one must have

e∗ (L) < e∗ (H) and w∗ (L) < w∗ (H)

The private information case

ˆ We now return to the assumption that the worker’s ability is private information

ˆ A low-ability worker could try to masquerade as a high-ability worker

ˆ Two cases can arise

– The additional effort c[L, e∗ (H)] − c[L, e∗ (L)] needed to obtain the education level e∗ (H) is not
compensated by the additional wage w∗ (H) − w∗ (L)
– The additional effort c[L, e∗ (H)] − c[L, e∗ (L)] needed to obtain the education level e∗ (H) is compen-
sated by the additional wage w∗ (H) − w∗ (L)

150
ˆ The low-ability worker has no incentives to pretend being a high-ability worker by choosing e∗ (H), i.e.,

w∗ (L) − c[L, e∗ (L)] ≥ w∗ (H) − c[L, e∗ (H)]

ˆ The low-ability worker has incentives to pretend being a high-ability worker by choosing e∗ (H), i.e.,

w∗ (L) − c[L, e∗ (L)] ≤ w∗ (H) − c[L, e∗ (H)]

Perfect Bayesian Nash equilibria

ˆ Each kind of equilibrium

– pooling
– separating
– hybrid

can exist in this model

ˆ In a pooling equilibrium both worker-types choose a single level of education, say ep

ˆ Requirement 3 then implies that the firm’s belief after observing ep must be the prior belief

µ(H|ep ) = q and µ(L|ep ) = 1 − q

151
ˆ This in turn implies that the wage offered by the firm after observing ep must be

wp = q × y(H, ep ) + (1 − q) × y(L, ep )

ˆ To complete the description of a pooling PBE, it remains

1. to specify the firm’s belief µ(·|e) for out-of-equilibrium education choices e 6= ep (Requirement 1)
2. these beliefs will then determine the firm’s strategy e 7→ w̃(e) through

w̃(e) = µ(H|e) × y(H, e) + [1 − µ(H|e)] × y(L, e) (W)

(Requirement 2R)
3. to show that both worker-types’ best response to the firm’s strategy w̃ is to choose e = ep (Require-
ment 2S)

Pooling equilibrium

ˆ One possibility is that the firms believe that any education level other than ep implies that the worker
has low ability
∀e 6= ep , µ(H|e) = 0

ˆ Nothing in the definition of PBE rules these beliefs out

– Requirements 1 through 2 put no restrictions on beliefs off the equilibrium path


– Requirement 4 is vacuous in a signaling game

ˆ The refinement we will introduce in a subsequent chapter will rule out the beliefs analyzed here

ˆ If the firm’s beliefs are (


0 for e 6= ep
µ(H|e) =
q for e = ep

ˆ Then Equation (W) implies that the firms’ strategy is


(
y(L, e) for e 6= ep
w(e) =
wp for e = ep

where we recall that


wp = q × y(H, ep ) + (1 − q) × y(L, ep )

ˆ A worker of ability η chooses e to solve

max w̃(e) − c(η, e)


e≥0

152
ˆ Consider the following example

ˆ The low-ability worker’s indifference curve through the point [e∗ (L), w∗ (L)] lies below that type’s indif-
ference curve through (ep , wp )

ˆ This implies that the education ep is optimal for the low-ability worker

ˆ The high-ability worker’s indifference curve through the point (ep , wp ) lies above the wage function w =
y(L, e)

– This implies that the education ep is optimal for the high-ability worker
– This is because the solution e∗H to the maximization problem

max y(L, e) − c(H, e)


e≥0

will lead to a wage w̃(e∗H ) = y(L, e∗H )

Other pooling equilibria

ˆ In the previous example, many other pooling perfect Bayesian equilibria exist

ˆ Some of these equilibria involve a different education choice by the worker

ˆ Others involve the same education choice but different off the equilibrium path

153
ˆ Let ê denote a level of education between ep and e′

ˆ If we substitute ep by ê then the resulting belief and strategy for the firms together with the strategy
e(η) = ê for both worker’s types form another pooling PBE

ˆ Suppose that the firms’ belief is defined by



 0
 for e ≤ e′′ except for e = ep
µ(H|e) = q for e = ep


q for e > e′′

ˆ The firms’ strategy is then



 y(L, e)
 for e ≤ e′′ except for e = ep
w(e) = wp for e = ep


wp for e > e′′

ˆ These belief and strategy for the firms and the strategy (e(L) = ep , e(H) = ep ) for the worker form a
third pooling PBE

Separating equilibrium: the no-envy case

ˆ We now turn to separating equilibria

ˆ Consider again the no-envy example

ˆ The natural separating PBE involves the strategy

e(L) = e∗ (L) and e(H) = e∗ (H)

for the worker

154
ˆ Signaling Requirement 3 then determines the firms’ belief after observing either of these two education
levels
µ[H|e∗ (L)] = 0 and µ[H|e∗ (H)] = 1

ˆ Equation (W ) implies that the firms’ strategy is

w̃(e∗ (L)) = w∗ (L) = y[L, e∗ (L)]

and
w̃(e∗ (H)) = w∗ (H) = y[H, e∗ (H)]

ˆ To complete the description of this separating PBE, it remains

1. to specify the firms’ belief µ(H|e) for out-of-equilibrium education choices, i.e., values of e other than
e∗ (L) and e∗ (H)
2. which then determines the rest of the firms’ strategy w̃ through Equation (W )
3. to show that the best response for a worker of ability η to the firms’ strategy w̃ is to choose e∗ (η)

ˆ Consider the belief that the worker has high ability if e is at least e∗ (H) but has low ability otherwise
(
0 for e < e∗ (H)
µ(H|e) =
1 for e ≥ e∗ (H)

ˆ Equation (W ) then implies that the firms’ strategy is


(
y(L, e) for e < e∗ (H)
w̃(H|e) =
y(H, e) for e ≥ e∗ (H)

ˆ Recall that e∗ (H) is the high-ability worker’s best response to the wage function e 7→ y(H, e)

155
ˆ Since y(L, e) ≤ y(H, e) we get that e∗ (H) is still a best response to the wage function w̃

ˆ Recall that e∗ (L) is the low-ability worker’s best response to the wage function e 7→ y(L, e) on the whole
real line, this implies that it is also a best response on the interval [0, e∗ (H)) since e∗ (L) < e∗ (H)

ˆ We should now solve the following maximization problem

max y(H, e) − c(L, e)


e≥e∗ (H)

ˆ Denote by f the function from [e∗ (H), ∞) to R defined by

f (e) ≡ y(H, e) − c(L, e)

ˆ Observe that
f ′ (e) = ye (H, e) − ce (L, e) ≤ ye (H, e) − ce (H, e) ≤ 0

ˆ This implies that


w∗ (H) − c[L, e∗ (H)]

is the highest payoff the low-ability worker can achieve among all choices of e ≥ e∗ (H)

ˆ Since we are in the no-envy case, we have

w∗ (L) − c[L, e∗ (L)] > w∗ (H) − c[L, e∗ (H)]

ˆ Implying that e∗ (L) is the worker’s best response to the strategy w̃

Separating equilibrium: the envy case

ˆ We consider the envy case: more interesting

ˆ Now the high-ability worker cannot earn the high wage y(H, ·) simply by choosing the education e∗ (H)
that he should choose under complete information

156
ˆ To signal his ability, the high-ability worker must choose es where es > e∗ (H) is defined by

y(H, es ) − c(L, es ) = y(L, e∗ (L)) − c(L, e∗ (L))

ˆ This is because the low-ability worker will mimic any value of e between e∗ (H) and es

ˆ And will trick the firm into believing that the worker has high ability

ˆ Formally, the natural separating PBE involves the strategy

e(L) = e∗ (L) and e(H) = es

for the worker

ˆ The equilibrium beliefs for the firm must satisfy

µ[H|e∗ (L)] = 0 and µ[H|es ] = 1

ˆ The equilibrium wage strategy for the firms must satisfy

w̃(e∗ (L)) = w∗ (L) = y(L, e∗ (L)) and w̃(es ) = y(H, es )

ˆ Actually this is the only equilibrium that survives the refinement we will introduce in a subsequent chapter

ˆ We propose the following specification of the firms’ out-of-equilibrium beliefs that supports this equilibrium
behavior (
0 for e < es
µ(H|e) =
1 for e ≥ es

ˆ The firms’ strategy is then (


y(L, e) for e < es
w̃(e) =
y(H, e) for e ≥ es

157
ˆ Let us compute the best response of the low-ability worker

ˆ We already know that e∗ (L) is a best response among all choices of e < es

ˆ One should find the worker’s best response to the firms’ strategy among all choices of e ≥ es , i.e.,

max y(H, e) − c(L, e)


e≥es

ˆ Denote by g the function defined by g(e) = y(H, e) − c(L, e) for all e ≥ es

ˆ Observe that
g ′ (e) = ye (H, e) − ce (L, e) ≤ ye (H, e) − ce (H, e)

ˆ Recall that the function e 7→ y(H, e) − c(H, e) is concave and

ye (H, e∗ (H)) − ce (H, e∗ (H)) = 0

implying that g ′ (e) ≤ 0 for all e ≥ es

ˆ Therefore, the worker’s best response to the firms’ strategy among all choices of e ≥ es is es

ˆ Since
w∗ (L) − c(L, e∗ (L)) = y(H, es ) − c(L, es )

ˆ The worker has two best responses: e∗ (L) and es

ˆ We will assume that this indifference is resolved in favor of e∗ (L)

– Alternatively, we could increase es by an arbitrary small amount so that the low-ability worker would
strictly prefer e∗ (L)

ˆ Let us now analyze the best response of the high-ability worker

ˆ Denote by g the function defined by g(e) = y(H, e) − c(H, e) for all e ≥ es

ˆ Since g is concave, we have

∀e ≥ es , g ′ (e) = ye (H, e) − ce (H, e) ≤ ye (H, e∗ (H)) − ce (e∗ (H))

ˆ This implies that the worker’s best response to the firms’ strategy among all choices of e ≥ es is es

ˆ What about the worker’s best response among all choices of e < es ?

ˆ Let π ∗ (L) be the payoff of the low-ability worker at point (e∗ (L), w∗ (L))

ˆ Denote by W (L, ·) the function defined by

W (L, e) = π ∗ (L) + c(L, e)

158
ˆ This is the equation of the indifference curve IL of the low-ability worker passing through (e∗ (L), w∗ (L))

ˆ Denote by W (H, ·) the function defined by

W (H, e) = [y(H, es ) − c(H, es )] + c(H, e)

ˆ This is the equation of the indifference curve IH of the high-ability worker passing through (es , w̃(es ))

ˆ By definition of es we have
W (L, es ) = W (H, es )

ˆ Observe that
∂W ∂W
(H, e) − (L, e) = ce (H, e) − ce (L, e) < 0
∂e ∂e
ˆ Implying that the function e 7→ W (H, e) − W (L, e) is strictly decreasing

ˆ We then get that


∀e < es , W (H, e) > W (L, e)

ˆ By definition of e∗ (L), convexity of e 7→ c(L, e) and concavity of e 7→ y(L, e) we have

∀e ≥ 0, W (L, e) ≥ y(L, e)

ˆ This implies that W (H, e) > y(L, e)

ˆ It follows that the indifference curve of the high-ability worker passing through (es , w̃(es )) is always above
the production function y(L, e), implying that any payoff among e < es is inferior to the one obtained at
es

ˆ There are other separating equilibria that involve a different education choice by the high-ability worker

– the low-ability worker always separate at e∗ (L)

159
ˆ There are other separating equilibria that involve the education choices e∗ (L) and es but differ off the
equilibrium path

Hybrid equilibrium

ˆ We analyze the case of an hybrid equilibrium where the low-ability worker randomizes

ˆ The high-ability worker chooses the education level eh (h for hybrid)

ˆ The low-ability worker randomizes between choosing eh with probability π and choosing eL with proba-
bility 1 − π

ˆ Signaling Requirement 3 then determines the firms’ belief after observing eh and eL

ˆ Bayes’ rule yields


q
µ(H|eL ) = 0 and µ(H|eh ) =
q + (1 − q)π

ˆ Since the high-ability worker always choose eh but the low-ability worker does so only with probability π,
observing eh makes it more likely that the worker has high ability so µ(H|eh ) > q

ˆ Second, as π approaches zero, the low-ability worker almost never pools with the high-ability worker so
µ(H|eh ) approaches 1

ˆ Third, as π approaches one, the low-ability worker almost always pools with the high-ability worker so
µ(H|eh ) approaches the prior belief q

ˆ When the low-ability worker separates from the high-ability worker by choosing eL

ˆ The belief µ(H|eL ) = 0 implies the wage w(eL ) = y(L, eL )

ˆ We claim that eL = e∗ (L)

ˆ Suppose the low-ability worker separates by choosing some eL 6= e∗ (L)

ˆ Such separation yields the payoff y(L, eL ) − c(L, eL )

ˆ But choosing e∗ (L) would yield the payoff of at least y[L, e∗ (L)] − c[L, e∗ (L)]

– or more if the firms’ belief µ[H|e∗ (L)] is greater than 0

ˆ The definition of e∗ (L) implies

y[L, e∗ (L)] − c[L, e∗ (L)] > y(L, e) − c(L, e), ∀e 6= e∗ (L)

ˆ For the low-ability worker to be willing to randomize between separating at e∗ (L) and pooling at eh

ˆ The wage wh ≡ w̃(eh ) must make that worker indifferent between the two

w∗ (L) − c[L, e∗ (L)] = wh − c(L, eh ) (P)

160
ˆ Recall that Equation (W ) and the definition of the belief µ(·|eh ) imply

q (1 − q)π
wh = × y(H, eh ) + × y(L, eh )
q + (1 − q)π q + (1 − q)π

ˆ For a given value of eh , if Equation (P ) yields wh < y(H, eh ) then there is a unique possible value for π
consistent with a hybrid equilibrium in which the low-ability worker randomizes between e∗ (L) and eh

ˆ If wh > y(H, eh ), then there does not exist a hybrid equilibrium involving eh

ˆ Observe that Equation (P ) yields wh < y(H, eh ) if and only if eh < es where es is the education chosen
by the high-ability worker in the separating equilibrium

ˆ Given wh < y(H, eh ), the probability r solves

r × y(H, eh ) + (1 − r) × y(L, eh ) = wh

ˆ This probability is the firms’ equilibrium belief µ(H|eh ), so

q(1 − r)
π=
r(1 − q)

ˆ As eh approaches es , the probability r approaches 1 so π approaches 0

ˆ The separating equilibrium described previously is the limit of the hybrid equilibria considered here

ˆ To complete the description of the hybrid PBE, we should define the firms’ belief out-of-equilibrium path
and check the workers’ best response

ˆ Let µ(·|e) be defined as follows (


0 for e < eh
µ(H|e) =
r for e ≥ eh

161
ˆ The firms’ strategy is then
(
y(L, e) for e < eh
w̃(e) =
r × y(L, e) + (1 − r) × y(H, e) for e ≥ eh

ˆ It remains to check that the workers’ strategy

– e(L) = eh with probability π and e(L) = e∗ (L) with probability 1 − π


– e(H) = eh

is a best response to the firms’ strategy

Corporate investment and capital structure

ˆ Consider an entrepreneur who has started a company but needs outside financing to undertake an attrac-
tive new project

ˆ The entrepreneur has private information about the profitability of the existing company

ˆ The payoff of the new project cannot be disentangled from the payoff of the existing company

ˆ All that can be observed is the aggregate profit of the firm

ˆ Suppose the entrepreneur offers a potential investor an equity stake in the firm in exchange for the
necessary financing

ˆ Under what circumstances will the new project be undertaken?

ˆ What will the equity stake be?

ˆ Suppose that the profit of the existing company can be either high or low: π ∈ {H, L} with H > L > 0

ˆ The potential investor’s opportunity cost is r, i.e., there is an alternative investment possibility with rate
of return r

ˆ The required investment in the new project is I

ˆ The payoff will be R

ˆ The new project is attractive in the sense that the NPV is positive, i.e., R > I(1 + r)

The timing and the payoffs of the game are:

1. Nature determines the profit of the existing company

ˆ The probability that π = L is p

162
2. The entrepreneur learns π and then offers the potential investor an equity stake s, where 0 ≤ s ≤ 1

3. The investor observes s but not π and then decides either to accept or to reject the offer

4. Payoffs:

ˆ If the investor rejects the offer then the investor’s payoff is I(1 + r) and the entrepreneur’s payoff is
π
ˆ If the investor accepts s then the investor’s payoff is s(π + R) and the entrepreneur’s is (1 − s)(π + R)

ˆ Suppose that after receiving the offer s the investor believes that the probability that π = L is q(s)

ˆ Then the investor will accept s if and only if

s[qL + (1 − q)H + R] ≥ I(1 + r) (PC-I)

ˆ Suppose the profit of the existing company is π

ˆ The entrepreneur prefers to receive the financing at the cost of an equity stake of s if and only if

R
s≤ (PC-E)
π+R

ˆ In a pooling PBE, the investor’s belief must be q(spo ) = p after receiving the equilibrium offer spo

ˆ The participation constraint (P C − E) is more difficult to satisfy for π = H than for π = L

ˆ Therefore, a pooling equilibrium (with “accepts” as an action) exists only if

I(1 + r) R
≤ (NC-p)
pL + (1 − p)H + R H +R

ˆ If p is close enough to zero, (NC-p) holds because R > I(1 + r)

ˆ If p is close enough to one, however, the necessary condition (NC-p) holds only if

I(1 + r)
R − I(1 + r) ≥ H −L (sNC-p)
R

ˆ In a pooling equilibrium, the high-profit type must subsidize the low-profit type

ˆ Setting q(spo ) = p yields that the investor accepts to finance the project if and only if

I(1 + r) I(1 + r)
spo ≥ >
pL + (1 − p)H + R H +R

ˆ If the investor were certain that π = H the he would accept the smaller equity stake

I(1 + r)
ssy
H =
H +R

163
ˆ The larger equity stake required in a pooling equilibrium may be so expensive that the high-profit firm
would prefer to forego the new project

ˆ A pooling equilibrium exists if p is close to zero, so that the cost of subsidization is small

ˆ Or if the profit from the new project outweighs the cost of subsidization

ˆ If (NC-p) fails then a pooling equilibrium does not exist

ˆ A separating equilibrium always exists, however

ˆ The low-profit type offers


I(1 + r)
sL =
L+R
which the investor accepts

ˆ The high-profit type offers


I(1 + r)
sH <
H +R
and the investor rejects

ˆ In such an equilibrium, investment is inefficiently low: the new project is certain to be profitable, but the
high-profit type foregoes the investment

ˆ There is no way for the high-profit type to distinguish itself

ˆ Financing terms that are attractive to the high-profit type are even more attractive to the low-profit type

ˆ As Myers and Majluf (J. Fin. Econ. 1984) observes, the forces in this model push firms toward either
debt or internal sources of funds

ˆ Actually, Myers and Majluf analyze a large firm (with shareholders and a manager) rather than an
entrepreneur (who is both the manager and the sole shareholder)

ˆ We consider the possibility that the entrepreneur can offer debt as well as equity

ˆ Suppose the investor accepts the debt contract D

ˆ If the entrepreneur does not declare bankruptcy then the investor’s payoff is D and the entrepreneur’s is
π+R−D

ˆ If the entrepreneur does declare bankruptcy then the investor’s payoff is π + R and the entrepreneur’s is
zero

ˆ Since L > 0, there is always a pooling equilibrium: both profit-types offer the debt contract D = I(1 + r),
which the investor accepts

ˆ If L were sufficiently negative that R + L < I(1 + r), then the low-profit type could not repay this debt
so the investor would not accept the contract

164
ˆ A similar argument would apply if L and H represented expected rather than certain profits

ˆ Suppose that the type π means that the existing company’s profit will be

– π + K with probability 1/2


– π − K with probability 1/2

ˆ If L − K + R < I(1 + r) then there is probability 1/2 that the low profit type will not be able to repay
the debt D = I(1 + r) so the investor will not accept the contract

Monetary policy

ˆ Consider a sequential-move game in which employers and workers negotiate nominal wages

ˆ After the negotiation, the monetary authority chooses the money supply, which in turn determines the
rate of inflation

ˆ If wage contracts cannot be perfectly indexed, employers and workers will try to anticipate inflation in
setting the wage

ˆ Once an imperfectly indexed nominal wage has been set, actual inflation above the anticipated level of
inflation will erode the real wage

ˆ This causes employers to expand employment and output

ˆ The monetary authority therefore faces a trade-off between the costs of inflation and the benefits of
reduced unemployment and increased output that follow from surprise inflation

We follow Barro and Gordon (J. Mon. Econ. 1983) and analyzed a reduced-form version of this model in the
following game

ˆ First, employers form an expectation of inflation, π e

ˆ Second, the monetary authority observes this expectation and chooses actual inflation, π

ˆ The payoff to employers is −(π−π e )2 : employers simply want to anticipate inflation correctly, they achieve
their maximum payoff when π = π e

ˆ The monetary authority would like inflation to be zero but output (y) to be at its efficient level (y ∗ )

ˆ The payoff to the monetary authority is

U (π, y) = −cπ 2 − (y − y ∗ )2

where the parameter c > 0 reflects the monetary authority’s trade-off between its two goals

165
ˆ Suppose the actual output is the following function of target output and surprise inflation

ỹ(π, π e ) = by ∗ + d(π − π e )

– Where b < 1 reflects the presence of monopoly power in product markets


– If there is no surprise inflation, π = π e , then actual output will be smaller than would be efficient
– Where d > 0 measures the effect of surprise inflation on output through real wages

ˆ We can then rewrite the monetary authority’s payoff as

W (π, π e ) = U (π, ỹ(π, π e )) = −cπ 2 − [(b − 1)y ∗ + d(π − π e )]2

ˆ We propose to solve the subgame-perfect outcome of this game

ˆ We first compute the monetary authority’s optimal choice of π given employers’ expectation π e :

d
π ∗ (π e ) = [(1 − b)y ∗ + dπ e ]
c + d2

ˆ Since employers anticipate that the monetary authority will choose π ∗ (π e ), employers choose π e to maxi-
mize −[π ∗ (π e ) − π e ]2 , which yields π ∗ (π e ) = π e , or

d(1 − b) ∗
πe = y ≡ πs
c

ˆ In this subgame-perfect outcome, the monetary authority is expected to inflate and does so

ˆ We consider a two-period version of the previous model and we add private information

ˆ In the two-period model, each player’s payoff is the sum of the player’s one-period payoffs

W (π1 , π1e ) + W (π2 , π2e ) and − (π1 − π1e )2 − (π2 − π2e )2

where πt is actual inflation in period t and πte is employers’ expectation (at end of period t−1 or beginning
of period t) of inflation in period t

ˆ We now assume that the parameter c is privately known by the monetary authority: c ∈ {S, W }

– c = S or W for “strong” and “weak” at fighting inflation where S > W > 0

1. Nature draws the monetary authority’s type, c

ˆ The probability that c = W is p

2. Employers form π1e , their expectation of first-period inflation

3. The monetary authority observes π1e and then chooses actual first-period inflation, π1

4. Employers observe π1 but not c, and then form π2e , their expectation of second-period inflation

166
5. The monetary authority observes π2e and then chooses actual second-period inflation, π2

ˆ There is a one-period signaling game embedded in this two-period monetary-policy game

ˆ The Sender’s message is the monetary authority’s first-period choice of inflation, π1

ˆ The Receiver’s action is employers’ second-period expectation of inflation, π2e

ˆ Employers’ first-period expectation of inflation and the monetary authority’s second-period choice of
inflation precede and follow the signaling game

ˆ If the monetary authority’s type is c then its optimal choice of π2 given the expectation π2e is

d
π2∗ (π2e , c) ≡ [(1 − b)y ∗ + dπ2e ]
c + d2

ˆ Employers anticipate this

ˆ If employers begin the second period believing that the probability that c = W is q, then they will form
the expectation π2e (q) that maximizes

−q[π2∗ (π2e , W ) − π2e ]2 − (1 − q)[π2∗ (π2e , S) − π2e ]2

Monetary policy: pooling equilibrium

ˆ In a pooling equilibrium, both types choose the same first-period inflation π1∗

ˆ Employers’ first-period expectation is π1e = π1∗

ˆ On the equilibrium path, employers begin the second period believing that the probability that c = W is
p and so form the expectation π2e (p)

ˆ Then the monetary authority of type c chooses its optimal second-period inflation given this expectation,
namely π2∗ [π2e (p), c], thus ending the game

Monetary policy: separating equilibrium

ˆ In a separating equilibrium, the two types choose different first-period inflation levels, say πW and πS

ˆ So employers’ first-period expectation is π1e = pπW + (1 − p)πS

ˆ After observing πW , employers begin the second period believing that c = W and so form the expectation
π2e (1) solution of the equation

d(1 − b) ∗
π2e = π2∗ (π2e (1), W ) i.e., π2e (1) = y
W

ˆ Likewise, observing πS leads to


d(1 − b) ∗
π2e (0) = y
S

167
ˆ In equilibrium, the weak type then chooses π2∗ [π2e (1), W ] and the strong type π2∗ [π2e (0), S], ending the game

To complete the description of the such an equilibrium it remains

1. to specify the Receiver’s out-of-equilibrium beliefs and actions

2. to check that no Sender-type has an incentive to deviate

3. in particular, to check neither type has an incentive to mimic the other’s equilibrium behavior

ˆ The weak-type might be tempted to choose πS in the first period, thereby inducing π2e (0) as the employers’
second-period expectation

ˆ And then choose π2∗ [π2e (0), W ] to end the game

ˆ Even if πS is uncomfortably low for the weak type, the ensuing expectation π2e (0) might be so low that
the weak type receives a huge payoff from the unanticipated inflation

π2∗ [π2e (0), W ] − π2e (0)

ˆ In a separating equilibrium, the strong type’s first period inflation must be low enough that the weak
type is not tempted to mimic the strong type

ˆ In spite of the subsequent benefit from unanticipated second-period inflation

4.3 Other applications of Signaling Games


Cheap-Talk games

ˆ Cheap-talk games are analogous to signaling games

ˆ But the Sender’s messages are just talk: costless, non-binding, non-verifiable claims

ˆ Such talk cannot be informative in Spence’s job market signaling game:

– a worker who simply announced “My ability is high” would not be believed

ˆ In other contexts, cheap talk can be informative

ˆ Stein (Am. Econ. Rev. 1989) shows that policy announcements by the Federal Reserve can be informative
but cannot be too precise

ˆ Matthews (Quarterly J. Econ. 1989) studies how a veto threat by the president can influence which bill
gets through Congress

ˆ One can also ask how to design environments to take advantage of cheap talk

ˆ Austen-Smith (1990): muito interessante!!


shows that in some settings debate among self-interested legislators improves the social value of the
eventual legislation

168
ˆ Farrell and Gibbons (1991) show that in some settings unionization improves social welfare because it
facilitates communication from the work force to management

ˆ In Spence’s job market model, cheap talk cannot be informative because all the Sender’s types have the
same preferences over the Receiver’s possible actions:

– all workers prefer higher wages, independent of ability

ˆ Let’s illustrate why uniformity of preferences over the Receiver’s possible actions vitiates cheap talk

ˆ Suppose there were a pure-strategy equilibrium in which one subset of Sender-types, T1 , send one message,
m1

ˆ While another subset of types, T2 , sends another message, m2

ˆ In equilibrium, the Receiver will interpret mi as coming from Ti and so will take the optimal action given
this belief; denote this action by ai

ˆ Since all Sender-type have the same preferences over actions

– If one type prefers a1 to a2 , then all types have this preference and will send m1 rather than m2

ˆ This destroys the putative equilibrium

There are 3 necessary conditions for cheap talk to be informative

1. different Sender-types have different preferences over the Receiver’s actions

2. the Receiver prefers different actions depending on the Sender’s type

3. Receiver’s preferences over actions not be completely opposed to the Sender’s preferences

ˆ Suppose that the Receiver prefers low actions when the Sender’s type is low and high actions when
the Sender’s type is high
ˆ If low Sender-types prefer low actions and high types high actions, then communication can occur
ˆ If the Sender has the opposite preference then communication cannot occur because the Sender would
like to mislead the Receiver

ˆ Crawford and Sobel (Econometrica 1982) analyze an abstract model that satisfies these three necessary
conditions: they show that

– more communication can occur through cheap talk when the players’ preferences are more closely
aligned
– perfect communication cannot occur unless the players’ preferences are perfectly aligned

ˆ Each of economic applications (cheap talk of the Fed, veto threats, information transmission in debate,
union voice) involve complicated models of economic environments

ˆ We will only analyze abstract cheap-talk games

169
An abstract cheap-talk game

The timing of the simplest cheap-talk game is identical to the timing of a signaling game; only payoffs differ

1. Nature draws a type ti for the Sender from a set T = {t1 , ..., tI } of feasible types according to a probability
distribution p with full support, i.e., p(t) > 0 for every t ∈ T

2. The Sender observes ti and then chooses a message mj from a set of feasible messages M = {m1 , ..., mJ }

3. The Receiver observes mj (but not ti ) and then chooses an action ak from a set of feasible actions
A = {a1 , ..., ak }

4. Payoffs are given by US (ti , ak ) and UR (ti , ak )

ˆ The key feature of such a game is that the message has no direct effect on either the Sender’s or the
Receiver’s payoff

ˆ The only way the message can matter is through its information content

ˆ By changing the Receiver’s belief about the Sender’s type, a message can change the Receiver’s action

ˆ And thus, indirectly affect both players’ payoffs

ˆ We will assume that anything can be said in the sense that M = T

ˆ Because the simplest cheap-talk and signaling games have the same timing, the definitions of P BE in the
two games are identical

ˆ A pure-strategy PBE is a pair of strategies m∗ : T → M and a∗ : M → A, and a family (µ(·|mj ))mj ∈M


of beliefs over T satisfying Requirements (1), (2R), (2S) and (3)

ˆ In a cheap-talk game, a pooling equilibrium always exists

ˆ Because messages have no direct effect on the Sender’s payoff

ˆ If the Receiver will ignore all messages then pooling is a best response for the Sender; and if the Sender
is pooling then a best response for the Receiver is to ignore all messages

ˆ More formally, let a denote the Receiver’s optimal action in a pooling equilibrium, i.e., a solves
X
max p(ti )UR (ti , ak )
ak ∈A
ti ∈T

ˆ Define a∗ by a∗ (mj ) = ā for every mj ∈ M

ˆ Fix an arbitrary message m̄ in M and define m∗ by m∗ (ti ) = m̄ for every ti ∈ T

ˆ Let µ(·|ti ) = p for every ti ∈ T

ˆ This is a pooling equilibrium

170
ˆ The interesting question therefore is whether non-pooling equilibria exist

ˆ We consider a two-type, two-action example

ˆ T = {tL , tH }, p(tL ) = p, and A = {aL , aH }

ˆ The payoffs are given in the following table (this is not a game in normal form!)

Receiver
tL tH
aL x, 1 y, 0
Sender
aH z, 0 w, 1

ˆ The first payoff in each cell is the Sender’s and the second the Receiver’s

ˆ We have chosen the Receiver’s payoffs so that the Receiver

– prefers the low action aL when the Sender’s type is low tL


– prefers the high action aH when the Sender’s type is high tH

ˆ To illustrate the first necessary condition, suppose both Sender-types have the same preferences over
actions

– For example, x > z and y > w


– Both types prefer aL to aH and both types would like the Receiver to believe that t = tL : the
Receiver cannot believe such a claim

ˆ To illustrate the third necessary condition, suppose the players’ preferences are completely opposed

– For example, z > x and y > w


– The low Sender-type prefers the high action and the high Sender-type the low action
– Then tL would like the Receiver to believe that t = tH and tH would like the Receiver to believe
that t = tL
– The Receiver cannot believe either of these claims

ˆ Consider now the case: x ≥ z and w ≥ y;


the players’ interests are perfectly aligned, in the sense that given the Sender’s type, the players (Sender
and Receiver) agree on which action should be taken

ˆ We exhibit a separating PBE

ˆ The Sender’s strategy is m∗ (t) = t for every t ∈ T

ˆ The Receiver’s beliefs are µ(tL |tL ) = 1 and µ(tL |tH ) = 0

ˆ The Receiver’s strategy is a∗ (tL ) = aL and a∗ (tH ) = aH

ˆ We consider now a special case of Crawford and Sobel’s model

171
ˆ The type, message and action spaces are continuous

ˆ The Sender’s type is uniformly distributed between 0 and 1

– T = [0, 1] and p = λ the Lebesgue measure

ˆ The message space is the type space M = T

ˆ The action is the interval from 0 to 1, i.e., A = [0, 1]

ˆ The Receiver’s payoff function is UR (t, a) = −(a − t)2

ˆ The Sender’s payoff function is US (t, a) = −[a − (t + b)]2

ˆ When the Sender’s type is t, the Receiver’s optimal action is a = t, but the according to Sender’s
preferences, the optimal action is7 a = t + b

ˆ Different Sender-types have different preferences over the Receiver’s actions (higher types prefer higher
actions)

ˆ The player’s preferences are not completely opposed

– The parameter b > 0 measures the similarity of the players’ preferences


– When b is closer to 0, the players’ interests are more closely aligned

ˆ We will prove the existence of partially pooling equilibrium of the following form

ˆ The type space is divided into the n intervals

[0, x1 ), [x1 , x2 ), ..., [xn−1 , 1]

ˆ All the types in a given interval send the same message, but types in different intervals send different
messages

ˆ Given the value of the preference-similarity parameter b, there is a maximum number of intervals (or
“steps”) that can occur in equilibrium

ˆ This maximum number is denoted by n∗ (b), and partially pooling equilibria exist for each n ∈ {1, 2, ..., n∗ (b)}

ˆ A decrease in b increases n∗ (b):


more communication can occur through cheap talk when the players’ preferences are more closely aligned

ˆ n∗ (b) approaches infinite as b approaches zero: perfect communication cannot occur unless the players’
preferences are perfectly aligned

ˆ We characterize these partially pooling equilibria, starting with a two-step equilibrium, i.e., n = 2

ˆ Suppose all the types in [0, x1 ) send one message while those in [x1 , 1] send another
7
Actually, it is min{1, t + b}

172
ˆ After receiving the message from the types in [0, x1 ), the Receiver will believe that the Sender’s type is
uniformly distributed on [0, x1 )

ˆ So the Receiver’s optimal action will be x1 /2

ˆ After receiving the message from the types in [x1 , 1], the Receiver’s optimal action will be (x1 + 1)/2

ˆ For the types in [0, x1 ) to be willing to send their message, it must be that all these types prefer the action
x1 /2 to the action (x1 + 1)/2

ˆ Likewise, all the types above x1 must prefer (x1 + 1)/2 to x1 /2

ˆ The Sender-type t

– prefers x1 /2 to (x1 + 1)/2 if the midpoint between these two actions exceeds that type’s optimal
action, t + b
– prefers (x1 + 1)/2 to x1 /2 if t + b exceeds the midpoint

ˆ For a two-step equilibrium to exist, x1 must be the type t whose optimal action t + b exactly equals the
midpoint between the two actions  
1 x1 x1 + 1
x1 + b = +
2 2 2
or x1 = (1/2) − 2b

ˆ Since x1 must be positive, a two-step equilibrium exists only if b < 1/4

ˆ For b ≥ 1/4 the players’ preferences are too dissimilar to allow even this limited communication

ˆ We still have to address the issue of messages that are off the equilibrium path

ˆ Let the Sender’s strategy be that all types t < x1 send the message 0

ˆ And all types t ≥ x1 send the message x1

173
ˆ Let the Receiver’s out-of-equilibrium belief after observing any message from (0, x1 ) be that t is uniformly
distributed on [0, x1 )

ˆ And after receiving any message from (x1 , 1] be that t is uniformly distributed on [x1 , 1]

ˆ We propose to characterize a n-step equilibrium

ˆ Assume the step [xk−1 , xk ) is of length c

ˆ To make the boundary type xk indifferent between the steps [xk−1 , xk ) and [xk , xk+1 )

ˆ One must have


xk+1 + xk c
− (xk + b) = + b
2 2
or
(xk+1 − xk ) = (xk − xk−1 ) + 4b

ˆ Each step must be 4b longer than the last

ˆ In an n-step equilibrium, if the first step is of length d

ˆ Then the second must be of length d + 4b

ˆ The third of length d + 8b

ˆ The nth step must end exactly at t = 1, so we must have

d + (d + 4b) + ... + [d + (n − 1)4b] = 1

ˆ Recall that 1 + 2 + ... + (n − 1) = n(n − 1)/2

ˆ Therefore we have
n × d + n(n − 1) × 2b = 1 (NC)

ˆ Given any n such that n(n − 1) × 2b < 1, there exists a value of d that solves (NC)

ˆ And therefore there exists an n-step partially pooling equilibrium

ˆ The largest possible number of steps in such an equilibrium, n∗ (b), is the largest value n such that
n(n − 1) × 2b < 1

ˆ Therefore n∗ (b) is the largest integer less than

1h p i
1 + 1 + (2/b)
2

ˆ Observe that n∗ (b) = 1 for b ≥ 1/4: no communication is possible if the players’ preferences are too
dissimilar

174
ˆ Moreover, n∗ (b) approaches infinity only as b approaches zero: perfect communication cannot occur unless
the players’ preferences are perfectly aligned

Sequential bargaining under asymmetric information

ˆ Consider a firm and a union bargaining over wages

ˆ For simplicity, assume that employment is fixed

ˆ The amount that union members earn if not employed, called union’s reservation wage, is denoted by wr

ˆ The firm’s profit, denoted by π, is uniformly distributed on [πL , πH ]

ˆ The value of π is privately known by the firm

– The firm might have superior knowledge concerning new products in the planning stage

ˆ We simplify the analysis by assuming that wr = πL = 0

The bargaining game lasts at most two periods

1. In the first period, the union makes a wage offer, w1

ˆ If the firm accepts this offer then the game ends


ˆ The union’s payoff is w1 and the firm’s is π − w1
ˆ These payoffs are the present values of the wage and (net) profit streams that accrue to the players
over the life of the contract being negotiated

2. If the firm rejects this offer the game proceeds to the second period

ˆ The union makes a second wage offer, w2


ˆ If the firm accepts this offer then the present values of the players’ payoffs are δw2 for the union and
δ(π − w2 ) for the firm
ˆ δ reflects both discounting and the reduced life of the contract remaining after the first period
ˆ If the firm rejects the union’s second offer then the game ends and payoffs are zero for both players

ˆ A more realistic model might allow the bargaining to continue until an offer is accepted

ˆ Or might force the parties to submit to binding arbitration after a prolonged strike

ˆ Here we sacrifice realism for tractability

ˆ We refer to Sobel adn Takahashi (Rev. Econ. Sud. 1983) for an infinite horizon analysis

175
We begin by sketching the unique PBE of this game

ˆ The union’s first-period wage offer is


(2 − δ)2
w1∗ = πH
2(4 − 3δ)

ˆ If the firm’s profit, π, exceeds


2w1∗ 2−δ
π1∗ = = πH
2−δ 4 − 3δ
then the firm accepts w1∗ ; otherwise the firm rejects w1∗

ˆ If its first-period offer is rejected, the union updates its belief about the firm’s profit

– The union believes that π is uniformly distributed on [0, π1∗ ]

ˆ The union’s second-period wage offer (conditional on w1 being rejected) is

π1∗ 2−δ
w2∗ = = πH < w1∗
2 2(4 − 3δ)

ˆ If the firm’s profit, π, exceeds w2∗ then the firm accepts the offer; otherwise, it rejects it

ˆ We will refer interchangeably to one firm with many possible profit types and to many firms each with
its own profit level

ˆ In each period, high-profits firms accept the union’s offer

ˆ While low-profit firms reject it

ˆ The union’s second-period belief reflects the fact that high-profit firms accepted the first-period offer

ˆ In equilibrium, low-profit firms tolerate a one-period strike in order to convince the union that they are
low-profit and so induce the union to offer a lower second-period wage

ˆ Firms with very low profits find even the lower second-period offer intolerably high and so reject it, too

ˆ We propose an extensive-form representation of a simplified version of the game

ˆ There are only two values of π: πL and πH

ˆ The union has only two possible wage offers wL and wH

ˆ In this simplified game, the union has the move at three information sets: the union’s strategy consists
of three wage offers

1. The first-period offer, w1


2. The second-period offer, w2 (H) after w1 = wH is rejected
3. The second-period offer, w2 (L) after w1 = wL is rejected

176
ˆ These three moves occur at three non-singleton information sets, at which the union’s beliefs are denoted

(p, 1 − p), (q, 1 − q) and (r, 1 − r)

respectively

177
ˆ In the full game, a strategy for the union is a

1. first-period offer w1
2. a second-period offer function w1 7→ w̃2 (w1 ) that specifies the offer w2 to be made after each possible
offer w1 is rejected

ˆ Each of these moves occur at a non-singleton information set

ˆ There is one second-period information set for each different first-period wage offer the union might make

ˆ So there is a continuum of such information sets, rather than two in the simplified game

ˆ With both the lone first-period and the continuum of second-period information sets, there is one decision
node for each possible value of π (so a continuum of such nodes, rather than two for the simplified game)

ˆ At each information set, the union’s belief is a probability distribution over these nodes

ˆ We denote the union’s first-period belief by µ1 ∈ P rob([0, πH ])

ˆ The union’s second-period belief, after observing the first-period offer w1 has been rejected, is denoted by
µ2 (·|w1 )

ˆ A strategy for the firm involves two decisions

178
ˆ Let A1 (w1 |π) equal one if the firm would accept the first-period offer w1 when its profit is π, and zero if
the firm would reject w1 under these circumstances

ˆ Let A2 (ww |π, w1 ) equal one if the firm would accept the second-period offer w2 when its profit is π and
the first-period offer was w1 , and zero if the firm would reject w2 under these circumstances

ˆ A strategy for the firm is a pair of functions (A1 , A2 ) with

A1 : (w1 , π) 7→ A1 (w1 |π) ∈ {0, 1}

and
A1 : (w2 , w1 , π) 7→ A2 (w2 |π, w1 ) ∈ {0, 1}

ˆ Since the firm has complete information throughout the game, its belief are trivial

ˆ The strategies (w̃1 , w̃2 ) and (A1 , A2 ), and the beliefs (µ1 , µ2 ) form a PBE if they satisfy Requirements 2,
3 and 4

ˆ Requirement 1 is satisfied by the mere existence of the union’s beliefs

ˆ We will show that there is a unique perfect Bayesian equilibrium

ˆ The simplest step of the argument is to apply Requirement 2 to the firm’s second-period decision A2 (w2 |π, w1 )

ˆ Since this is the last move of the game, the optimal decision for the firm is to accept w2 if and only if
π ≥ w2 ; the value of w1 is irrelevant
(
1 if π ≥ w2
A2 (w2 |π, w1 ) =
0 if π < w2

ˆ Given the strategy A2 , we can apply Requirement 2 to the union’s second-period choice of a wage offer

ˆ w2 should maximize the union’s expected payoff, given the union’s belief µ2 and the firm’s subsequent
strategy A2

ˆ The difficult part of the argument is to determine the belief µ2

ˆ We temporarily consider the following one-period bargaining problem

ˆ Suppose the union believes that the firm’s profit is uniformly distributed on [0, π1 ], where for the moment
π1 is arbitrary

ˆ If the union offers w then the firm’s best response is:

– accept w if and only if π ≥ w

ˆ Thus the union’s problem can be stated as

max[w × P rob{firm accepts w} + 0 × P rob{firm rejects w}]


w≥0

179
where
π1 − w
P rob{firm accepts w} =
π1

ˆ The optimal wage offer is therefore w∗ (π1 ) = π1 /2

ˆ We return (permanently) to the two-period problem

ˆ Assume that the union offers w1 in the first period and the firm expects the union to offer w2 in the
second period

ˆ The firm’s possible payoffs are

– π − w − 1 from accepting w1
– δ(π − w2 ) from rejecting w1 and accepting w2
– zero from rejecting both offers

ˆ The firm prefers accepting w1 to accepting w2 if π − w1 ≥ δ(π − w2 ), or

w1 − δw2
π≥ ≡ π ∗ (w1 , w2 )
1−δ

ˆ And the firm prefers accepting w1 to rejecting both offers if π − w1 ≥ 0

ˆ Thus for arbitrary values of w1 and w2 , firms with π ≥ max{π ∗ (w1 , w2 ), w1 } will accept w1 and the other
firms will reject

ˆ Since Requirement 2 dictates that the firm act optimally given the players’ subsequent strategies, we can
derive A1 (w1 |π) by replacing the arbitrary wage w2 by w̃2 (w1 ), i.e.,
(
1 if π ≥ max{π ∗ (w1 , w̃2 (w1 )), w1 }
A1 (w1 |π) =
0 if π < max{π ∗ (w1 , w̃2 (w1 )), w1 }

ˆ We can derive µ2 , the union’s second-period belief at the information set reached if the first period offer
w1 is rejected

ˆ Requirement 4 dictates that the union’s belief be determined by Bayes’ rule and the firm’s strategy

ˆ Thus, given the first part of the firm’s strategy A1 just derived

ˆ The union’s belief must be that the types remaining in the second period are uniformly distributed on
[0, π̂1 (w1 , w̃2 )] where
π̂1 (w1 , w̃2 ) ≡ max{π ∗ (w1 , w̃2 (w1 )), w1 }

ˆ Given this belief, the union’s optimal second-period offer must be

π̂1 (w1 , w̃2 )


w̃2 (w1 ) = w∗ (π̂1 (w1 , w̃2 )) =
2

180
ˆ It follows that w̃2 (w1 ) solves the implicit equation for w2 as a function of w1 :

2w2 = max{π ∗ (w1 , w2 ), w1 }

ˆ To solve this equation, suppose that w1 ≥ π ∗ (w1 , w2 )

ˆ Then 2w2 = w1 but this contradicts w1 ≥ π ∗ (w1 , w2 )

ˆ Therefore, we must have 2w2 = π ∗ (w1 , w2 ) implying that

w1
w̃2 (w1 ) =
2−δ

ˆ Therefore, the union’s second-period belief at the information set reached if the first period offer w1 is
rejected

ˆ Is that the types remaining in the second period are uniformly distributed on

[0, π̃(w1 )] where π̃(w1 ) = π̂1 (w1 , w̃2 (w1 ))

ˆ Since w̃2 (w1 ) = w1 /(2 − δ) we get that


2w1
π̃(w1 ) =
2−δ
ˆ We have now reduced the game to a single-period optimization problem for the union

ˆ Given the union’s first-period wage offer, w1 , we have specified

– the firm’s optimal first-period response

2w1
A1 (w1 |π) = 1 ⇔ π ≥ π̃(w1 ) =
2−δ

– the union’s belief entering the second period

1
µ2 (·|w1 ) = λ
π̃(w1 ) [0,π̃(w1 ))

– the union’s optimal second-period offer

w1
w̃2 (w1 ) =
2−δ

– the firm’s optimal second-period response

A2 (w2 |π, w1 ) = 1 ⇔ π ≥ w2

ˆ Thus, the union’s first-period wage offer w1 should be chosen to solve

w1 × µ1 {A1 (w1 |·) = 1} + Π2 (π1 ) × [1 − µ1 {A1 (w1 |·) = 1}]

181
where Π2 (π1 ) is the discounted of the second-period payoff conditional to the rejection by the firm of the
offer w1 , i.e.,
Π2 (π1 ) = δw̃2 (w1 ) × µ2 [{A2 (w̃2 (w1 )|·, w1 ) = 1}|w1 ]

ˆ Observe that
πH − π̃(w1 )
µ1 {A1 (w1 |·) = 1} = µ1 {π ≥ π̃(w1 )} =
πH

ˆ Observe that
µ2 [{A2 (w̃2 (w1 )|·, w1 ) = 1}|w1 ] = µ2 ([w̃2 (w1 ), πH ]|w1 )

ˆ Since π̃(1) = 2w̃2 (w1 ) we get that

π̃(w1 ) − w̃2 (w1 )


µ2 [{A2 (w̃2 (w1 )|·, w1 ) = 1}|w1 ] =
π̃(w1 )

ˆ The union’s first-period wage offer w1∗ should be chosen to solve


 
πH − π̃(w1 ) π̃(w1 ) − w̃2 (w1 )
max w1 + δw̃2 (w1 )
w1 ≥0 πH πH

ˆ The solution w1∗ is


(2 − δ)2
w1∗ = πH
2(4 − 3δ)

ˆ If the firm’s profit, π, exceeds


2 2−δ
π1∗ = w1∗ = πH
2−δ 4 − 3δ
then the firm accepts w1∗ ; otherwise, the firm rejects w1∗

ˆ If its first period offer is rejected, the union updates its belief about the firm’s profit: the union believes
that π is uniformly distributed on [0, π1∗ ]

ˆ The union’s second-period wage offer (conditional on w1∗ being rejected) is

π1∗ 2−δ
w2∗ = = πH < w1∗
2 2(4 − 3δ)

ˆ If the firm’s profit, π, exceeds w2∗ then the firm accepts the offer; otherwise, it rejects it

Reputation in the finitely repeated Prisoners’ Dilemma

ˆ Consider a stage game having a unique Nash equilibrium

ˆ Any finitely repeated game based on this stage game has a unique SPNE

– The Nash equilibrium of the stage game is played in every stage, after every history

182
ˆ A great deal of experimental evidence suggests that cooperation occurs frequently during finitely repeated
Prisoners’ Dilemmas

ˆ Especially in stages that are not too close to the end

ˆ Kreps, Milgrom, Roberts, and Wilson (J. Econ. Theory 1982) show that a reputation model offers an
explanation of this evidence

ˆ We introduce a new way of modeling asymmetric information

ˆ Rather than assume that one player has private information about his or her payoffs

ˆ We will assume that the player has private information about his or her feasible strategies

ˆ We will assume that with probability p the Row player can play only the Tit-for-Tat strategy

– This strategy begins the repeated game by cooperating and thereafter mimics the opponent’s previous
play

ˆ While with probability 1 − p the Row player can play any of the strategies available in the complete-
information repeated game (including Tit-for-Tat)

– This Row-type is called “rational”

ˆ Under this formulation, if the Row player ever deviates from the Tit-for-Tat strategy then it becomes
common knowledge that Row is rational

ˆ The spirit of KMRW’s analysis is that even if p is very small

– i.e., even if the Column player has only a tiny suspicion that the Row player might not be rational

ˆ This uncertainty can have a big effect

ˆ KMRW show that there is an upper bound on the number of stages in which either player finks in
equilibrium

ˆ This upper bound depends on p and on the stage-game payoffs but not on the number of stages in the
repeated game

ˆ Thus, in any equilibrium of a long enough repeated game, the fraction of stages in which both players
cooperate is large

The two key steps in KMRW’s argument are

1. If the Row player deviates from Tit-for-Tat then it becomes common knowledge that Row is rational

ˆ So neither player cooperates thereafter


ˆ So the rational Row has an incentive to mimic Tit-for-Tat

183
2. Given an assumption on the stage-game payoffs to be imposed below, the Column player’s best response
against Tit-for-Tat would be to cooperate until the last stage of the game

ˆ We will consider the complement of the analysis in KMRW

ˆ Rather than assume that p is small and analyze long repeated games

ˆ We will assume that p is large enough that there exists an equilibrium in which both players cooperate in
all but the last two stages of a (possibly short) repeated game

ˆ We begin with the two period case

The timing is

1. Nature draws a type for the Row player

ˆ With probability p, Row has only the Tit-for-Tat strategy available


ˆ With probability 1 − p, Row can play any strategy
ˆ Row learns his or her type, but Column does not learn Row’s type

2. Row and Column play the Prisoners’ Dilemma

ˆ The players’ choices in this stage game become common knowledge

3. Row and Column play the Prisoners’ Dilemma for a second and last time

4. Payoffs are received

ˆ The playoffs are the (undiscounted) sums of their stage-game payoffs

Column
Cooperate Fink
Cooperate 1, 1 b, a
Row
Fink a, b 0, 0

ˆ To make this stage game a Prisoners’ Dilemma, we assume that

a>1 and b<0

ˆ Recall that finking (F) strictly dominates cooperating (C) in the stage game, both for rational Row and
for Column

ˆ Since, in the last stage of this two-period game of incomplete information, Column will surely fink

ˆ Then, there is no reason for the rational Row to cooperate in the first stage

ˆ Tit-for-Tat begins the game by cooperating

ˆ Thus, the only move to be determined is Column’s first-period move (X)

184
ˆ This move is then mimicked by Tit-for-Tat in the second period

ˆ By choosing X = C, Column receives the expected payoff

p · 1 + (1 − p) · b

in the first period

ˆ Since Tit-for-Tat and the rational Row choose different moves in the first period

ˆ Column will begin the second period knowing whether Row is Tit-for-Tat or rational

ˆ The expected second-period payoff for the Column player is

p · a + (1 − p) · 0

ˆ This reflects Column’s uncertainty about Row’s type when deciding whether to cooperate or fink in the
first period

ˆ By choosing X = F , Column’s expected payoff in the first period is

p · a + (1 − p) · 0

ˆ In the second-period he also receives p · a

ˆ Therefore, Column will cooperate in the first period provided that

p + (1 − p)b ≥ 0 (C-1)

ˆ We hereafter assume that (C-1) holds

ˆ Now consider the three-period case

ˆ If Column and the rational Row both cooperate in the first period

ˆ Then the equilibrium path for the second and third periods will be given by the equilibrium of the previous
two-period game with X = C

185
ˆ We will derive sufficient conditions for Column and the rational Row to cooperate in the first period and
get the following three-period path, called “cooperation equilibrium”

In this equilibrium
ˆ The payoff to the rational Row is 1 + a

ˆ The expected payoff to Column is

[p · 1 + (1 − p) · 1] + [p · 1 + (1 − p)b] + [p · a + (1 − p) · 0] = 1 + p + (1 − p)b + pa

ˆ If the rational Row finks in the first period

ˆ Then it becomes common knowledge that Row is rational

ˆ So both players fink in the second and third periods

ˆ Thus, the total payoff to the rational Row from finking in the first period is a

ˆ This is less than the cooperation equilibrium payoff 1 + a

ˆ The rational Row has no incentive to deviate from the strategy of the cooperation equilibrium

ˆ We next consider whether Column has an incentive to deviate

ˆ If Column finks in the first period then

– Tit-for-Tat will fink in the second period


– the rational Row will fink in the second period because Column is sure to fink in the last period

ˆ Having finked in the first period, Column must then decide whether to fink or cooperate in the second
period

ˆ If Column finks in the second period, then Tit-for-Tat will fink in the third period

ˆ The play will be as follows

186
ˆ Column’s payoff from this deviation is a

ˆ This is less than Column’s expected payoff in the cooperation equilibrium provided that

1 + p + (1 − p)b + pa ≥ a

ˆ Given (C-1), a sufficient condition for Column not to play this deviation is

1 + pa ≥ a (C-2)

ˆ Alternatively, Column could deviate by finking in the first period but cooperating in the second

ˆ In which case Tit-for-Tat would cooperate in the third period

ˆ The play would be as follows:

ˆ Column expected payoff from this deviation is a + b + pa

ˆ This is less than Column’s expected payoff in the cooperation equilibrium provided that

1 + p + (1 − p)b + pa ≥ a + b + pa

ˆ Given (C-1), a sufficient condition for Column not to play this deviation is

a+b≤1 (C-3)

ˆ We have shown that if (C-1), (C-2) and (C-3) hold

ˆ Then the cooperation equilibrium is the equilibrium path of a PBE of the three-period Prisoners’ Dilemma

ˆ For a given value of p, the payoffs a and b satisfy these three conditions if they belong to the shaded region

187
ˆ As p approaches zero, this shaded region vanishes

4.4 Refinements of Perfect Bayesian Equilibrium


ˆ We defined a perfect Bayesian equilibrium to be strategies and beliefs satisfying Requirements 1 through
4

ˆ We observed that in such an equilibrium no player’s strategy can be strictly dominated beginning at any
information set

ˆ We now consider two further requirements on beliefs off the equilibrium path

ˆ The first additional requirement formalizes the following idea

Since PBE prevents player i from playing a strategy that is strictly dominated beginning at any information
set, it is not reasonable for player j to believe that i would play such a strategy

ˆ To make this idea more concrete, consider the following dynamic game with incomplete information

ˆ There are two pure-strategy perfect Bayesian equilibria

(L, L′ , p = 1) and (R, R′ , p ≤ 1/2)

188
ˆ In (L, L′ ), player 2’s information set is on the equilibrium path, so Requirement 3 dictates that p = 1

ˆ In (R, R′ ), this information set is off the equilibrium path but Requirement 4 puts no restriction on p

ˆ We thus require only that 2’s belief p make the action R′ optimal – i.e., p ≤ 1/2

ˆ The key feature of this example is that M is a strictly dominated strategy for player 1

ˆ Thus, it is not reasonable for player 2 to believe that 1 might have played M

ˆ Formally, it is not reasonable for 1 − p to be positive, so p must equal one

ˆ Therefore, the PBE (R, R′ , p ≤ 1/2) is not reasonable leaving (L, L′ , p = 1) as the only PBE satisfying
this requirement

ˆ Although M is strictly dominated, L is not

ˆ If L were striclty dominated (for instance if player 1’s payoff of 3 were, say, 3/2)

ˆ Then the same argument would imply that it is not reasonable for p to be positive, but this would
contradict the earlier result that p must be one

ˆ In such a case, the new requirement would not restrict player 2’s out-of-equilibrium beliefs

ˆ In the previous example, M is strictly dominated (in the whole game)

ˆ This strict dominance is too strong a test

ˆ We will require that player j should not believe that player i might have played a strategy that is strictly
dominated beginning at any information set

ˆ For example, consider the following modification of the previous game

– We expand the game in such a way that player 2 has a move preceding 1’s move and has two choices
at this initial move
– Either end the game or give the move to 1 at 1’s information set
– Now M is not any more strictly dominated because if 2 ends the game at the initial node then L,
M , and R all yield the same payoff

Definition

ˆ Consider an information set at which player i has the move

ˆ The strategy s′i is strictly dominated beginning at this information set if there exists another
strategy si such that

– for every belief that i could hold at the given information set
– for each possible combination of the other players’ subsequent strategies8
8
a “subsequent strategy” is a complete plan of action covering every contingency that might arise after the given information set
has been reached

189
ˆ Player i’s expected payoff from taking the action specified by si at the given information set and playing
the subsequent strategy specified by si

ˆ Is strictly greater than the expected payoff from taking the action and playing the subsequent strategy
specified by s′i

Requirement 25 (5). If possible, each player’s beliefs off the equilibrium path should place zero probability
on nodes that are reached only if another player plays a strategy that is strictly dominated beginning at some
information set

ˆ The qualification “If possible” in Requirement 5 covers the case that would arise in the previous game if
R dominated both M and L (as would occur if player 1’s payoff of 3 were 3/2

ˆ In such a case, Requirement 1 dictates that player 2 have a belief, but it is not possible for this belief to
place zero probability on the nodes following both M and L

ˆ So Requirement 5 would not apply

ˆ To illustrate Requirement 5, consider the following signaling game

– In the payoffs (3, 2), the payoff 3 is the Sender’s payoff

ˆ The Sender strategy (m′ , m′′ ) means that type t1 chooses a message m′ and type t2 chooses the message
m′′ , i.e., the Sender strategy m̃ = (m′ , m′′ ) is given by
(
m′ if t = t1
m̃(t) =
m′′ if t = t2

ˆ The Receiver strategy (a′ , a′′ ) means that the Receiver chooses action a′ following L and a′′ following R,
i.e., the Receiver strategy ã = (a′ , a′′ ) is given by
(
a′ if m = L
ã(m) =
a′′ if m = R

190
ˆ We can check that the strategies and beliefs

{(L, L), (u, d), p = 0.5, q}

constitute a pooling PBE for any q ≥ 1/2

ˆ The key feature of this signaling game, however, is that it makes no sense for t1 to play R

– The strategies in which t1 plays R are strictly dominated beginning at the Sender’s information set
corresponding to t1
– Showing that (R, L) and (R, R) are strictly dominated beginning at this information set amounts to
exhibiting an alternative strategy for the Sender that yields a higher payoff for t1 for each strategy
the Receiver could play
– (L, R) is such a strategy: it yields at worst 2 for t1 , whereas (R, L) and (R, R) yield at best 1

ˆ The t1 -node in the Receiver’s information set following R can be reached only if the Sender plays a strategy
that is strictly dominated

ˆ Furthermore, the t2 -node in the Receiver’s information set following R can be reached by a strategy that
is not strictly dominated beginning at an information set, namely (L, R)

ˆ Requirement 5 dictates that q = 0

ˆ Since {(L, L), (u, d), p = 0.5, q} is a PBE only if q ≥ 1/2, such an equilibrium cannot satisfy Requirement
5

ˆ An equivalent way to impose Requirement 5 on the signaling game is as follows

Definição 43. In a signaling game, the message mj from M is dominated for type ti from T if there
exists another message mj ′ from M such that ti ’s lowest possible payoff from mj ′ is greater than ti ’ highest
possible payoff from mj
min US (ti , mj ′ , ak ) > max US (ti , mj , ak )
ak ∈A ak ∈A

Signaling Requirement (5). If the information set following mj is off the equilibrium path and mj is domi-
nated for type ti then (if possible) the Receiver’s belief µ(ti |mj ) should place zero probability on type ti

ˆ This is possible provided mj is not dominated for all types in T

ˆ The separating PBE


{(L, R), (u, u), p = 1, q = 0}

satisfies Signaling Requirement 5 trivially because there are no information set off this equilibrium path

ˆ Suppose now that the Receiver’s payoffs when type t2 plays R are reversed:

– 1 from playing d and 0 from playing u

191
ˆ Now
{(L, L), (u, d), p = 0.5, q}

is a pooling PBE for any value of q

ˆ So
{(L, L), (u, d), p = 0.5, q = 0}

is a pooling PBE satisfying Requirement 5

ˆ In some games, there are perfect Bayesian equilibria that seem unreasonable but nonetheless satisfy
Requirement 5

ˆ Cho and Kreps (QJE 1987) proposed an additional refinement

ˆ We propose to discuss three aspects of their paper

1. the “Beer and Quiche” signaling game, which illustrates that unreasonable perfect Bayesian equilibria
can satisfy Signaling Requirement 5
2. a stronger version of Signaling Requirement 5, called the Intuitive Criterion
3. the application of the Intuitive Criterion to Spence’s job-market signaling game

The Beer and Quiche game

ˆ The Sender is one of two types

– “wimpy” (timid, coward, unadventurous) with probability 0.1


– “surly” (unfriendly, hostile, bad-tempered, threatening) with probability 0.9

ˆ The Sender’s message is the choice of whether to have beer or quiche for breakfast

ˆ The Receiver’s action is the choice of whether or not to duel with the Sender

ˆ The qualitative feature of the payoffs are that

– the wimpy type would prefer to have quiche for breakfast, the surly would prefer to have beer
– both types would prefer not to duel with the Receiver (and care about this more than about which
breakfast they have)
– the Receiver would prefer to duel with the wimpy type but not to duel with the surly type

192
ˆ In this game,
{m∗ , a∗ , p = 0.1, q}

with (
∗ ∗ not if m = Quiche
m (t) = Quiche, a (m) =
duel if m = Beer

is a pooling PBE for any q ≥ 1/2

ˆ This equilibrium satisfies Signaling Requirement 5, because Beer is not dominated for either Sender type

ˆ The Receiver’s belief off the equilibrium path does seem suspicious

ˆ If the Receiver unexpectedly observes Beer then the Receiver concludes that the Sender is at least as
likely to be wimpy as surly (i.e., q ≥ 1/2) even though

(a) the wimpy type cannot possibly improve on the equilibrium payoff of 3 by having Beer rather than
Quiche
(b) the surly type could improve on the equilibrium payoff of 2, by receiving the payoff of 3 that would
follow if the Receiver held a belief q < 1/2

ˆ Given (a) and (b), one might expect the surly type to choose Beer and then make the following speech:
Seeing me choose Beer should convince you that I am the surly type:

– choosing Beer could not possibly have improved the lot of the wimpy type, by (a)
– if choosing Beer will convince you that I am the surly type then doing so will improve my lot, by (b)

ˆ If such a speech is believed, it dictates that q = 0, which is incompatible with this pooling PBE

Definição 44. Given a PBE in a signaling game, the message mj from M is equilibrium-dominated
for type ti from T if ti ’s equilibrium payoff, denoted by U ∗ (ti ), is greater than ti ’s highest possible payoff

193
from mj
U ∗ (ti ) > max US (ti , mj , ak )
ak ∈A

Signaling Requirement (6). If the information set following mj is off the equilibrium path and mj is
equilibrium-dominated for type ti then (if possible) the Receiver’s belief µ(ti |mj ) should place zero probability on
type ti . This is possible provided mj is not equilibrium-dominated for all types in T

ˆ “Beer and Quiche” shows that a message mj can be equilibrium-dominated for ti without being dominated
for ti

ˆ If mj is dominated for ti , however, then mj must be equilibrium-dominated for ti

ˆ So imposing Signaling Requirement 6 makes Signaling Requirement 5 redundant

ˆ Arguments in this spirit are sometimes said to use forward induction

– because interpreting a deviation – i.e., in forming the belief µ(ti |mj ) – the Receiver asks whether the
Sender’s past behavior could have been rational
– whereas backwards induction assumes that future behavior will be rational

Spence’s job-market signaling game

ˆ Consider the envy case of the job-market signaling model

ˆ there are enormous number of pooling, separating and hybrid perfect Bayesian equilibria in this model

ˆ only one is consistent with signaling Requirement 6

194
– ti = L chooses e∗ (L)
– ti = H chooses es

ˆ Remember that, in any PBE, worker’s wage is

w(e) = µ(H|e) · y(H, e) + (1 − µ(H|e)) · y(L, e)

ˆ Because
y[L, e∗ (L)] − c[L, e∗ (L)] > w(e) − c(L, e) ∀e > es

any education level e > es is dominated for the low-ability type

– in terms of Signaling Requirement 5


– therefore, µ(H|e) = 1 for all e > es

ˆ There is no other separating PBE satisfying Signaling Requirement 5

ˆ For any PBE with e(H) = ê, ê > es , a deviation would be choose e ∈ [es , ê)

– see the previous figure

ˆ in any equilibrium that satisfies Signaling Requirement 5, type-H’s utility must be at least

y(H, es ) − c(H, es )

otherwise, the worker would deviate to (w, e) = (y(H, es ), es )

ˆ Some pooling and hybrid equilibria cannot satisfy Signaling Requirement 5

ˆ There are 2 cases:

(a) q is low enough


(b) q is not low enough

that the wage function


w = q · y(H, e) + (1 − q) · y(L, e)

is below the high-ability worker’s indifference curve through the point [es , y(H, es )]

ˆ First, consider case (a):

195
ˆ no pooling equilibria satisfy Signaling Requirement 5

– type-H worker cannot achieve the utility y(H, es ) − c(H, es ) in such an equilibrium

ˆ no hybrid equilibrium in which the type-H worker does the randomizing satisfy Signaling Requirement 5

– the point (e, w) at which pooling occurs in such an equilibrium lies below the wage function w =
q · y(H, e) + (1 − q) · y(L, e)

ˆ no hybrid equilibrium in which the type-L worker does the randomizing satisfy Signaling Requirement 5

– the point (e, w) at which pooling occurs in such an equilibrium must be on the type-L’s indifference
curve through the point [e∗ (L), w∗ (L)]
– and so lies below the type-H’s indifference curve through the point [es , y(H, es )]

Observação 29. In this case, there is only one PBE satisfying Signaling Requirement 5.

ˆ Second, consider the case (b)

196
ˆ no hybrid equilibrium in which the type-L worker does the randomizing satisfy Signaling Requirement 5

– the point (e, w) at which pooling occurs in such an equilibrium must be on the type-L’s indifference
curve through the point [e∗ (L), w∗ (L)]
– and so lies below the type-H’s indifference curve through the point [es , y(H, es )]

ˆ pooling and hybrid equilibria in which the type-H worker does the randomizing can satisfy this requirement

– if the pooling occurs at an point (e, w) in the shaded region of the figure

ˆ However, such equilibria cannot satisfy Signaling Requirement 6

ˆ Consider a pooling equilibrium at ep shown in the figure

ˆ Education choices e > e′ are equilibrium-dominated for type-L worker

– even the highest wage that could be paid to a worker with education e, y(H, e)
– yields an (e, w) point below the type-L’s indifference curve through the point (ep , wp )

ˆ Education choices between e′ and e′′ are not equilibrium-dominated for the type-H worker

– if such a choice convinces the firms that the worker has high ability,
– then the firms will offer the wage y(H, e)
– which will make type-H better off than in the indicated pooling equilibrium

ˆ Thus, if e′ < e < e′′ , Signaling Requirement 6 implies µ(H|e) = 1

ˆ Which in turn implies that the indicated pooling equilibrium cannot satisfy Signaling Requirement 6

Observação 30. This argument can be repeated for all the pooling and hybrid equilibria in the shaded
region in the figure

ˆ so the only PBE that satisfies Signaling Requirement 6 is the separating equilibrium previously dis-

197
cussed

198
Tópicos Especiais
Instabilidade Financeira (Bank runs)
To be written.

Casamentos Estáveis (Matching )


Matching: o algoritmo de Gale and Shapley (1962)

Considere dois grupos de agentes, grupo M = {m1 , m2 , · · · , mn } e grupo W = {w1 , w2 , · · · , wn }. Defina


N := {1, 2, · · · , n}.
O objetivo é associar cada um dos elementos de M a um, e somente um, elemento de W. * Ou seja, quer-se
construir uma função injetiva x : M → W.
Para todo a ∈ {m, w} e todo i ∈ N , a relação de preferências do agente ai é representada dada por ≻ai .
A relação de preferência ≻ai e é suposta estrita, completa e transitiva. Ou seja,
* para todo wi ∈ W tem-se

[(mk ≻wi ml ) ∨ (ml ≻wi mk )] ∧ ¬[(mk ≻wi ml ) ∧ (ml ≻wi mk )], ∀k, l ∈ N (1)
(mj ≻wi mk ) ∧ (mk ≻wi ml ) ⇒ (mj ≻wi ml ), ∀j, k, l ∈ N (2)

* para todo mi ∈ M tem-se

[(wk ≻mi wl ) ∨ (wl ≻mi wk )] ∧ ¬[(wk ≻mi wl ) ∧ (wl ≻mi wk )], ∀k, l ∈ N (3)
(wj ≻mi wk ) ∧ (wk ≻mi wl ) ⇒ (wj ≻mi wl ), ∀j, k, l ∈ N (4)

**Obs.:** Na notação acima, ∨ denota a disjunção ”ou”, ∧ denota a conjunção ”e” e ¬ denota a negação ”não”
Portanto, a relação de preferência de ai (denotada por ≻ai ) pode ser representada por um permutação do
conjunto N .
**Exemplo 1:** Suponha n = 3, a = w e i = 2. Então N = {1, 2, 3} e ai = w2 . O conjunto de possı́veis relações
de preferência de w2 é

Rw2 = {(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)}. (5)

A relação de preferência ≻w2 = (2, 1, 3), por exemplo, significa que o agente w2 considera * a opção m2 estri-
tamente melhor do que a opção m1 . * a opção m2 estritamente melhor do que a opção m3 . * a opção m1
estritamente melhor do que a opção m3 .
Ou seja, * Sempre que m2 for uma opção possı́vel para w2 , então w2 escolherá m2 . * w2 escolherá m1 somente
quando m2 não estiver disponı́vel e m1 estiver disponı́vel. * w2 escolherá m3 somente quando m2 e m1 não
estiverem disponı́veis.

199
**Exemplo 2 (descrição completa em [ams.org](http://www.ams.org/samplings/feature-column/fc-2015-03)):**
Suponha N = 4 e que as relações de preferência de mi e de wi são representadas pela i-ésima linha da matriz
M e W , respectivamente.

   
w1 w2 w3 w4 m4 m3 m1 m2
   
w1 w4 w3 w2  m2 m4 m1 m3 
M =
 ,
 W =
m
 (6)
w2 w1 w3 w4   4 m1 m2 m3 

w4 w2 w3 w1 m3 m2 m1 m4

Por exemplo, a preferência de m3 é dada por ≻m3 = (2, 1, 3, 4).


O algoritmo:
O objetivo do algoritmo é computar a função injetiva x : M → W. Tal função é calculada como o limite de
uma sequência de funções {xt }Tt=0 tal que xt : M → W para todo t. O valor de T é escolhido de forma que
xT = xT −1 . Ele segue os passos a seguir:
Passo 0: O primeiro elemento da sequência, x0 é escolhido neste passo. Cada m ∈ M é associado ao elemento
de W mais peferido por m, ou seja,

x0 (m) = min{≻(i)
m : wi ∈ W}, ∀m ∈ M. (7)

(i)
em que ≻m denota a i-ésima entrada de ≻m .

ˆ Passo 0.1: Considere um elemento w ∈ W arbitrário. O conjunto de elementos de M associados a w via


x0 é dado por

Mw0 = {m ∈ M : x0 (m) = w} ⊆ M. (8)

(i)
Deste conjunto, w escolhe seu elemento preferido, min{≻w : mi ∈ Mw0 }, e rejeita os demais.

ˆ Passo 0.2: Considere um elemento m ∈ M arbitrário. O conjunto de elementos de W que não rejeitaram
m é dado por

0
Wm = W \ {w ∈ W : (x0 (m) = w) ∧ (m 6= min{≻(i) 0
w : mi ∈ Mw })} ⊆ W. (9)

k−1 mais peferido por m, ou seja,


Passo k > 0: Cada m ∈ M é associado ao elemento de Wm

(i) k−1
xk (m) = min{≻m : wi ∈ W m }, ∀m ∈ M. (10)

ˆ Passo k.1: Considere um elemento w ∈ W arbitrário. O conjunto de elementos de M associados a w via


xk é dado por
Mwk = {m ∈ M : xk (m) = w} ⊆ M.

200
(i)
Deste conjunto, w escolhe seu elemento preferido, min{≻w : mi ∈ Mwk }, e rejeita os demais.

ˆ Passo k.2: Considere um elemento m ∈ M arbitrário. O conjunto de elementos de W que ainda não
rejeitaram m é dado por

k
Wm k−1
= Wm \ {w ∈ W : (xk (m) = w) ∧ (m 6= min{≻(i) k−1
w : mi ∈ Mw })} ⊆ W.

Passo Final: Este processo continua até a iteração k na qual xk (m) = xk−1 (m) para todo m ∈ M. Neste
passo do algoritmo, define-se x(m) = xk (m) para todo m ∈ M.

Implementação em Python:

A implementação a seguir usa as relações de preferência apresentadas no exemplo 2 acima.

import numpy as np

def initiate_Wmt():
"Inicialize o conjunto de $w$’s que ainda no rejeitaram $m$ como o conjunto de todas os $w$’s"
Wm_t = {}
for m in N:
Wm_t[m-1] = set()
for w in N:
Wm_t[m-1].add(w)
print(’Wm_t =’,Wm_t)
return Wm_t

def compute_xt(Wm_t):
"Dado o conjunto de $w$’s que ainda no rejeitaram $m$, dado por $W_m^t$, calcule $x_t(m)$"
x_t = np.zeros((1,n),dtype=np.int8)
for m in N:
tt = n
for w in Wm_t[m-1]:
for ii in range(tt):
# print(m,w,ii,tt,M[m-1,:])
if M[m-1,ii]==w:
x_t[0,m-1] = w
tt = ii
break
print(’x_t =’,x_t)
return x_t

def update_Mwt(x_t):
"Dado a associao $x_t$, calcule o conjunto de $m$’s associados a $w$"
Mw_t = {}
for w in N:

201
Mw_t[w-1] = set()
for m in N:
if x_t[0,m-1]==w:
Mw_t[w-1].add(m)
print(’Mw_t =’,Mw_t)
return Mw_t

def compute_xxt(Mw_t):
"Dado o conjunto de $m$’s associados a $w$, calcule o aceite de $w$, dado por $xx_t$"
xx_t = np.zeros((1,n),dtype=np.int8)
for w in N:
tt = n
for m in Mw_t[w-1]:
for ii in range(tt):
# print(w,m,ii,tt,W[w-1,:])
if W[w-1,ii]==m:
xx_t[0,w-1] = m
tt = ii
break
print(’xx_t =’,xx_t)
return xx_t

def update_Wmt(x_t,xx_t,Wmt):
"Com base na associao $x_t$ e no aceite de $w$, dado por $xx_t$, atualize o conjunto de $w$’s
que\
ainda no rejeitaram $m$"
Wm_t = {}
for m in N:
Wm_t[m-1] = set()
for w in Wmt[m-1]:
if x_t[0,m-1]!=w or m==xx_t[0,w-1]:
Wm_t[m-1].add(w)
print(’Wm_t =’,Wm_t)
return Wm_t

"ESTE O PROGRAMA PRINCIPAL..."


n = 4

print(’ Definio de parmetros...’)


N = np.linspace(1, n, n, endpoint=True,dtype=np.int8)
print(’N =’,N,end=’\n\n’)

# Inicialize as preferncias de cada conjunto de agentes


M = np.array([[1,2,3,4],[1,4,3,2],[2,1,3,4],[4,2,3,1]])
W = np.array([[4,3,1,2],[2,4,1,3],[4,1,2,3],[3,2,1,4]])

202
print(’M =’)
print(M,end=’\n\n’)
print(’W =’)
print(W,end=’\n\n’)

print(’ Inicializao de objetos...’)


# Inicialize o conjunto de w’s que ainda no rejeitaram os m’s
Wm_t = initiate_Wmt()

# Inicialize as associaes : $x_0$


x_t = compute_xt(Wm_t)

# Inicialize o conjunto de m’s que esto associados com cada w


Mw_t = update_Mwt(x_t)

# Inicialize as escolhas de m por parte de cada w


xx_t = compute_xxt(Mw_t)

print(’\n\ nIterao at convergncia ...’)


norm, it = 1, 0
while norm!=0:
it += 1
print(’\ nIterao ’,it)
# Atualize o conjunto de w’s que ainda no rejeitaram os m’s
Wm_t = update_Wmt(x_t,xx_t,Wm_t)

# Store current proposals $x_t$ and compute $x_{t+1}$


old_xt = np.copy(x_t)
x_t = compute_xt(Wm_t)

# Update $M_w^t$
Mw_t = update_Mwt(x_t)

# Update $xx_t$
xx_t = compute_xxt(Mw_t)

# Update norm
norm = max(abs(x_t-old_xt)[0])
print(’norma =’,norm,end=’\n\n’)

O programa gera como resultado:

Definio de parmetros...
N = [1 2 3 4]

M =
[[1 2 3 4]

203
[1 4 3 2]
[2 1 3 4]
[4 2 3 1]]

W =
[[4 3 1 2]
[2 4 1 3]
[4 1 2 3]
[3 2 1 4]]

Inicializao de objetos...
Wm_t = {0: {1, 2, 3, 4}, 1: {1, 2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3, 4}}
x_t = [[1 1 2 4]]
Mw_t = {0: {1, 2}, 1: {3}, 2: set(), 3: {4}}
xx_t = [[1 3 0 4]]

Iterao at convergncia ...

Iterao 1
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3, 4}}
x_t = [[1 4 2 4]]
Mw_t = {0: {1}, 1: {3}, 2: set(), 3: {2, 4}}
xx_t = [[1 3 0 2]]
norma = 3

Iterao 2
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 2, 3, 4}, 3: {1, 2, 3}}
x_t = [[1 4 2 2]]
Mw_t = {0: {1}, 1: {3, 4}, 2: set(), 3: {2}}
xx_t = [[1 4 0 2]]
norma = 2

Iterao 3
Wm_t = {0: {1, 2, 3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[1 4 1 2]]
Mw_t = {0: {1, 3}, 1: {4}, 2: set(), 3: {2}}
xx_t = [[3 4 0 2]]
norma = 1

Iterao 4
Wm_t = {0: {2, 3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[2 4 1 2]]

204
Mw_t = {0: {3}, 1: {1, 4}, 2: set(), 3: {2}}
xx_t = [[3 4 0 2]]
norma = 1

Iterao 5
Wm_t = {0: {3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[3 4 1 2]]
Mw_t = {0: {3}, 1: {4}, 2: {1}, 3: {2}}
xx_t = [[3 4 1 2]]
norma = 1

Iterao 6
Wm_t = {0: {3, 4}, 1: {2, 3, 4}, 2: {1, 3, 4}, 3: {1, 2, 3}}
x_t = [[3 4 1 2]]
Mw_t = {0: {3}, 1: {4}, 2: {1}, 3: {2}}
xx_t = [[3 4 1 2]]
norma = 0

References
J. Bertrand. Review of recherches sur le principe mathématique de lathéorie des richesses. Journal des Savants,
499, 1883.

I.-K. Cho and D. M. Kreps. Signaling games and stable equilibria. The Quarterly Journal of Economics, 102
(2):179–221, 1987.

A.-A. Cournot. Recherches sur les principes mathématiques de la théorie des richesses par Augustin Cournot.
chez L. Hachette, 1838.

D. W. Diamond and P. H. Dybvig. Bank runs, deposit insurance, and liquidity. Journal of political economy,
91(3):401–419, 1983.

M. P. Espinosa and C. Rhee. Efficient wage bargaining as a repeated game. The Quarterly Journal of Economics,
104(3):565–588, 1989.

J. W. Friedman. A non-cooperative equilibrium for supergames. The Review of Economic Studies, 38(1):1–12,
1971.

D. Gale and L. S. Shapley. College admissions and the stability of marriage. The American Mathematical
Monthly, 69(1):9–15, 1962.

R. Gibbons. Game Theory for Applied Economists. Princeton University Press, 1992. ISBN 9781400835881.

J. C. Harsanyi. Games with incomplete information played by” bayesian” players, i-iii. part ii. bayesian equi-
librium points. Management Science, pages 320–334, 1968.

205
J. C. Harsanyi. Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points.
International Journal of Game Theory, 2(1):1–23, 1973.

E. P. Lazear and S. Rosen. Rank-order tournaments as optimum labor contracts. Journal of political Economy,
89(5):841–864, 1981.

W. Leontief. The pure theory of the guaranteed annual wage contract. Journal of Political Economy, 54(1):
76–79, 1946.

J. F. Nash et al. Equilibrium points in n-person games. Proceedings of the national academy of sciences, 36(1):
48–49, 1950.

H. Von Stackelberg. Marktform und gleichgewicht. J. springer, 1934.

206

Você também pode gostar