Sequential

DYNAMIC GAME OF COMPLETE INFORMATION
Dynamic games of complete information are games that play sequentially. depicted by extensive form specifying - players, - when each player gets to move, and - players available choices and information - payoffs after sequence of moves end. This type of game is better explained by the following example. Suppose player 2, for example, I, surprises player 1 (you) as you leave ATM with 40,000. I say give me the money or I will blow us both up!. The game can then be depicted in a following extensive form, also known as game trees. They are usually
1 Hand Over 2 Blow up Dont Blow up Keep money 2 Dont
Player 1s payoff Player 2s payoff
11 9
Figure 1
1 +1
10 10
0 0
The game in an extensive form has the following structures. The tree has an initial node and other nodes. In every node, except the initial node, has predecessors. Predecessors give single path back to the initial node. Intuitively, a node determines the entire sequence of moves to that point. The last thing is the terminal nodes. They are not the predecessors of anything and indicate all possible outcomes. Dynamic games of complete information also have some nonplayers who randomly enter the game with some positive probability. These nonplayers are called nature, or moves by nature. Nature is neither player 1 nor player 2. Some texts denote nature by player 0. In an extensive form, it is depicted by a hollow circle, with probabilities in parentheses. For example,
page 1
Commit crime (0.3) Get caught (0.7) Not caught
Dont
Figure 2
The game of complete information does not mean the same thing as the game with perfect information. Complete information in game theory means each player fully knows the strategies of another. But in the dynamic game, sometimes players might not know which strategy the other had used. information. We categorize these as the dynamic game with complete information and perfect Examples of dynamic games with complete and perfect information are chess,
Stackelberg competition. Examples of dynamic games with complete but imperfect information are poker, or repeated simultaneous move games. Hence the game of with complete but imperfect information means that players know each others strategy but have no idea which one had been used. In an extensive form, the game with imperfect information can be depicted by information sets. Since each player does not know all earlier moves, we must partition the set of players When player i has to move, he observes decision nodes into subsets, called information set.
information set containing real node, but he may not know which node had been reached. Consider the following Prisoners dilemma in an extensive form.
1 Collude 2 Collude Defect Collude Defect 2 Defect
100 100
25 120
Figure 3
120 25
80 80
Or sometimes people write the information set using oval as follows,
page 2
1 Collude 2 Collude Defect Collude Defect 2 Defect
100 100
25 120
Figure 4
120 25
80 80
As you can see, in the game of perfect information, each information set contains just one node. In the game of imperfect information, each information set can contain more than one node. Figure 3 or 4 contain two nodes. Reconsider figure 1.
1 Hand Over 2 Blow up Dont Blow up Keep money 2 Dont
11 9
1 +1
10 10
Figure 1
0 0
Player 1 has two pure strategies, Hand over and Keep money, which we will abbreviate them as H and K. How many strategies does player 2 have? You may think that player 2 has two pure strategies, Blow up and Dont blow up abbreviated as B and D. However, this specification does not account for the fact that player 2 acts differently at each information set. Therefore, player 2 has four pure strategies. They are (BB), (BD), (DB), and (DD). The first element in the parentheses gives player 2s choice if player 1 chooses H, and the second element gives player 2s choice if player 1 chooses K. For example, (BD) means player 2 will blow both of them up if player 1 chooses H and player 2 will not blow up if player 1 chooses K. Hence, the strategy set of player 1 is {H, K} and player 2s strategy set is {BB, BD, DB, DD}. Suppose that player 1s strategies have 3 strategies, it is left to the readers to think how many strategies player 2 has.
page 3
We can write this extensive form in the strategic form of the bomb-threat game as follows. 1 H K 2 BB -11, -9 -10, -10 BD -11, -9 0,0
Table 1
DB -1, 1 -10, -10
DD -1, 1 0, 0
We are to consider the dominant strategies of each player.
We see that DD weakly
dominates all other strategies of player 2, and DB weakly dominates BB. To find Nash Equilibria, circle the payoff corresponding to the players best responses. There are three pure strategy Nash Equilibria: (H, DB), (K,BD), and (K, DD). You may pause and think of these strategies for a while. (H, DB) means that player 1 hands over the money and player 2 chooses not to blow up while will do so if player 1 keeps money. (K, BD) means that player 1 chooses to keep money and player 2 will blow up if player 1 hands over and will not choose to do so if player 1 keeps money. The first of these seems reasonable but the second sounds strange. However, both strategies are off equilibrium path. player 1s choice. We have learned in class how to find the rational choice by backward induction or sometimes called backward recursion. Why? Because player 2 threatens to use B in circumstances that will not happen. Player 2s threat to use B is not credible since player 2 is better off playing D regardless of
1 H 2 B 11 9 D B K 2 D 0 0
Consider 2s left node.
2 thinks that if 1
chooses H, what will 2 choose at that node? Clearly, 2 chooses D. 2 thinks again that if 1 chooses K, 2 will choose D.
1 10 +1 10
Figure 5
1 H 2 B 11 9 D B K 2 D 0 0
Now, player 1 thinks that it is 2s best interest to chooses D regardless of the strategies of player 1. Player 1 will choose between H and K at 2s choices. If player 1 chooses H, 1 will get payoff equals to -1. If 1 chooses K, 1 will get payoff equals to 0. Therefore 1 will choose K.
1 10 +1 10
Figure 6
page 4
1 H 2 B 11 9 D B K 2 D 0 0
Similarly, information
any can
finite be
game solved by
of
perfect backward Ties
recursion. If there is no tie at any point in the process, there is a unique equilibrium. lead to multiple equilibria. A non-finite game
1 10 +1 10
Figure 7
that can be solved by backward recursion is the Stackelberg oligopoly game.
We will study the Stackelberg Oligopoly in the next chapter.
Consider the next
example where the game is of imperfect information type. We will replicate a situation where a citizen decides whether to commit a crime, and the judge must decide whether to punish a citizen. The judge has imperfect information since there might be an innocent citizen who may be scared and then looks suspicious. The judge observes only whether a citizen looks suspicious. A guilty citizen who committed a crime always appear suspicious, but an innocent citizen looks suspicious with probability
1 19 . Hence, an innocent citizen are revealed innocent with probability . We 20 20
can write the game tree as follows.
Citizen
Nature
Commit
Judge Look suspicious
Dont
(1/20) Look Innocent (19/20) Judge
P 1 1
N 0 0
Figure 8
N 0 0
1 1 1 1
1 1
Each node in an information set must have the same number of branches of strategies coming out. Otherwise, one could distinguish which node had been reached Judge has 2 according to the number of choices available. Recall that in any dynamic game of perfect information, one could solve for Nash Equilibria by backward recursion. information sets and 4 pure strategies: PP, PN, NP, and NN. Before we go on, let us
examine the meaning of subgames. Subgames are a part of a game that can be considered a game in its own right. Each game is a subgame of itself, and other subgames are called proper subgames. (Like the definition of subset and proper subsets.) We often require an equilibrium to make sense not only in the whole game but in every subgame. This is the notion of subgame perfect Nash Equilibrium, abbreviated as SPE for short. A joint strategy
page 5
is a subgame perfect Nash Equilibrium if it induces a Nash Equilibrium in every subgame. Any SPE is also a Nash Equilibrium. Why? What is a subgame? It has an initial node, which may or may not have its predecessors. It includes all successors, and includes all other nodes in the same information set as each node in the subgame. In other words, a node x is said to define a subgame if whenever y is a node following x and z is in the same information set containing y, then z must follow x. This can be best explained by the below figures. Hence, we can answer the above question why SPE is also a Nash Equilibrium.
x y z z y
Figure 9: x defines a subgame.
Figure 10: x does not define a subgame.
We have seen that the Backward induction strategy (strategies) constitutes a Subgame Perfect Nash Equilibrium (Equilibria). This technique is nicely tailored to apply to games with perfect information. It does not, however, immediately extend to the games of complete but Consider figure 11 in which player 1 has the option of playing a imperfect information.
coordination game with player 2. Let us try to apply the backward induction technique to it.
IN
OUT
2 l r l r
2 2
1 3
0 0
0 0
3 1
Figure 11
As before, the first step is to locate all information sets such that whatever action is chosen at that information set, the game subsequently ends. For the game in figure 11, this isolates player
page 6
2s information set, i.e., the point in the game reached after player 1 has chosen IN and then L or R. Note that when it is player 2s turn to play, taking either action l or r will result in the end of the game. Now, according to the backward induction algorithm, the next step is to choose an optimal action for player 2 there. But now we are in trouble since it is not at all clear which action is optimal to player 2. This is because player 2s best action depends on the action taken by player 1. If player 1 chose L, then 2s best action is l, whereas if player 1 chose R, then 2 should instead choose r. There is no immediate way out of this difficulty because by definition of the information set, player 2 does not know which action playuer 1 has taken. The idea to proceed is to consider the subgame as a game in its own right. See figure 12.
L 2 l r l
1 3
0 0
0 0
3 1
Figure 12
Since this is the game with imperfect information, we can write this in the strategic form as follows. 1 L R 2 l 1, 3 0, 0 R 0, 0 3, 1
As you can check, there are two pure strategy Nash equilibria of this subgame, (L,l) and (R, r). There is also a mixed strategy Nash equilibrium, but the discussion will be simplified if we ignore this for the time being. Let us suppose that when this subgame is reached in the course of playing the original game, one of these Nash equilibria will be played. Suppose we arbitrarily choose (L,l) as a Nash equilibrium. Consequently, the resulting payoff vector will be (1, 3) if this subgame is reached. We now can proceed by backward induction by replacing this entire subgame with the resulting payoff vector (1, 3). See figure 13. Once done, it is clear that player 1 will choose OUT at his first decision node.
page 7
IN 1 L 2 R
OUT
2 2
1 IN OUT
1 3
0 0
0 0
3 1
Figure 13
1 3
2 2
Altogether, the strategies previously derived are as follows. For player 1, OUT at his first decision node and L at his second, and for player 2, l at his information set. Not only does these strategies constitute a Nash equilibrium in the subgame, they form a Nash equilibrium of the original game as well. Hence, we have seen an example verifying the claim that subgame perfect Nash equilibrium must induce a Nash equilibrium in every subgame. As you can recall, there are two pure strategy Nash equilibrium in the subgame, and we arbitrarily chose one of them. Had we chosen the other, the resulting strategies would have been quite different. resulting strategies, too, are subgame perfect. Consider figure 14 in which there is a Nash equilibrium but it is not subgame perfect. Nonetheless, these
IN 2 L 1 R
OUT 2
1 2
L 1
0 0
3 1
1 3
0 0
Figure 13
0 0
3 1
1 3
0 0
page 8
The pure strategy depicted by the arrows in the left game tree is Nash equilibrium because neither player can improve his payoff by switching strategies given the strategy of the other player. However, it is not subgame perfect. To see this, note that the strategies induced in the subgame beginning at player 2s node do not constitute a Nash equilibrium of the subgame. This is shown in the right game tree where the subgame has been isolated and the arrow indicates a deviation that strictly improves player 2s payoff in that subgame. Consider the repeated prisoners dilemma played twice. Each players payoff is the sum of her payoffs in each round. (This could allow discounting.) At the end of round 1 both players choices are observed by both players, and then round 2 choices are made. Consider an extensive form for the entire repeated game. In the figure in the next page, how are many subgames? First, consider the following payoff matrix. 1 Collude Defect 2 Collude 5, 5 6, 0
Table 2
Defect 0, 6 1, 1
Player 1 Player 2 C
D Player 2 C
Player 1
C
Player 1
D
Player 1
D
Player 1
C
2
D
2 2
D
2 2
D
2 2
D
2
Figure 14
There are in total 5 subgames. The whole game itself, and the 4 proper subgames at the second round at player 1s nodes. Before moving on, we reconsider the judge game to see how nature takes part in action.
page 9
Citizen
Nature
Commit
Judge Look suspicious
Dont
(1/20) Look Innocent (19/20) Judge
P 1 1
P
subgame
N 0 0
1 1 1 1
0 0
Figure 15
1 1
This can be written in a strategic form game signifying expected payoff as follows.
Citizen Judge
PP -1, 1 -1, -1
PN -1, 1 -1/20, -1/20

Table 3
NP 1, -1 -19/20, -19/20
NN 1, -1 0, 0
Commit a crime Dont commit
There exists a unique pure strategy Nash Equilibrium: (Commit a crime, PP). Doesnt this sound strange that it is in peoples best interest to commit a crime and the judge will always punish even some innocent people. However, it is easily seen that this Nash Equilibrium is not subgame perfect. When citizen who does not commit a crime reaches the node look innocent, the judge will choose N; not punish the innocent. Thus, this means (Commit, PP) is not credible. Therefore, we can scratch out the strategy where the judge punish the innocent. It is left to the reader to find the Nash Equilibrium of this judge game. (Hint: we need mixed strategy Nash equilibrium in the reduced strategic form game.) We are now consider the two period Prisoners dilemma. table 2. Assume the discount factor is >0. If a game is repeated,. Is the
outcome just a repetition of one period game? Can collusion be sustained? Consider the game in We examine the proper subgames to find SPE. Suppose that we consider the leftmost subgame, ones that follows CC in figure 11.
1 C 2 C D C D 2 D
5 + 5 5 5 + 5 5 + 6
5 + 6 5
5+ 5+
Figure 16
page 10
In each proper subgame, payoffs are increasing affine functions of payoff in one period game. An affine function is just a monotonic transformation of a payoff, and such a transformation leaves preferences unchanged. Therefore, in each proper subgame, DD is the unique Nash Equilibrium. We work backward to period one. Each player knows that both of them will choose D in period two. Thus, payoffs are just the same as in the first period payoff plus . So, first period decision amounts to decision in one period game. To write the strategies used in SPE, one might think that a strategy is simply a pair of instructions (1st period action, 2nd period action). But there are four possible first period outcome: (C, C), (C, D), (D, C), and (D, D), and these represents four separate contingencies in which each player might be called upon to act. Therefore, each players strategy consists of five instructions denoted (v, w, x, y, z), where v is the first period action and w, x, y, and z are the second period actions to be taken following the first period outcomes, (C, C), (C, D), (D, C), and (D, D) respectively. In this example, the SPE is {(D, D, D, D, D), (D, D, D, D, D)}, which means choose D in the first period and then choose D in the second period regardless of the first period outcome. Therefore, in prisoners dilemma repeated any known finite number of times, the only unique SPE has each player choosing D at each information set. The generalization of any game with a unique Nash Equilibrium where each player chooses ai* is that any finitely repeated game also has a unique SPE where each player chooses ai* at each players information set. So far, this is quite disappointing. We still do not discover the theoretical reason why there is collusion in oligopoly market. We will move to the infinitely repeated games. We let the number
i be player I payoff in period k, with discount factor 0 < < 1 to of periods be infinite. Let k
ensure that the discounted present value of payoff stream, i =
k =1
k 1
ki < . We might also add
complication to the model by adding the probability of continuing an additional round at each period as 0 < q < 1 , and work with the expected payoff
E i =
(q )
k =1
k 1
ki . Since this is an
infinitely repeated game, we have no last period to use a backward recursion. How can we analyze this problem? Remember that payoff is just a monotonic transformation so this does not change preferences, even though it is an infinite period game, since payoffs increase by a constant. Therefore, we still have the same SPE as in the case of finitely repeated games. also be more than one SPE! Each player needs to maximize the discounted present value. This depends critically
1 on the discount factor = 1+ r , with r as the discount rate. The player with high discount rate will
This seems
hopeless. But as in the strategic form games can have more than one Nash Equilibrium, there may
want to earn high payoff in early periods even if it means sacrificing future profits, and the player with low discount rate will have a longer time horizons to sacrifice the current payoff to earn higher payoff in the future. Thus, collusion might be sustained if each player is patient enough to sacrifice
page 11
the current payoff in order to collude to gain the future payoff.
That is, collusion might be
sustained if the discount factor, , is sufficiently close to 1. Hence, there might be an SPE in which all players choose C at each point along the equilibrium path, so the outcome has both choosing C forever. This is not as simple as each player chooses C at every information set since we have already shown that it is not possible in each subgame to choose C. information set is not even a Nash Equilibrium. Choosing C at every This strategy in which players may choose C
forever is called the Grim Trigger strategy. Of course, this involves discount factor. If is close to zero, the collusion might not be sustained or may be sustained only in early periods. Players using Grim Trigger strategy will choose C if and only if no one has ever chosen D. If someone chooses D, then both chooses D forever as a punishment deviating from the collusion. Hence, the outcome is that choosing C forever is an SPE, dependent on the value of the discount factor. If the discount factor is close to one, or in other words, if the discount rate is low, then players are patient enough to wait for future payoffs. If the discount factor is close to zero, or in other words, if the discount rate is high, then players are anxious to get payoffs in early periods. We must show that choosing C forever is an SPE. Notice that there are two kinds of
subgames. First, the subgame where someone has previously played D, and then GTS specifies D at every information set, is a Nash Equilibrium in this subgame. Second, the subgame where no one has ever played D, then the Grim Trigger Strategy gives the discounted present value as
PV = 5 + 5 + 5 2 + ... =
response of each player.
5
k =1
k 1
= 5 / (1 ) . We will check that following GTS gives the best
Suppose that the payoff stream of player i if i chooses D after m>0 periods is 5, 5, 5, , 5, 6, 1, 1, 1, . The payoff stream of choosing C forever is 5, 5, 5, .
m periods
The discounted present value from choosing D after m periods equals
defect = 5 + 5 + 5 2 + ... + 5 m 1 + 6 m + m +1 + m +2 + ...

The discounted present value from choosing C forever equals
collude = 5 + 5 + 5 2 + ... + 5 m 1 + 5 m + 5 m +1 + 5 m +2 + ...

Now, it should be clear that both players will cooperate if and only if
collude > defect

5 + 5 + 5 2 + ... + 5 m 1 + 5 m + 5 m +1 + 5 m +2 + ... > 5 + 5 + 5 2 + ... + 5 m 1 + 6 m + m +1 + m +2 + ... 4 m +1 + 4 m +2 + ... > m
4 m +1 (1 + + 2 + ...) > m
page 12
4 >1 1
>
1 . 5
Hence, if > 0.2 , GTS yields a Nash Equilibrium in every subgame and thus SPE is to choose C forever. If we relate the discount factor to the discount rate, =
1 1 > . Then, we 1+r 5
have r <4. Therefore, if the discount factor is not too low, or if the discount rate is not too high, players are patient enough to sustain the collusion, and cooperation can be supported as an SPE.
page 13

Sequential

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Sequential

Enviado por

Direitos autorais:

Formatos disponíveis

DYNAMIC GAME OF COMPLETE INFORMATION

1 Hand Over 2 Blow up Dont Blow up Keep money 2 Dont

Player 1s payoff Player 2s payoff

Commit crime (0.3) Get caught (0.7) Not caught

1 Collude 2 Collude Defect Collude Defect 2 Defect

Or sometimes people write the information set using oval as follows,

1 Collude 2 Collude Defect Collude Defect 2 Defect

1 Hand Over 2 Blow up Dont Blow up Keep money 2 Dont

DB -1, 1 -10, -10

We are to consider the dominant strategies of each player.

We see that DD weakly

Consider 2s left node.

perfect backward Ties

that can be solved by backward recursion is the Stackelberg oligopoly game.

We will study the Stackelberg Oligopoly in the next chapter.

Consider the next

1 19 . Hence, an innocent citizen are revealed innocent with probability . We 20 20

can write the game tree as follows.

Figure 9: x defines a subgame.

Figure 10: x does not define a subgame.

PN -1, 1 -1/20, -1/20

Commit a crime Dont commit

ensure that the discounted present value of payoff stream, i =

ki < . We might also add

the current payoff in order to collude to gain the future payoff.

That is, collusion might be

= 5 / (1 ) . We will check that following GTS gives the best

The discounted present value from choosing D after m periods equals

defect = 5 + 5 + 5 2 + ... + 5 m 1 + 6 m + m +1 + m +2 + ...

collude = 5 + 5 + 5 2 + ... + 5 m 1 + 5 m + 5 m +1 + 5 m +2 + ...

collude > defect

1 1 > . Then, we 1+r 5

Você também pode gostar