Você está na página 1de 23

Game playing

Unit 4
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Outline
Games
Perfect play
minimax decisions
pruning
Resource limits and approximate evaluation
Games of chance
Games of imperfect information
Chapter 6 2
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Games vs. search problems
Unpredictable opponent solution is a strategy
specifying a move for every possible opponent reply
Time limits unlikely to nd goal, must approximate
Plan of attack:
Computer considers possible lines of play (Babbage, 1846)
Algorithm for perfect play (Zermelo, 1912; Von Neumann, 1944)
Finite horizon, approximate evaluation (Zuse, 1945; Wiener, 1948;
Shannon, 1950)
First chess program (Turing, 1951)
Machine learning to improve evaluation accuracy (Samuel, 195257)
Pruning to allow deeper search (McCarthy, 1956)
Chapter 6 3
www.StudentRockStars.com
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Types of games
deterministic chance
perfect information
imperfect information
chess, checkers,
go, othello
backgammon
monopoly
bridge, poker, scrabble
nuclear war
battleships,
blind tictactoe
Chapter 6 4
www.StudentRockStars.com
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Game tree (2-player, deterministic, turns)
X X
X X
X
X
X
X X
MAX (X)
MIN (O)
X X
O
O
O X O
O
O O
O O O
MAX (X)
X O X O X O X
X X
X
X
X X
MIN (O)
X O X X O X X O X
. . . . . . . . . . . .
. . .
. . .
. . .
TERMINAL
X X
1 0 +1 Utility
Chapter 6 5
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Minimax
Perfect play for deterministic, perfect-information games
Idea: choose move to position with highest minimax value
= best achievable payo against best play
E.g., 2-ply game:
MAX
3 12 8 6 4 2 14 5 2
MIN
3
A
1
A
3
A
2
A
13
A
12
A
11
A
21
A
23
A
22
A
33
A
32
A
31
3 2 2
Chapter 6 6
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Minimax algorithm
function Minimax-Decision(state) returns an action
inputs: state, current state in game
return the a in Actions(state) maximizing Min-Value(Result(a, state))
function Max-Value(state) returns a utility value
if Terminal-Test(state) then return Utility(state)
v
for a, s in Successors(state) do v Max(v, Min-Value(s))
return v
function Min-Value(state) returns a utility value
if Terminal-Test(state) then return Utility(state)
v
for a, s in Successors(state) do v Min(v, Max-Value(s))
return v
Chapter 6 7
www.StudentRockStars.com
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Properties of minimax
Complete??
Chapter 6 8
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Properties of minimax
Complete?? Only if tree is nite (chess has specic rules for this).
NB a nite strategy can exist even in an innite tree!
Optimal??
Chapter 6 9
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Properties of minimax
Complete?? Yes, if tree is nite (chess has specic rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise??
Time complexity??
Chapter 6 10
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Properties of minimax
Complete?? Yes, if tree is nite (chess has specic rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise??
Time complexity?? O(b
m
)
Space complexity??
Chapter 6 11
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Properties of minimax
Complete?? Yes, if tree is nite (chess has specic rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise??
Time complexity?? O(b
m
)
Space complexity?? O(bm) (depth-rst exploration)
For chess, b 35, m 100 for reasonable games
exact solution completely infeasible
But do we need to explore every path?
Chapter 6 12
www.StudentRockStars.com
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
pruning example
MAX
3 12 8
MIN
3
3
Chapter 6 13
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
pruning example
MAX
3 12 8
MIN
3
2
2
X X
3
Chapter 6 14
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
pruning example
MAX
3 12 8
MIN
3
2
2
X X
14
14
3
Chapter 6 15
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
pruning example
MAX
3 12 8
MIN
3
2
2
X X
14
14
5
5
3
Chapter 6 16
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
pruning example
MAX
3 12 8
MIN
3
3
2
2
X X
14
14
5
5
2
2
3
Chapter 6 17
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Why is it called ?
..
..
..
MAX
MIN
MAX
MIN
V
is the best value (to max) found so far o the current path
If V is worse than , max will avoid it prune that branch
Dene similarly for min
Chapter 6 18
www.StudentRockStars.com
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
The algorithm
function Alpha-Beta-Decision(state) returns an action
return the a in Actions(state) maximizing Min-Value(Result(a, state))
function Max-Value(state, , ) returns a utility value
inputs: state, current state in game
, the value of the best alternative for max along the path to state
, the value of the best alternative for min along the path to state
if Terminal-Test(state) then return Utility(state)
v
for a, s in Successors(state) do
v Max(v, Min-Value(s, , ))
if v then return v
Max(, v)
return v
function Min-Value(state, , ) returns a utility value
same as Max-Value but with roles of , reversed
Chapter 6 19
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Properties of
Pruning does not aect nal result
Good move ordering improves eectiveness of pruning
With perfect ordering, time complexity = O(b
m/2
)
doubles solvable depth
A simple example of the value of reasoning about which computations are
relevant (a form of metareasoning)
Unfortunately, 35
50
is still impossible!
Chapter 6 20
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Resource limits
Standard approach:
Use Cutoff-Test instead of Terminal-Test
e.g., depth limit (perhaps add quiescence search)
Use Eval instead of Utility
i.e., evaluation function that estimates desirability of position
Suppose we have 100 seconds, explore 10
4
nodes/second
10
6
nodes per move 35
8/2
reaches depth 8 pretty good chess program
Chapter 6 21
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Evaluation functions
Black to move
White slightly better
White to move
Black winning
For chess, typically linear weighted sum of features
Eval(s) = w
1
f
1
(s) + w
2
f
2
(s) + . . . + w
n
f
n
(s)
e.g., w
1
= 9 with
f
1
(s) = (number of white queens) (number of black queens), etc.
Chapter 6 22
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m
Summary
Games are fun to work on! (and dangerous)
They illustrate several important points about AI
perfection is unattainable must approximate
good idea to think about what to think about
uncertainty constrains the assignment of values to states
optimal decisions depend on information state, not real state
Games are to AI as grand prix racing is to automobile design
Chapter 6 38
www.StudentRockStars.com
www.StudentRockStars.com
www.StudentRockStars.com
w
w
w
.
S
t
u
d
e
n
t
R
o
c
k
S
t
a
r
s
.
c
o
m

Você também pode gostar