Minimax Algorithm and its variations including Alpha-Beta and Rminimax

© All Rights Reserved

193 visualizações

Minimax Algorithm and its variations including Alpha-Beta and Rminimax

© All Rights Reserved

- Hack, Math and Stat
- Syllabus -GEC004-Math in the Modern world (1).doc
- Navigation in a Small World
- UCT_ecml06
- 8583800-chasemanhattanbank
- Li.S.04062016.RESUME_1
- Send
- DAILY LYF
- Egg From 100 Floors Building
- workbook
- Computer Science
- Important Books for IIT(Examrace)
- BIBLIOGRAFIE
- chapter 1 lesson 6 7
- Assignment 3 DecisionAnalysis Project Sem3B 2010
- ieep218
- Effectiveness of Evolutionary Algorithms for Optimization of Heat Exchangers
- Mind the Gap Sample
- Admission Criteria
- EE 333 Assignment One

Você está na página 1de 35

Overview of

Minimax Algorithm

A PRESENTATION BY ASHISH SABADE

Minimax Algorithm is used in Artificial Intelligence in computer games

Its at the heart of almost every computer board game

Minimax chooses the path which maximizes the gain of the current player, while minimizing the

gain of the adversary

A search tree is generated, depth-rst, starting with the current game position upto the end

game position.

Can work in real time(i.e- not turn based) with timer (iterative deepening, later)

Applies to games where:

Players take turns

Have perfect information

Chess, Checkers, Tactics

Most widely used in logic games. These games are also called as zero sum games i.e a game in

which gain of players move is exactly balanced by loss incurred by opponents move

Used in social situations

Minimax Algorithm-Components

Search tree :

Squares represent decision states (i.e- after a move)

Branches are decisions (i.e- the move)

Start at root

Nodes at end are leaf nodes

Unlike binary trees can have any number of children

Depends on the game situation

Levels usually called plies (a ply is one level)

Each ply is where "turn" switches to other player

Players called Min and Max

Current board

position

Minimax Algorithm-Description(1)

Assign points to the outcome of a game

Ex: Tic-Tac-Toe: X wins, value of 1. O wins, value -1.

Max (X) tries to maximize point value, while Min (O) tries to minimize point

value

Assume both players play to best of their ability

Always make a move to minimize or maximize points

So, in choosing, Max will choose best move to get highest points, assuming Min

will choose best move to get lowest points

Minimax Algorithm-Description(2)

A tree is generated and filled with values generated from evaluation function

An evaluation function, also known as a heuristic evaluation function or static evaluation

function, is a function used by game-playing programs to estimate the value or goodness of a

position

The evaluation function is typically designed to prioritize speed over accuracy; the function looks

only at the current position and does not explore possible moves (therefore static).

E.g for chess: f(P) = 200(K-K') + 9(Q-Q') + 5(R-R') + 3(B-B'+N-N') + (P-P') - 0.5(D-D'+S-S'+I-I')

+ 0.1(M-M')

K, Q, R, B, N, P are the number of white kings, queens, rooks, bishops, knights and pawns on the board.

D, S, I are doubled, backward and isolated white pawns.

M represents white mobility (measured, say, as the number of legal moves available to White)

Step 1:Creation of a MAX node

Step 2:Generation of children nodes (MIN) using DFS

15

MAX

Step 4:Children of a A1 are generated

MIN

A1

A2 15

15

A3

evaluation function.

Values are filled in nodes

12

20

child node values

Step 7:Repeat steps 3 to 6 till all children of main

nodes (A1,A2,A3) are generated and filled

of the child node values and this tells us the

best possible move that we should take

Following functions are used in minimax algorithm :

Main function to call either max or min function depending upon player

int MiniMax (int depth)

Max Function returns move of max value

int Max (int depth)

Min Function returns min value

int Min (int depth)

Evaluation function returns goodness of position of player

int eval(move)

int MinMax(int depth){

// White is Max, Black is Min

if (turn == WHITE)

return Max(depth);

else

return Min(depth);

}

Function call:value = MinMax(5); // search 5 plies

int Max(int depth) {

int best = -INFINITY; // first move is best

if (depth == 0)

return Evaluate();

GenerateLegalMoves();

while (MovesLeft()) {

MakeNextMove();

val = Min(depth 1); // Mins turn next

UnMakeMove();

if (val > best)

best = val;

}

return best;

}

int Min(int depth) {

int best = INFINITY; // different than MAX

if (depth == 0)

return Evaluate();

GenerateLegalMoves();

while (MovesLeft()) {

MakeNextMove();

val = Max(depth 1); // Maxs turn next

UnMakeMove();

if (val < best) // different than MAX

best = val;

}

return best;

}

Minimax Algorithm-Evaluation

Function

Most Important Function in minimax algorithm

Example for tic tac toe for max player (for min player just return score *(-1) )

int eval(move){

if(game over) return 0;

else{

if(move blocking opponents win) return highest_priority; // say highest priority is 5

if(open move){

if(move in central area) return 4;

if(move in corner area) return 3;

if(move in horizontal block) return 1;

}

}

}

Complete?

Yes (if tree is finite)

Optimal?

Yes (against an optimal opponent).

Can it be beaten by an opponent playing sub-optimally?

No

Time complexity?

Space complexity?

O(bm) (depth-first search, generate all actions at once)

O(m) (backtracking search, generate actions one at a time)

Search techniques

Pruning(1)

MinMax searches entire tree, even if in some cases the rest can be ignored

e.g->In chess for depth of 5, 364 nodes are generated !!!

In general, stop evaluating move when find worse than previously examined move

Does not benefit the player to play that move, it need not be evaluated any further.

Save processing time without affecting final result

Hence, Alpha-Beta pruning is used to dismiss unnecessary generation of worthless nodes

We can use alpha-beta pruning to calculate values at more depth and quickly, thus enabling

computer to make best decision using less time

Pruning(2)

Two scores passed around in search

Alpha best score by some means

Anything less than this is no use (can be pruned) since we can already get alpha

Minimum score Max will get

Initially, negative infinity

Anything higher than this wont be used by opponent

Maximum score Min will get

Initially, infinity

Beta < Alpha current position not result of best play and can be pruned

function

Search

It is a remedy for the horizon problem faced by AI engines for various games

like chess and abalone

Emulates human behaviour by instructing a computer to search "interesting" positions to a

greater depth than "quiet" ones to make sure there are no hidden traps.

Pseudocode:

function quiescence_search(node, depth)

if node appears quiet or node is a terminal node or depth = 0

return estimated value of node else //One might use minimax or alpha-beta search here...

search children of node using recursive applications of quiescence_search

return estimated value of children //...and here

function normal_search(node, depth)

if node is a terminal node

return estimated value of node

else if depth = 0

if node appears quiet

return estimated value of node

else

return estimated value from quiescence_search(node, reasonable_depth_value)

else search children of node using recursive applications of normal_search

return estimated value of children

Aspiration Window

Used in order to decrease the Search Space in an Alpha-Beta Pruning algorithm

This technique can be generalized by setting a search window for each level of the tree.

The upper and lower limit of that window are defined as the highest and the lowest reasonable

value that a node at this level can have

If a nodes value falls outside that window, depending on the father type (max or min), we can

interrupt our search and return that same value for this node.

The window size should be different for every level of the tree

For higher levels->bigger window

For lower levels->smaller window

Tables

A common method that is applied when a problem consists of overlapping sub-problems is

Memoization(making a memo)

This method involves storing the result of solved sub-problems in order to use it, in the future,

without the need of making the same calculations all over again

A hash table is created accordingly. Zorbit Hashing technique is used

A unique number needs to be assigned to each different state of the game board

With the use of a table with 4MegaEntries the search becomes approximately three times

faster.

Deepening

It is better to complete the search at a shallower, thus safer depth

and then steadily increase the depth, while checking the remaining time

Just before the time limit is reached, the search for the best possible move at the increased

depth is aborted and the move from the previously completed depth search is retrieved.

By using a Transposition Table, many of the evaluation functions values of the shallower search

are stored and can be retrieved while searching deeper.

It is true that the more well sorted the children of a node are, the more efficient the Alpha-Beta

pruning becomes

If they are sorted by estimating which move is expected to be better, then the probability of

earlier pruning is increased.

The gain from sorting the child-nodes is more significant at higher levels of the tree, because

pruning of deep sub-trees can happen earlier.

Y axis:number of nodes traversed

X-axis:Depth

Board game of

ABALONE

was used for tests

(Rminimax)

Traditional minimax algorithm is very useful but it has some drawbacks

Drawbacks:1) Too deterministic (predictable)

2) Does not take into account the human error factor i.e. it assumes a player will always be perfectly

rational

3) Controlling strength of a player is not possible (only one level of difficulty achievable)

A random minimax will randomize the choice among all possible paths of the game tree is

introduced

Rminimax relies heavily on the randomized shortest-path model

Let G be a directed graph containing a source node with index 1

A non-negative local cost ckk >= 0 is associated to each of the arcs

The set of all paths (including cycles) that go from 1 to n is denoted as P1n

P1n is composed by a sequence of arcs k->k

Let the total cost C() of a path be the sum of these local costs along

A probability will be assigned to each path, favouring nearly optimal

paths having a low cost

controls entropy

When ->0, almost equal

probability is

assigned to all

paths

Rminimax Algorithm

Require:

G: The generated game tree obtained with the MINIMAX

algorithm. The root of the game is k 1.

> 0: The degree of randomization of the tree ( for a

perfect rational player; 0 for an almost completely random

player).

ckk 0: The cost of each arc of the tree.

1. Assign zbn= 1 for each n N.

2. Recursively compute zbk .

3. Compute the corresponding pkk value.

4. return pkk: the transition probabilities for the next play.

2 Player i.e MIN

max

12

Normal Minimax

( CPU with max move)

9

min

12

18

10

12

30

account

No randomness

board i.e. game state

CPU move

10

Random Minimax

( CPU with random move)

Wrong

move !!

10

12

Correct

move !!

18

10

12

30

board i.e. game state

Case 1: ->infinity

optimal minimax strategy

(previous example)

Case 2: ->0

random strategy

(current example)

move because of Rminimax

The research work on random minimax algorithm is still in process

However it is gaining more and more popularity in game programming because of its non

deterministic nature

It is expected that soon various multiplayer games and online games (having artificial

intelligence) will implement Rminimax algorithm

Also further research is expected to make the AI behaviour compatible to the players behaviour

by modifying

Also this framework can be applied to Monte-Carlo search technique

Rminimax algorithm applied to

the game Tic-Tac-Toe.

The horizontal axis represents

the variation of 1 for player 1

while the vertical axis shows the

number of victories of 1 over

2,out of 100 games.

when 1 >> 2, player 1 wins

while for 1 << 2 , it is player 2

who leads the game

References

1)Rminimax: An Optimally Randomized MINIMAX Algorithm

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 43, NO. 1, FEBRUARY 2013

Computational Intelligence and Games (CIG), 2012 IEEE Conference

Control and Decision Conference (CCDC), 2013

THANK YOU

- Hack, Math and StatEnviado pordavidrojasv
- Syllabus -GEC004-Math in the Modern world (1).docEnviado porRachel Peters
- Navigation in a Small WorldEnviado porapi-26490800
- UCT_ecml06Enviado porsean_markan9136
- 8583800-chasemanhattanbankEnviado porHemu Jain
- Li.S.04062016.RESUME_1Enviado porDarren Shengfeng Li
- SendEnviado porsharadggg
- DAILY LYFEnviado porSyeda Iffath
- Egg From 100 Floors BuildingEnviado porIoan Nicolae
- workbookEnviado porapi-236337064
- Computer ScienceEnviado porStrange Fellow II
- Important Books for IIT(Examrace)Enviado porAseemSharan
- BIBLIOGRAFIEEnviado poralina_tod
- chapter 1 lesson 6 7Enviado porapi-331336504
- Assignment 3 DecisionAnalysis Project Sem3B 2010Enviado porMarcus Goh
- ieep218Enviado porJOSEPH HERBERT MABEL
- Effectiveness of Evolutionary Algorithms for Optimization of Heat ExchangersEnviado porDietrich von Bismarck
- Mind the Gap SampleEnviado porJon Hadley
- Admission CriteriaEnviado porKrishna Bhagat
- EE 333 Assignment OneEnviado porcdavila2
- gEnviado porRohil Mansharamani
- march 23Enviado porapi-99479022
- chapter 2_4941.pdfEnviado porPranay Pandey
- lessontemplate october 7th 2015Enviado porapi-302331518
- op edEnviado porAndrew Martini
- Equation Line Two Points PracticeEnviado porSaher
- Manegerial Economics ( Risk Analysis Presentation)Enviado porFaheem Ul Hassan
- dxf codepageEnviado poreswarmbbe
- EESEnviado porpablo luis caceres leon
- exp 4 svmEnviado pordaksh

- RotaryKilnsforcementplants.ashxEnviado porgunduanil17
- PiSo vs Transient SimpleEnviado porJaeHeung Gill
- Ultrasonic SensorsEnviado pormohamed ikram
- E1!09!10 Q2 Control M1 Grup2 Alex MurilloEnviado porAle Key
- Overcoming Protein Instability Problems During Fusion Protein CleavageEnviado porAsma
- Birkhoff - Numerical Solution of Elliptic Equations.pdfEnviado porGuilherme Morais
- ThyristorEnviado porSristick
- CFD Modeling of Gas-Liquid-Solid Fluidized BedEnviado porAnkga Camaro
- 2.+Pressure_Basin+Types_Petroleum+SystemsEnviado porRadu Chibzui
- IOSR Journals (wwww.iosrjournals.org)Enviado porInternational Organization of Scientific Research (IOSR)
- Removable Ortho AppliancesEnviado porbassico12
- JAN 701 InstructionEnviado porfomalout6912
- Arc Welding ReportEnviado porYe Chonn
- LANEnviado porsaranyathilaga026
- Risc vs CiscEnviado porRaj Kumar
- OCW Marshall Mix Design MethodEnviado porPacifique Nicholas
- Vitotronic 100 333 m1 GbEnviado porkiradavid
- kalman.cEnviado porcartamenes
- PEX-01-03Enviado porJarren Bautista
- A Class of Correlation Based Tracking Algorithms for BOC signalsEnviado porRakesh Krishna
- Hvac Refrigeration Cycle DvdEnviado porajaybhatt9
- Set Operations , Ddl Tcl EtcEnviado porGirish Kumar
- Building Services - Question BankEnviado porManish Bokdia
- Apco Airvacuum Valves Avv Sales 601Enviado porJhony Medrano
- Carlsson SWEnviado porAngel Sancen
- 07_Presentation_--_Chess_Openings.pptEnviado porKrishnapriya
- Akhtaruddin, 2005, Corporate Mandatory Disclosure Practices in BangladeshEnviado porSwapon Ahmed
- Reliability-based Method for Fatigue Evaluation of Railway BridgesEnviado porSushank Dani
- 10.1016@j.jbankfin.2016.04.013Enviado porPrateek Sharma
- Anatomy of MaxillaEnviado porSuprit Sn

## Muito mais do que documentos

Descubra tudo o que o Scribd tem a oferecer, incluindo livros e audiolivros de grandes editoras.

Cancele quando quiser.