Editor's note: This paper was originally published in the 2000 Game Developer's Conference proceedings
Contents |
|
3.1 Game Trees And Minimax Search
The root of a game tree represents the current state of the game. Each node in the tree can have any number of child nodes. Each child of a node represents a new state after a legal move from the node's state. This continues until we reach a leaf, a node with no child nodes, in the game tree. We assign a payoff vector to each leaf in the game tree. In a generalized game tree, this payoff vector represents the utility of the final position to both players. In general, winning a game represents a positive utility or a player, while losing a game represents a negative utility. Since the game is a two-player zero-sum game, the utility for the first player must equal the negative of the utility for the second player. The utility for the side to move at the root of the tree is usually the only one given to save space.
In Figure 2,
an example of a game tree for a game of Naughts and Crosses (or
Tic-Tac-Toe) is given. Note that the two players take alternating turns at
different levels of the tree. X moves
![]() |
Figure 2. Example Of Naughts and Crosses Game
Tree. |
at the root, while the opponent, O, moves at the first level below the root. A position is normally categorized by how many levels down in the game tree it is located. The common term for this is ply. The root is said to be at ply 0, while the immediate successors of the root are said to be at ply 1, et cetera.
Naughts and Crosses, like chess and checkers, has only three possible outcomes for a player: win, loss or draw. Normally, we assign the payoff of +1, 0 and -1 to a win, draw or loss for the player to move, respectively. These payoffs are given in Figure 2 at the bottom of each leaf position, with respect to the player with the crosses.
We will give names to each player to simplify our discussion. Let us call the player to move in the initial position Max and the opponent Min. At each node in the tree where Max has to move, Max would like to play the move that maximizes the payoff. Thus, Max will assign the maximum score amongst the children to the node where Max makes a move. Similarly, Min will minimize the payoff to Max, since that will maximize Min's payoff. The maximum and minimum scores are taken at alternating levels of the tree, since Max and Min alternate turns.
In this way,
all nodes in the tree can be assigned a payoff or minimax value, starting
from the leaves of the tree and moving up the tree towards the root. In
Figure 3, we give minimax values for all nodes in our Naughts and
Crosses game tree (Figure 2). These minimax values tell us what the best
possible outcome for Max is in any position within the game tree, given
that Min will do its best to foil Max's plans.
Once the root of the
game tree has been assigned a minimax value, a best move for Max is
defined as a move which leads to the same minimax value as the root of the
tree. We can trace down the tree, always choosing moves that lead to the
same minimax value. This path of moves gives us an optimal line of play
for either player, and is known as a principal
![]() |
Figure 3. Naughts and Crosses Game Tree With Minimax
Values. |
variation. Note that in our game of Naughts and Crosses,
the side playing the Crosses will draw the game, but only if an X is
played in the lower central square. Playing to either square in the top
row can lead to a loss for the Crosses, if the opponent plays the best
move.
To compute the minimax value of a position, we can use any
algorithm that searches the whole game tree. A depth-first search uses
less memory than a best-first or breadth-first tree search algorithm, so
it is preferred in current game-tree search programs. In Figure 3, we show
two C functions which are the basis of a recursive depth-first search of a
game tree. By calling Maximize with a position p, we will get the minimax
value of position p as the output of the function after the entire game
tree has been searched.
In Listing 1, we have left out some of the details. For example, we have not defined what a position is, since this is game-dependent. There are three additional functions that would be required to implement the minimax search: (1) EndOfGame, which determines whether the game is over at the input position, returning TRUE if the game is over; (2) GameValue, which accepts a position as a parameter, determines who has won the game, and returns the payoff with respect to the player Max; and (3) GenerateSuccessors which generates an array of successor positions (p.succ) from the input position, and returns the number of successors to the calling procedure.
Note that
Maximize() and Minimize() recursively call one another until a position is
reached where the EndOfGame() function returns TRUE. As each successor of
a node is explored, gamma maintains the current assessment of the
position, based on all of the moves that have been searched so far. Once
all successors have been examined, the minimax value for that position has
been computed and stored in gamma, which can be returned to a higher level
within the tree (please refer to Listing
1).
The minimax algorithm can also determine which move yields
the score gamma, and return that up the tree as well. However, there is
only one place we are interested in the move choice: the root of the game
tree. We could write a special version of Maximize that returns a best
move and the minimax value.
This
formulation requires exactly the same amount of work as the matrix
formulation did, but further pruning can be done on this tree. The algorithmimproves on the typical minimax algorithm
by passing down bounds throughout the tree and can prune off branches that
can be shown to have no relevance on the minimax value of the game tree.
With the
algorithm, one can search the optimal number of terminal
positions required to determine the minimax value (if one always searches
the best move first at every node)! In practice, that doesn't happen (why
would you need to search if you already knew the best move?), so there are
variants on
such as NegaScout
that have been shown to be significantly better than
in practice.
However, it would still take a modern
computer millions of years to evaluate the full game tree for the game of
chess if one had to go all the way to the terminal nodes. How can we
control the size of the search?
________________________________________________________