Editor's note: This paper was originally published in the 2000 Game Developer's Conference proceedings

Contents

1. Introduction

3.1 Game Trees and Minimax Search

3.2 Iterative Deepening

4.0 Reimplementing A*

4.3 The History of Heuristic

3.1 Game Trees And Minimax Search

The root of a game tree represents the current state of the game. Each node in the tree can have any number of child nodes. Each child of a node represents a new state after a legal move from the node's state. This continues until we reach a leaf, a node with no child nodes, in the game tree. We assign a payoff vector to each leaf in the game tree. In a generalized game tree, this payoff vector represents the utility of the final position to both players. In general, winning a game represents a positive utility or a player, while losing a game represents a negative utility. Since the game is a two-player zero-sum game, the utility for the first player must equal the negative of the utility for the second player. The utility for the side to move at the root of the tree is usually the only one given to save space.

In Figure 2, an example of a game tree for a game of Naughts and Crosses (or Tic-Tac-Toe) is given. Note that the two players take alternating turns at different levels of the tree. X moves

Figure 2. Example Of Naughts and Crosses Game Tree.

at the root, while the opponent, O, moves at the first level below the root. A position is normally categorized by how many levels down in the game tree it is located. The common term for this is ply. The root is said to be at ply 0, while the immediate successors of the root are said to be at ply 1, et cetera.

Naughts and Crosses, like chess and checkers, has only three possible outcomes for a player: win, loss or draw. Normally, we assign the payoff of +1, 0 and -1 to a win, draw or loss for the player to move, respectively. These payoffs are given in Figure 2 at the bottom of each leaf position, with respect to the player with the crosses.

We will give names to each player to simplify our discussion. Let us call the player to move in the initial position Max and the opponent Min. At each node in the tree where Max has to move, Max would like to play the move that maximizes the payoff. Thus, Max will assign the maximum score amongst the children to the node where Max makes a move. Similarly, Min will minimize the payoff to Max, since that will maximize Min's payoff. The maximum and minimum scores are taken at alternating levels of the tree, since Max and Min alternate turns.

In this way, all nodes in the tree can be assigned a payoff or minimax value, starting from the leaves of the tree and moving up the tree towards the root. In Figure 3, we give minimax values for all nodes in our Naughts and Crosses game tree (Figure 2). These minimax values tell us what the best possible outcome for Max is in any position within the game tree, given that Min will do its best to foil Max's plans.

Once the root of the game tree has been assigned a minimax value, a best move for Max is defined as a move which leads to the same minimax value as the root of the tree. We can trace down the tree, always choosing moves that lead to the same minimax value. This path of moves gives us an optimal line of play for either player, and is known as a principal

Figure 3. Naughts and Crosses Game Tree With Minimax Values.

variation. Note that in our game of Naughts and Crosses, the side playing the Crosses will draw the game, but only if an X is played in the lower central square. Playing to either square in the top row can lead to a loss for the Crosses, if the opponent plays the best move.

To compute the minimax value of a position, we can use any algorithm that searches the whole game tree. A depth-first search uses less memory than a best-first or breadth-first tree search algorithm, so it is preferred in current game-tree search programs. In Figure 3, we show two C functions which are the basis of a recursive depth-first search of a game tree. By calling Maximize with a position p, we will get the minimax value of position p as the output of the function after the entire game tree has been searched.

In Listing 1, we have left out some of the details. For example, we have not defined what a position is, since this is game-dependent. There are three additional functions that would be required to implement the minimax search: (1) EndOfGame, which determines whether the game is over at the input position, returning TRUE if the game is over; (2) GameValue, which accepts a position as a parameter, determines who has won the game, and returns the payoff with respect to the player Max; and (3) GenerateSuccessors which generates an array of successor positions (p.succ) from the input position, and returns the number of successors to the calling procedure.

Note that Maximize() and Minimize() recursively call one another until a position is reached where the EndOfGame() function returns TRUE. As each successor of a node is explored, gamma maintains the current assessment of the position, based on all of the moves that have been searched so far. Once all successors have been examined, the minimax value for that position has been computed and stored in gamma, which can be returned to a higher level within the tree (please refer to Listing 1).

The minimax algorithm can also determine which move yields the score gamma, and return that up the tree as well. However, there is only one place we are interested in the move choice: the root of the game tree. We could write a special version of Maximize that returns a best move and the minimax value.

This formulation requires exactly the same amount of work as the matrix formulation did, but further pruning can be done on this tree. The algorithmimproves on the typical minimax algorithm by passing down bounds throughout the tree and can prune off branches that can be shown to have no relevance on the minimax value of the game tree. With the algorithm, one can search the optimal number of terminal positions required to determine the minimax value (if one always searches the best move first at every node)! In practice, that doesn't happen (why would you need to search if you already knew the best move?), so there are variants on such as NegaScout that have been shown to be significantly better than in practice.

However, it would still take a modern computer millions of years to evaluate the full game tree for the game of chess if one had to go all the way to the terminal nodes. How can we control the size of the search?


________________________________________________________

3.2 Iterative Deepening