alpha-beta pruning
Reduces the complexity of the minimax search by not considering branches of the game tree that cannot influence the final decision
Alpha-beta pruning returns the same outcome of the “vanilla” minimax search.
Two parameters:
- $\alpha$: It represents the value of the best move found so far for MAX (“at least”)
- $\beta$: It represents the value of the best move found so far for MIN ("at most")
The pruning works as follow:
- Update the a and b values as soon as possible
- Interrupt the search below a MAX node when the a value is larger or equal to the b value
- Interrupt the search below a MIN node when the b value is smaller or equal to the a value
Alpha–beta pruning can be applied to trees of any depth, and it is often possible to prune entire subtrees rather than just leaves. The general principle is this: consider a node $n$ somewhere in the tree such that Player has a choice of moving to $n$. If Player has a better choice either at the same level or at any point higher up in the tree, then Player will never move to $n$.
Complexity
Given the maximum number of legal moves in a state is b, minimax search will expand $O(b^h)$ nodes but the efficiency of alpha-beta pruning depends on the order in which nodes are examined. In the best case, alpha-beta pruning expands $O(b^{h/2})$ nodes.
In the average case (nodes considered in random order), alpha-beta pruning expands $O(b^{3h/4})$ nodes
Evaluation function
To make use of our limited computation time, we can cut off the search early and apply a heuristic evaluation function to states, effectively treating nonterminal nodes as if they were terminal. In other words, we replace the UTILITY function with EVAL, which estimates a state’s utility. We also replace the terminal test by a cutoff test, which must return true for terminal states, but is otherwise free to decide when to cut off the search, based on the search depth and any property of the state that it chooses to consider.
A heuristic evaluation function $Eval(s,p)$ returns an estimate of the expected utility of state $s$ to player $p$. For terminal states, it must be that $Eval(s,p) = Utility(s,p)$ and for nonterminal states, the evaluation must be somewhere between a loss and a win: $$Utility(loss,p) \le Evals(s,p) \le Utility(min,p)$$
What makes for a good evaluation function? First, the computation must not take too long! (The whole point is to search faster.) Second, the evaluation function should be strongly correlated with the actual chances of winning.
Most evaluation functions work by calculating various features of the state—for example, in chess, we would have features for the number of white pawns, black pawns, white queens, black queens, and so on.