Alpha beta pruning is an optimisation technique for the minimax algorithm. Through the course of this blog, we will discuss what alpha beta pruning means, we will discuss minimax algorithm, rules to find good ordering, and more.
- Introduction
- Minimax algorithm
- Key points in Alpha-beta Pruning
- Working of Alpha-beta Pruning
- Move Ordering in Pruning
- Rules to find Good ordering
- Codes in Python
Introduction
The word ‘pruning’ means cutting down branches and leaves. In data science pruning is a much-used term which refers to post and pre-pruning in decision trees and random forest. Alpha-beta pruning is nothing but the pruning of useless branches in decision trees. This alpha-beta pruning algorithm was discovered independently by researchers in the 1900s.
Alpha-beta pruning is an optimisation technique for the minimax algorithm which is discussed in the next section. The need for pruning came from the fact that in some cases decision trees become very complex. In that tree, some useless branches increase the complexity of the model. So, to avoid this, Alpha-Beta pruning comes to play so that the computer does not have to look at the entire tree. These unusual nodes make the algorithm slow. Hence by removing these nodes algorithm becomes fast.
Minimax algorithm
Minimax is a classic depth-first search technique for a sequential two-player game. The two players are called MAX and MIN. The minimax algorithm is designed for finding the optimal move for MAX, the player at the root node. The search tree is created by recursively expanding all nodes from the root in a depth-first manner until either the end of the game or the maximum search depth is reached. Let us explore this algorithm in detail.
As already mentioned, there are two players in the game, viz- Max and Min. Max plays the first step. Max’s task is to maximise its reward while Min’s task is to minimise Max’s reward, increasing its own reward at the same time. Let’s say Max can take actions a, b, or c. Which one of them will give Max the best reward when the game ends? To answer this question, we need to explore the game tree to a sufficient depth and assume that Min plays optimally to minimise the reward of Max.
Here is an example. Four coins are in a row and each player can pick up one coin or two coins on his/her turn. The player who picks up the last coin wins. Assuming that Max plays first, what move should Max make to win?
If Max picks two coins, then only two coins remain and Min can pick two coins and win. Thus picking up 1 coin shall maximise Max’s reward.
As you might have noticed, the nodes of the tree in the figure below have some values inscribed on them, these are called minimax value. The minimax value of a node is the utility of the node if it is a terminal node.
If the node is a non-terminal Max node, the minimax value of the node is the maximum of the minimax values of all of the node’s successors. On the other hand, if the node is a non-terminal Min node, the minimax value of the node is the minimum of the minimax values of all of the node’s successors.
Now we will discuss the idea behind the alpha beta pruning. If we apply alpha-beta pruning to the standard minimax algorithm it gives the same decision as that of standard algorithm but it prunes or cuts down the nodes that are unusual in decision tree i.e. which are not affecting the final decision made by the algorithm. This will help to avoid the complexity in the interpretation of complex trees.
Now let us discuss the intuition behind this technique. Let us try to find minimax decision in the below tree :
In this case,
Minimax Decision = MAX {MIN {3, 5, 10}, MIN {2, a, b}, MIN {2, 7, 3}}
= MAX {3, c, 2} = 3
Here in the above result you must have a doubt in your mind that how can we find the maximum from missing value. So, here is solution of your doubt also:
In the second node we choose the minimum value as c which is less than or equal to 2 i.e. c <= 2. Now If c <= 3 and we have to choose the max of 3, c, 2 the maximum value will be 3.
We have reached a decision without looking at those nodes. And this is where alpha-beta pruning comes into the play.
Key points in Alpha-beta Pruning
- Alpha: Alpha is the best choice or the highest value that we have found at any instance along the path of Maximizer. The initial value for alpha is – ∞.
- Beta: Beta is the best choice or the lowest value that we have found at any instance along the path of Minimizer. The initial value for alpha is + ∞.
- The condition for Alpha-beta Pruning is that α >= β.
- Each node has to keep track of its alpha and beta values. Alpha can be updated only when it’s MAX’s turn and, similarly, beta can be updated only when it’s MIN’s chance.
- MAX will update only alpha values and MIN player will update only beta values.
- The node values will be passed to upper nodes instead of values of alpha and beta during go into reverse of tree.
- Alpha and Beta values only be passed to child nodes.
Working of Alpha-beta Pruning
- We will first start with the initial move. We will initially define the alpha and beta values as the worst case i.e. α = -∞ and β= +∞. We will prune the node only when alpha becomes greater than or equal to beta.
2. Since the initial value of alpha is less than beta so we didn’t prune it. Now it’s turn for MAX. So, at node D, value of alpha will be calculated. The value of alpha at node D will be max (2, 3). So, value of alpha at node D will be 3.
3. Now the next move will be on node B and its turn for MIN now. So, at node B, the value of alpha beta will be min (3, ∞). So, at node B values will be alpha= – ∞ and beta will be 3.
In the next step, algorithms traverse the next successor of Node B which is node E, and the values of α= -∞, and β= 3 will also be passed.
4. Now it’s turn for MAX. So, at node E we will look for MAX. The current value of alpha at E is – ∞ and it will be compared with 5. So, MAX (- ∞, 5) will be 5. So, at node E, alpha = 5, Beta = 5. Now as we can see that alpha is greater than beta which is satisfying the pruning condition so we can prune the right successor of node E and algorithm will not be traversed and the value at node E will be 5.
6. In the next step the algorithm again comes to node A from node B. At node A alpha will be changed to maximum value as MAX (- ∞, 3). So now the value of alpha and beta at node A will be (3, + ∞) respectively and will be transferred to node C. These same values will be transferred to node F.
7. At node F the value of alpha will be compared to the left branch which is 0. So, MAX (0, 3) will be 3 and then compared with the right child which is 1, and MAX (3,1) = 3 still α remains 3, but the node value of F will become 1.
8. Now node F will return the node value 1 to C and will compare to beta value at C. Now its turn for MIN. So, MIN (+ ∞, 1) will be 1. Now at node C, α= 3, and β= 1 and alpha is greater than beta which again satisfies the pruning condition. So, the next successor of node C i.e. G will be pruned and the algorithm didn’t compute the entire subtree G.
Now, C will return the node value to A and the best value of A will be MAX (1, 3) will be 3.
The above represented tree is the final tree which is showing the nodes which are computed and the nodes which are not computed. So, for this example the optimal value of the maximizer will be 3.
Move Ordering in Pruning
The effectiveness of alpha – beta pruning is based on the order in which node is examined. Move ordering plays an important role in alpha beta pruning.
There are two types of move ordering in Alpha beta pruning:
- Worst Ordering: In some cases of alpha beta pruning none of the node pruned by the algorithm and works like standard minimax algorithm. This consumes a lot of time as because of alpha and beta factors and also not gives any effective results. This is called Worst ordering in pruning. In this case, the best move occurs on the right side of the tree.
- Ideal Ordering: In some cases of alpha beta pruning lot of the nodes pruned by the algorithm. This is called Ideal ordering in pruning. In this case, the best move occurs on the left side of the tree. We apply DFS hence it first search left of the tree and go deep twice as minimax algorithm in the same amount of time.
Rules to find Good ordering
- The best move happens from the lowest node
- Use domain knowledge while finding the best move
- Order of nodes should be in such a way that the best nodes will be computed first
Codes in Python:
class MinimaxABAgent: """ Minimax agent """ def __init__(self, max_depth, player_color): """ Initiation Parameters ---------- max_depth : int The max depth of the tree player_color : int The player's index as MAX in minimax algorithm """ self.max_depth = max_depth self.player_color = player_color self.node_expanded = 0 def choose_action(self, state): """ Predict the move using minimax algorithm Parameters ---------- state : State Returns ------- float, str: The evaluation or utility and the action key name """ self.node_expanded = 0 start_time = time.time() print("MINIMAX AB : Wait AI is choosing") list_action = AIElements.get_possible_action(state) eval_score, selected_key_action = self._minimax(0,state,True,float('-inf'),float('inf')) print("MINIMAX : Done, eval = %d, expanded %d" % (eval_score, self.node_expanded)) print("--- %s seconds ---" % (time.time() - start_time)) return (selected_key_action,list_action[selected_key_action]) def _minimax(self, current_depth, state, is_max_turn, alpha, beta): if current_depth == self.max_depth or state.is_terminal(): return AIElements.evaluation_function(state, self.player_color), "" self.node_expanded += 1 possible_action = AIElements.get_possible_action(state) key_of_actions = list(possible_action.keys()) shuffle(key_of_actions) #randomness best_value = float('-inf') if is_max_turn else float('inf') action_target = "" for action_key in key_of_actions: new_state = AIElements.result_function(state,possible_action[action_key]) eval_child, action_child = self._minimax(current_depth+1,new_state,not is_max_turn, alpha, beta) if is_max_turn and best_value < eval_child: best_value = eval_child action_target = action_key alpha = max(alpha, best_value) if beta <= alpha: break elif (not is_max_turn) and best_value > eval_child: best_value = eval_child action_target = action_key beta = min(beta, best_value) if beta <= alpha: break return best_value, action_target
In this document we have seen an important component of game theory. Although the minimax algorithm’s performance is good but the algorithm is slow. So to make it fast we use alpha-beta pruning algorithm which will cut down the unusual nodes from the decision tree to improve the performance. Nowadays fast and well-performed algorithm is widely used.
This article has been published from the source link without modifications to the text. Only the headline has been changed.