Dynamic Branch Prediction
The basic idea of dynamic branch prediction is to use the past branch behavior to predict the future. We need hardaware to dynamiccaly predict the outcome of a branch: the prediction will depend on the behavior at run time.
Two interacting hardware blocks placed in the instruction fetch stage are needed to predict the next instruction to fetch:
- Branch Outcome Predictor (BOP): predicts the direction of a branch (taken or not)
- Branch Target Predictor / Branch Target Buffer (BTB): predicts the branch target address (PTA _ Predicted Target Address) in case of taken branch
If the predicted outcome is a misspediction we will need to flush the fetched instruction resulting in a one cycle penalty (optimized MIPS).
1) Branch History Table (or Branch Prediction Buffer)
Table containing 1 bit for each entry that says whether the branch was recentky taken or not.
The table is indexed by the lower porition k-bit of the address of the branch instruction and, for locality, we would expect that the most significant bits of the branch address are not changed.
The table has no tags (every access is a hit) and the prediction bit could has been put there by another branch with the same low-order address bits: but it doesn’t matter. The prediction is just a hint!
A misprediction occurs when one of the following:
- The prediction is incorrect for that branch
- The same index has been referenced by two different branches, and the previous history refers to the other branch (This can occur because there is no tag check). To reduce this problem it is enough to increase the number of rows in the BHT (that is to increase k) or to use a hashing function (such as in GShare).
A major shortcoming of the 1-bit BHT is that in a loop branch, even if a branch is almost always taken and then not taken once, the 1-bit BHT will mispredict twice:
- At the last loop iteration, since the prediction bit will say taken, while we need to exit from the loop
- When we re-enter the loop, at the end of the first loop iteration we need to take the branch to stay in the loop, while the prediction bit say to exit from the loop.
We can consider a BHT with 2 bits to fix this shortcoming and allowing only one missprediction in loops.
In general it is possible to build an n bits BHT where when the counter is greater than or equal to one-half of its maximum value ($2^n-1$), the branch is predicted as taken.