Newsportal USENET - Re: Interpreters and indirect-branch prediction

On 11/13/2024 1:36 PM, MitchAlsup1 wrote:

On Wed, 13 Nov 2024 8:20:27 +0000, Anton Ertl wrote:

mitchalsup@aol.com (MitchAlsup1) writes:
For something like a bytecode interpreter, the prediction accuracy
of the jump predictor is going to be epically bad.
>
And with "jump predictor", you probably mean the indirect-branch
predictor.
>
On hardware of up to around 2005 where the indirect branches were at
best predicted by BTBs with 2-bit counters, a "bytecode interpreter"
with one shared indirect branch (as generated by the usual
for...{switch...{...}} idiom in C) results in misprediction rates of
80% [ertl&gregg03jilp].
That certainly qualifies as bad.

Can note that in my branch predictors, I ended up with a special case for "predict the opposite of last time" to try to deal with branches which alternated between being taken and not taken (which could get greater than a 50% mispredict with the more traditional approaches, like the 2-bit state model; the transitory states would allow it to adapt to these cases).
A more elaborate state machine could potentially predict simple RLE patterns, but designing the state-transition graph would be harder.
So, say:
   000: Weakly Not Taken
   001: Strong Not Taken
   010: Weakly Transitory Not Taken
   011: Strong Transitory Not Taken
   100: Weakly Taken
   101: Strong Taken
   110: Weakly Transitory Taken
   111: Strong Transitory Taken
Where the transitory states always flip-flop.
Had done better than a 2-bit state machine (with no transitory states). Adding additional "strength" levels did not help with accuracy in past attempts (actually seemed to make it slightly worse).
Could maybe augment this with a 2-bit periodic field:
   00: Period 0, 0000, 1111, 1010, 0101
   01: Period 1, 001001, 010010, 100100, -, 110110, 101101, 011011
   1z: Period 2, 00010001, ..., 11101110, ...
Though, with the longer periods always assuming a "weak" state, here the pattern also needs to be able to express the phase of the pattern.
Best state transition diagram is less obvious, correct prediction likely jumps to the next stage of the predicted pattern, whereas a mispredict jumps to a different pattern.
One other possibility being to use 5 or 6 bits of state and throw it at a genetic algorithm and see what the genetic algorithm comes up with (probably using some variants of the existing state machine as an initial seed guess).
Say, ranking each possible state machine based on how well it can predict a series of simple repetitive bit patterns of various periods.
Tempting to consider using a "majority of 8" mutator, as in this case one may want to limit mutation rate (and "majority of 3" might not be stable enough, majority of 5 lacks a cheap way to evaluate, but "majority of 8" allows representing each bit as a byte and using a lookup table). Well, or maybe "majority of 7".
May try messing with it if anyone is interested.
...

With one a non-shared indirect branch per VM
instruction implementation (as in typical threaded-code
implementations), we measured 50%-61% mispredictions for BTB with
2-bit counters in simulation [ertl&gregg03jilp] and similar results on
actual CPUs.
50% misprediction rate qualifies as bad.
>
We simulated 2-level history-based indirect branch predictors
[ertl&gregg03jilp], and they typically have less than 10%
mispredictions.
10% misprediction rate qualifies as "not so bad".

Date	Sujet	#	Auteur
23 Oct 24	Reverse engineering of Intel branch predictors	34	Thomas Koenig
23 Oct 24	Re: Reverse engineering of Intel branch predictors	24	MitchAlsup1
28 Oct 24	Re: Reverse engineering of Intel branch predictors	23	Stefan Monnier
5 Nov 24	Re: Reverse engineering of Intel branch predictors	22	MitchAlsup1
11 Nov 24	Re: Reverse engineering of Intel branch predictors	2	Thomas Koenig
11 Nov 24	Re: Reverse engineering of Intel branch predictors	1	MitchAlsup1
11 Nov 24	Re: Reverse engineering of Intel branch predictors	19	Stefan Monnier
11 Nov 24	Re: Reverse engineering of Intel branch predictors	18	MitchAlsup1
11 Nov 24	Re: Reverse engineering of Intel branch predictors	17	Stefan Monnier
12 Nov 24	Re: Reverse engineering of Intel branch predictors	16	MitchAlsup1
12 Nov 24	Re: Reverse engineering of Intel branch predictors	13	Stefan Monnier
12 Nov 24	Re: Reverse engineering of Intel branch predictors	12	MitchAlsup1
12 Nov 24	Re: Reverse engineering of Intel branch predictors	7	Stefan Monnier
13 Nov 24	Re: Reverse engineering of Intel branch predictors	6	Terje Mathisen
13 Nov 24	Re: Reverse engineering of Intel branch predictors	5	Stefan Monnier
13 Nov 24	Re: Reverse engineering of Intel branch predictors	4	Thomas Koenig
13 Nov 24	Re: Reverse engineering of Intel branch predictors	2	Stefan Monnier
14 Nov 24	Re: Reverse engineering of Intel branch predictors	1	Thomas Koenig
14 Nov 24	Interpreters and indirect-branch prediction (was: Reverse ...)	1	Anton Ertl
13 Nov 24	Interpreters and indirect-branch prediction	4	Anton Ertl
13 Nov 24	Re: Interpreters and indirect-branch prediction	3	MitchAlsup1
13 Nov 24	Re: Interpreters and indirect-branch prediction	2	BGB
14 Nov 24	Re: Interpreters and indirect-branch prediction	1	BGB
12 Nov 24	Re: Reverse engineering of Intel branch predictors	2	Brett
13 Nov 24	Re: Reverse engineering of Intel branch predictors	1	MitchAlsup1
1 Nov 24	Re: Reverse engineering of Intel branch predictors	9	Waldek Hebisch
1 Nov 24	Re: Reverse engineering of Intel branch predictors	1	MitchAlsup1
5 Nov 24	Re: Reverse engineering of Intel branch predictors	7	Stefan Monnier
5 Nov 24	Re: Reverse engineering of Intel branch predictors	1	MitchAlsup1
8 Nov 24	Re: Reverse engineering of Intel branch predictors	5	Waldek Hebisch
8 Nov 24	Re: Reverse engineering of Intel branch predictors	3	MitchAlsup1
10 Nov 24	Re: Reverse engineering of Intel branch predictors	2	Waldek Hebisch
10 Nov 24	Re: Reverse engineering of Intel branch predictors	1	MitchAlsup1
11 Nov 24	Re: Reverse engineering of Intel branch predictors	1	Stefan Monnier