Sujet : Re: Reverse engineering of Intel branch predictors
De : monnier (at) *nospam* iro.umontreal.ca (Stefan Monnier)
Groupes : comp.archDate : 11. Nov 2024, 21:36:50
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <jwvcyj1sefl.fsf-monnier+comp.arch@gnu.org>
References : 1 2 3 4
User-Agent : Gnus/5.13 (Gnus v5.13)
I don't understand the "thus not needing prediction". Loading IP from
memory takes time, doesn't it? Depending on your memory hierarchy and
where the data is held, I'd say a minimum of 3 cycles and often more.
What do you do during those cycles?
It is not that these things don't need prediction, it is that you do the
prediction and then verify the prediction using different data.
I see, so you still need something similar to a BTB for operations like
JTT, but the delay until you can verify the prediction is shorter, which
should presumably reduce the cost of mispredictions.
For example: The classical way to do dense switches is a LD of the
target address and a jump to the target. This requires verifying the
address of the target. Whereas if you predict as JTT does, you verify
by matching the index number (which is known earlier and since the
table is read-only you don't need to verify the target address.
Hmm... but in order not to have bubbles, your prediction structure still
needs to give you a predicted target address (rather than a predicted
index number), right?
Also in order to be able to verify the index rather than the
target address, your prediction structure will *also* need to give you
the predicted index?
So, rather than a BTB which just gives you a predicted target address,
you'll need something that returns both a target address and the
corresponding index (and the correspondence needs to be reliable if we
verify only the index, tho I guess you could also verify both)?
Stefan