Liste des Groupes | Revenir à c arch |
Anton Ertl wrote:For the record, Sprangle and I did the Agree predictor on a machineEricP <ThatWouldBeTelling@thevillage.com> writes:>
>I have difficulty believing that the branch predictor values from some>
thread in one process would be anything but a *negative* impact on a
random different thread in a different process.
This sounds very similar to the problem of aliasing of two different
branches in the branch predictor. The branch predictor researchers
have looked into that, and found that it does not pay off to tag
predictions with the branches they are for. The aliased branch is at
least as likely to benefit from the prediction as it is to suffer from
interference; as a further measure agree predictors [sprangle+97] were
proposed; I don't know if they ever made it into practical
application.
Yes I assume aliasing is possible as one source of erroneous
predictions.
I view the branch predictor (BP) as a black box attached to the Fetch
stage.
Fetch feeds BP with the current Fetch RIP virtual address (FetRipVir)
and
gets back a hit/miss signal, if a hit then the kind of branch/jump it is
supposed to be, and a target virtual address (TargRipVir) to fetch from
next.
>
As the BP would use a subset of the FetRipVir bits to index its tables,
or the equivalent in the Branch History Register (BHR),
then its possible for BP to erroneously send Fetch off on a wild goose
chase,
triggering I-TLB table walks and/or I-cache misses.
>
A similar effect to aliasing occurs on address space switch because the
table indexes for one virtual address space and PHT are completely
different.
>
Then it becomes a matter of how quickly the mistake can be detected,
the previous path canceled and the correct path established, at what
cost.
>As for the idea of erasing the branch predictor on process switch:>
>
Consider the case where your CPU-bound process has to make way for a
short time slice of an I/O-bound process, and once that has submitted
its next synchronous I/O request, your CPU-bound process gets control
again. The I/O bound process tramples only over a small part of
branch predictor state, but if you erase on process switch, all the
branch preductor state will be gone when the CPU-bound process gets
the CPU core again. That's the reason why we do not erase
microarchitectural state on context switch; we do it neither for
caches nor for branch predictors.
Caches are not erased because they (a) usually are physically indexed
and
physically tagged and (b) use all physical address bits in the
index-tag.
If a cache is virtually indexed and tagged then it must be flushed on
address space switch, or entries also tagged with an ASID.
>
Where branch predictors use addresses, they use fetch virtual addresses
and any tables indexed by those VA will be invalid in a different
process.
Also to save space they often don't use the full address bits but a
subset
which leads to aliasing of BP info for different instructions.
>Moreover, another process will likely use some of the same libraries>
the earlier process used, and will benefit from having the branches in
the library predicted (unless ASLR prevents them from using the same
entries in the branch predictor).
Even assuming this effect is significant I don't think it justifies
opening a security hole by retaining the BP tables, any more than it
would justify retaining the TLB for the prior address space.
>>>
@InProceedings{sprangle+97,
author = {Eric Sprangle and Robert S. Chappell and Mitch Alsup
and Yale N. Patt},
title = {The Agree Predictor: A Mechanism for Reducing
Negative Branch History Interference},
crossref = {isca97},
pages = {284--291},
annote = {Reduces the number of conflict mispredictions by
having the predictor entries predict whether or not
some other predictor (say, a static predictor) is
correct. This increases the chance that the
predicted direction is correct in case of a
conflict.}
}
>
@Proceedings{isca97,
title = "$24^\textit{th}$ Annual International Symposium on Computer
Architecture",
booktitle = "$24^\textit{th}$ Annual International Symposium on
Computer Architecture",
year = "1997",
key = "ISCA 24",
}
>Because if you retain>
the predictor values then the new thread has to unlearn what it learned,
before it starts to learn values for the new thread. Whereas if the
predictor is flushed it can immediately learn its own values.
Unlearn? The only thing I can think about in that direction is that a
two-bit counter (for some history and maybe branch address) happens to
be in a state where two instead of one misprediction is necessary
before the prediction changes. Anyway, branch prediction research has
looked into the issue a long time ago and found that erasing on
context switch is a net loss.
>
- anton
In the above Agree Predictor the two-bit Pattern History Table (PHT)
is indexed by the multi-bit Branch History Table (BHT),
and the BHT must be retrained before it generates useful PHT indexes.
>
The Branch Bias Table (BBT) is one bit indexed by the lower bits of the
Fetch RIP XOR'ed with the BHT. Even though this is only one bit to
toggle
to train it, the XOR with BHT means it too will only generate useful
indexes
to select that one bit after the BHT is retrained. Until then it will be
toggling the wrong bias bits.
>
In other BP there are set associative Branch Target Buffers (BTB)
that remember the target virtual address a branch will go to.
Same for Indirect Branch Predictor, and CALL/RET stack predictor.
>
All of these would repeatedly send Fetch off on a wild goose chases
until the current execution detects the mistakes, squashes any
instructions
fetched along the erroneous path, cancels any pending loads it
triggered,
and overwrites these entries.
Les messages affichés proviennent d'usenet.