Sujet : Re: Another security vulnerability
De : monnier (at) *nospam* iro.umontreal.ca (Stefan Monnier)
Groupes : comp.archDate : 05. Apr 2024, 17:54:50
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <jwvcyr3bwca.fsf-monnier+comp.arch@gnu.org>
References : 1 2 3 4 5 6 7 8 9 10 11
User-Agent : Gnus/5.13 (Gnus v5.13)
Since each chased pointer starts back at LSQ, the cost is no different
than an explicit Prefetch instruction, except without (a),(b) and (c)
having been applied first.
I thought the important difference is that the decision to prefetch or
not can be done dynamically based on past history.
Programmers and compilers are notoriously bad at predicting
branches (except for error branches), but ought to be quite good
at predicting prefetches. If a pointer is loaded, chances are
very high that are it will be dereferenced.
I don't think it's that simple: prefetches only bring the data into L1
cache, so they're only useful if:
- The data is not already in L1.
- The data will be used soon (i.e. before it gets thrown away from the cache).
- The corresponding load doesn't occur right away.
In all other cases, the prefetch will be just wasted work.
It's easy for programmers to "predict" those (dependent) loads which will occur
right away, but those don't really benefit from a prefetch.
E.g. if the dependent load is done 2 cycles later, performing a prefetch
lets you start the memory access 2 cycles early, but since that access
is not in L1 it'll take more than 10 cycles, so shaving
2 cycles off isn't of great benefit.
Given that we're talking about performing a prefetch on the result of
a previous load, and loads tend to already have a fairly high latency
(3-5 cycles), "2 cycles later" really means "5-7 cycles after the
beginning of the load of that pointer". That can easily translate to 20
instructions later.
My gut feeling is that it's difficult for programmers to predict what
will happen more than 20 instructions further without looking at
detailed profiling.
Stefan