Liste des Groupes | Revenir à c arch |
"Paul A. Clayton" <paaronclayton@gmail.com> writes:[snip]On 3/28/24 3:59 PM, MitchAlsup1 wrote:
For data-dependent (pointer chasing) prefetching, one would likeWhich is usually handled by the LLC when the address space isThis is the tidbit that prevents doing prefetches at/in the DRAM controller.>
The address so fetched needs translation !! And this requires dragging
stuff over to DRC that is not normally done.
With multiple memory channels having independent memory
controllers (a reasonable design I suspect), a memory controller
may have to send the prefetch request to another memory controller
anyway.
striped across multiple memory controllers.
Most of the data was intended not to be cached. Instructions andGiven the lack of both spatial and temporal locality in thatBusses on cores are reaching the stage where an entire cache line>
is transferred in 1-cycle. With such busses, why define anything
smaller than a cache line ?? {other than uncacheable accesses}
The Intel research chip was special-purpose targeting
cache-unfriendly code. Reading 64 bytes when 99% of the time 56
bytes would be unused is rather wasteful (and having more memory
channels helps under high thread count).
workload, one wonders if the data should be cached at all.
Yes, but if the width of the on-chip network — which is what MitchHowever, even for a "general purpose" processor, "word"-granularSo long as the data transfer is cachable, the atomics can be handled
atomic operations could justify not having all data transfers be
cache line size. (Such are rare compared with cache line loads
from memory or other caches, but a design might have narrower
connections for coherence, interrupts, etc. that could be used for
small data communication.)
at the LLC, rather than the memory controller.
Les messages affichés proviennent d'usenet.