Sujet : Re: DMA is obsolete
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.archDate : 04. May 2025, 00:02:25
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <17ad830022e847950d47b90da1b555b7@www.novabbs.org>
References : 1 2 3 4 5 6
User-Agent : Rocksolid Light
On Sat, 3 May 2025 21:53:37 +0000, Scott Lurndal wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
>
Looking at
https://chipsandcheese.com/p/arms-cortex-a53-tiny-but-important, a
Cortex-A53 would not be up to it (at 1896MHz it can read <12GB/s and
write <18GB/s even to the L1 cache). However, Chester Lam notes: "A53
offers very low cache bandwidth compared to pretty much any other core
we’ve analyzed." I think, though, that a small in-order core like the
A53, but with enough load and store buffering and enough bandwidth to
I/O and the memory controller should not have a problem shoveling data
from or to a 400Gb/s NIC. With 128 bits/cycle in each direction one
would need one transfer per cycle in each direction at 3125MHz to
achieve 400Gb/s, or maybe 4GHz for a dual-issue core to allow for loop
overhead.
>
Running any SoC at 3+gHz requires significant effort in the
back-end and to ensure timing closure on the front end (and
affects floorplanning). All this adds to the cost to build
and manufacture the chips.
>
It may be more productive to consider widening the internal
buses to be 256 or 512 bits wide.
At smaller than 7nm there seems to be little reason the main
interconnect is not cache-line-wide or cache-line-wide in two
directions. Your typical GPU will have 1024 wires into and out
of each shader core and several other big blocks.
Many cache-lines are 512-bits wide (except for IBM at 4096-bits
wide).