Liste des Groupes | Revenir à se design |
john larkin <jl@glen--canyon.com> wrote:Since CPU cores are trivial nowadays - they cost a few cents each ->
the transputer concept may make sense again. We rely on an OS and
compiler tricks to get apparent parallelism, and the price is
complexity and bugs.
Why not have a CPU per task? Each with a decent chunk of dedicated
fast ram?
Intel tried that:
https://en.wikipedia.org/wiki/Xeon_Phi
>
(obviously using x86 was a bad idea, but apart from that...)
>
The issue is one of memory capacity and bandwidth. Many applications have a
large (GB) dataset that doesn't partition nicely up between multiple nodes.
>
Even the largest FPGAs tend to have MB-scale amounts of memory on them, not
GB, because the memory density of a dedicated DRAM chip is so much better
than making on-chip BRAMs. It turns out to be more efficient to use a large
external DRAM and drive it in a highly parallel way, pumping data through a
GPU-style core, than it is to have lots of little cores individually
fetching single words from their local BRAM. With that model you also need
a fabric for the little cores to communicate, while with a big DRAM you get
inter-core/thread communication for free - you just arrange to a write to a
different part of the shared dataset and the next consumer picks it up.
>
You can of course put GDDR or HBM on an FPGA, but it's the same problem -
only a few devices must be shared by numerous cores. Ultimately memory
throughput beats latency hands down, especially for large datasets. This
was not such a problem in the Transputer's day, which is why that
architecture made sense.
>
Theo
Les messages affichés proviennent d'usenet.