Sujet : Re: New ISA board to play with transputers
De : theom+news (at) *nospam* chiark.greenend.org.uk (Theo)
Groupes : sci.electronics.designDate : 06. Jul 2025, 17:09:25
Autres entêtes
Organisation : University of Cambridge, England
Message-ID : <fN*TFQgA@news.chiark.greenend.org.uk>
References : 1 2 3 4 5
User-Agent : tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/5.10.0-35-amd64 (x86_64))
john larkin <
jl@glen--canyon.com> wrote:
Since CPU cores are trivial nowadays - they cost a few cents each -
the transputer concept may make sense again. We rely on an OS and
compiler tricks to get apparent parallelism, and the price is
complexity and bugs.
Why not have a CPU per task? Each with a decent chunk of dedicated
fast ram?
Intel tried that:
https://en.wikipedia.org/wiki/Xeon_Phi(obviously using x86 was a bad idea, but apart from that...)
The issue is one of memory capacity and bandwidth. Many applications have a
large (GB) dataset that doesn't partition nicely up between multiple nodes.
Even the largest FPGAs tend to have MB-scale amounts of memory on them, not
GB, because the memory density of a dedicated DRAM chip is so much better
than making on-chip BRAMs. It turns out to be more efficient to use a large
external DRAM and drive it in a highly parallel way, pumping data through a
GPU-style core, than it is to have lots of little cores individually
fetching single words from their local BRAM. With that model you also need
a fabric for the little cores to communicate, while with a big DRAM you get
inter-core/thread communication for free - you just arrange to a write to a
different part of the shared dataset and the next consumer picks it up.
You can of course put GDDR or HBM on an FPGA, but it's the same problem -
only a few devices must be shared by numerous cores. Ultimately memory
throughput beats latency hands down, especially for large datasets. This
was not such a problem in the Transputer's day, which is why that
architecture made sense.
Theo