Sujet : Re: Microarch Club
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.archDate : 25. Mar 2024, 23:17:03
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <80b47109a4c8c658ca495b97b9b10a54@www.novabbs.org>
References : 1 2
User-Agent : Rocksolid Light
BGB-Alt wrote:
On 3/21/2024 2:34 PM, George Musk wrote:
Thought this may be interesting:
https://microarch.club/
https://www.youtube.com/@MicroarchClub/videos
At least sort of interesting...
I guess one of the guys on there did a manycore VLIW architecture with the memory local to each of the cores. Seems like an interesting approach, though not sure how well it would work on a general purpose workload. This is also closer to what I had imagined when I first started working on this stuff, but it had drifted more towards a slightly more conventional design.
But, admittedly, this is for small-N cores, 16/32K of L1 with a shared L2, seemed like a better option than cores with a very large shared L1 cache.
You appear to be "starting to get it"; congratulations.
I am not sure that abandoning a global address space is such a great idea, as a lot of the "merits" can be gained instead by using weak coherence models (possibly with a shared 256K or 512K or so for each group of 4 cores, at which point it goes out to a higher latency global bus). In this case, the division into independent memory regions could be done in software.
Most of the last 50 years has been towards a single global address space.
It is unclear if my approach is "sufficiently minimal". There is more complexity than I would like in my ISA (and effectively turning it into the common superset of both my original design and RV64G, doesn't really help matters here).
If going for a more minimal core optimized for perf/area, some stuff might be dropped. Would likely drop integer and floating-point divide
I think this is pound foolish even if penny wise.
again. Might also make sense to add an architectural zero register, and eliminate some number of encodings which exist merely because of the lack of a zero register (though, encodings are comparably cheap, as the
I got an effective zero register without having to waste a register name to "get it". My 66000 gives you 32 registers of 64-bits each and you can put any bit pattern in any register and treat it as you like.
Accessing #0 takes 1/16 of a 5-bit encoding space, and is universally
available.
internal uArch has a zero register, and effectively treats immediate values as a special register as well, ...). Some of the debate is more related to the logic cost of dealing with some things in the decoder.
The problem is universal constants. RISCs being notably poor in their
support--however this is better than addressing modes which require
µCode.
Though, would likely still make a few decisions differently from those in RISC-V. Things like indexed load/store,
Absolutely
predicated ops (with a designated flag bit),
Predicated then and else clauses which are branch free.
{{Also good for constant time crypto in need of flow control...}}
and large-immediate encodings,
Nothing else is so poorly served in typical ISAs.
help enough with performance (relative to cost)
+40%
to be worth keeping (though, mostly because the alternatives are not so good in terms of performance).
Damage to pipeline ability less than -5%.