On 11/13/2024 4:20 AM, Anton Ertl wrote:
Terje Mathisen <terje.mathisen@tmsw.no> writes:
To me brilliant is something that still isn't obvious after larning
about it.
Why do you think it's less brilliant to recognize something obvious
that everybody else has overlooked?
Yeah.
This is my feelings about some of the deficiencies of standard RISC-V.
The stuff I want added, and have added as experiments, is not exactly non-obvious. Saw at least one person doubting that it would make much difference (namely in the use of "make the immediate bigger" prefixes).
But, experimentally, it does make enough of a difference that it should be worth considering, at least for performance-oriented use-cases (likely, not really needed for microcontrollers, where the priority is more "cheap CPU" rather than "fast CPU").
But, as I see it, if you can make binaries 40% smaller, and 35% faster, this is something that should be worth considering.
As opposed to the C extension which IME seems to only give around a 25-30% size reduction, and (with a CPU design that only does superscalar on properly aligned 32-bit instructions) actually makes performance slightly worse.
Granted, having both jumbo prefixes and the 'C' extension being likely a best case for code density (though, BGBCC doesn't yet support the 'C' extension, so I can't test this).
I am half tempted to move the RV jumbo prefixes from
...-100-kkkkk-00-11011 (ALUIW block)
To:
...-100-kkkkk-00-00111 (JALR block)
For "technical reasons" (well, would also clean up the encoding conflict with an older/dropped "ADDIWU" instruction). TBD if worth the break in compatibility though (if I did so, might consider also claiming 1xx for jumbo prefixes, say, to give an extra bit so that "JIMM+JIMM+LUI" could have enough bits to encode F0..F31 as well, but there are other possibilities for how to encode this).
Most of these features have historical precedent as well, so should in theory be "safe" (similar sorts of prefixes existed in Transputer and Java VM).
Granted, not found examples thus far in 1980s or 1990s RISC architectures (these sorts of prefixes didn't really seem to start appearing in RISC's until the early 2000s). Annoyingly, most precedent for the use of prefixes and prefix instructions seems to be in terms of CISC architectures.
The closest direct equivalent of the Jumbo_Imm prefix I am aware of didn't appear until MicroBlaze, which is cutting it a little close (and have yet to verify if it existed in the original version of MicroBlaze). In any case, will probably be more safe in a few years (as MiceoBlaze moves further outside of the 20 year window).
Register-Indexed Load/Store and similar were fairly widespread (80386, ARM32, and others), so should be safe.
Can note that also, in BJX2, the general ideas behind WEX encoding also had precedent (was in use in 1990s DSP architectures and similar), ...
Sometimes, there is an elegance in finding things sufficiently obvious that it is more a question why it is not more widespread.
Or, avoiding things that require a non-trivial leap in logic, or pose difficulty in verifying the logic chains.
Though, arguably, in terms of precedent, something like RISC-V is arguably fairly safe:
Its core ISA lacks anything that didn't already have precedent by the early 1980s.
But, as I see it, pretty much anything that has precedent earlier than ~2004 should be safe (which, as I see it, should include things like jumbo prefixes, etc).
...
There are, granted, potential gotchas, like the years of hassle that S3TC and depth-fail shadows and similar caused.
Where, S3TC should have been invalid, as it wasn't substantially different from what was already in common use in the 1980s.
Seemingly, main arguable "novel" feature it had was defining the interpolated colors as 1/3 + 2/3 rather than 1-bit (A or B), or 3/8 + 5/8 (as in some earlier Apple image formats).
There was the "S2TC" workaround (just disallow interpolation entirely); theoretically though, someone could have just used DXT1/DXT5 mostly as is, but then redefined the interpolation as 3/8 + 5/8 as "close enough"...
Similarly the depth-fail issue was also annoying. There was still depth-pass though, but this had some annoying edge cases that required workarounds (the shadows would break if the camera was inside a shadow volume, requiring a workaround).
Depth-fail shadows should also be safe now.
...
Well, and people can freely use FAT32, or (in theory) NTFS. Though, the design of NTFS itself is a bigger impediment to using it; though with some limited (newer features may still not be safe).
A person should also be able to do their own off-brand implementations of x86-64 (*) and 32-bit ARM and Thumb/Thumb2.
*: The original form of x86-64 should be safe, would mostly need to omit newer forms of SSE, and AVX, to be safe.
...
May not be obvious, but admittedly, I am more someone that tries to avoid "novelty" (often things like cost/benefit concerns and historical precedent are given more weight).
- anton