Liste des Groupes | Revenir à c arch |
On 2/14/2025 3:52 PM, MitchAlsup1 wrote:------------
Strategy for low end processors::It would take LESS total man-power world-wide and over-time to>
simply make HW perform misaligned accesses.
>
I think the usual issue is that on low-end hardware, it is seen as
"better" to skip out on misaligned access in order to save some cost in
the L1 cache.
>
Though, not sure how this mixes with 16/32 ISAs, given if one allows
misaligned 32-bit instructions, and a misaligned 32-bit instruction to
cross a cache-line boundary, one still has to deal with essentially the
same issues.
Another related thing I can note is internal store-forwarding within theIMHO:: Low end processors should not be doing ST->LD forwarding.
L1 D$ to avoid RAW and WAW penalties for multiple accesses to the same
cache line.
>These still look like LDs to me.
Say, it less convoluted to do, say:
MOV.X R24, (SP, 0)
MOV.X R26, (SP, 16)
MOV.X R28, (SP, 32)
MOV.X R30, (SP, 48)
Then again, I have heard that apparently there are libraries that relyIt takes Round Nearest Odd to perform Kahan-Babashuka Summation.
on the global-rounding-mode behavior, but I have also heard of such
libraries having issues or non-determinism when mixed with other
libraries which try to set a custom rounding mode when these modes
disagree.
>
>
I prefer my strategy instead:
FADD/FSUB/FMUL:
Hard-wired Round-Nearest / RNE.
Does not modify FPU flags.
FADDG/FSUBG/FMULG:Oh what fun, another RISC-V encoding mistake...
Dynamic Rounding;
May modify FPU flags.
>
Can note that RISC-V burns 3 bits for FPU instructions always encoding a
rounding mode (whereas in my ISA, encoding a rounding mode other than
RNE or DYN requiring a 64-bit encoding).
Les messages affichés proviennent d'usenet.