Re: "Mini" tags to reduce the number of op codes

Liste des GroupesRevenir à c arch 
Sujet : Re: "Mini" tags to reduce the number of op codes
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 12. Apr 2024, 02:07:08
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <uva1fu$2010o$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
User-Agent : Mozilla Thunderbird
On 4/11/2024 6:06 PM, MitchAlsup1 wrote:
BGB-Alt wrote:
 
On 4/11/2024 1:46 PM, MitchAlsup1 wrote:
BGB wrote:
>
>
Win-win under constraints of Load-Store Arch. Otherwise, it depends.
>
Never seen a LD-OP architecture where the inbound memory can be in the Rs1 position of the instruction.
>
>
>
FWIW:
The LDSH / SHORI mechanism does provide a way to get 64-bit constants, and needs less encoding space than the LUI route.
>
   MOV Imm16. Rn
   SHORI Imm16, Rn
   SHORI Imm16, Rn
   SHORI Imm16, Rn
>
Granted, if each is a 1-cycle instruction, this still takes 4 clock cycles.
>
As compared to::
>
     CALK   Rd,Rs1,#imm64
>
Which takes 3 words (12 bytes) and executes in CALK cycles, the loading
of the constant is free !! (0 cycles) !! {{The above example uses at least
5 cycles to use the loaded/built constant.}}
>
 
The main reason one might want SHORI is that it can fit into a fixed-length 32-bit encoding.
 While 32-bit encoding is RISC mantra, it has NOT been shown to be best
just simplest. Then, once you start widening the microarchitecture, it
is better to fetch wider than decode-issue so that you suffer least from boundary conditions. Once you start fetching wide OR have wide
decode-issue, you have ALL the infrastructure to do variable length
instructions. Thus, complaining that VLE is hard has already been
eradicated.
 
As noted, BJX2 is effectively VLE.
   Just now split into two sub-variants.
So, as for lengths:
   Baseline: 16/32/64/96
   XG2: 32/64/96
Original version was 16/32/48.
But, the original 48-bit encoding was dropped, mostly to make the rest of the encoding more orthogonal, and these were replaced with Jumbo prefixes. An encoding space exists where 48-bit ops could in theory be re-added to Baseline, but have not done so as it does not seem be justifiable in a cost/benefit sense (and would still have some of the same drawbacks as the original 48 bit ops).
Had also briefly experimented with 24-bit ops, but these were quickly dropped due to "general suckage" (though, an alternate 16/24/32/48 encoding scheme could have theoretically given better code-density).
However, RISC-V is either 32-bit, or 16/32.
For now, I am not bothering with the 16-bit C extension, not so much for sake of difficulty of dealing with VLE (the core can already deal with VLE), but more because the 'C' encodings are such a dog chewed mess that I don't feel terribly inclined to bother with them.
But, like, I can't really compare BJX2 Baseline with RV64G in terms of code density, because this wouldn't be a fair comparison. Would need to compare code-density between Baseline and RV64GC, which would imply needing to actually support the C extension.
I could already claim a "win" here if I wanted, but as I see it, doing so would not be valid.
Theoretically, encoding space exists for bigger ops in RISC-V, but no one has defined ops there yet as far as I know. Also, the way RISC-V represents larger ops is very different.
However, comparing fixed-length against VLE when the VLE only has larger instructions, is still acceptable as I see it (even if larger instructions can still allow a more compact encoding in some cases).
Say, for example, as I see it, SuperH vs Thumb2 would still be a fair comparison, as would Thumb2 vs RV32GC, but Thumb2 vs RV32G would not.
Unless one only cares about "absolute code density" irrespective of keeping parity in terms of feature-set.

                              Also technically could be retrofitted onto RISC-V without any significant change, unlike some other options (as noted, I don't argue for adding Jumbo prefixes to RV under the basis that there is no real viable way to add them to RV, *).
 The issue is that once you do VLE RISC-Vs ISA is no longer helping you
get the job done, especially when you have to execute 40% more instructions
 
Yeah.
As noted, I had already been beating RISC-V in terms of performance, only there was a shortfall in terms of ".text" size (for the XG2 variant).
Initially this was around a 16% delta, now down to around 5%. Nearly all of the size reduction thus far, has been due to fiddling with stuff in my compiler.
In theory, BJX2 (XG2) should be able to win in terms of code-density, as the only cases where RISC-V has an advantage do not appear to be statistically significant.
As also noted, I am using "-ffunction-sections" and similar (to allow GCC to prune unreachable functions), otherwise there is "no contest" (easier to win against 540K than 290K...).

Sadly, the closest option to viable for RV would be to add the SHORI instruction and optionally pattern match it in the fetch/decode.
 
Or, say:
   LUI Xn, Imm20
   ADD Xn, Xn, Imm12
   SHORI Xn, Imm16
   SHORI Xn, Imm16
 
Then, combine LUI+ADD into a 32-bit load in the decoder (though probably only if the Imm12 is positive), and 2x SHORI into a combined "Xn=(Xn<<32)|Imm32" operation.
 
This could potentially get it down to 2 clock cycles.
 Universal constants gets this down to 0 cycles......
 
Possibly.

*: To add a jumbo prefix, one needs an encoding that:
   Uses up a really big chunk of encoding space;
   Is otherwise illegal and unused.
RISC-V doesn't have anything here.
 Which is WHY you should not jump ship from SH to RV, but jump to an
ISA without these problems.
 
Of the options that were available at the time:
   SuperH: Simple encoding and decent code density;
   RISC-V: Seemed like it would have had worse code density.
     Though, it seems that RV beats SH in this area.
   Thumb: Uglier encoding and some more awkward limitations vs SH.
     Also, condition codes, etc.
   Thumb2: Was still patent-encumbered at the time.
   PowerPC: Bleh.
   ...
The main reason for RISC-V support is not due to "betterness", but rather because RISC-V is at least semi-popular (and not as bad as I initially thought, in retrospect).

Ironically, in XG2 mode, I still have 28x 24-bit chunks of encoding space that aren't yet used for anything, but aren't usable as normal encoding space mostly because if I put instructions in there (with the existing encoding schemes), I couldn't use all the registers (and they would not have predication or similar either). Annoyingly, the only types of encodings that would fit in there at present are 2RI Imm16 ops or similar (or maybe 3R 128-bit SIMD ops, where these ops only use encodings for R0..R31 anyways, interpreting the LSB of the register field as encoding R32..R63).
 Just another reason not to stay with what you have developed.
 In comparison, I reserve 6-major OpCodes so that a control transfer into
data is highly likely to get Undefined OpCode exceptions rather than a
try to execute what is in that data. Then, as it is, I still have 21-slots
in the major OpCode group free (27 if you count the permanently reserved).
 Much of this comes from side effects of Universal Constants.
 
An encoding that can MOV a 64-bit constant in 96-bits (12 bytes) and 1-cycle, is preferable....
>
A consuming instruction where you don't even use a register is better
still !!
 
Can be done, but thus far 33-bit immediate values. Luckily, Imm33s seems to addresses around 99% of uses (for normal ALU ops and similar).
 What do you do when accessing data that the linker knows is more than 4GB away from IP ?? or known to be outside of 0-4GB ?? externs, GOT, PLT, ...
 
Had considered allowing an Imm57s case for SIMD immediates (4x S.E5.F8 or 2x S.E8.F19), which would have indirectly allowed the Imm57s case. By themselves though, the difference doesn't seem enough to justify the cost.
 While I admit that <basically> anything bigger than 50-bits will be fine
as displacements, they are not fine for constants and especially FP
constants and many bit twiddling constants.
 
The number of cases where this comes up is not statistically significant enough to have a meaningful impact on performance.
Fraction of a percent edge-cases are not deal-breakers, as I see it.
I do at least have some confidence that my stuff can be made usable on affordable FPGAs.
Some of the stuff you argue for, I don't feel is viable on this class of hardware.
Like, the challenge would be to, say, make a soft-processor and fit all of the stuff you are arguing for into an XC7S50 or similar (say, on one of the Arty boards or something).
Or, some other sub $400 or so FPGA board (that can be targeted with the free version of Vivado or similar...).
Something like an Lattice ECP5 is probably OK.
Though, Cyclone-V or Zynq is probably not, too much room for "cheating" there by leveraging the ARM cores...

Don't have enough bits in the encoding scheme to pull off a 3RI Imm64 in 12 bytes (and allowing a 16-byte encoding would have too steep of a cost increase to be worthwhile).
 And yet I did.
 
I am not saying it is impossible, only that I can't pull it off with my existing encoding.
I guess it could be possible if I burnt all off the remaining encoding bits on it (effectively 27-bit jumbo prefixes, + the WI bit in the final instruction).
This would preclude using these bits for anything else though.
Debatable if it is "worth it".

So, alas...
 Yes, alas..........

Date Sujet#  Auteur
3 Apr 24 * "Mini" tags to reduce the number of op codes81Stephen Fuld
3 Apr 24 +* Re: "Mini" tags to reduce the number of op codes8Anton Ertl
15 Apr 24 i+* Re: "Mini" tags to reduce the number of op codes6MitchAlsup1
15 Apr 24 ii`* Re: "Mini" tags to reduce the number of op codes5Terje Mathisen
15 Apr 24 ii +- Re: "Mini" tags to reduce the number of op codes1Terje Mathisen
15 Apr 24 ii `* Re: "Mini" tags to reduce the number of op codes3MitchAlsup1
16 Apr 24 ii  `* Re: "Mini" tags to reduce the number of op codes2Terje Mathisen
16 Apr 24 ii   `- Re: "Mini" tags to reduce the number of op codes1MitchAlsup1
17 Apr 24 i`- Re: "Mini" tags to reduce the number of op codes1Stephen Fuld
3 Apr 24 +* Re: "Mini" tags to reduce the number of op codes3Thomas Koenig
17 Apr 24 i`* Re: "Mini" tags to reduce the number of op codes2Stephen Fuld
17 Apr 24 i `- Re: "Mini" tags to reduce the number of op codes1BGB-Alt
3 Apr 24 +* Re: "Mini" tags to reduce the number of op codes12BGB-Alt
3 Apr 24 i+* Re: "Mini" tags to reduce the number of op codes9MitchAlsup1
4 Apr 24 ii+* Re: "Mini" tags to reduce the number of op codes7Terje Mathisen
4 Apr 24 iii+* Re: "Mini" tags to reduce the number of op codes3Michael S
4 Apr 24 iiii`* Re: "Mini" tags to reduce the number of op codes2Terje Mathisen
4 Apr 24 iiii `- Re: "Mini" tags to reduce the number of op codes1Michael S
5 Apr 24 iii`* Re: "Mini" tags to reduce the number of op codes3BGB-Alt
5 Apr 24 iii `* Re: "Mini" tags to reduce the number of op codes2MitchAlsup1
5 Apr 24 iii  `- Re: "Mini" tags to reduce the number of op codes1BGB
17 Apr 24 ii`- Re: "Mini" tags to reduce the number of op codes1Stephen Fuld
3 Apr 24 i`* Re: "Mini" tags to reduce the number of op codes2MitchAlsup1
4 Apr 24 i `- Re: "Mini" tags to reduce the number of op codes1BGB
5 Apr 24 +* Re: "Mini" tags to reduce the number of op codes54John Savard
5 Apr 24 i+- Re: "Mini" tags to reduce the number of op codes1BGB-Alt
5 Apr 24 i`* Re: "Mini" tags to reduce the number of op codes52MitchAlsup1
7 Apr 24 i `* Re: "Mini" tags to reduce the number of op codes51John Savard
7 Apr 24 i  +* Re: "Mini" tags to reduce the number of op codes6MitchAlsup1
8 Apr 24 i  i`* Re: "Mini" tags to reduce the number of op codes5John Savard
8 Apr 24 i  i +* Re: "Mini" tags to reduce the number of op codes2Thomas Koenig
17 Apr 24 i  i i`- Re: "Mini" tags to reduce the number of op codes1John Savard
8 Apr 24 i  i `* Re: "Mini" tags to reduce the number of op codes2MitchAlsup1
17 Apr 24 i  i  `- Re: "Mini" tags to reduce the number of op codes1John Savard
7 Apr 24 i  `* Re: "Mini" tags to reduce the number of op codes44Thomas Koenig
7 Apr 24 i   `* Re: "Mini" tags to reduce the number of op codes43MitchAlsup1
8 Apr 24 i    `* Re: "Mini" tags to reduce the number of op codes42Thomas Koenig
8 Apr 24 i     +- Re: "Mini" tags to reduce the number of op codes1Anton Ertl
9 Apr 24 i     `* Re: "Mini" tags to reduce the number of op codes40Thomas Koenig
9 Apr 24 i      +* Re: "Mini" tags to reduce the number of op codes38BGB
9 Apr 24 i      i`* Re: "Mini" tags to reduce the number of op codes37MitchAlsup1
10 Apr 24 i      i `* Re: "Mini" tags to reduce the number of op codes36BGB-Alt
10 Apr 24 i      i  +* Re: "Mini" tags to reduce the number of op codes31MitchAlsup1
10 Apr 24 i      i  i+* Re: "Mini" tags to reduce the number of op codes23BGB
10 Apr 24 i      i  ii`* Re: "Mini" tags to reduce the number of op codes22MitchAlsup1
10 Apr 24 i      i  ii +* Re: "Mini" tags to reduce the number of op codes3BGB-Alt
10 Apr 24 i      i  ii i`* Re: "Mini" tags to reduce the number of op codes2MitchAlsup1
11 Apr 24 i      i  ii i `- Re: "Mini" tags to reduce the number of op codes1BGB
10 Apr 24 i      i  ii +- Re: "Mini" tags to reduce the number of op codes1BGB-Alt
11 Apr 24 i      i  ii +* Re: "Mini" tags to reduce the number of op codes16MitchAlsup1
11 Apr 24 i      i  ii i`* Re: "Mini" tags to reduce the number of op codes15Michael S
11 Apr 24 i      i  ii i `* Re: "Mini" tags to reduce the number of op codes14BGB
11 Apr 24 i      i  ii i  `* Re: "Mini" tags to reduce the number of op codes13MitchAlsup1
11 Apr 24 i      i  ii i   +* Re: "Mini" tags to reduce the number of op codes9BGB-Alt
12 Apr 24 i      i  ii i   i`* Re: "Mini" tags to reduce the number of op codes8MitchAlsup1
12 Apr 24 i      i  ii i   i `* Re: "Mini" tags to reduce the number of op codes7BGB
12 Apr 24 i      i  ii i   i  `* Re: "Mini" tags to reduce the number of op codes6MitchAlsup1
12 Apr 24 i      i  ii i   i   `* Re: "Mini" tags to reduce the number of op codes5BGB
13 Apr 24 i      i  ii i   i    +- Re: "Mini" tags to reduce the number of op codes1MitchAlsup1
13 Apr 24 i      i  ii i   i    `* Re: "Mini" tags to reduce the number of op codes3MitchAlsup1
13 Apr 24 i      i  ii i   i     +- Re: "Mini" tags to reduce the number of op codes1BGB
15 Apr 24 i      i  ii i   i     `- Re: "Mini" tags to reduce the number of op codes1BGB-Alt
12 Apr 24 i      i  ii i   `* Re: "Mini" tags to reduce the number of op codes3Michael S
12 Apr 24 i      i  ii i    +- Re: "Mini" tags to reduce the number of op codes1Michael S
15 Apr 24 i      i  ii i    `- Re: "Mini" tags to reduce the number of op codes1MitchAlsup1
11 Apr 24 i      i  ii `- Re: "Mini" tags to reduce the number of op codes1Terje Mathisen
11 Apr 24 i      i  i`* Re: "Mini" tags to reduce the number of op codes7Paul A. Clayton
11 Apr 24 i      i  i +- Re: "Mini" tags to reduce the number of op codes1BGB
11 Apr 24 i      i  i +* Re: "Mini" tags to reduce the number of op codes2BGB-Alt
12 Apr 24 i      i  i i`- Re: "Mini" tags to reduce the number of op codes1MitchAlsup1
12 Apr 24 i      i  i +* Re: "Mini" tags to reduce the number of op codes2MitchAlsup1
21 Apr 24 i      i  i i`- Re: "Mini" tags to reduce the number of op codes1Paul A. Clayton
21 Apr 24 i      i  i `- Re: "Mini" tags to reduce the number of op codes1Paul A. Clayton
10 Apr 24 i      i  `* Re: "Mini" tags to reduce the number of op codes4Chris M. Thomasson
10 Apr 24 i      i   `* Re: "Mini" tags to reduce the number of op codes3BGB
10 Apr 24 i      i    `* Re: "Mini" tags to reduce the number of op codes2Chris M. Thomasson
10 Apr 24 i      i     `- Re: "Mini" tags to reduce the number of op codes1BGB-Alt
13 Apr 24 i      `- Re: "Mini" tags to reduce the number of op codes1Brian G. Lucas
15 Apr 24 +- Re: "Mini" tags to reduce the number of op codes1MitchAlsup1
17 Apr 24 `* Re: "Mini" tags to reduce the number of op codes2Stephen Fuld
17 Apr 24  `- Re: "Mini" tags to reduce the number of op codes1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal