Newsportal USENET - Re: Split instruction and immediate stream

On 3/8/2025 11:53 AM, MitchAlsup1 wrote:

On Sat, 8 Mar 2025 14:21:51 +0000, Thomas Koenig wrote:

There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I knew a guy with that name at AMD--he did microcode--and did it well.

This was also posted to the RISC-V mailing list...

I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
I took a quick look, and it seems that
a) too few registers
b) too many OpCode bits
although it does look easy to parse.

Yeah, a bit of a rebalance is needed...
The design goes to 12 bit register fields for 64-bit ops, which is just absurd, and doesn't really leave enough bits for immediate encodings in the instruction formats.
If I were to do a vaguely similar design, probably:
   Bit 0 of each 16-bit word indicates a following word is present;
   16-bit ops have 2R with 16 registers;
   32-bit ops have 3R with 64 registers.
Say, 16b:
   zzzz-mmmm-nnnn-zzz0 //2R
   zzzz-iiii-nnnn-zzz0 //2RI, Imm4
   zzzz-iiii-iiii-zzz0 //Imm8 (Branch, AddSP)
Then, 32b:
   mmnn-mmmm-nnnn-zzz1 zzzz-tttt-ttzz-zzz0 //3R
   mmnn-mmmm-nnnn-zzz1 zzzz-iiii-iizz-zzz0 //3RI, Imm6
   mmnn-mmmm-nnnn-zzz1 iiii-iiii-iizz-zzz0 //3RI, Imm10
   iinn-iiii-nnnn-zzz1 iiii-iiii-iizz-zzz0 //2RI, Imm16
   iiii-iiii-iiii-zzz1 iiii-iiii-iizz-zzz0 //Imm22 (Branch)
Could have 48 and 64 bit encodings, which keep the same base layout as the 32-bit ops, but maybe extend immediate and opcode bits.
Say, 48-bit:
   mmnn-mmmm-nnnn-zzz1 iiii-iiii-iizz-zzz1
   iiii-iiii-iiii-iiz0 //3RI, Imm24
And, 64-bit:
   mmnn-mmmm-nnnn-zzz1 iiii-iiii-iizz-zzz1
   iiii-iiii-iiii-iiz1 zzzz-iiii-iiii-izz0 //3RI, Imm33
For register space, might make sense to map the 16-bit ops to R16..R31, but then organize the registers such that it has access to both callee-save and argument registers.
Say:
   R0 ..R3   ZR, LR, SP, GP
   R4 ..R15 Callee Save (12)
   R16..R23 Callee Save ( 4)
   R24..R27 Scratch    ( 4)
   R28..R31 Args 0..3   ( 4)
   R32..R43 Args 4..15 (12)
   R44..R51 Scratch    ( 8)
   R52..R63 Callee Save (12)
16b opcode map, possible:
   00tt-mmmm-nnnn-0000 //Store (B/W/L/Q), "MOV.x Rn, (Rm)"
   0100-iiii-nnnn-0000 MOV.Q Rn, (SP, Imm4*8)
   0101-iiii-nnnn-0000 MOV.X Xn, (SP, Imm4*8) //Pair
   0110-iiii-nnnn-0000 MOV.Q (SP, Imm4*8), Rn
   0111-iiii-nnnn-0000 MOV.X (SP, Imm4*8), Xn //Pair
   1ttt-mmmm-nnnn-0000 //Load (SB/SW/SL/Q, UB/UW/UL/X)
   0000-mmmm-nnnn-0010 ADD   Rm, Rn
   0001-mmmm-nnnn-0010 SUB   Rm, Rn
   0010-mmmm-nnnn-0010 ADDSL Rm, Rn
   0011-mmmm-nnnn-0010 SUBSL Rm, Rn
   0100-mmmm-nnnn-0010 -
   0101-mmmm-nnnn-0010 AND   Rm, Rn
   0110-mmmm-nnnn-0010 OR Rm, Rn
   0111-mmmm-nnnn-0010 XOR   Rm, Rn
   ...
   0000-iiii-nnnn-0100 ADD   Imm4u, Rn
   0001-iiii-nnnn-0100 SUB   Imm4u, Rn
   0010-iiii-nnnn-0100 ADDSL Imm4u, Rn
   0011-iiii-nnnn-0100 SUBSL Imm4u, Rn
   0100-iiii-iiii-0100 ADD   Imm8u*8, SP
   0101-iiii-iiii-0100 SUB   Imm8u*8, SP
   0110-iiii-iiii-0100 BRA   Imm8u (+512B)
   0111-iiii-iiii-0100 BRA   Imm8n (-512B)
   ...
   00nn-iiii-nnnn-1010 ? MOV   Imm4u, Yn
   01nn-iiii-nnnn-1010 ? ADD   Imm4u, Yn
   10nn-iiii-nnnn-1010 ? MOV   Imm4n, Yn
   11nn-iiii-nnnn-1010 ? ADD   Imm4n, Yn
   mmnn-mmmm-nnnn-1100 ? MOV   Ym, Yn   //2R MOV
   mmnn-mmmm-nnnn-1110 ? ADD   Ym, Yn   //2R ADD
There are only a few ops which have access to the full GPR space, as this is very expensive for 16-bit ops, so best limited to only the most common cases.
...
The 32-bit opcode map, not laid out here, would likely be entirely disconnected from the 16-bit map.
Usual tradeoff though that 16/32/64/48 bit encodings would make superscalar more difficult and more expensive than 32/64.
But, such a layout could potentially be good for code density at least I guess.
Best I could come up with with a quick/dirty pull it seems...
Don't have much time right now, so will leave it at this.
...

Date	Sujet	#	Auteur
8 Mar 25	Split instruction and immediate stream	28	Thomas Koenig
8 Mar 25	Re: Split instruction and immediate stream	4	MitchAlsup1
8 Mar 25	Re: Split instruction and immediate stream	1	BGB
9 Mar 25	Re: Split instruction and immediate stream	2	MitchAlsup1
9 Mar 25	Re: Split instruction and immediate stream	1	BGB
8 Mar 25	Re: Split instruction and immediate stream	2	Terje Mathisen
8 Mar 25	Re: Split instruction and immediate stream	1	Thomas Koenig
9 Mar 25	Re: Split instruction and immediate stream	21	Robert Finch
9 Mar 25	Re: Split instruction and immediate stream	2	Thomas Koenig
9 Mar 25	Re: Split instruction and immediate stream	1	George Neuner
9 Mar 25	Re: Split instruction and immediate stream	5	BGB
9 Mar 25	Re: Split instruction and immediate stream	4	Robert Finch
9 Mar 25	Re: Split instruction and immediate stream	3	MitchAlsup1
9 Mar 25	Re: Split instruction and immediate stream	2	Thomas Koenig
9 Mar 25	Re: Split instruction and immediate stream	1	MitchAlsup1
9 Mar 25	Re: Split instruction and immediate stream	13	MitchAlsup1
9 Mar 25	Re: Split instruction and immediate stream	1	BGB
22 Mar 25	Re: Split instruction and immediate stream	11	Marcus
22 Mar 25	Re: Split instruction and immediate stream	10	Thomas Koenig
23 Mar 25	Re: Split instruction and immediate stream	9	Robert Finch
23 Mar 25	Re: Split instruction and immediate stream	5	Marcus
23 Mar 25	Re: Split instruction and immediate stream	1	MitchAlsup1
23 Mar 25	Re: Split instruction and immediate stream	3	Robert Finch
23 Mar 25	Re: Split instruction and immediate stream	1	MitchAlsup1
24 Mar 25	Re: Split instruction and immediate stream	1	Anton Ertl
23 Mar 25	Re: Split instruction and immediate stream	3	Thomas Koenig
24 Mar 25	Re: Split instruction and immediate stream	1	Robert Finch
24 Mar 25	Re: Split instruction and immediate stream	1	BGB