On 3/8/2025 11:53 AM, MitchAlsup1 wrote:
On Sat, 8 Mar 2025 14:21:51 +0000, Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I knew a guy with that name at AMD--he did microcode--and did it well.
This was also posted to the RISC-V mailing list...
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
I took a quick look, and it seems that
a) too few registers
b) too many OpCode bits
although it does look easy to parse.
Yeah, a bit of a rebalance is needed...
The design goes to 12 bit register fields for 64-bit ops, which is just absurd, and doesn't really leave enough bits for immediate encodings in the instruction formats.
If I were to do a vaguely similar design, probably:
Bit 0 of each 16-bit word indicates a following word is present;
16-bit ops have 2R with 16 registers;
32-bit ops have 3R with 64 registers.
Say, 16b:
zzzz-mmmm-nnnn-zzz0 //2R
zzzz-iiii-nnnn-zzz0 //2RI, Imm4
zzzz-iiii-iiii-zzz0 //Imm8 (Branch, AddSP)
Then, 32b:
mmnn-mmmm-nnnn-zzz1 zzzz-tttt-ttzz-zzz0 //3R
mmnn-mmmm-nnnn-zzz1 zzzz-iiii-iizz-zzz0 //3RI, Imm6
mmnn-mmmm-nnnn-zzz1 iiii-iiii-iizz-zzz0 //3RI, Imm10
iinn-iiii-nnnn-zzz1 iiii-iiii-iizz-zzz0 //2RI, Imm16
iiii-iiii-iiii-zzz1 iiii-iiii-iizz-zzz0 //Imm22 (Branch)
Could have 48 and 64 bit encodings, which keep the same base layout as the 32-bit ops, but maybe extend immediate and opcode bits.
Say, 48-bit:
mmnn-mmmm-nnnn-zzz1 iiii-iiii-iizz-zzz1
iiii-iiii-iiii-iiz0 //3RI, Imm24
And, 64-bit:
mmnn-mmmm-nnnn-zzz1 iiii-iiii-iizz-zzz1
iiii-iiii-iiii-iiz1 zzzz-iiii-iiii-izz0 //3RI, Imm33
For register space, might make sense to map the 16-bit ops to R16..R31, but then organize the registers such that it has access to both callee-save and argument registers.
Say:
R0 ..R3 ZR, LR, SP, GP
R4 ..R15 Callee Save (12)
R16..R23 Callee Save ( 4)
R24..R27 Scratch ( 4)
R28..R31 Args 0..3 ( 4)
R32..R43 Args 4..15 (12)
R44..R51 Scratch ( 8)
R52..R63 Callee Save (12)
16b opcode map, possible:
00tt-mmmm-nnnn-0000 //Store (B/W/L/Q), "MOV.x Rn, (Rm)"
0100-iiii-nnnn-0000 MOV.Q Rn, (SP, Imm4*8)
0101-iiii-nnnn-0000 MOV.X Xn, (SP, Imm4*8) //Pair
0110-iiii-nnnn-0000 MOV.Q (SP, Imm4*8), Rn
0111-iiii-nnnn-0000 MOV.X (SP, Imm4*8), Xn //Pair
1ttt-mmmm-nnnn-0000 //Load (SB/SW/SL/Q, UB/UW/UL/X)
0000-mmmm-nnnn-0010 ADD Rm, Rn
0001-mmmm-nnnn-0010 SUB Rm, Rn
0010-mmmm-nnnn-0010 ADDSL Rm, Rn
0011-mmmm-nnnn-0010 SUBSL Rm, Rn
0100-mmmm-nnnn-0010 -
0101-mmmm-nnnn-0010 AND Rm, Rn
0110-mmmm-nnnn-0010 OR Rm, Rn
0111-mmmm-nnnn-0010 XOR Rm, Rn
...
0000-iiii-nnnn-0100 ADD Imm4u, Rn
0001-iiii-nnnn-0100 SUB Imm4u, Rn
0010-iiii-nnnn-0100 ADDSL Imm4u, Rn
0011-iiii-nnnn-0100 SUBSL Imm4u, Rn
0100-iiii-iiii-0100 ADD Imm8u*8, SP
0101-iiii-iiii-0100 SUB Imm8u*8, SP
0110-iiii-iiii-0100 BRA Imm8u (+512B)
0111-iiii-iiii-0100 BRA Imm8n (-512B)
...
00nn-iiii-nnnn-1010 ? MOV Imm4u, Yn
01nn-iiii-nnnn-1010 ? ADD Imm4u, Yn
10nn-iiii-nnnn-1010 ? MOV Imm4n, Yn
11nn-iiii-nnnn-1010 ? ADD Imm4n, Yn
mmnn-mmmm-nnnn-1100 ? MOV Ym, Yn //2R MOV
mmnn-mmmm-nnnn-1110 ? ADD Ym, Yn //2R ADD
There are only a few ops which have access to the full GPR space, as this is very expensive for 16-bit ops, so best limited to only the most common cases.
...
The 32-bit opcode map, not laid out here, would likely be entirely disconnected from the 16-bit map.
Usual tradeoff though that 16/32/64/48 bit encodings would make superscalar more difficult and more expensive than 32/64.
But, such a layout could potentially be good for code density at least I guess.
Best I could come up with with a quick/dirty pull it seems...
Don't have much time right now, so will leave it at this.
...