Liste des Groupes | Revenir à c arch |
On Sat, 17 Aug 2024 22:05:03 +0000, Thomas Koenig wrote:
Brett <ggtgp@yahoo.com> schrieb:Thomas Koenig <tkoenig@netcologne.de> wrote:Conceptually some of the modifier bits move into the opcode space, not
as clean but you have to squeeze those bits hard
It is very fine point of semantics if the modifier bits are part
of the opcode space or not. I happen to think that they are,
they are just in a (somehwat) different place and spelled a bit
differently, but it does not really matter how you look at it -
you need the bits to encode them.
To me, an instruction has 3 components:: Operands, Routing, and
calculation. We mainly consider the calculation (ADD) to be the
instruction and fuzz over what is operands and how does one
route them to places of calculation. My 66000 ISA directly
annotates the operands and the routing. This is what the
modifier bits do; they tell how to interpret the register
specifiers (Rn or #n), (Rn or -Rn) and when to substitute
another word or doubleword in the instruction stream as an
operand directly.
This does not add gates of delay to Operand routing because
all of the constant stuff is overlapped with the comparison
of register specifiers with pipeline result specifiers to
determine forwarding. Constants forward in the network prior
to register results preventing any added delay.
One can come up with a few patterns that are not hard to
decode, and spread across several instruction types.
So, go right ahead. Find an encoding that a) encompasses all of
Mitch's functionality, b) has six bits for registers everywhere,
and c) does not drive the assembler writer crazy (that's me,
for Mitch's design) or hardware designer bonkers (where Mitch has
the experience).
Consider, for example, memory reference address modes for 1
instruction::
LDSB Rd,[Rp,disp16]
LDSB Rd,[IP,disp16]
and
LDSB Rd,[Rp,Ri<<s]
LDSB Rd,[Rp,0]
LDSB Rd,[IP,Ri<<s]
LDSB Rd,[Rp,,disp32]
LDSB Rd,[Rp,Ri<<2,disp32]
LDSB Rd,[IP,,disp32]
LDSB Rd,[IP,Ri<<s,disp32]
LDSB Rd,[Rp,,disp64]
LDSB Rd,[Rp,Ri<<s,disp64]
LDSB Rd,[Rp,,disp64]
LDSB Rd,[IP,Ri<<s,disp64]
I use 2 instructions here::
1) a major OpCode with 16-bit immediate
R0 in the Rb position is a proxy for IP
2) a major OpCode and a MEME OpCode with 5-bits of Modifiers.
R0 in Rb position is remains a Proxy for IP
R0 in Ri position is a proxy for #0.
3) I still have 1-bit left over to denote participation in ATOMIC
events.
you get all sizes and signs of Load-Locked
you get up to 8 LLs
you can use as many Store-Conditionals as you need
all interested 3rd parties see memory before or after the event
and nothing in between.
Using 6-bit registers I would be down by 3-bits causing all sorts of
memory reference grief--leading to other compromises in ISA design
elsewhere.
Based on the code I read out of Brian's compiler: there is no particular
need for 64-registers. I am already using only 72% of the instructions
{72% average, 70% geomean, 69% harmonic mean} that RISC-V requires
{same compiler, same optimizations, just different code generators}.
One can argue that having 64-bit displacements is not-all-that-necessary
But how does one take dusty deck FORTRAN FEM programs and allow the
common blocks to grow bigger than 4GBs ?? This is the easiest way
to port code written 5 decades ago to use the sizes of memory they
need to run those "Great Big" FEM models today.
Let's start with the... BB1 instruction, which branches on bit
set in a register, so it needs a major opcode, a bit number, a
register number and a displacement. How do you propose to do that?
Shave one bit off the displacement?
Then proceed to Branch on Condition:: along with the standard::
EQ0, NE0, GT0, GE0, LT0, LE0 conditions one gets with other encodings,
I also get FEQ0, FNE0, FGT0, FGE0, FLT0, FLE0, DEQ0, DNE0, DGT0,
DGE0, DLT0, DLE0 along with Interference, SVC, SVR, and RET.
{And I left out the unordered float/double comparisons, above.}}
1-instruction due mostly to NOT having condition codes.and the four-register instructions like FMA...
I prefer 3-operand 1-result instead of 4-register. 4-register could
have 1-operand and 3 results and lacks decent specificity. 35 years
ago I used 3-register to describe Mc88100 and I regret that now.
I prefer FMAC instead of FMA--in hindsight I should had made it
FMAC and DMAC, but alas... I use FMAC to cover all 4 of::
x = y * z + q
x = y * -z + q
x = y * z - q
x = y * -z - q
Trying to wave a red flag in front of Mitch. ;)
I just happen to like FMA :-)
Of course, it might be possible to code FMA like AVX does, with
only three registers - 18 bits for three registers, plus two bits
for which one of them gets smashed for the result.
Why do I get the feeling the compiler guys would not like this ??
But - just making offhand suggestions won't cut it. You will
have to think about the layout of the instructions, how everything
fits in, and needing one to four more bits per instruction
can be accomodated.
Les messages affichés proviennent d'usenet.