Sujet : Re: Banked register files
De : ggtgp (at) *nospam* yahoo.com (Brett)
Groupes : comp.archDate : 20. Aug 2024, 01:23:11
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <va0k4v$32dgq$1@dont-email.me>
References : 1 2
User-Agent : NewsTap/5.5 (iPad)
MitchAlsup1 <
mitchalsup@aol.com> wrote:
On Mon, 19 Aug 2024 21:46:07 +0000, Brett wrote:
Banked register files, a mental exercise at expanding the register file.
With three operand RISC you have you have three 5 bit register
specifiers using 15 bits.
If instead you have eight banks of eight registers you have a 3 bit bank
specifier and three 3 bit register specifiers for 12 bits.
Now the banks need to talk to each other and so you would add a bit to
each register specifier to tell whether it uses the bank or the base
registers, for 72 registers total, not 64. So a 3 bit bank specifier
and three 4 bit register specifiers for 15 bits, the same as a 32
register RISC chip.
This covers 100% of instructions that smell like::
ADD R17,R17,R25
but covers 000% of the instructions that smell like::
ADD R7,R17,R25
I strongly suspect that it covers less than 50% of the 3-operand
instruction uses.
My description was bad, let’s do a MC 68000 version, base registers are
addressing mostly, and the banks are integer/float.
The compiler can handle this easily, simple dependency grouping and if you
need more than 8 registers you use the No base flag to total to the base
registers. So you have two chains that total in two banks and both write to
the base registers where the last total of the two chains are added.
This saves a lot of no bank bits, only the result needs a no bank override.
Simple code only uses base registers, or base plus one bank.
Call and return parameters are in the base registers, spilling to a bank if
you need more.
Since only the result needs an override, you can do 4 banks of 16
registers.
Two operand plus 16 bit offset instructions would need to sacrifice one
bit of offset. Four operand instructions would save a bit.
As an extra bonus you now have another 3 bit field that could be another
source or destination if you are not using the bank register. But with
only eight base registers it can look hard to pull off using 4 or 5
registers
at once. But maybe not if most of the addressing is in the bank
registers.
The frame pointer would be in the base registers, as it loads the other
pointers.
I guess the real question at this point is how are the banks used when:
a) calling a subroutine
b) returning from a subroutine
c) calling a method
d) calling an external subroutine
e) dealing with {lower bound, upper bound, stride} for each dimension
of a multi-dimensional array
f) how does the scheme work when INT-RF != FP-RF ??
The most general case for banked registers is loop unrolling. Eight
registers is not a lot so the first loop may use two banks, but now you
have 4 unrolls that are fairly trivial to set up.
Unrolling has become less and less necessary with GBOoO implementations.
Producing fewer instructions to encode the whole loop is more import-
ant.
Is this a good idea, maybe, maybe not. This is a mental exercise, it
proves > I am mental. ;)
Arguably less crazy than some other proposals.
Compiler people would have to be convinced to get on board as this
would disrupt their built-in idea that register files are mono-
lithic.
How does banked compare to high registers? Not as good.
Intel could pull off something like this to one up ARM. A new fixed
width instruction set with a nice patent moat, and fits the x86 mindset.
What fits the x86 mindset is the::
MEM Rd,[Rbase+Rindex<<scale+LargeDisplacement]
address mode.