Sujet : Re: Tonights Tradeoff
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.archDate : 11. Sep 2024, 23:32:34
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <29971782c81bff6ad49af488bc1b737c@www.novabbs.org>
References : 1 2 3 4 5 6 7 8 9 10
User-Agent : Rocksolid Light
On Wed, 11 Sep 2024 15:48:03 +0000, Stephen Fuld wrote:
On 9/11/2024 6:54 AM, Robert Finch wrote:
>
snip
>
>
I have found that there can be a lot of registers available if they are
implemented in BRAMs. BRAMs have lots of depth compared to LUT RAMs.
BRAMs have a one cycle latency but that is just part of the pipeline. In
Q+ about 40k LUTs are being used just to keep track of registers.
(rename mappings and checkpoints).
>
Given a lot of available registers I keep considering trying a VLIW
design similar to the Itanium, rotating register and all. But I have a
lot invested in OoO.
>
>
Q+ has seven in-order pipeline stages before things get to the re-order
buffer.
So does the RISC-V BOOM.
Does each of these take a clock cycle? If so, that seems excessive.
What is your cost for a mis-predicted branch?
I have My 66000 decoder at 4 stages (stage 4 does rename of up to 6
instructions) with the first 3 fetching and parsing instructions
{along with predicting flow control.}
>
>
>