Re: Tonights Tradeoff

Liste des GroupesRevenir à c arch 
Sujet : Re: Tonights Tradeoff
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.arch
Date : 12. Sep 2024, 21:46:35
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <9f0a142454e6ab2f1d1985e3af116b4b@www.novabbs.org>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13
User-Agent : Rocksolid Light
On Thu, 12 Sep 2024 19:28:19 +0000, Robert Finch wrote:

On 2024-09-12 12:46 p.m., MitchAlsup1 wrote:
On Thu, 12 Sep 2024 3:37:22 +0000, Robert Finch wrote:
>
On 2024-09-11 11:48 a.m., Stephen Fuld wrote:
On 9/11/2024 6:54 AM, Robert Finch wrote:
>
snip
>
>
I have found that there can be a lot of registers available if they
are implemented in BRAMs. BRAMs have lots of depth compared to LUT
RAMs. BRAMs have a one cycle latency but that is just part of the
pipeline. In Q+ about 40k LUTs are being used just to keep track of
registers. (rename mappings and checkpoints).
>
Given a lot of available registers I keep considering trying a VLIW
design similar to the Itanium, rotating register and all. But I have a
lot invested in OoO.
>
>
Q+ has seven in-order pipeline stages before things get to the re-
order buffer.
>
Does each of these take a clock cycle?  If so, that seems excessive.
What is your cost for a mis-predicted branch?
>
>
>
>
Each stage takes one clock cycle. Unconditional branches are detected at
the second stage and taken then so they do not consume as many clocks.
There are two extra stages to handle vector instructions. Those two
stages could be removed if vectors are not needed.
>
Mis-predicted branches are really expensive. They take about six clocks,
plus the seven clocks to refill the pipeline, so it is about 13 clocks.
Seems like it should be possible to reduce the number of clocks of
processing during the miss, but I have not got around to it yet. There
is a branch miss state machine that restores the checkpoint. Branches
need a lot of work yet.
>
In a machine I did in 1990-2 we would fetch down the alternate path
and put the recovery instructions in a buffer, so when a branch was
mispredicted, the instructions were already present.
>
So, you can't help the 6 cycles of branch verification latency,
but you can fix the pipeline refill latency.
>
We got 2.05 i/c on XLISP SPECnit 89 mostly because of the low backup
overhead.
>
>
That sounds like a good idea. The fetch typically idles for a few cycles
as it can fetch more instructions than can be consumed in a single
cycle. So, while it’s idling it could be fetching down an alternate
path. Part of the pipeline would need to be replicated doubling up on
the size. Then an A/B switch happens which selects the right pipeline.
You want the alternate path buffer to be staged up ready to go. You
do not necessarily have to dedicate any post decode pipeline stages to
them. You can fetch these from the buffer indexed by branch number,
so when the branch fires to execute the verify stuff, you are fetching
the backup instructions.
You CAN use the renamer state after the previously issued group so the
buffer contains already renamed registers--If you back up and use these
instructions you threw away the post issue parts of the renamer anyway.
This enables you to take a cycle to back up the renamer without penalty.

Would not want to queue to the reorder buffer from the alternate path,
as there is a bit of a bottleneck at queue. Not wondering what to do
about multiple branches. Multiple pipelines and more switches? Front-end
would look like a pipeline tree to handle multiple outstanding branches.
>
Was wondering what to do with the extra fetch bandwidth. Fetching two
cache-lines at once means there may have been up to 21 instructions
fetched. But its only a four-wide machine.
For my 6-wide machine I am fetching 1/2 a cache line twice for the
sequential path and 1/2 a cache line for the alternate path from
an 8 banked ICache.

                                           I was going to try and feed
multiple cores from the same cache. Core A is performance, core B is
average, and core C is economy using left-over bandwidth from A and B.
ARM's big little strategy, with some power philosophy, gives you that
average as a little core running at high voltage and frequency.

I can code the alternate path fetch up and try it in SIM, but it is too
large for my FPGA right now. Another config option. Might put the switch
before the rename stage. Nothing like squeezing a mega-LUT design into
100k LUTs. Getting a feel for the size of things. A two-wide in-order
core would easily fit. Even a simple two-wide out-of-order core would
likely fit, if one stuck to 32-bits and a RISC instruction set. A
four-wide OoO core with lots of features is pushing it.

Date Sujet#  Auteur
7 Sep 24 * Tonights Tradeoff99Robert Finch
7 Sep 24 `* Re: Tonights Tradeoff98MitchAlsup1
8 Sep 24  `* Re: Tonights Tradeoff97Robert Finch
8 Sep 24   `* Re: Tonights Tradeoff96MitchAlsup1
10 Sep 24    `* Re: Tonights Tradeoff95Robert Finch
10 Sep 24     +* Re: Tonights Tradeoff17BGB
10 Sep 24     i+* Re: Tonights Tradeoff12Robert Finch
10 Sep 24     ii+* Re: Tonights Tradeoff10BGB
11 Sep 24     iii`* Re: Tonights Tradeoff9Robert Finch
11 Sep 24     iii +* Re: Tonights Tradeoff7Stephen Fuld
11 Sep 24     iii i+- Re: Tonights Tradeoff1MitchAlsup1
12 Sep 24     iii i`* Re: Tonights Tradeoff5Robert Finch
12 Sep 24     iii i `* Re: Tonights Tradeoff4MitchAlsup1
12 Sep 24     iii i  `* Re: Tonights Tradeoff3Robert Finch
12 Sep 24     iii i   `* Re: Tonights Tradeoff2MitchAlsup1
13 Sep 24     iii i    `- Re: Tonights Tradeoff1MitchAlsup1
12 Sep 24     iii `- Re: Tonights Tradeoff1BGB
11 Sep 24     ii`- Re: Tonights Tradeoff1MitchAlsup1
11 Sep 24     i`* Re: Tonights Tradeoff4MitchAlsup1
12 Sep 24     i `* Re: Tonights Tradeoff3Thomas Koenig
12 Sep 24     i  `* Re: Tonights Tradeoff2BGB
12 Sep 24     i   `- Re: Tonights Tradeoff1Robert Finch
11 Sep 24     `* Re: Tonights Tradeoff77MitchAlsup1
15 Sep 24      `* Re: Tonights Tradeoff76Robert Finch
16 Sep 24       `* Re: Tonights Tradeoff75Robert Finch
24 Sep 24        `* Re: Tonights Tradeoff - Background Execution Buffers74Robert Finch
24 Sep 24         `* Re: Tonights Tradeoff - Background Execution Buffers73MitchAlsup1
26 Sep 24          `* Re: Tonights Tradeoff - Background Execution Buffers72Robert Finch
26 Sep 24           `* Re: Tonights Tradeoff - Background Execution Buffers71MitchAlsup1
27 Sep 24            `* Re: Tonights Tradeoff - Background Execution Buffers70Robert Finch
4 Oct 24             `* Re: Tonights Tradeoff - Background Execution Buffers69Robert Finch
4 Oct 24              +* Re: Tonights Tradeoff - Background Execution Buffers66Anton Ertl
4 Oct 24              i`* Re: Tonights Tradeoff - Background Execution Buffers65Robert Finch
5 Oct 24              i `* Re: Tonights Tradeoff - Background Execution Buffers64Anton Ertl
9 Oct 24              i  `* Re: Tonights Tradeoff - Background Execution Buffers63Robert Finch
9 Oct 24              i   +* Re: Tonights Tradeoff - Background Execution Buffers3MitchAlsup1
9 Oct 24              i   i+- Re: Tonights Tradeoff - Background Execution Buffers1Robert Finch
12 Oct 24              i   i`- Re: Tonights Tradeoff - Background Execution Buffers1BGB
12 Oct 24              i   +* Re: Tonights Tradeoff - Carry and Overflow58Robert Finch
12 Oct 24              i   i`* Re: Tonights Tradeoff - Carry and Overflow57MitchAlsup1
12 Oct 24              i   i `* Re: Tonights Tradeoff - Carry and Overflow56BGB
12 Oct 24              i   i  `* Re: Tonights Tradeoff - Carry and Overflow55Robert Finch
13 Oct 24              i   i   +* Re: Tonights Tradeoff - Carry and Overflow3MitchAlsup1
13 Oct 24              i   i   i`* Re: Tonights Tradeoff - ATOM2Robert Finch
13 Oct 24              i   i   i `- Re: Tonights Tradeoff - ATOM1MitchAlsup1
13 Oct 24              i   i   +- Re: Tonights Tradeoff - Carry and Overflow1BGB
31 Oct 24              i   i   `* Page fetching cache controller50Robert Finch
31 Oct 24              i   i    +- Re: Page fetching cache controller1MitchAlsup1
6 Nov 24              i   i    `* Re: Q+ Fibonacci48Robert Finch
17 Apr 25              i   i     `* Re: register sets47Robert Finch
17 Apr 25              i   i      `* Re: register sets46Stephen Fuld
17 Apr 25              i   i       +- Re: register sets1Robert Finch
17 Apr 25              i   i       `* Re: register sets44MitchAlsup1
18 Apr 25              i   i        `* Re: register sets43Robert Finch
18 Apr 25              i   i         `* Re: register sets42MitchAlsup1
20 Apr 25              i   i          `* Re: register sets41Robert Finch
21 Apr 25              i   i           `* Re: auto predicating branches40Robert Finch
21 Apr 25              i   i            `* Re: auto predicating branches39Anton Ertl
21 Apr 25              i   i             +- Is an instruction on the critical path? (was: auto predicating branches)1Anton Ertl
21 Apr 25              i   i             `* Re: auto predicating branches37MitchAlsup1
22 Apr 25              i   i              `* Re: auto predicating branches36Anton Ertl
22 Apr 25              i   i               +- Re: auto predicating branches1MitchAlsup1
22 Apr 25              i   i               `* Re: auto predicating branches34Anton Ertl
22 Apr 25              i   i                `* Re: auto predicating branches33MitchAlsup1
23 Apr 25              i   i                 +* Re: auto predicating branches3Stefan Monnier
23 Apr 25              i   i                 i`* Re: auto predicating branches2Anton Ertl
25 Apr 25              i   i                 i `- Re: auto predicating branches1MitchAlsup1
23 Apr 25              i   i                 `* Re: auto predicating branches29Anton Ertl
23 Apr 25              i   i                  `* Re: auto predicating branches28MitchAlsup1
24 Apr 25              i   i                   `* Re: asynch register rename27Robert Finch
27 Apr 25              i   i                    `* Re: fractional PCs26Robert Finch
27 Apr 25              i   i                     `* Re: fractional PCs25MitchAlsup1
28 Apr 25              i   i                      `* Re: fractional PCs24Robert Finch
28 Apr 25              i   i                       +* Re: fractional PCs13MitchAlsup1
29 Apr 25              i   i                       i`* Re: fractional PCs12Robert Finch
5 May 25              i   i                       i `* Re: control co-processor11Robert Finch
5 May 25              i   i                       i  `* Re: control co-processor10Al Kossow
5 May 25              i   i                       i   `* Re: control co-processor9Stefan Monnier
6 May 25              i   i                       i    +* Re: control co-processor2MitchAlsup1
7 May 25              i   i                       i    i`- Re: control co-processor1MitchAlsup1
7 May 25              i   i                       i    `* Scan chains (was: control co-processor)6Stefan Monnier
7 May 25              i   i                       i     +* Re: Scan chains (was: control co-processor)2Al Kossow
7 May 25              i   i                       i     i`- Re: Scan chains1Stefan Monnier
7 May 25              i   i                       i     `* Re: Scan chains3MitchAlsup1
7 May 25              i   i                       i      `* Re: Scan chains2Stefan Monnier
8 May 25              i   i                       i       `- Re: Scan chains1MitchAlsup1
29 Apr 25              i   i                       `* Re: fractional PCs10Robert Finch
29 Apr 25              i   i                        `* Re: fractional PCs9MitchAlsup1
30 Apr 25              i   i                         `* Re: fractional PCs8Robert Finch
30 Apr 25              i   i                          +* Re: fractional PCs6Thomas Koenig
1 May 25              i   i                          i+- Re: fractional PCs1Robert Finch
2 May 25              i   i                          i`* Re: fractional PCs4moi
2 May 25              i   i                          i +* Re: millicode, extracode, fractional PCs2John Levine
2 May 25              i   i                          i i`- Re: millicode, extracode, fractional PCs1moi
2 May 25              i   i                          i `- Re: fractional PCs1moi
30 Apr 25              i   i                          `- Re: fractional PCs1MitchAlsup1
13 Oct 24              i   `- Re: Tonights Tradeoff - Background Execution Buffers1Anton Ertl
4 Oct 24              +- Re: Tonights Tradeoff - Background Execution Buffers1BGB
6 Oct 24              `- Re: Tonights Tradeoff - Background Execution Buffers1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal