Re: Stealing a Great Idea from the 6600

Liste des GroupesRevenir à c arch 
Sujet : Re: Stealing a Great Idea from the 6600
De : kegs (at) *nospam* provalid.com (Kent Dickey)
Groupes : comp.arch
Date : 13. Jun 2024, 17:06:07
Autres entêtes
Organisation : provalid.com
Message-ID : <v4f5de$2bfca$1@dont-email.me>
References : 1 2 3 4
User-Agent : trn 4.0-test76 (Apr 2, 2001)
In article <v04tpb$pqus$1@dont-email.me>,
Terje Mathisen  <terje.mathisen@tmsw.no> wrote:
MitchAlsup1 wrote:
BGB wrote:
 
On 4/20/2024 5:03 PM, MitchAlsup1 wrote:
Like, in-order superscalar isn't going to do crap if nearly every
instruction depends on every preceding instruction. Even pipelining
can't help much with this.
 
Pipelining CREATED this (back to back dependencies). No amount of
pipelining can eradicate RAW data dependencies.
 
The compiler can shuffle the instructions into an order to limit the
number of register dependencies and better fit the pipeline. But,
then, most of the "hard parts" are already done (so it doesn't take
much more for the compiler to flag which instructions can run in
parallel).
 
Compiler scheduling works for exactly 1 pipeline implementation and
is suboptimal for all others.
>
Well, yeah.
>
OTOH, if your (definitely not my!) compiler can schedule a 4-wide static
ordering of operations, then it will be very nearly optimal on 2-wide
and 3-wide as well. (The difference is typically in a bit more loop
setup and cleanup code than needed.)
>
Hand-optimizing Pentium asm code did teach me to "think like a cpu",
which is probably the only part of the experience which is still kind of
relevant. :-)
>
Terje
>
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"


This is a late reply, but optimal static ordering for N-wide may be
very non-optimal for N-1 (or N-2, etc.).  As an example, assume a perfectly
scheduled 4-wide sequence of instructions with the instructions labeled
with the group number, and letter A-D for the position in the group.
There is a dependency from A to A, B to B, etc., and a dependency from D
to A.  Here's what the instruction groupings look like on a 4-way machine:

INST0_A
INST0_B
INST0_C
INST0_D
-------
INST1_A
INST1_B
INST1_C
INST1_D
-------
INST2_A

There will obviously be other dependencies (say, INST2_A depends on INST0_B)
but they don't affect how this will be executed.
The ----- lines indicate group boundaries.  All instructions in a group
execute in the same cycle.  So the first 8 instruction take just 2 clocks
on a 4-wide.

If you run this sequence on a 3-wide, then the groupings will become:

INST0_A
INST0_B
INST0_C
-------
INST0_D
-------
INST1_A
INST1_B
INST1_C
-------
INST1_D
-------

What took 2 clocks on the 4-wide now takes 4 clocks on the 3-wide.  And
a different arrangement would take just 3 clocks:

INST0_A
INST0_B
INST0_D
-------
INST1_A
INST0_C
INST1_B
-------
INST1_C
INST1_D

-------------------------------

A similar problem occurs when the 4-wide is optimally scheduled, but doesn't
issue 4 instructions due to dependencies.  These dependencies can hit at
bad times for 2-wide causing it to not be optimal.  Here's a new 4-wide
sequence where INST1_A depends on INST0_C and INST0_A, and INST2_* all
depends on INST1_A, with this pattern repeating in even/odd groups.

INST0_A
INST0_B
INST0_C
-------
INST1_A
-------
INST2_A
INST2_B
INST2_C
-------
INST3_A
-------

This sequence takes 4 clocks on a 4-wide machine.

When run on a 2-wide machine, these are the cycle counts:

INST0_A
INST0_B
-------
INST0_C
-------
INST1_A
-------
INST2_A
INST2_B
-------
INST2_C
-------
INST3_A
-------

This takes 6 clocks.  But by moving INSTx_B, it could be faster:

INST0_A
INST0_C
-------
INST0_B
INST1_A
-------
INST2_A
INST2_C
-------
INST2_B
INST3_A
-------

Now it takes just 4 clocks.  So an optimal 4-wide schedule can be shown to
not be very non-optimal on 3-wide or 2-wide systems.  And this isn't taking
into account other delays and resource limits (like number of loads and
stores supported per cycle).

It's an interesting problem as to how bad it can get.  With resource
limits, I suspect it can be an integer multiple bad, but just using
register dependencies, I'm not sure how bad it can get.  I just showed 50%,
but I'm not sure if 100% slower is possible.

Kent

Date Sujet#  Auteur
13 Jun 24 * Re: Stealing a Great Idea from the 660031Kent Dickey
13 Jun 24 +* Re: Stealing a Great Idea from the 660016Stefan Monnier
13 Jun 24 i`* Re: Stealing a Great Idea from the 660015BGB
13 Jun 24 i `* Re: Stealing a Great Idea from the 660014MitchAlsup1
14 Jun 24 i  `* Re: Stealing a Great Idea from the 660013BGB
18 Jun 24 i   `* Re: Stealing a Great Idea from the 660012MitchAlsup1
19 Jun 24 i    +* Re: Stealing a Great Idea from the 66008BGB
19 Jun 24 i    i`* Re: Stealing a Great Idea from the 66007MitchAlsup1
19 Jun 24 i    i +* Re: Stealing a Great Idea from the 66005BGB
19 Jun 24 i    i i`* Re: Stealing a Great Idea from the 66004MitchAlsup1
20 Jun 24 i    i i `* Re: Stealing a Great Idea from the 66003Thomas Koenig
20 Jun 24 i    i i  `* Re: Stealing a Great Idea from the 66002MitchAlsup1
21 Jun 24 i    i i   `- Re: Stealing a Great Idea from the 66001Thomas Koenig
20 Jun 24 i    i `- Re: Stealing a Great Idea from the 66001John Savard
19 Jun 24 i    +- Re: Stealing a Great Idea from the 66001Thomas Koenig
20 Jun 24 i    +- Re: Stealing a Great Idea from the 66001MitchAlsup1
31 Jul 24 i    `- Re: Stealing a Great Idea from the 66001Lawrence D'Oliveiro
13 Jun 24 +* Re: Stealing a Great Idea from the 660013MitchAlsup1
13 Jun 24 i+* Re: Stealing a Great Idea from the 66005Stefan Monnier
13 Jun 24 ii+* Re: Stealing a Great Idea from the 66003MitchAlsup1
14 Jun 24 iii`* Re: Stealing a Great Idea from the 66002Terje Mathisen
14 Jun 24 iii `- Re: Stealing a Great Idea from the 66001MitchAlsup1
30 Jul 24 ii`- Re: Stealing a Great Idea from the 66001Lawrence D'Oliveiro
30 Jul 24 i`* Re: Stealing a Great Idea from the 66007Lawrence D'Oliveiro
30 Jul 24 i `* Re: Stealing a Great Idea from the 66006Michael S
31 Jul 24 i  `* Re: Stealing a Great Idea from the 66005Lawrence D'Oliveiro
31 Jul 24 i   `* Re: Stealing a Great Idea from the 66004Michael S
31 Jul 24 i    `* Re: Stealing a Great Idea from the 66003MitchAlsup1
1 Aug 24 i     `* Re: Stealing a Great Idea from the 66002Lawrence D'Oliveiro
1 Aug 24 i      `- Re: Stealing a Great Idea from the 66001MitchAlsup1
14 Jun 24 `- Re: Stealing a Great Idea from the 66001Terje Mathisen

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal