Re: Stealing a Great Idea from the 6600

Liste des GroupesRevenir à c arch 
Sujet : Re: Stealing a Great Idea from the 6600
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.arch
Date : 21. Apr 2024, 19:57:27
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <152f8504112a37d8434c663e99cb36c5@www.novabbs.org>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13
User-Agent : Rocksolid Light
BGB wrote:

On 4/20/2024 5:03 PM, MitchAlsup1 wrote:
BGB wrote:
>
Compilers are notoriously unable to outguess a good branch predictor.
 

Errm, assuming the compiler is capable of things like general-case inlining and loop-unrolling.

I was thinking of simpler things, like shuffling operators between independent (sub)expressions to limit the number of register-register dependencies.

Like, in-order superscalar isn't going to do crap if nearly every instruction depends on every preceding instruction. Even pipelining can't help much with this.
Pipelining CREATED this (back to back dependencies). No amount of
pipelining can eradicate RAW data dependencies.

The compiler can shuffle the instructions into an order to limit the number of register dependencies and better fit the pipeline. But, then, most of the "hard parts" are already done (so it doesn't take much more for the compiler to flag which instructions can run in parallel).
Compiler scheduling works for exactly 1 pipeline implementation and
is suboptimal for all others.

Meanwhile, a naive superscalar may miss cases that could be run in parallel, if it is evaluating the rules "coarsely" (say, evaluating what is safe or not safe to run things in parallel based on general groupings of opcodes rather than the rules of specific opcodes; or, say, false-positive register alias if, say, part of the Imm field of a 3RI instruction is interpreted as a register ID, ...).

Granted, seemingly even a naive approach is able to get around 20% ILP out of "GCC -O3" output for RV64G...

But, the GCC output doesn't seem to be quite as weak as some people are claiming either.

ties the code to a specific pipeline structure, and becomes effectively moot with OoO CPU designs).
 OoO exists, in a practical sense, to abstract the pipeline out of the compiler; or conversely, to allow multiple implementations to run the
same compiled code optimally on each implementation.
 

Granted, but OoO isn't cheap.
But it does get the job done.

So, a case could be made that a "general use" ISA be designed without the use of explicit bundling. In my case, using the bundle flags also requires the code to use an instruction to signal to the CPU what configuration of pipeline it expects to run on, with the CPU able to fall back to scalar (or superscalar) execution if it does not match.
 Sounds like a bridge too far for your 8-wide GBOoO machine.
 

For sake of possible fancier OoO stuff, I upheld a basic requirement for the instruction stream:
The semantics of the instructions as executed in bundled order needs to be equivalent to that of the instructions as executed in sequential order.

In this case, the OoO CPU can entirely ignore the bundle hints, and treat "WEXMD" as effectively a NOP.

This would have broken down for WEX-5W and WEX-6W (where enforcing a parallel==sequential constraint effectively becomes unworkable, and/or renders the wider pipeline effectively moot), but these designs are likely dead anyways.

And, with 3-wide, the parallel==sequential order constraint remains in effect.

For the most part, thus far nearly everything has ended up as "Mode 2", namely:
   3 lanes;
     Lane 1 does everything;
     Lane 2 does Basic ALU ops, Shift, Convert (CONV), ...
     Lane 3 only does Basic ALU ops and a few CONV ops and similar.
       Lane 3 originally also did Shift, dropped to reduce cost.
     Mem ops may eat Lane 3, ...
 Try 6-lanes:
    1,2,3 Memory ops + integer ADD and Shifts
    4     FADD   ops + integer ADD and FMisc
    5     FMAC   ops + integer ADD
    6     CMP-BR ops + integer ADD
 

As can be noted, my thing is more a "LIW" rather than a "true VLIW".
Mine is neither LIW or VLIW but it definitely is LBIO through GBOoO

So, MEM/BRA/CMP/... all end up in Lane 1.

Lanes 2/3 effectively ending up used for fold over most of the ALU ops turning Lane 1 mostly into a wall of Load and Store instructions.

Where, say:
   Mode 0 (Default):
     Only scalar code is allowed, CPU may use superscalar (if available).
   Mode 1:
     2 lanes:
       Lane 1 does everything;
       Lane 2 does ALU, Shift, and CONV.
     Mem ops take up both lanes.
       Effectively scalar for Load/Store.
       Later defined that 128-bit MOV.X is allowed in a Mode 1 core.
Modeless.
 

Had defined wider modes, and ones that allow dual-lane IO and FPU instructions, but these haven't seen use (too expensive to support in hardware).
 
Had ended up with the ambiguous "extension" to the Mode 2 rules of allowing an FPU instruction to be executed from Lane 2 if there was not an FPU instruction in Lane 1, or allowing co-issuing certain FPU instructions if they effectively combine into a corresponding SIMD op.
 
In my current configurations, there is only a single memory access port.
 This should imply that your 3-wide pipeline is running at 90%-95% memory/cache saturation.
 

If you mean that execution is mostly running end-to-end memory operations, yeah, this is basically true.

Comparably, RV code seems to end up running a lot of non-memory ops in Lane 1, whereas BJX2 is mostly running lots of memory ops, with Lane 2 handling most of the ALU ops and similar (and Lane 3, occasionally).
One of the things that I notice with My 66000 is when you get all the constants you ever need at the calculation OpCodes, you end up with FEWER instructions that "go random places" such as instructions that
<well> paste constants together. This leave you with a data dependent
string of calculations with occasional memory references. That is::
universal constants gets rid of the easy to pipeline extra instructions
leaving the meat of the algorithm exposed.

 If you design around the notion of a 3R1W register file, FMAC and INSERT
fall out of the encoding easily. Done right, one can switch it into a 4R
or 4W register file for ENTER and EXIT--lessening the overhead of call/ret.
 

Possibly.

It looks like some savings could be possible in terms of prologs and epilogs.

As-is, these are generally like:
   MOV    LR, R18
   MOV    GBR, R19
   ADD    -192, SP
   MOV.X  R18, (SP, 176)  //save GBR and LR
   MOV.X  ...  //save registers
Why not an instruction that saves LR and GBR without wasting instructions
to place them side by side prior to saving them ??

   WEXMD  2  //specify that we want 3-wide execution here

   //Reload GBR, *1
   MOV.Q  (GBR, 0), R18
   MOV    0, R0  //special reloc here
   MOV.Q  (GBR, R0), R18
   MOV    R18, GBR
It is gorp like that that lead me to do it in HW with ENTER and EXIT.
Save registers to the stack, setup FP if desired, allocate stack on SP, and decide if EXIT also does RET or just reloads the file. This would require 2 free registers if done in pure SW, along with several MOVs...

   //Generate Stack Canary, *2
   MOV    0x5149, R18  //magic number (randomly generated)
   VSKG   R18, R18  //Magic (combines input with SP and magic numbers)
   MOV.Q  R18, (SP, 144)

   ...
   function-specific stuff
   ...

   MOV    0x5149, R18
   MOV.Q  (SP, 144), R19
   VSKC   R18, R19  //Validate canary
   ...

*1: This part ties into the ABI, and mostly exists so that each PE image can get GBR reloaded back to its own ".data"/".bss" sections (with
Universal displacements make GBR unnecessary as a memory reference can
be accompanied with a 16-bit, 32-bit, or 64-bit displacement. Yes, you can read GOT[#i] directly without a pointer to it.

multiple program instances in a single address space). But, does mean that pretty much every non-leaf function ends up needing to go through this ritual.
Universal constant solves the underlying issue.

*2: Pretty much any function that has local arrays or similar, serves to protect register save area. If the magic number can't regenerate a matching canary at the end of the function, then a fault is generated.
My 66000 can place the callee save registers in a place where user cannot
access them with LDs or modify them with STs. So malicious code cannot
damage the contract between ABI and core.

The cost of some of this starts to add up.

In isolation, not much, but if all this happens, say, 500 or 1000 times or more in a program, this can add up.
Was thinking about that last night. H&P "book" statistics say that call/ret
represents 2% of instructions executed. But if you add up the prologue and
epilogue instructions you find 8% of instructions are related to calling and returning--taking the problem from (at 2%) ignorable to (at 8%) a big
ticket item demanding something be done.
8% represents saving/restoring only 3 registers vis stack and associated SP
arithmetic. So, it can easily go higher.

....

Date Sujet#  Auteur
17 Apr 24 * Stealing a Great Idea from the 6600128John Savard
18 Apr 24 +* Re: Stealing a Great Idea from the 6600125MitchAlsup1
18 Apr 24 i`* Re: Stealing a Great Idea from the 6600124John Savard
18 Apr 24 i `* Re: Stealing a Great Idea from the 6600123MitchAlsup1
19 Apr 24 i  `* Re: Stealing a Great Idea from the 6600122John Savard
19 Apr 24 i   `* Re: Stealing a Great Idea from the 6600121John Savard
19 Apr 24 i    `* Re: Stealing a Great Idea from the 6600120MitchAlsup1
20 Apr 24 i     +* Re: Stealing a Great Idea from the 66002John Savard
21 Apr 24 i     i`- Re: Stealing a Great Idea from the 66001John Savard
20 Apr 24 i     `* Re: Stealing a Great Idea from the 6600117John Savard
20 Apr 24 i      `* Re: Stealing a Great Idea from the 6600116John Savard
20 Apr 24 i       `* Re: Stealing a Great Idea from the 6600115MitchAlsup1
20 Apr 24 i        +* Re: Stealing a Great Idea from the 6600105BGB
21 Apr 24 i        i`* Re: Stealing a Great Idea from the 6600104MitchAlsup1
21 Apr 24 i        i +* Re: Stealing a Great Idea from the 660063John Savard
21 Apr 24 i        i i+* Re: Stealing a Great Idea from the 660015John Savard
25 Apr 24 i        i ii`* Re: Stealing a Great Idea from the 660014Lawrence D'Oliveiro
25 Apr 24 i        i ii +* Re: Stealing a Great Idea from the 660012MitchAlsup1
25 Apr 24 i        i ii i+- Re: Stealing a Great Idea from the 66001Lawrence D'Oliveiro
30 Apr 24 i        i ii i`* Re: a bit of history, Stealing a Great Idea from the 660010John Levine
3 May 24 i        i ii i `* Re: a bit of history, Stealing a Great Idea from the 66009Anton Ertl
3 May 24 i        i ii i  +* Re: a bit of history, Stealing a Great Idea from the 66007John Levine
4 May 24 i        i ii i  i`* Re: a bit of history, Stealing a Great Idea from the 66006Thomas Koenig
4 May 24 i        i ii i  i +* Re: a bit of history, Stealing a Great Idea from the 66004John Levine
4 May 24 i        i ii i  i i`* Re: a bit of history, Stealing a Great Idea from the 66003MitchAlsup1
5 May 24 i        i ii i  i i `* Re: a bit of history, Stealing a Great Idea from the 66002Thomas Koenig
5 May 24 i        i ii i  i i  `- Re: a bit of history, Stealing a Great Idea from the 66001MitchAlsup1
28 Jul 24 i        i ii i  i `- Re: a bit of history, Stealing a Great Idea from the 66001Lawrence D'Oliveiro
3 May 24 i        i ii i  `- Re: a bit of history, Stealing a Great Idea from the 66001MitchAlsup1
25 Apr 24 i        i ii `- Re: Stealing a Great Idea from the 66001John Savard
21 Apr 24 i        i i`* Re: Stealing a Great Idea from the 660047MitchAlsup1
23 Apr 24 i        i i +* Re: Stealing a Great Idea from the 660045George Neuner
23 Apr 24 i        i i i`* Re: Stealing a Great Idea from the 660044MitchAlsup1
25 Apr 24 i        i i i `* Re: Stealing a Great Idea from the 660043George Neuner
26 Apr 24 i        i i i  `* Re: Stealing a Great Idea from the 660042BGB
26 Apr 24 i        i i i   `* Re: Stealing a Great Idea from the 660041MitchAlsup1
26 Apr 24 i        i i i    +* Re: Stealing a Great Idea from the 66002Anton Ertl
26 Apr 24 i        i i i    i`- Re: Stealing a Great Idea from the 66001MitchAlsup1
26 Apr 24 i        i i i    +* Re: Stealing a Great Idea from the 66004BGB
26 Apr 24 i        i i i    i+* Re: Stealing a Great Idea from the 66002MitchAlsup1
27 Apr 24 i        i i i    ii`- Re: Stealing a Great Idea from the 66001BGB
26 Apr 24 i        i i i    i`- Re: Stealing a Great Idea from the 66001MitchAlsup1
27 Apr 24 i        i i i    `* Re: Stealing a Great Idea from the 660034BGB
27 Apr 24 i        i i i     `* Re: Stealing a Great Idea from the 660033MitchAlsup1
28 Apr 24 i        i i i      `* Re: Stealing a Great Idea from the 660032BGB
28 Apr 24 i        i i i       `* Re: Stealing a Great Idea from the 660031MitchAlsup1
28 Apr 24 i        i i i        `* Re: Stealing a Great Idea from the 660030BGB
28 Apr 24 i        i i i         +* Re: Stealing a Great Idea from the 660024BGB
28 Apr 24 i        i i i         i`* Re: Stealing a Great Idea from the 660023BGB
28 Apr 24 i        i i i         i `* Re: Stealing a Great Idea from the 660022Thomas Koenig
28 Apr 24 i        i i i         i  `* Re: Stealing a Great Idea from the 660021BGB
28 Apr 24 i        i i i         i   `* Re: Stealing a Great Idea from the 660020BGB
28 Apr 24 i        i i i         i    +* Re: Stealing a Great Idea from the 66002Thomas Koenig
28 Apr 24 i        i i i         i    i`- Re: Stealing a Great Idea from the 66001BGB
29 Jul 24 i        i i i         i    +* Re: Stealing a Great Idea from the 660016Lawrence D'Oliveiro
29 Jul 24 i        i i i         i    i+* Re: Stealing a Great Idea from the 66006BGB
30 Jul 24 i        i i i         i    ii`* Re: Stealing a Great Idea from the 66005Lawrence D'Oliveiro
30 Jul 24 i        i i i         i    ii `* Re: Stealing a Great Idea from the 66004BGB
31 Jul 24 i        i i i         i    ii  `* Re: Stealing a Great Idea from the 66003Lawrence D'Oliveiro
31 Jul 24 i        i i i         i    ii   `* Re: Stealing a Great Idea from the 66002BGB
1 Aug 24 i        i i i         i    ii    `- Re: Stealing a Great Idea from the 66001Lawrence D'Oliveiro
29 Jul 24 i        i i i         i    i`* Re: Stealing a Great Idea from the 66009Terje Mathisen
29 Jul 24 i        i i i         i    i `* Re: Stealing a Great Idea from the 66008MitchAlsup1
30 Jul 24 i        i i i         i    i  +- Re: Stealing a Great Idea from the 66001Lawrence D'Oliveiro
30 Jul 24 i        i i i         i    i  +* Re: Stealing a Great Idea from the 66004Michael S
30 Jul 24 i        i i i         i    i  i`* Re: Stealing a Great Idea from the 66003MitchAlsup1
31 Jul 24 i        i i i         i    i  i `* Re: Stealing a Great Idea from the 66002BGB
1 Aug 24 i        i i i         i    i  i  `- Re: Stealing a Great Idea from the 66001Lawrence D'Oliveiro
1 Aug 24 i        i i i         i    i  `* Re: Stealing a Great Idea from the 66002Thomas Koenig
1 Aug 24 i        i i i         i    i   `- Re: Stealing a Great Idea from the 66001MitchAlsup1
29 Jul 24 i        i i i         i    `- Re: Stealing a Great Idea from the 66001George Neuner
28 Apr 24 i        i i i         `* Re: Stealing a Great Idea from the 66005MitchAlsup1
28 Apr 24 i        i i i          `* Re: Stealing a Great Idea from the 66004BGB
29 Apr 24 i        i i i           `* Re: Stealing a Great Idea from the 66003MitchAlsup1
29 Apr 24 i        i i i            `* Re: Stealing a Great Idea from the 66002BGB
29 Apr 24 i        i i i             `- Re: Stealing a Great Idea from the 66001Thomas Koenig
29 Apr 24 i        i i `- Re: Stealing a Great Idea from the 66001Tim Rentsch
21 Apr 24 i        i `* Re: Stealing a Great Idea from the 660040BGB
21 Apr 24 i        i  `* Re: Stealing a Great Idea from the 660039MitchAlsup1
22 Apr 24 i        i   +* Re: Stealing a Great Idea from the 66003BGB
22 Apr 24 i        i   i`* Re: Stealing a Great Idea from the 66002MitchAlsup1
22 Apr 24 i        i   i `- Re: Stealing a Great Idea from the 66001BGB
22 Apr 24 i        i   +* Re: Stealing a Great Idea from the 66002John Savard
22 Apr 24 i        i   i`- Re: Stealing a Great Idea from the 66001BGB
22 Apr 24 i        i   `* Re: Stealing a Great Idea from the 660033Terje Mathisen
22 Apr 24 i        i    +- Re: Stealing a Great Idea from the 66001BGB
13 Jun 24 i        i    `* Re: Stealing a Great Idea from the 660031Kent Dickey
13 Jun 24 i        i     +* Re: Stealing a Great Idea from the 660016Stefan Monnier
13 Jun 24 i        i     i`* Re: Stealing a Great Idea from the 660015BGB
13 Jun 24 i        i     i `* Re: Stealing a Great Idea from the 660014MitchAlsup1
14 Jun 24 i        i     i  `* Re: Stealing a Great Idea from the 660013BGB
18 Jun 24 i        i     i   `* Re: Stealing a Great Idea from the 660012MitchAlsup1
19 Jun 24 i        i     i    +* Re: Stealing a Great Idea from the 66008BGB
19 Jun 24 i        i     i    i`* Re: Stealing a Great Idea from the 66007MitchAlsup1
19 Jun 24 i        i     i    i +* Re: Stealing a Great Idea from the 66005BGB
19 Jun 24 i        i     i    i i`* Re: Stealing a Great Idea from the 66004MitchAlsup1
20 Jun 24 i        i     i    i i `* Re: Stealing a Great Idea from the 66003Thomas Koenig
20 Jun 24 i        i     i    i i  `* Re: Stealing a Great Idea from the 66002MitchAlsup1
21 Jun 24 i        i     i    i i   `- Re: Stealing a Great Idea from the 66001Thomas Koenig
20 Jun 24 i        i     i    i `- Re: Stealing a Great Idea from the 66001John Savard
19 Jun 24 i        i     i    +- Re: Stealing a Great Idea from the 66001Thomas Koenig
20 Jun 24 i        i     i    +- Re: Stealing a Great Idea from the 66001MitchAlsup1
31 Jul 24 i        i     i    `- Re: Stealing a Great Idea from the 66001Lawrence D'Oliveiro
13 Jun 24 i        i     +* Re: Stealing a Great Idea from the 660013MitchAlsup1
14 Jun 24 i        i     `- Re: Stealing a Great Idea from the 66001Terje Mathisen
22 Apr 24 i        `* Re: Stealing a Great Idea from the 66009John Savard
18 Apr 24 `* Re: Stealing a Great Idea from the 66002Lawrence D'Oliveiro

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal