Re: Arguments for a sane ISA 6-years later

Liste des GroupesRevenir à c arch 
Sujet : Re: Arguments for a sane ISA 6-years later
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 30. Jul 2024, 23:47:44
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v8bqik$17qhc$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10
User-Agent : Mozilla Thunderbird
On 7/30/2024 3:56 PM, Chris M. Thomasson wrote:
On 7/30/2024 1:23 PM, BGB wrote:
On 7/30/2024 4:44 AM, Anton Ertl wrote:
BGB <cr88192@gmail.com> writes:
Otherwise, stuff isn't going to fit into the FPGAs.
>
Something like TSO is a lot of complexity for not much gain.
>
Given that you are so constrained, the easiest corner to cut is to
have only one core.  And then even seqyential consistency is trivial
to implement.
>
>
On the XC7A100T, this is what I am doing...
>
With the current feature-set, don't have enough resource budget to go dual core at present.
>
I can go dual core on the Xc7A200T though.
>
>
>
Granted, one could argue that maybe one should not do such an elaborate CPU. Say, a case could be made for just doing a RISC-V implementation.
>
There is an RV32GC implementation (dual-issue superscalar) that can run on the XC7A100T that, ironically, still takes most of the FPGA and can only run at ~ 25 or 33 MHz. Its IPC is pretty good, but it runs at a low clock-speed and is 32-bit.
>
Only real way to make small/fast cores though is to make them single-issue and limit the feature-set (only doing a basic integer ISA).
[...]
 Have you ever messed around with a Cell processor? Think of its vector processing units, or Synergistic Processing Elements (SPE) iirc. Also, iirc it was not that easy to program for. buffered DMA wrt the SPE's, again iirc. So, some games only used the "single" PPE unit. Iirc, they wanted more PPE units but that was not realized...
 
No real first-hand experience programming for it, but was early 20s when the PlayStation3 came out, and wasn't really messing with much of anything beyond normal desktop PCs at the time.
I had a few times considered trying to pair a bigger core (such as one running BJX2 main profile) with smaller cores (running a smaller profile for BJX2), but couldn't really get the smaller core small enough while still being useful for what I wanted to do with it.
While a moderately smaller core is possible by using a single-issue integer-only design, this is rather limited...
And, sticking two more feature-limited cores on an FPGA isn't terribly useful.
Nor is going tri-core or quad-core with minimalist cores.
Say:
   One core, of my current configuration is more useful than, say:
   Two cores that do basic Integer+FPU+TLB;
   Four cores, that only do Integer.
Like, say, an RV64I or RV32IM quad-core would not necessarily all that useful.
Trying to fit the BJX2 core on an XC7S50, I needed to drop to 2-wide in order to fit it in with the fast SIMD unit.
It was a tradeoff between:
   3-wide, but 10 cycle SIMD ops;
   2-wide, with 3 cycle SIMD ops.
On the XC7S25 or XC7A35T, not really managed to fit much beyond simple integer cores. But, these FPGAs are small enough, that it is generally better to drop to 32-bit.
So, for example, an RV32IM is about what makes sense on an XC7S25 or XC7A35T.
Where the last number is loosely correlated to total LUT size:
   XC7A100T is ~ 3x the LUTs as the XC7A35T.
But not exactly 1:1 between Artix and Spartan.
For Spartan, the number is closer to the number of kLUTs, but Artix has slightly less LUTs relative to the part number; so the XC7S25 and XC7A35T are fairly comparable.
As for the matter of, if I add SIMD ops for 8-bit multiply widening to Binary16, whether to use A-Law or FP8, currently FP8 seems to be ahead:
   More popular (NVIDIA is also using FP8);
   More dynamic range;
   Will be slightly cheaper to implement;
   ...
Also torn between the more expensive route:
   Trying for a 3 cycle MAC operation;
   Would likely glue it onto the low-precision SIMD unit.
Or, the cheaper route:
   Trying for a 2 cycle PMUL;
   Likely via the CONV2 path.
Not likely worthwhile to put it in the 3-cycle MUL path:
   Would gain little performance-wise over converter ops;
   This was mostly used for more complex converters:
     Index-Color Packing;
     Color-Cell Encode;
     ...
The operation logic is likely fast enough that it could be put in a 2-cycle path.
Though, trying to shove it onto the front-end of a SIMD FADD is likely pushing it.
Multiplier logic likely something like:
   tSgnA=valA[7];
   tSgnB=valB[7];
   tExpA={ valA[6], !valA[6], valA[5:3] };
   tExpB={ valB[6], !valB[6], valB[5:3] };
   tFraA=valA[2:0];
   tFraB=valB[2:0];
   tZeroA=(valA[6:0]==7'h00);
   tZeroB=(valB[6:0]==7'h00);
   tSgnC=tSgnA^tSgnB;
   tExpC0=tExpA+tExpB+0;
   tExpC1=tExpA+tExpB+1;
   tZeroC=tZeroA|tZeroB;
   case({tFraA, tFraB})
     6'b000_000: tFraC0=8'h40;
     6'b000_001: tFraC0=8'h48;
     ...
     6'b001_001: tFraC0=8'h51;
     ...
     6'b111_111: tFraC0=8'hE1;
   endcase
   if(tFraC0[7])
   begin
     tExpC=tExpC1;
     tFraC={tFraC0[7:0], 3'h0};
   end
   else
   begin
     tExpC=tExpC0;
     tFraC={tFraC0[6:0], 4'h0};
   end
   tValC={tSgnC, tExpC, tFraC[9:0]};
   if(tZeroC)
     tValC=16'h0000;
Which can most likely fit in a 2-cycle operation...
...

Date Sujet#  Auteur
24 Jul 24 * Arguments for a sane ISA 6-years later63MitchAlsup1
25 Jul 24 `* Re: Arguments for a sane ISA 6-years later62BGB
25 Jul 24  +* Re: Arguments for a sane ISA 6-years later57Chris M. Thomasson
26 Jul 24  i`* Re: Arguments for a sane ISA 6-years later56Anton Ertl
26 Jul 24  i +* Re: Arguments for a sane ISA 6-years later20BGB
29 Jul 24  i i`* Re: Arguments for a sane ISA 6-years later19Anton Ertl
29 Jul 24  i i +* Intel overvoltage (was: Arguments for a sane ISA 6-years later)2Thomas Koenig
29 Jul 24  i i i`- Re: Intel overvoltage1BGB
29 Jul 24  i i `* Re: Arguments for a sane ISA 6-years later16BGB
30 Jul 24  i i  `* Re: Arguments for a sane ISA 6-years later15Anton Ertl
30 Jul 24  i i   `* Re: Arguments for a sane ISA 6-years later14BGB
30 Jul 24  i i    +* Re: Arguments for a sane ISA 6-years later2Chris M. Thomasson
31 Jul 24  i i    i`- Re: Arguments for a sane ISA 6-years later1BGB
1 Aug 24  i i    `* Re: Arguments for a sane ISA 6-years later11Anton Ertl
1 Aug 24  i i     +- Re: Arguments for a sane ISA 6-years later1Michael S
1 Aug 24  i i     +* Re: Arguments for a sane ISA 6-years later8MitchAlsup1
1 Aug 24  i i     i+- Re: Arguments for a sane ISA 6-years later1Michael S
2 Aug 24  i i     i`* Re: Arguments for a sane ISA 6-years later6MitchAlsup1
2 Aug 24  i i     i +- Re: Arguments for a sane ISA 6-years later1Michael S
4 Aug 24  i i     i `* Re: Arguments for a sane ISA 6-years later4MitchAlsup1
5 Aug 24  i i     i  `* Re: Arguments for a sane ISA 6-years later3Stephen Fuld
5 Aug 24  i i     i   `* Re: Arguments for a sane ISA 6-years later2Stephen Fuld
5 Aug 24  i i     i    `- Re: Arguments for a sane ISA 6-years later1MitchAlsup1
1 Aug 24  i i     `- Re: Arguments for a sane ISA 6-years later1BGB
26 Jul 24  i +* Re: Arguments for a sane ISA 6-years later20MitchAlsup1
27 Jul 24  i i+- Re: Arguments for a sane ISA 6-years later1BGB
29 Jul 24  i i`* Memory ordering (was: Arguments for a sane ISA 6-years later)18Anton Ertl
29 Jul 24  i i +* Re: Memory ordering15MitchAlsup1
29 Jul 24  i i i+* Re: Memory ordering6Chris M. Thomasson
29 Jul 24  i i ii`* Re: Memory ordering5MitchAlsup1
30 Jul 24  i i ii `* Re: Memory ordering4Michael S
31 Jul 24  i i ii  `* Re: Memory ordering3Chris M. Thomasson
31 Jul 24  i i ii   `* Re: Memory ordering2Chris M. Thomasson
31 Jul 24  i i ii    `- Re: Memory ordering1Chris M. Thomasson
30 Jul 24  i i i`* Re: Memory ordering8Anton Ertl
30 Jul 24  i i i +* Re: Memory ordering2Chris M. Thomasson
30 Jul 24  i i i i`- Re: Memory ordering1Chris M. Thomasson
31 Jul 24  i i i `* Re: Memory ordering5MitchAlsup1
31 Jul 24  i i i  +- Re: Memory ordering1Chris M. Thomasson
1 Aug 24  i i i  `* Re: Memory ordering3Anton Ertl
1 Aug 24  i i i   `* Re: Memory ordering2MitchAlsup1
2 Aug 24  i i i    `- Re: Memory ordering1Anton Ertl
29 Jul 24  i i `* Re: Memory ordering2Chris M. Thomasson
30 Jul 24  i i  `- Re: Memory ordering1Chris M. Thomasson
29 Jul 24  i +* Re: Arguments for a sane ISA 6-years later13Chris M. Thomasson
29 Jul 24  i i+* Re: Arguments for a sane ISA 6-years later9BGB
29 Jul 24  i ii`* Re: Arguments for a sane ISA 6-years later8Chris M. Thomasson
29 Jul 24  i ii +- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
29 Jul 24  i ii +* Re: Arguments for a sane ISA 6-years later2BGB
29 Jul 24  i ii i`- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
30 Jul 24  i ii `* Re: Arguments for a sane ISA 6-years later4jseigh
30 Jul 24  i ii  `* Re: Arguments for a sane ISA 6-years later3Chris M. Thomasson
31 Jul 24  i ii   `* Re: Arguments for a sane ISA 6-years later2jseigh
31 Jul 24  i ii    `- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
29 Jul 24  i i+- Memory ordering (was: Arguments for a sane ISA 6-years later)1Anton Ertl
29 Jul 24  i i`* Re: Arguments for a sane ISA 6-years later2MitchAlsup1
29 Jul 24  i i `- Re: Arguments for a sane ISA 6-years later1BGB
6 Aug 24  i `* Re: Arguments for a sane ISA 6-years later2Chris M. Thomasson
6 Aug 24  i  `- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
26 Jul 24  `* Re: Arguments for a sane ISA 6-years later4MitchAlsup1
27 Jul 24   +- Re: Arguments for a sane ISA 6-years later1BGB
28 Jul 24   `* Re: Arguments for a sane ISA 6-years later2Paul A. Clayton
28 Jul 24    `- Re: Arguments for a sane ISA 6-years later1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal