Re: My 66000 and High word facility

Liste des GroupesRevenir à c arch 
Sujet : Re: My 66000 and High word facility
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 12. Aug 2024, 22:58:45
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v9dt2a$3fdml$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9
User-Agent : Mozilla Thunderbird
On 8/12/2024 3:12 PM, MitchAlsup1 wrote:
On Mon, 12 Aug 2024 19:27:22 +0000, BGB wrote:
 
On 8/12/2024 12:36 PM, MitchAlsup1 wrote:
On Mon, 12 Aug 2024 6:29:36 +0000, Anton Ertl wrote:
>
Brett <ggtgp@yahoo.com> writes:
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
Brett <ggtgp@yahoo.com> writes:
The lack of CPU’s with 64 registers is what makes for a market,
that 4%
that could benefit have no options to pick from.
>
They had:
>
SPARC: Ok, only 32 GPRs available at a time, but more in hardware
through the Window mechanism.
>
AMD29K: IIRC a 128-register stack and 64 additional registers
>
IA-64: 128 GPRs and 128 FPRs with register stack and rotating register
files to make good use of them.
>
All antiques no longer available.
>
SPARC is still available: <https://en.wikipedia.org/wiki/SPARC> says:
>
|Fujitsu will also discontinue their SPARC production [...] end-of-sale
|in 2029, of UNIX servers and a year later for their mainframe.
>
No word of when Oracle will discontinue (or has discontinued) sales,
but both companies introduced their last SPARC CPUs in 2017.
>
In any case, my point still stands: these architectures were
available, and the large number of registers failed to give them a
decisive advantage.  Maybe it even gave them a decisive disadvantage:
AMD29K and IA-64 never had OoO implementations, and SPARC got them
only with the Fujitsu SPARC64 V in 2002 and the Oracle SPARC T4 in
2011, years after Intel, MIPS, HP switched to OoO im 1995/1996 and
Power and Alpha switched in 1998 (POWER3, 21264).
>
Where is your 4% number coming from?
>
The 4% number is poor memory and a guess.
Here is an antique paper on the issue:
>
https://www.eecs.umich.edu/techreports/cse/00/CSE-TR-434-00.pdf
>
Interesting.  I only skimmed the paper, but I read a lot about
inlining and interprocedural register allocation.  SPARCs register
windows and AMD29K's and IA-64's register stacks were intended to be
useful for that, but somehow the other architectures did not suffer a
big-enough disadvantage to make them adopt one of these concepts, and
that's despite register windows/stacks working even for indirect calls
(e.g., method calls in the general case), where interprocedural
register allocation or inlining don't help.
>
It seems to me that with OoO the cycle cost of spilling and refilling
on call boundaries was lowered: the spills can be delayed until the
computation is complete, and the refills can start early because the
stack pointer tends to be available early.
>
And recent OoO CPUs even have zero-cycle store-to-load forwarding, so
even if the called function is short, the spilling and refilling
around it (if any) does not increase the latency of the value that's
spilled and refilled.  But that consideration is only relevant for
Intel APX, ARM A64 and RISC-V went for 32 registers several years
before zero-cycle store-to-load-forwarding was implemented.
>
One other optimization that they use the additional registers for is
"register promotion", i.e., putting values from memory into registers
for a while (if absence of aliasing can be proven).  One interesting
aspect here is that register promotion with 64 or 256 registers (RP-64
and RP-256) is usually not much better (if better at all) than
register promotion with 32 registers (RP-32); see Figure 1.  So
register promotion does not make a strong case for more registers,
either, at least in this paper.
>
With full access to constants, there is even less need to promote
addresses or immediates into registers as you can simply poof them
up anything you want one.
>
>
There are tradeoffs still, if constants need space to encode...
>
Inline is still better than a memory load, granted.
>
May make sense to consolidate multiple uses of a value into a register
rather than try encoding them as an immediate each time.
 See polpak:: r8_erf()
  r8_erf:                                 ; @r8_erf
; %bb.0:
     fabs    r2,r1
     fcmp    r3,r2,#0x3EF00000
     bngt    r3,.LBB141_5
; %bb.1:
     fcmp    r3,r2,#4
     bngt    r3,.LBB141_6
; %bb.2:
     fcmp    r3,r2,#0x403A8B020C49BA5E
     bnlt    r3,.LBB141_7
; %bb.3:
     fmul    r3,r1,r1
     fdiv    r3,#1,r3
     mov    r4,#0x3F90B4FB18B485C7
     fmac    r4,r3,r4,#0x3FD38A78B9F065F6
     fadd    r5,r3,#0x40048C54508800DB
     fmac    r4,r3,r4,#0x3FD70FE40E2425B8
     fmac    r5,r3,r5,#0x3FFDF79D6855F0AD
     fmac    r4,r3,r4,#0x3FC0199D980A842F
     fmac    r5,r3,r5,#0x3FE0E4993E122C39
     fmac    r4,r3,r4,#0x3F9078448CD6C5B5
     fmac    r5,r3,r5,#0x3FAEFC42917D7DE7
     fmac    r4,r3,r4,#0x3F4595FD0D71E33C
     fmul    r4,r3,r4
     fmac    r3,r3,r5,#0x3F632147A014BAD1
     fdiv    r3,r4,r3
     fadd    r3,#0x3FE20DD750429B6D,-r3
     fdiv    r3,r3,r2
     br    .LBB141_4
LBB141_5:
     fmul    r3,r1,r1
     fcmp    r2,r2,#0x3C9FFE5AB7E8AD5E
     sra    r2,r2,#8,#1
     cvtsd    r4,#0
     mux    r2,r2,r3,r4
     mov    r3,#0x3FC7C7905A31C322
     fmac    r3,r2,r3,#0x400949FB3ED443E9
     fadd    r4,r2,#0x403799EE342FB2DE
     fmac    r3,r2,r3,#0x405C774E4D365DA3
     fmac    r4,r2,r4,#0x406E80C9D57E55B8
     fmac    r3,r2,r3,#0x407797C38897528B
     fmac    r4,r2,r4,#0x40940A77529CADC8
     fmac    r3,r2,r3,#0x40A912C1535D121A
     fmul    r1,r3,r1
     fmac    r2,r2,r4,#0x40A63879423B87AD
     fdiv    r2,r1,r2
     mov    r1,r2
     ret
LBB141_6:
     mov    r3,#0x3E571E703C5F5815
     fmac    r3,r2,r3,#0x3FE20DD508EB103E
     fadd    r4,r2,#0x402F7D66F486DED5
     fmac    r3,r2,r3,#0x4021C42C35B8BC02
     fmac    r4,r2,r4,#0x405D6C69B0FFCDE7
     fmac    r3,r2,r3,#0x405087A0D1C420D0
     fmac    r4,r2,r4,#0x4080C972E588749E
     fmac    r3,r2,r3,#0x4072AA2986ABA462
     fmac    r4,r2,r4,#0x4099558EECA29D27
     fmac    r3,r2,r3,#0x408B8F9E262B9FA3
     fmac    r4,r2,r4,#0x40A9B599356D1202
     fmac    r3,r2,r3,#0x409AC030C15DC8D7
     fmac    r4,r2,r4,#0x40B10A9E7CB10E86
     fmac    r3,r2,r3,#0x40A0062821236F6B
     fmac    r4,r2,r4,#0x40AADEBC3FC90DBD
     fmac    r3,r2,r3,#0x4093395B7FD2FC8E
     fmac    r4,r2,r4,#0x4093395B7FD35F61
     fdiv    r3,r3,r4
LBB141_4:
     fmul    r4,r2,#16
     fmul    r4,r4,#0x3D800000
     rnd    r4,r4,#5
     fadd    r5,r2,-r4
     fadd    r2,r2,r4
     fmul    r4,r4,-r4
     fexp    r4,r4
     fmul    r2,r2,-r5
     fexp    r2,r2
     fmul    r2,r4,r2
     fadd    r2,#0,-r2
     fmac    r2,r2,r3,#0x3F000000
     fadd    r2,r2,#0x3F000000
     pdlt    r1,T
     fadd    r2,#0,-r2
     mov    r1,r2
     ret
LBB141_7:
     fcmp    r1,r1,#0
     sra    r1,r1,#8,#1
     cvtsd    r2,#-1
     cvtsd    r3,#1
     mux    r2,r1,r3,r2
     mov    r1,r2
     ret
 All of the constants are use once !
 RISC-V takes 240 instructions and uses 342 words of
memory {.text, .data, .rodata}
 My 66000 takes 85 instructions and uses 169 words of
memory {.text, .data, .rodata}
FWIW:
   FADD Rm, Imm64f, Rn  //XG2 Only
   FADD Rm, Imm56f, Rn  //
And:
   FMUL Rm, Imm64f, Rn  //XG2 Only
   FMUL Rm, Imm56f, Rn  //
Do exist in BJX2 as an optional feature (as 96 bit jumbo encodings in the FPIMM extension).
64f and 56f use the same encoding, except that in Baseline mode only 56 bits can be encoded with the low 8 bits filled with zeroes.
There is also Imm32f which is basically Binary64 truncated down to 32 bits (these cases can generally be encoded in a 64-bit encoding).
But, whether or not it makes much difference over loading working values into temporary registers is debatable...
As can be noted, double precision FPU ops are not pipelined in my case.
Well, except FADDA / FMULA, which only have Imm32f variants; but this isn't a huge loss given these ops only guarantee approximately single precision; and a majority of typical FPU constants have their low-order bits as 0.
Though, annoyingly, Imm32f has 3 bits less precision than a standard Binary32 "float".
Though, this was along with cases involving a 5-bit E3.F2 / 6-bit E3.F3 microfloat (in a 32-bit encoding; also part of FPIMM).
There are related encodings for SIMD vector immediate values, but these were part of a different "NNX" extension (which includes some features which were assumed to be primarily relevant to neural-net code). This includes things like SIMD immediates and SIMD instructions with built-in shuffle (kinda expensive).
But, I had left most of the FP8 stuff as part of GSVF (Floating Point SIMD) as they have more general SIMD / Audio / Graphics related uses.
...

Date Sujet#  Auteur
10 Aug 24 * My 66000 and High word facility93Brett
10 Aug 24 +* Re: My 66000 and High word facility91MitchAlsup1
11 Aug 24 i`* Re: My 66000 and High word facility90Brett
11 Aug 24 i +- Re: My 66000 and High word facility1Thomas Koenig
11 Aug 24 i +* Re: My 66000 and High word facility60Anton Ertl
11 Aug 24 i i+* Re: My 66000 and High word facility20Brett
12 Aug 24 i ii`* Re: My 66000 and High word facility19Anton Ertl
12 Aug 24 i ii +* Re: My 66000 and High word facility17MitchAlsup1
12 Aug 24 i ii i`* Re: My 66000 and High word facility16BGB
12 Aug 24 i ii i `* Re: My 66000 and High word facility15MitchAlsup1
12 Aug 24 i ii i  `* Re: My 66000 and High word facility14BGB
13 Aug 24 i ii i   `* Re: My 66000 and High word facility13MitchAlsup1
13 Aug 24 i ii i    `* Re: My 66000 and High word facility12BGB
13 Aug 24 i ii i     `* Re: My 66000 and High word facility11MitchAlsup1
13 Aug 24 i ii i      `* Re: My 66000 and High word facility10BGB
13 Aug 24 i ii i       `* Re: My 66000 and High word facility9MitchAlsup1
13 Aug 24 i ii i        +* Re: My 66000 and High word facility5Thomas Koenig
13 Aug 24 i ii i        i+* Re: My 66000 and High word facility3MitchAlsup1
14 Aug 24 i ii i        ii`* Re: My 66000 and High word facility2Thomas Koenig
14 Aug 24 i ii i        ii `- Re: My 66000 and High word facility1MitchAlsup1
14 Aug 24 i ii i        i`- Re: My 66000 and High word facility1BGB
14 Aug 24 i ii i        `* Re: My 66000 and High word facility3BGB
15 Aug 24 i ii i         `* Re: My 66000 and High word facility2MitchAlsup1
15 Aug 24 i ii i          `- Re: My 66000 and High word facility1BGB
15 Aug 24 i ii `- Re: My 66000 and High word facility1MitchAlsup1
11 Aug 24 i i+- Re: My 66000 and High word facility1Niklas Holsti
11 Aug 24 i i+* Re: My 66000 and High word facility30BGB
12 Aug 24 i ii`* Re: My 66000 and High word facility29Brett
12 Aug 24 i ii +- Re: My 66000 and High word facility1Terje Mathisen
15 Aug 24 i ii +* Re: My 66000 and High word facility25MitchAlsup1
15 Aug 24 i ii i`* Re: My 66000 and High word facility24Brett
15 Aug 24 i ii i `* Re: My 66000 and High word facility23Brett
15 Aug 24 i ii i  `* Re: My 66000 and High word facility22Stephen Fuld
16 Aug 24 i ii i   `* Re: My 66000 and High word facility21Brett
16 Aug 24 i ii i    +- Re: My 66000 and High word facility1Brett
16 Aug 24 i ii i    `* Re: My 66000 and High word facility19MitchAlsup1
17 Aug 24 i ii i     `* Re: My 66000 and High word facility18Brett
17 Aug 24 i ii i      +* Re: My 66000 and High word facility8Thomas Koenig
17 Aug 24 i ii i      i`* Re: My 66000 and High word facility7Brett
18 Aug 24 i ii i      i +* Re: My 66000 and High word facility5Thomas Koenig
18 Aug 24 i ii i      i i`* Re: My 66000 and High word facility4MitchAlsup1
18 Aug 24 i ii i      i i +- Re: My 66000 and High word facility1Brett
18 Aug 24 i ii i      i i `* Re: My 66000 and High word facility2Thomas Koenig
19 Aug 24 i ii i      i i  `- Re: My 66000 and High word facility1BGB
19 Aug 24 i ii i      i `- Re: My 66000 and High word facility1BGB
17 Aug 24 i ii i      `* Re: My 66000 and High word facility9MitchAlsup1
17 Aug 24 i ii i       `* Re: My 66000 and High word facility8Brett
18 Aug 24 i ii i        +* Re: My 66000 and High word facility2MitchAlsup1
18 Aug 24 i ii i        i`- Re: My 66000 and High word facility1Brett
19 Aug 24 i ii i        `* Re: My 66000 and High word facility5Stefan Monnier
19 Aug 24 i ii i         +- Re: My 66000 and High word facility1BGB
19 Aug 24 i ii i         `* Re: My 66000 and High word facility3MitchAlsup1
19 Aug 24 i ii i          +- Re: My 66000 and High word facility1Thomas Koenig
20 Aug 24 i ii i          `- Re: My 66000 and High word facility1Michael S
20 Aug 24 i ii `* Re: My 66000 and High word facility2Stefan Monnier
21 Aug 24 i ii  `- Re: My 66000 and High word facility1BGB
15 Aug 24 i i`* Re: My 66000 and High word facility8MitchAlsup1
15 Aug 24 i i +* Re: My 66000 and High word facility3Anton Ertl
15 Aug 24 i i i`* Re: My 66000 and High word facility2Michael S
15 Aug 24 i i i `- Re: My 66000 and High word facility1MitchAlsup1
15 Aug 24 i i `* Re: My 66000 and High word facility4Michael S
15 Aug 24 i i  `* Re: My 66000 and High word facility3Stephen Fuld
15 Aug 24 i i   `* Re: My 66000 and High word facility2Michael S
15 Aug 24 i i    `- Re: My 66000 and High word facility1MitchAlsup1
19 Aug 24 i `* Re: My 66000 and High word facility28MitchAlsup1
19 Aug 24 i  `* Re: My 66000 and High word facility27Brett
19 Aug 24 i   `* Re: My 66000 and High word facility26MitchAlsup1
20 Aug 24 i    +* Re: My 66000 and High word facility3Brett
20 Aug 24 i    i`* Re: My 66000 and High word facility2MitchAlsup1
20 Aug 24 i    i `- Re: My 66000 and High word facility1Brett
20 Aug 24 i    `* number of registers (was: My 66000 and High word facility)22Anton Ertl
20 Aug 24 i     `* Re: number of registers21MitchAlsup1
20 Aug 24 i      +* Re: number of registers13Michael S
20 Aug 24 i      i`* Re: number of registers12MitchAlsup1
21 Aug 24 i      i +* Re: number of registers6Brett
21 Aug 24 i      i i+* Re: number of registers4MitchAlsup1
21 Aug 24 i      i ii+* Re: number of registers2Brett
23 Aug 24 i      i iii`- Re: number of registers1Brett
22 Aug 24 i      i ii`- Re: number of registers1Stephen Fuld
21 Aug 24 i      i i`- Re: number of registers1Anton Ertl
21 Aug 24 i      i `* Re: number of registers5Anton Ertl
21 Aug 24 i      i  +* Re: number of registers3Stephen Fuld
21 Aug 24 i      i  i`* Re: number of registers2Anton Ertl
21 Aug 24 i      i  i `- Re: number of registers1Stephen Fuld
21 Aug 24 i      i  `- Re: number of registers1Anton Ertl
20 Aug 24 i      `* Re: number of registers7MitchAlsup1
21 Aug 24 i       `* Re: number of registers6Anton Ertl
21 Aug 24 i        +* Re: number of registers3Michael S
21 Aug 24 i        i`* Re: number of registers2Anton Ertl
21 Aug 24 i        i `- Re: number of registers1Michael S
21 Aug 24 i        `* Re: number of registers2MitchAlsup1
21 Aug 24 i         `- Re: number of registers1Michael S
10 Aug 24 `- Re: My 66000 and High word facility1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal