Liste des Groupes | Revenir à c arch |
On Fri, 19 Jul 2024 20:25:51 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:
>MitchAlsup1 <mitchalsup@aol.com> schrieb:>
I, personally, have found many Newton-Raphson iterators that
converge faster using 1/SQRT(x) than using the SQRT(x) equivalent.
I can well believe that.
It is interesting to see what different architectures offer for
faster reciprocals.
POWER has fre and fres (double and single version) for approximate
divisin, which are accurate to 1/256. These operations are quite
fast, 4 to 7 cycles on POWER9, with up to 4 instructions per cycle
so obviously fully pipelined. With 1/256 accuracy, this could
actually be the original Quake algorithm (or its modification)
with a single Newton step, but this is of course much better in
hardware where exponent handling can be much simplified (and
done only once).
x86_64 has rcpss, accurate to 1/6144, with (looking at the
instruction tables) 6 for newer architectures, with a throuhtput
of 1/4.
It seems, you looked at the wrong instruction table.
Here are not the very modern x86-64 cores:
Arch Latency Throughput (scalar/128b/256b)
Zen3 3 2/2/1
Skylake 4 1/1/1
Ice Lake 4 1/1/1
Power9 5-7 4/2/N/A
>So, if your business depends on calculating many inaccurate>
square roots, fast, buy a POWER :-)
If you are have enough of independent rsqrt to do, all four processors
have the same theoretical peak throughput, but x86 tend to have more
cores and to run at faster clock. And lower latency makes achieving
peak throughput easier. Also, depending on target precision, higher
initial precision of x86 estimate means that sometimes you can get away
with 1 less NR iteration.
>
Also, if what you really need is sqrt rather than rsqrt, then depending
on how much inaccuracy you can accept, sometimes on modern x86 the
calculating accurate sqrt can be better solution than calculating
approximation. It is less likely to be the case on POWER9 Accurate sqrt
(single precision)
Zen3 14 0.20/0.200/0.200
SkyLake 12 0.33/0.333/0.167
Ice Lake 12 0.33/0.333/0.167
Power9 26 0.20/0.095/N/A
>
Accurate sqrt (double precision)
Zen3 20 0.111/0.111/0.111
Skylake 12 0.167/0.167/0.083
Ice Lake 12 0.167/0.167/0.083
Power9 36 0.111/0.067/N/A
>
>Other architectures I have tried don't seem to have it.>
Arm64 has it. It is called FRSQRTE.
>
>Does it make sense? Well, if you want to calculate lots of Arrhenius>
equations, you don't need full accuracy and (like in Mitch's case)
exp has become as fast as division, then it could actually make a
lot of sense. It is still possible to add Newton steps afterwards,
which is what gcc does if you add -mrecip -ffast-math.
I don't know about POWER, but on x86 I wouldn't do it.
I'd either use plain division that on modern cores is quite fast
or will use NR to calculate normal reciprocal. x86 provides initial
estimate for that too (RCPSS).
Date | Sujet | # | Auteur | |
13 Jul 24 | Continuations | 138 | Lawrence D'Oliveiro | |
13 Jul 24 | Re: Continuations | 4 | BGB | |
14 Jul 24 | Re: Continuations | 2 | aph | |
15 Jul 24 | Re: Continuations | 1 | Lawrence D'Oliveiro | |
14 Jul 24 | Re: Continuations | 1 | Anton Ertl | |
13 Jul 24 | Re: Continuations | 23 | John Dallman | |
14 Jul 24 | Re: Continuations | 21 | Lawrence D'Oliveiro | |
14 Jul 24 | Re: Continuations | 20 | George Neuner | |
14 Jul 24 | Re: Continuations | 19 | John Levine | |
14 Jul 24 | Re: Continuations | 18 | Niklas Holsti | |
15 Jul 24 | Re: Continuations | 16 | John Levine | |
15 Jul 24 | Re: Continuations | 1 | Terje Mathisen | |
15 Jul 24 | Re: Continuations | 1 | John Levine | |
15 Jul 24 | Re: Continuations | 9 | Niklas Holsti | |
16 Jul 24 | Re: Continuations | 8 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 7 | John Levine | |
16 Jul 24 | Re: Continuations | 1 | Chris M. Thomasson | |
16 Jul 24 | Re: Continuations | 5 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 4 | John Levine | |
16 Jul 24 | Re: Continuations | 3 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 2 | MitchAlsup1 | |
17 Jul 24 | Re: Continuations | 1 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 3 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 2 | MitchAlsup1 | |
16 Jul 24 | Re: Continuations | 1 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 1 | MitchAlsup1 | |
16 Jul 24 | Re: Continuations | 1 | Lawrence D'Oliveiro | |
14 Jul 24 | Re: Continuations | 1 | BGB | |
13 Jul 24 | Re: Continuations | 1 | BGB | |
14 Jul 24 | Re: Continuations | 10 | Lawrence D'Oliveiro | |
15 Jul 24 | Re: Continuations | 7 | Thomas Koenig | |
15 Jul 24 | Re: Continuations | 6 | Thomas Koenig | |
16 Jul 24 | Re: Continuations | 4 | Thomas Koenig | |
16 Jul 24 | Re: Continuations | 2 | MitchAlsup1 | |
17 Jul 24 | Re: Continuations | 1 | Lawrence D'Oliveiro | |
17 Jul 24 | Re: Continuations | 1 | Lawrence D'Oliveiro | |
17 Jul 24 | Re: Continuations | 1 | John Dallman | |
16 Jul 24 | Re: Continuations | 1 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 1 | John Levine | |
14 Jul 24 | Re: Continuations | 1 | George Neuner | |
14 Jul 24 | Re: Continuations | 92 | John Savard | |
14 Jul 24 | Re: Continuations | 1 | BGB | |
15 Jul 24 | Re: Continuations | 90 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 89 | John Savard | |
16 Jul 24 | Re: Continuations | 2 | MitchAlsup1 | |
17 Jul 24 | Re: Continuations | 1 | Lawrence D'Oliveiro | |
16 Jul 24 | Re: Continuations | 86 | MitchAlsup1 | |
17 Jul 24 | Re: Continuations | 69 | John Savard | |
17 Jul 24 | Re: Continuations | 68 | MitchAlsup1 | |
17 Jul 24 | Re: Continuations | 67 | Thomas Koenig | |
17 Jul 24 | Re: Continuations | 1 | Thomas Koenig | |
17 Jul 24 | Re: Continuations | 1 | Michael S | |
17 Jul 24 | Re: Continuations | 37 | MitchAlsup1 | |
17 Jul 24 | Re: Continuations | 36 | Stephen Fuld | |
17 Jul 24 | Re: Continuations | 35 | MitchAlsup1 | |
17 Jul 24 | Re: Continuations | 22 | Stephen Fuld | |
18 Jul 24 | Re: Continuations | 8 | MitchAlsup1 | |
18 Jul 24 | Re: Continuations | 1 | Michael S | |
18 Jul 24 | Re: Continuations | 6 | MitchAlsup1 | |
19 Jul 24 | Re: Continuations | 1 | Stephen Fuld | |
21 Jul 24 | Re: Reservation stations [was Continuations] | 2 | Anton Ertl | |
21 Jul 24 | Re: Reservation stations [was Continuations] | 1 | MitchAlsup1 | |
21 Jul 24 | Re: Reservation stations [was Continuations] | 2 | MitchAlsup1 | |
22 Jul 24 | IPC (was: Reservation stations) | 1 | Anton Ertl | |
18 Jul 24 | Re: Continuations | 11 | Thomas Koenig | |
18 Jul 24 | Re: Continuations | 10 | Michael S | |
18 Jul 24 | Re: Continuations | 9 | Thomas Koenig | |
18 Jul 24 | Re: Continuations | 8 | Michael S | |
18 Jul 24 | Re: Continuations | 6 | Thomas Koenig | |
18 Jul 24 | Re: Continuations | 1 | Michael S | |
18 Jul 24 | Re: Continuations | 4 | Michael S | |
19 Jul 24 | Re: Continuations | 3 | Thomas Koenig | |
19 Jul 24 | Re: Continuations | 2 | Michael S | |
20 Jul 24 | Re: Continuations | 1 | Thomas Koenig | |
18 Jul 24 | Re: Continuations | 1 | MitchAlsup1 | |
18 Jul 24 | Re: Continuations | 2 | John Savard | |
18 Jul 24 | Re: Continuations | 1 | Thomas Koenig | |
18 Jul 24 | Re: Continuations | 6 | Thomas Koenig | |
18 Jul 24 | Re: Continuations | 5 | Michael S | |
18 Jul 24 | Re: Continuations | 4 | Michael S | |
18 Jul 24 | Re: Continuations | 3 | Thomas Koenig | |
18 Jul 24 | Re: Continuations | 2 | MitchAlsup1 | |
20 Jul 24 | Re: Continuations | 1 | Thomas Koenig | |
18 Jul 24 | Non-pipelined FDIV/SQRT (was: Continuations) | 3 | Stefan Monnier | |
18 Jul 24 | Re: Non-pipelined FDIV/SQRT | 1 | MitchAlsup1 | |
28 Jul 24 | Re: Non-pipelined FDIV/SQRT | 1 | Michael S | |
18 Jul 24 | Re: Continuations | 3 | MitchAlsup1 | |
28 Jul 24 | Re: Continuations | 2 | Paul A. Clayton | |
28 Jul 24 | Re: Continuations | 1 | Michael S | |
19 Jul 24 | Re: Continuations | 27 | Terje Mathisen | |
19 Jul 24 | Re: Continuations | 5 | Thomas Koenig | |
19 Jul 24 | Re: Continuations | 1 | Chris M. Thomasson | |
19 Jul 24 | Re: Continuations | 3 | MitchAlsup1 | |
20 Jul 24 | Re: Continuations | 1 | Terje Mathisen | |
20 Jul 24 | Re: Continuations | 1 | Thomas Koenig | |
19 Jul 24 | Re: Continuations | 21 | MitchAlsup1 | |
19 Jul 24 | Re: Continuations | 8 | Terje Mathisen | |
22 Jul 24 | Re: Continuations | 7 | Michael S | |
22 Jul 24 | Re: Continuations | 3 | MitchAlsup1 | |
22 Jul 24 | Re: Continuations | 2 | Michael S | |
23 Jul 24 | Re: Continuations | 1 | MitchAlsup1 | |
23 Jul 24 | Re: Continuations | 3 | Terje Mathisen | |
19 Jul 24 | Faster div or 1/sqrt approximations (was: Continuations) | 12 | Thomas Koenig | |
17 Jul 24 | Re: Continuations | 3 | Lawrence D'Oliveiro | |
17 Jul 24 | Re: Continuations | 12 | Stephen Fuld | |
17 Jul 24 | Re: fancy instructions, Continuations | 1 | John Levine | |
15 Jul 24 | Re: Continuations | 1 | wolfgang kern | |
15 Jul 24 | Re: pessimal storage allocation, Continuations | 3 | John Levine | |
15 Jul 24 | Re: Continuations | 1 | MitchAlsup1 | |
16 Jul 24 | Re: Continuations | 1 | Lynn Wheeler |
Les messages affichés proviennent d'usenet.