Sujet : Re: Continuations
De : already5chosen (at) *nospam* yahoo.com (Michael S)
Groupes : comp.archDate : 28. Jul 2024, 09:39:04
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <20240728113904.00007675@yahoo.com>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
On Sat, 27 Jul 2024 20:06:11 -0400
"Paul A. Clayton" <
paaronclayton@gmail.com> wrote:
On 7/18/24 12:40 PM, MitchAlsup1 wrote:
[snip]
Over the 20 cycles the multiplier is doing Goldschmidt iterations,
there
are only 3 slots where a different instruction could sneak through.
Since initial iterations could (I think) use reduced precision, it
might also be possible to use part of the multiplier for a
different lower precision multiplication.
(ISTR, an AMD implementation used different precision for
different steps.)
I never looked at Goldschmidt division algorithm before (partly
because it does not look practical for software implementations and
partly with no particular reason|) so it's possible that I am missing
something obvious.
At the first glance, it appears that cutting precision of F(1) and F(2)
factors is not a matter of reduction in computational complexity. It is
a necessity for correctness.
If the number of significant bits in F(1) and F(2) not cut upfront, we
will have to calculate the result products N(i) with impractically high
precision.
With F(i) properly cut, the required precision for N(i) is still high.
Something like 67 bits for first iteration, 95 bits for 2nd, 151 bits
for the 3rd and hopefully last iteration.
I still didn't figure out if the same sequence applies to D(i) or it is
possible to cut corners here.
BTW, are we sure that modern processors really use Goldshmidt division?
Intel and AMD both have FMA latency=4 and worst case DIVSD latency=14,
so doing it with highly optimized Goldshmidt appears [border-line]
possible. The same applies to high-end Arm Inc. cores (worst case
FDIV latency=15). But on Apple Firestorm the latency of FDIV.FP64 = 10
which sounds too fast for Goldshmidt.