Liste des Groupes | Revenir à c arch |
Michael S <already5chosen@yahoo.com> schrieb:On Sun, 12 May 2024 20:55:03 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:
John Dallman <jgd@cix.co.uk> schrieb:>In article <abe04jhkngt2uun1e7ict8vmf1fq8p7rnm@4ax.com>,
quadibloc@servername.invalid (John Savard) wrote:
I'm not really sure such floating-pont precision is useful, but
I do remember some people telling me that higher float
precision is indeed something to be desired.I would be in favour of 128-bit being available.
Me, too. Solving tricky linear systems, or obtaining derivatives
numerically (for example for Jacobians) eats up a _lot_ of
precision bits, and double precision can sometimes run into
trouble.
At least gcc and gfortran now support POWER's native 128-bit format
in hardware. On other systems, software emulation is used, which
is of course much slower.
Much slower?
I think, at least for matrix multiplication, my emulation on modern
x86 was within factor of 1.5x from your measurements on POWER9.
I don't remember the exact timing, and it might be interesting to
revisit that (also considering that the
gfortran code for matmul is
not optimized for 128-bit float and might have blown cache sizes,
plus it would be fair to compare compiler vs. compiler and assembler
vs. assembler).
>
I just looked it up - on POWER9, xsaddqp has 12 cycles of latency,
with one result per cycle, POWER10 has 12 to 13 cycles with two
results per cycle.
What can your code get on x86_64?
Les messages affichés proviennent d'usenet.