Sujet : Re: Memory ordering
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.archDate : 18. Nov 2024, 08:11:04
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Nov18.081104@mips.complang.tuwien.ac.at>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13
User-Agent : xrn 10.11
"Chris M. Thomasson" <
chris.m.thomasson.1@gmail.com> writes:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
What if you had to write code for a weakly ordered system, and the
performance guidelines said to only use a membar when you absolutely
have to. If you say something akin to "I do everything using
std::memory_order_seq_cst", well, that is a violation right off the bat.
...
I am trying to say you might not be hired if you only knew how to handle
std::memory_order_seq_cst wrt C++... ?
I am not looking to be hired.
In any case, this cuts both ways: If you are an employer working on
multi-threaded software, say, for Windows or Linux, will you reduce
your pool of potential hires by including a requirement like the one
above? And then pay for longer development time and additional
hard-to-find bugs coming from overshooting the requirement you stated
above. Or do you limit your software support to TSO hardware (for
lack of widely available SC hardware), and gain all the benefits of
more potential hires, reduced development time, and fewer bugs?
I have compared arguments against strong memory ordering with those
against floating-point. Von Neumann argued for fixed point as follows
<
https://booksite.elsevier.com/9780124077263/downloads/historial%20perspectives/section_3.11.pdf>:
|[...] human time is consumed in arranging for the introduction of
|suitable scale factors. We only argue that the time consumed is a
|very small percentage of the total time we will spend in preparing an
|interesting problem for our machine. The first advantage of the
|floating point is, we feel, somewhat illusory. In order to have such
|a floating point, one must waste memory capacity which could
|otherwise be used for carrying more digits per word.
Kahan writes <
https://people.eecs.berkeley.edu/~wkahan/SIAMjvnl.pdf>:
|Papers in 1947/8 by Bargman, Goldstein, Montgomery and von Neumann
|seemed to imply that 40-bit arithmetic would hardly ever deliver
|usable accuracy for the solution of so few as 100 linear equations in
|100 unknowns; but by 1954 engineers were solving bigger systems
|routinely and getting satisfactory accuracy from arithmetics with no
|more than 40 bits.
The flaw in the reasoning of the paper was:
|To solve it more easily without floating–point von Neumann had
|transformed equation Bx = c to B^TBx = B^Tc , thus unnecessarily
|doubling the number of sig. bits lost to ill-condition
This is an example of how the supposed gains that the harder-to-use
interface provides (in this case the bits "wasted" on the exponent)
are overcompensated by then having to use a software workaround for
the harder-to-use interface.
- anton
-- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>