Liste des Groupes | Revenir à c arch |
On Tue, 03 Dec 2024 08:32:52 GMTTell that to a lock-free stack. Many OS's use them.
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:If you want capable dual-core or quad-core processor integrated withOn 11/18/2024 3:20 PM, Chris M. Thomasson wrote:...On 11/17/2024 11:11 PM, Anton Ertl wrote:The flaw in the reasoning of the paper was:
>
|To solve it more easily without floating–point von Neumann had
|transformed equation Bx = c to B^TBx = B^Tc , thus unnecessarily
|doubling the number of sig. bits lost to ill-condition
>
This is an example of how the supposed gains that the
harder-to-use interface provides (in this case the bits "wasted"
on the exponent) are overcompensated by then having to use a
software workaround for the harder-to-use interface.Don't tell me you want all of std::memory_order_* to default to>
std::memory_order_seq_cst? If your on a system that only has seq_cst
and nothing else, okay, but not on other weaker (memory order)
systems, right?
I tell anyone who wants to read it to stop buying hardware without FP
for non-integer work, and with weak memory ordering for work that
needs concurrent programming. There are enough affordable offerings
with FP and TSO that we do not need to waste programming time and
increase the frequency of hard-to-find bugs by figuring out how to get
good performance out of hardware without FP hardware and with weak
memory ordering.
>
Those who enjoy the challenge of dealing with the unnecessary problems
of sub-par hardware can continue to enjoy that.
>
But when developing production software, as a manager don't let
programmers with this hobby horse influence your hardware and
development decisions. Give full support for FP and TSO hardware, and
limited support to weakly-ordered hardware. That limited support may
consist of using software implementations of FP (instead of designing
software for fixed point arithmetic). In case of hardware with weak
ordering the limited support could be to use memory barriers liberally
(without trying to minimize them at all; every memory barrier
elimination costs development time and increases the potential for
hard-to-find bugs), of using OS mechanisms for concurrency (rather
than, e.g., lock-free algorithms), or maybe even only supporting
single-threaded operation.
>
Efficiently-implemented sequentially-consistent hardware would be even
more preferable, and if it was widely available, I would recommend
buying that over TSO hardware, but unfortunately we are not there yet.
>
- anton
FPGA then Arm Cortex-A is the only game in town right now, and probably
for a few years going forward. Typically, old low end cores. FPU is
there, TSO is not.
Fortunately, in majority of applications of this chips there is no need
for concurrent programming, but one is rarely 100% sure that need
wouldn't emerge when he starts a project.
BTW, does your stance means that your are strongly against A64FX ?
My own stance is that people should not do lockless concurrent
programming. Period.
Well, almost period. Something like RCU in Linux kernel is an exception.
May be, atomic updates of statistical counters is another exception,
but only when one is sure that his application will never have to scale
above 2 dozens of cores.
Lockless programming is horrendously complicated and error prone.
Sequential consistency removes only small part of potential
complications.
Les messages affichés proviennent d'usenet.