Re: Arguments for a sane ISA 6-years later

Liste des GroupesRevenir à c arch 
Sujet : Re: Arguments for a sane ISA 6-years later
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 27. Jul 2024, 00:01:43
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v819ss$31ob5$1@dont-email.me>
References : 1 2 3 4 5
User-Agent : Mozilla Thunderbird
On 7/26/2024 3:59 PM, MitchAlsup1 wrote:
On Fri, 26 Jul 2024 17:00:07 +0000, Anton Ertl wrote:
 
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 7/25/2024 1:09 PM, BGB wrote:
At least with a weak model, software knows that if it doesn't go through
the rituals, the memory will be stale.
>
There is no guarantee of staleness, only a lack of stronger ordering
guarantees.
>
The weak model is ideal for me. I know how to program for it
>
And the fact that this model is so hard to use that few others know
how to program for it make it ideal for you.
>
and it's more efficient
>
That depends on the hardware.
>
Yes, the Alpha 21164 with its imprecise exceptions was "more
efficient" than other hardware for a while, then the Pentium Pro came
along and gave us precise exceptions and more efficiency.  And
eventually the Alpha people learned the trick, too, and 21264 provided
precise exceptions (although they did not admit this) and more
efficieny.
>
Similarly, I expect that hardware that is designed for good TSO or
sequential consistency performance will run faster on code written for
this model than code written for weakly consistent hardware will run
on that hardware.
 According to Lamport; only the ATOMIC stuff needs sequential
consistency.
So, it is completely possible to have a causally consistent processor
that switches to sequential consistency when doing ATOMIC stuff and gain
performance when not doing ATOMIC stuff, and gain programmability when
doing atomic stuff.
 
Probably true.
Main thing that matters for consistency is things like mutex locks and shared buffers.
Often, for most everything else, consistency can be glossed over.
In a traditional weak model, one would flush the caches whenever locking a mutex or similar (to make sure that everything is written out before locking, and post-locking, up to date).
Where, here, one assumes that the only time memory is necessarily up to date, is after acquiring a mutex lock (but, for good measure, one can also do a flush when releasing the lock, such that any other threads that gain the lock will have an up-to-date view of anything that happened between gaining and releasing the mutex).
If a person is clever, they might try to sidestep the needs for the cache-flushing here, but this may result in any shared memory not being up to date.
This works a little better though if one assumes that any actively shared buffers are essentially read-only during the time each thread is doing its work (potentially followed by a consolidation phase, where all the threads flush their caches such that their view of memory is brought back in sync).

                   That's because software written for weakly
consistent hardware often has to insert barriers or atomic operations
just in case, and these operations are slow on hardware optimized for
weak consistency.
 The operations themselves are not slow. What is slow is delaying the
pipeline until it catches up to the stronger memory model before
proceeding.
How I attempted to do no-cache/volatile/atomic operations was roughly:
   If the cache line seen is not marked as volatile:
     Flush it;
   Fetch the line from memory, set it marked as volatile;
   Do the operation;
   Set up a mechanism to auto-flush the line.
If a volatile line is seen and we are not doing a volatile operation, flush it.
Auto-flush: Once we are not doing the volatile memory operation, look again at cache line, see that it is volatile, and flush it.
It is vaguely similar for TLB misses, where a TLB miss will load "whatever" from memory, but it is flagged so that the cache will auto-flush it at the nearest opportunity once the offending operation completes (though, with the slight difference that TLB Missed lines are unable to be marked as Dirty in a Store operation).

>
By contrast, one can design hardware for strong ordering such that the
slowness occurs only in those cases when actual (not potential)
communication between the cores happens, i.e., much less frequently.
 How would you do this for a 256-way banked memory system of the
NEC SX ?? I.E., the processor is not in charge of memory order--
the memory system is.
 
I have little idea personally how something like TSO could scale to manycore systems, or to systems where there is non-trivial communication latency (such as when threads are running across a LAN, or maybe the internet).
Meanwhile, weak consistency models are easier to scale up to high latency.
Say, as opposed to local cache flushing, the shared mutex lock effectively involves sending all of the dirty pages back to a server over a TCP socket or something; followed by re-downloading any of the shared pages afterwards (though, maybe with the server being able to signal which pages on the server might still be up-to-date from the client's POV and avoid the need to re-download them).
Granted, for high latency contexts, generally message-passing tends to become preferable to shared memory; but definitions here can get fuzzy.
Like, depending on how one looks at it, multiple players connected to a Minecraft server could be considered as a usage of a high-latency shared memory system (with the Minecraft terrain being essentially a sort of shared memory; albeit expressed via message passing over TCP/IP).
Though, for general use, it would make sense to limit the scope of "shared memory" to something more resembling a traditional "mmap()" style operation (with a mechanism to detect when pages are dirty and to then re-synchronize them).

>
and sometimes use cases do not care if they encounter "stale" data.
>
Great.  Unless these "sometimes" cases are more often than the cases
where you perform some atomic operation or barrier because of
potential, but not actual communication between cores, the weak model
is still slower than a well-implemented strong model.
>
- anton

Date Sujet#  Auteur
24 Jul 24 * Arguments for a sane ISA 6-years later63MitchAlsup1
25 Jul 24 `* Re: Arguments for a sane ISA 6-years later62BGB
25 Jul 24  +* Re: Arguments for a sane ISA 6-years later57Chris M. Thomasson
26 Jul 24  i`* Re: Arguments for a sane ISA 6-years later56Anton Ertl
26 Jul 24  i +* Re: Arguments for a sane ISA 6-years later20BGB
29 Jul 24  i i`* Re: Arguments for a sane ISA 6-years later19Anton Ertl
29 Jul 24  i i +* Intel overvoltage (was: Arguments for a sane ISA 6-years later)2Thomas Koenig
29 Jul 24  i i i`- Re: Intel overvoltage1BGB
29 Jul 24  i i `* Re: Arguments for a sane ISA 6-years later16BGB
30 Jul 24  i i  `* Re: Arguments for a sane ISA 6-years later15Anton Ertl
30 Jul 24  i i   `* Re: Arguments for a sane ISA 6-years later14BGB
30 Jul 24  i i    +* Re: Arguments for a sane ISA 6-years later2Chris M. Thomasson
31 Jul 24  i i    i`- Re: Arguments for a sane ISA 6-years later1BGB
1 Aug 24  i i    `* Re: Arguments for a sane ISA 6-years later11Anton Ertl
1 Aug 24  i i     +- Re: Arguments for a sane ISA 6-years later1Michael S
1 Aug 24  i i     +* Re: Arguments for a sane ISA 6-years later8MitchAlsup1
1 Aug 24  i i     i+- Re: Arguments for a sane ISA 6-years later1Michael S
2 Aug 24  i i     i`* Re: Arguments for a sane ISA 6-years later6MitchAlsup1
2 Aug 24  i i     i +- Re: Arguments for a sane ISA 6-years later1Michael S
4 Aug 24  i i     i `* Re: Arguments for a sane ISA 6-years later4MitchAlsup1
5 Aug 24  i i     i  `* Re: Arguments for a sane ISA 6-years later3Stephen Fuld
5 Aug 24  i i     i   `* Re: Arguments for a sane ISA 6-years later2Stephen Fuld
5 Aug 24  i i     i    `- Re: Arguments for a sane ISA 6-years later1MitchAlsup1
1 Aug 24  i i     `- Re: Arguments for a sane ISA 6-years later1BGB
26 Jul 24  i +* Re: Arguments for a sane ISA 6-years later20MitchAlsup1
27 Jul 24  i i+- Re: Arguments for a sane ISA 6-years later1BGB
29 Jul 24  i i`* Memory ordering (was: Arguments for a sane ISA 6-years later)18Anton Ertl
29 Jul 24  i i +* Re: Memory ordering15MitchAlsup1
29 Jul 24  i i i+* Re: Memory ordering6Chris M. Thomasson
29 Jul 24  i i ii`* Re: Memory ordering5MitchAlsup1
30 Jul 24  i i ii `* Re: Memory ordering4Michael S
31 Jul 24  i i ii  `* Re: Memory ordering3Chris M. Thomasson
31 Jul 24  i i ii   `* Re: Memory ordering2Chris M. Thomasson
31 Jul 24  i i ii    `- Re: Memory ordering1Chris M. Thomasson
30 Jul 24  i i i`* Re: Memory ordering8Anton Ertl
30 Jul 24  i i i +* Re: Memory ordering2Chris M. Thomasson
30 Jul 24  i i i i`- Re: Memory ordering1Chris M. Thomasson
31 Jul 24  i i i `* Re: Memory ordering5MitchAlsup1
31 Jul 24  i i i  +- Re: Memory ordering1Chris M. Thomasson
1 Aug 24  i i i  `* Re: Memory ordering3Anton Ertl
1 Aug 24  i i i   `* Re: Memory ordering2MitchAlsup1
2 Aug 24  i i i    `- Re: Memory ordering1Anton Ertl
29 Jul 24  i i `* Re: Memory ordering2Chris M. Thomasson
30 Jul 24  i i  `- Re: Memory ordering1Chris M. Thomasson
29 Jul 24  i +* Re: Arguments for a sane ISA 6-years later13Chris M. Thomasson
29 Jul 24  i i+* Re: Arguments for a sane ISA 6-years later9BGB
29 Jul 24  i ii`* Re: Arguments for a sane ISA 6-years later8Chris M. Thomasson
29 Jul 24  i ii +- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
29 Jul 24  i ii +* Re: Arguments for a sane ISA 6-years later2BGB
29 Jul 24  i ii i`- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
30 Jul 24  i ii `* Re: Arguments for a sane ISA 6-years later4jseigh
30 Jul 24  i ii  `* Re: Arguments for a sane ISA 6-years later3Chris M. Thomasson
31 Jul 24  i ii   `* Re: Arguments for a sane ISA 6-years later2jseigh
31 Jul 24  i ii    `- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
29 Jul 24  i i+- Memory ordering (was: Arguments for a sane ISA 6-years later)1Anton Ertl
29 Jul 24  i i`* Re: Arguments for a sane ISA 6-years later2MitchAlsup1
29 Jul 24  i i `- Re: Arguments for a sane ISA 6-years later1BGB
6 Aug 24  i `* Re: Arguments for a sane ISA 6-years later2Chris M. Thomasson
6 Aug 24  i  `- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
26 Jul 24  `* Re: Arguments for a sane ISA 6-years later4MitchAlsup1
27 Jul 24   +- Re: Arguments for a sane ISA 6-years later1BGB
28 Jul 24   `* Re: Arguments for a sane ISA 6-years later2Paul A. Clayton
28 Jul 24    `- Re: Arguments for a sane ISA 6-years later1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal