Re: arm ldxr/stxr vs cas

Liste des GroupesRevenir à c arch 
Sujet : Re: arm ldxr/stxr vs cas
De : jseigh_es00 (at) *nospam* xemaps.com (jseigh)
Groupes : comp.arch
Date : 07. Sep 2024, 16:02:56
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vbhpv0$1de2c$1@dont-email.me>
References : 1 2 3 4 5 6 7 8
User-Agent : Mozilla Thunderbird
On 9/6/24 15:57, MitchAlsup1 wrote:
On Fri, 6 Sep 2024 19:36:36 +0000, Chris M. Thomasson wrote:
 
On 9/5/2024 2:49 PM, jseigh wrote:
On 9/5/24 16:34, Chris M. Thomasson wrote:
On 9/5/2024 12:46 PM, MitchAlsup1 wrote:
On Thu, 5 Sep 2024 11:33:23 +0000, jseigh wrote:
>
On 9/4/2024 5:27 PM, MitchAlsup1 wrote:
On Mon, 2 Sep 2024 17:27:57 +0000, jseigh wrote:
>
I read that arm added the cas instruction because they didn't think
ldxr/stxr would scale well.  It wasn't clear to me as to why that
would be the case.  I would think the memory lock mechanism would
have really low overhead vs cas having to do an interlocked load
and store.  Unless maybe the memory lock size might be large
enough to cause false sharing issues.  Any ideas?
>
A pipeline lock between the LD part of a CAS and the ST part of a
CAS is essentially FREE. But the same is true for LL followed by
a later SC.
>
Older machines with looser than sequential consistency memory models
and running OoO have a myriad of problems with LL - SC. This is
why My 66000 architecture switches from causal consistency to
sequential consistency when it encounters <effectively> LL and
switches bac after seeing SC.
>
No Fences necessary with causal consistency.
>
>
I'm not sure I entirely follow.  I was thinking of the effects on
cache.  In theory the SC could fail without having get the current
cache line exclusive or at all.  CAS has to get it exclusive before
it can definitively fail.
>
A LL that takes a miss in L1 will perform a fetch with intent to modify,
so will a CAS. However, LL is allowed to silently fail if exclusive is
not returned from its fetch, deferring atomic failure to SC, while CAS
will fail when exclusive fails to return.
>
CAS should only fail when the comparands are not equal to each other.
Well, then there is the damn weak and strong CAS in C++11... ;^o
>
>
LL-SC is designed so that
when a failure happens, failure is visible at SC not necessarily at LL.
>
There are coherence protocols that allows the 2nd party to determine
if it returns exclusive or not. The example I know is when the 2nd
party is already performing an atomic event and it is better to fail
the starting atomic event than to fail an ongoing atomic event.
In My 66000 the determination is made under the notion of priority::
the higher priority thread is allows to continue while the lower
priority thread takes the failure. The higher priority thread can
be the requestor (1st party) or the holder of data (2nd party)
while all interested observers (3rd parties) are in a position
to see what transpired and act accordingly (causal).
>
>
I'm not so sure about making the memory lock granularity same as
cache line size but that's an implementation decision I guess.
>
I do like the idea of detecting potential contention at the
start of LL/SC so you can do back off.  Right now the only way I
can detect contention is after the fact when the CAS fails and
I probably have the cache line exclusive at that point.  It's
pretty problematic.
>
I wonder if the ability to determine why a "weak" CAS failed might help.
They (weak) can fail for other reasons besides comparing comparands...
Well, would be a little too low level for a general atomic op in
C/C++11?
 One can detect that the CAS-line is no longer exclusive as a form
of weak failure, rather than waiting for the data to show up and
fail strongly on the compare.
There is no requirement for CAS to calculate the expected value in
any way, though typically the expected value is loaded from the CAS
target.  In fact you can use random values and it will still work,
just take a lot longer.  A typical optimization for pushing onto
a stack that you expect to be empty more often than not is to
initially load NULL as expected value instead of loading from the
stack anchor, a load immediate vs load from storage.
x64 doesn't have an atomic 128 bit load but cmpxchg16b works
ok nonetheless.  The 2 64 bit loads just have to be effectively
atomic most of the time or you can use the updated result from
cmpxchg16b.
aarch64 didn't have atomic 128 bit load, LDP, early on. You
have to do a LDXP/STXP to determine if load was atomic.  In
practice if you're doing a LDXP/STXP loop anyway it doesn't
matter too much as long as you can handle the occasional
random 128 bit value.
I have some success with after the fact contention back off.
I get 30% to 50% improvement in most cases.  The main challenge
is getting a 100+ nanosecond pause.  nanosleep() doesn't hack it.
Joe Seigh

Date Sujet#  Auteur
2 Sep 24 * arm ldxr/stxr vs cas58jseigh
2 Sep 24 +* Re: arm ldxr/stxr vs cas4Chris M. Thomasson
2 Sep 24 i`* Re: arm ldxr/stxr vs cas3Chris M. Thomasson
2 Sep 24 i `* Re: arm ldxr/stxr vs cas2jseigh
2 Sep 24 i  `- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
4 Sep 24 +* Re: arm ldxr/stxr vs cas50MitchAlsup1
5 Sep 24 i+* Re: arm ldxr/stxr vs cas3Chris M. Thomasson
5 Sep 24 ii`* Re: arm ldxr/stxr vs cas2MitchAlsup1
5 Sep 24 ii `- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
5 Sep 24 i`* Re: arm ldxr/stxr vs cas46jseigh
5 Sep 24 i +- Re: arm ldxr/stxr vs cas1Anton Ertl
5 Sep 24 i `* Re: arm ldxr/stxr vs cas44MitchAlsup1
5 Sep 24 i  `* Re: arm ldxr/stxr vs cas43Chris M. Thomasson
5 Sep 24 i   `* Re: arm ldxr/stxr vs cas42jseigh
6 Sep 24 i    +- Re: arm ldxr/stxr vs cas1MitchAlsup1
6 Sep 24 i    +* Re: arm ldxr/stxr vs cas20Chris M. Thomasson
6 Sep 24 i    i`* Re: arm ldxr/stxr vs cas19MitchAlsup1
7 Sep 24 i    i `* Re: arm ldxr/stxr vs cas18jseigh
8 Sep 24 i    i  `* Re: arm ldxr/stxr vs cas17Chris M. Thomasson
8 Sep 24 i    i   `* Re: arm ldxr/stxr vs cas16Chris M. Thomasson
8 Sep 24 i    i    `* Re: arm ldxr/stxr vs cas15Chris M. Thomasson
8 Sep 24 i    i     `* Re: arm ldxr/stxr vs cas14MitchAlsup1
8 Sep 24 i    i      +* Re: arm ldxr/stxr vs cas4Chris M. Thomasson
8 Sep 24 i    i      i+- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
8 Sep 24 i    i      i`* Re: arm ldxr/stxr vs cas2jseigh
8 Sep 24 i    i      i `- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
8 Sep 24 i    i      `* Re: arm ldxr/stxr vs cas9Chris M. Thomasson
8 Sep 24 i    i       +* Re: arm ldxr/stxr vs cas6Michael S
8 Sep 24 i    i       i+- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
8 Sep 24 i    i       i+- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
9 Sep 24 i    i       i`* Re: arm ldxr/stxr vs cas3Michael S
9 Sep 24 i    i       i `* Re: arm ldxr/stxr vs cas2Michael S
9 Sep 24 i    i       i  `- Re: arm ldxr/stxr vs cas1Michael S
8 Sep 24 i    i       +- Re: arm ldxr/stxr vs cas1MitchAlsup1
8 Sep 24 i    i       `- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
9 Sep 24 i    `* Re: arm ldxr/stxr vs cas20Terje Mathisen
9 Sep 24 i     +* Re: arm ldxr/stxr vs cas11jseigh
9 Sep 24 i     i+* Re: arm ldxr/stxr vs cas6Chris M. Thomasson
10 Sep 24 i     ii`* Re: arm ldxr/stxr vs cas5jseigh
10 Sep 24 i     ii `* Re: arm ldxr/stxr vs cas4Chris M. Thomasson
10 Sep 24 i     ii  `* Re: arm ldxr/stxr vs cas3jseigh
11 Sep 24 i     ii   `* Re: arm ldxr/stxr vs cas2Chris M. Thomasson
11 Sep 24 i     ii    `- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
10 Sep 24 i     i`* Re: arm ldxr/stxr vs cas4Terje Mathisen
10 Sep 24 i     i `* Re: arm ldxr/stxr vs cas3jseigh
10 Sep 24 i     i  +- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
10 Sep 24 i     i  `- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
9 Sep 24 i     +- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
11 Sep 24 i     `* Re: arm ldxr/stxr vs cas7Paul A. Clayton
11 Sep 24 i      +* Re: arm ldxr/stxr vs cas2Chris M. Thomasson
11 Sep 24 i      i`- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
11 Sep 24 i      +* Re: arm ldxr/stxr vs cas2jseigh
11 Sep 24 i      i`- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
11 Sep 24 i      `* Re: arm ldxr/stxr vs cas2Stefan Monnier
12 Sep 24 i       `- Re: arm ldxr/stxr vs cas1Chris M. Thomasson
9 Sep 24 `* Re: arm ldxr/stxr vs cas3jseigh
11 Sep 24  `* Re: arm ldxr/stxr vs cas2jseigh
11 Sep 24   `- Re: arm ldxr/stxr vs cas1Chris M. Thomasson

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal