Sujet : Re: arm ldxr/stxr vs cas
De : jseigh_es00 (at) *nospam* xemaps.com (jseigh)
Groupes : comp.archDate : 08. Sep 2024, 12:53:59
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vbk38p$1tmo8$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : Mozilla Thunderbird
On 9/8/24 02:35, Chris M. Thomasson wrote:
On 9/7/2024 5:59 PM, MitchAlsup1 wrote:
On Sat, 7 Sep 2024 23:16:35 +0000, Chris M. Thomasson wrote:
>
On 9/7/2024 4:14 PM, Chris M. Thomasson wrote:
[...]
When I am using CAS I don't really expect it to fail willy nilly even if
the comparands are still the same. Weak vs Strong. Still irks me a bit.
;^)
>
There are algorithms out there, usually state machines that depend on
strong cas. When a CAS fails, it depends on it failing because the
comparands were actually different...
>
Leading to ABA failures::
>
Do you really want the following CAS to succeed ??
>
LD R19,[someMemoryValue]
..
interrupt delays program execution for 1 week
..
CAS R17,R19,[someMemoryLocation]
>
Given that the someMemoryLocation is accessible to other programs
while tis one is sleeping ??
>
Thus, it seems reasonable to fail a CAS when one cannot determine
if the memory location has been changed and changed back in the
mean time.
ABA, well that can happen with CAS and certain algorithms that use them. The good ol' version counter is pretty nice, but it still can fail. 64 bit words, 128 bit CAS. Loads to boot... ;^) Actually, I think Joe mentioned something interesting about CAS a long time ago wrt IBM... Candy Cane Striped books (Joe do you remember?) about a way to avoid live lock and failing in the os. I think Windows has some like it with their SList and SEH?
You can have an ABA problem with single word CAS if your values (usually
pointers) are one word in size. The solution to that is to use double
word CAS, DWCAS, with the other word being a monotonic sequence number. The logic being that it would take longer to wrap and reuse the sequence
number than any reasonable delay in execution. In the 70's s370
CDS was 64 bits (2 32 bit words) and it was estimated that it would
take a year to wrap a 32 bit counter.
The risc-v arch manual mentions that you need DWCAS or LL/SC to avoid
the ABA problem and that they chose LL/SC to avoid dealing with 128
bit atomic data sizes (their loss but I'm not going into that here).
Anyway if you want to avoid the ABA problem, you can't do it with
C/C++ atomics since they don't support DWCAS.
The candy striped Principle of Operation was an internal restricted
document with architecture and engineering notes added. The note
was that for short CAS loops fairness should be guaranteed so all
processors could make equal forward progress.
Speaking of live lock, in the 80's, VM/XA spin locks broke on a 4300 box. VMXA used a test and set loop without any back off. The 4300 TS
microcode timed out getting a lock and failed silently. So all the
cpu's spun forever trying to get a lock that was free. We sort
of figured out when I provided a patch that put a load and test
of the lock so we could put in a hardware break point to see if the
lock was actually not free. It suddenly started working because
the patch provide enough backoff for the microcode locking
to start working again.
Joe Seigh