Sujet : Re: smrproxy v2
De : chris.m.thomasson.1 (at) *nospam* gmail.com (Chris M. Thomasson)
Groupes : comp.lang.c++Date : 28. Oct 2024, 22:57:23
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vfp1c3$16d9f$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13
User-Agent : Mozilla Thunderbird
On 10/28/2024 4:45 AM, jseigh wrote:
On 10/28/24 00:02, Chris M. Thomasson wrote:
On 10/27/2024 5:35 PM, jseigh wrote:
On 10/27/24 18:32, Chris M. Thomasson wrote:
>
The membar version? That's a store/load membar so it is expensive.
>
I was wondering in your c++ version if you had to use any seq_cst barriers. I think acquire/release should be good enough. Now, when I say C++, I mean pure C++, no calls to FlushProcessWriteBuffers and things like that.
>
I take it that your pure C++ version has no atomic RMW, right? Just loads and stores?
While a lock action has acquire memory order semantics, if the
implementation has internal stores, you have to those stores
are complete before any access from the critical section.
So you may need a store/load memory barrier.
Wrt acquiring a lock the only class of mutex logic that comes to mind that requires an explicit storeload style membar is Petersons, and some others along those lines, so to speak. This is for the store and load version. Now, RMW on x86 basically implies a StoreLoad wrt the LOCK prefix, XCHG aside for it has an implied LOCK prefix. For instance the original SMR algo requires a storeload as is on x86/x64. MFENCE or LOCK prefix.
Fwiw, my experimental pure C++ proxy works fine with XADD, or atomic fetch-add. It needs an explicit membars (no #StoreLoad) on SPARC in RMO mode. On x86, the LOCK prefix handles that wrt the RMW's themselves. This is a lot different than using stores and loads. The original SMR and Peterson's algo needs that "store followed by a load to a different location" action to hold true, aka, storeload...
Now, I don't think that a data-dependant load can act like a storeload. I thought that they act sort of like an acquire, aka #LoadStore | #LoadLoad wrt SPARC. SPARC in RMO mode honors data-dependencies. Now, the DEC Alpha is a different story... ;^)
For cmpxchg, it has full cst_seq. For other rmw atomics I don't
know. I have to ask on c.a. I think some data dependency and/or
control dependency might factor in.
Joe Seigh