Sujet : Re: smrproxy v2
De : chris.m.thomasson.1 (at) *nospam* gmail.com (Chris M. Thomasson)
Groupes : comp.lang.c++Date : 25. Nov 2024, 01:48:07
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vi0hg6$2f499$2@dont-email.me>
References : 1 2 3 4 5 6 7
User-Agent : Mozilla Thunderbird
On 11/24/2024 4:37 PM, Chris M. Thomasson wrote:
On 11/24/2024 4:09 PM, jseigh wrote:
On 11/24/24 18:19, Chris M. Thomasson wrote:
On 11/24/2024 12:14 PM, jseigh wrote:
On 11/23/24 11:10, jseigh wrote:
On 11/21/24 15:17, jseigh wrote:
On 10/17/24 08:10, jseigh wrote:
I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
>
It's way faster now.
>
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
>
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.
>
>
I got a port to c++ working now. There are 5 proxy implementations
1) smrproxy v2
2) arcproxy - reference counted proxy
3) rwlock based proxy
4) mutex based proxy
5) an unsafe proxy with no locking
>
The testcase is templated so you can use any of the
5 proxy implementations without rewriting for each proxy
type. You can do apple to apple comparisons. I
realize that's the complete antithesis of current
programming practices but there you have it. :)
>
A bit of clean up and performance tuning now.
>
>
Ok, smrproxy lock/unlock is down to 0.6 nanoseconds now,
about what the C version was.
>
>
I've been using cpu time to measure performance. That's ok
for lock-free/wait-free locking. For normal mutexes and
shared locks, it doesn't measure wait time so those didn't
look as bad as they really were. You can add logic
to measure how long it takes to acquire a lock but that
adds significant overhead.
>
I remember back in the day when I was comparing and contrasting various lock/wait-free algorithms with their 100% lock-based counter parts. Some of the lock-based tests too so long that I just terminated the damn program. Iirc, a lock-free test would take around 5 minutes. The lock- based test would be around 30+ minutes. This was way back on c.p.t.
>
I set the iteration count as a parameter. Mutex can be particularly
slow with a lot of reader threads. I usually see about 1000 - 10000
times slower than smrproxy. rwlocks aren't as bad, about 200 x
slower.
>
Mutex, rwlock, and arcproxy use interlocked instructions so you
can get a really wide performance range based on cache geometry
and processor sets you run on.
Big time. My older proxy uses interlocked instructions as well. Except, it does not use any CAS instructions... :^)
https://pastebin.com/raw/f71480694
Wow, back in 2010! How time goes by. Shit... ;^o
Btw, my code for that link is using Relacy Race Detector. :^)