Re: Arguments for a sane ISA 6-years later

Liste des GroupesRevenir à c arch 
Sujet : Re: Arguments for a sane ISA 6-years later
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 29. Jul 2024, 19:20:44
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v88mhu$jit2$1@dont-email.me>
References : 1 2 3 4 5 6
User-Agent : Mozilla Thunderbird
On 7/29/2024 12:38 PM, MitchAlsup1 wrote:
On Mon, 29 Jul 2024 3:32:52 +0000, Chris M. Thomasson wrote:
 
On 7/26/2024 10:00 AM, Anton Ertl wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 7/25/2024 1:09 PM, BGB wrote:
At least with a weak model, software knows that if it doesn't go through
the rituals, the memory will be stale.
>
There is no guarantee of staleness, only a lack of stronger ordering
guarantees.
>
The weak model is ideal for me. I know how to program for it
>
And the fact that this model is so hard to use that few others know
how to program for it make it ideal for you.
>
and it's more efficient
>
That depends on the hardware.
>
Yes, the Alpha 21164 with its imprecise exceptions was "more
efficient" than other hardware for a while, then the Pentium Pro came
along and gave us precise exceptions and more efficiency.  And
eventually the Alpha people learned the trick, too, and 21264 provided
precise exceptions (although they did not admit this) and more
efficieny.
>
Similarly, I expect that hardware that is designed for good TSO or
sequential consistency performance will run faster on code written for
this model than code written for weakly consistent hardware will run
on that hardware.  That's because software written for weakly
consistent hardware often has to insert barriers or atomic operations
just in case, and these operations are slow on hardware optimized for
weak consistency.
>
By contrast, one can design hardware for strong ordering such that the
slowness occurs only in those cases when actual (not potential)
communication between the cores happens, i.e., much less frequently.
>
and sometimes use cases do not care if they encounter "stale" data.
>
Great.  Unless these "sometimes" cases are more often than the cases
where you perform some atomic operation or barrier because of
potential, but not actual communication between cores, the weak model
is still slower than a well-implemented strong model.
>
A strong model? You mean I don't have to use any memory barriers at all?
Tell that to SPARC in RMO mode... How strong? Even the x86 requires a
membar when a store followed by a load to another location shall be
respected wrt order. Store-Load. #StoreLoad over on SPARC. ;^)
 DRAM does not need this property, MMI/O does.
 
As I see it, it makes sense to treat MMIO separately from normal memory access:
   Memory may use a much weaker model;
   For MMIO, generally every access needs to be synchronous.
In most cases, for MMIO, it makes sense to pay this cost, even if it means that access is slower.
In my case though, some of the hardware uses DRAM-backed buffers, which makes more of a pain for memory consistency.
But, then one has to make a tradeoff of whether it is better to use normal memory access and flushing, or No-Cache/Volatile access (which makes accesses consistent, but comes at a fairly steep performance penalty).

If you can force everything to be #StoreLoad (*) and make it faster than
a handcrafted algo on a very weak memory system, well, hats off! I
thought it was easier for a HW guy to implement weak consistency? At the
cost of the increased complexity wrt programming the sucker! ;^)
 Or HW can have different order strengths based on where the PTE
sends the request. DRAM gets causal order, ATOMICs to DRAM get
sequential consistency, MMI/O gets sequential consistency,
Configuration gets strong ordering.
 Programmer has to do nothing.
 
In my case, thus far it is based partly on the high 4 bits of the address:
   0..B: Virtual address, cached
   C, physical addressed, cached
   D, physical addressed, non-cached
   E, reserved
   F, MMIO, non-cached.
For virtual memory, the TLB can also encode the use of non-cached memory.
Considered, but still not implemented as of yet, would be memory operations that encode the use of non-cached access.
Most likely, this would be as a special case of the existing LDOP/OPST mechanism (was also used for the RV64 AMO operations).
The same mechanism would behave as LoadOp, OpStore, or AMO, depending on how it was used. Currently, these behave as normal access, but it might make sense to allow them to signal Volatile/NoCache.
There is a bit currently used in the encoding to signal Imm6 in BJX2, but no effect on the operation itself. It is possible this bit (in the operation) could be repurposed to signal Volatile access, but would need a way to signal it in the encoding.
Well, and/or make XCHG special:
Say, Load+XCHG is assumed Volatile, as is Store+XCHG, but in the latter case one ignores the result of the XCHG if the intention is merely to store something (and use a plain Load+Store sequence if one intends to perform a non-volatile exchange).
If other Volatile AMO operations may only be encoded in RISC-V Mode, this probably isn't a huge loss...
Well, and/or I drop the Imm6 case, which isn't really used much anyways.
But, could be used to encode things like "*(int *)ptr+=4;" as a single operation.

>
(*) Not just #StoreLoad for full consistency, you would need :
>
MEMBAR #StoreLoad | #LoadStore | #StoreStore | #LoadLoad
>
right?

Date Sujet#  Auteur
24 Jul 24 * Arguments for a sane ISA 6-years later63MitchAlsup1
25 Jul 24 `* Re: Arguments for a sane ISA 6-years later62BGB
25 Jul 24  +* Re: Arguments for a sane ISA 6-years later57Chris M. Thomasson
26 Jul 24  i`* Re: Arguments for a sane ISA 6-years later56Anton Ertl
26 Jul 24  i +* Re: Arguments for a sane ISA 6-years later20BGB
29 Jul 24  i i`* Re: Arguments for a sane ISA 6-years later19Anton Ertl
29 Jul 24  i i +* Intel overvoltage (was: Arguments for a sane ISA 6-years later)2Thomas Koenig
29 Jul 24  i i i`- Re: Intel overvoltage1BGB
29 Jul 24  i i `* Re: Arguments for a sane ISA 6-years later16BGB
30 Jul 24  i i  `* Re: Arguments for a sane ISA 6-years later15Anton Ertl
30 Jul 24  i i   `* Re: Arguments for a sane ISA 6-years later14BGB
30 Jul 24  i i    +* Re: Arguments for a sane ISA 6-years later2Chris M. Thomasson
31 Jul 24  i i    i`- Re: Arguments for a sane ISA 6-years later1BGB
1 Aug 24  i i    `* Re: Arguments for a sane ISA 6-years later11Anton Ertl
1 Aug 24  i i     +- Re: Arguments for a sane ISA 6-years later1Michael S
1 Aug 24  i i     +* Re: Arguments for a sane ISA 6-years later8MitchAlsup1
1 Aug 24  i i     i+- Re: Arguments for a sane ISA 6-years later1Michael S
2 Aug 24  i i     i`* Re: Arguments for a sane ISA 6-years later6MitchAlsup1
2 Aug 24  i i     i +- Re: Arguments for a sane ISA 6-years later1Michael S
4 Aug 24  i i     i `* Re: Arguments for a sane ISA 6-years later4MitchAlsup1
5 Aug 24  i i     i  `* Re: Arguments for a sane ISA 6-years later3Stephen Fuld
5 Aug 24  i i     i   `* Re: Arguments for a sane ISA 6-years later2Stephen Fuld
5 Aug 24  i i     i    `- Re: Arguments for a sane ISA 6-years later1MitchAlsup1
1 Aug 24  i i     `- Re: Arguments for a sane ISA 6-years later1BGB
26 Jul 24  i +* Re: Arguments for a sane ISA 6-years later20MitchAlsup1
27 Jul 24  i i+- Re: Arguments for a sane ISA 6-years later1BGB
29 Jul 24  i i`* Memory ordering (was: Arguments for a sane ISA 6-years later)18Anton Ertl
29 Jul 24  i i +* Re: Memory ordering15MitchAlsup1
29 Jul 24  i i i+* Re: Memory ordering6Chris M. Thomasson
29 Jul 24  i i ii`* Re: Memory ordering5MitchAlsup1
30 Jul 24  i i ii `* Re: Memory ordering4Michael S
31 Jul 24  i i ii  `* Re: Memory ordering3Chris M. Thomasson
31 Jul 24  i i ii   `* Re: Memory ordering2Chris M. Thomasson
31 Jul 24  i i ii    `- Re: Memory ordering1Chris M. Thomasson
30 Jul 24  i i i`* Re: Memory ordering8Anton Ertl
30 Jul 24  i i i +* Re: Memory ordering2Chris M. Thomasson
30 Jul 24  i i i i`- Re: Memory ordering1Chris M. Thomasson
31 Jul 24  i i i `* Re: Memory ordering5MitchAlsup1
31 Jul 24  i i i  +- Re: Memory ordering1Chris M. Thomasson
1 Aug 24  i i i  `* Re: Memory ordering3Anton Ertl
1 Aug 24  i i i   `* Re: Memory ordering2MitchAlsup1
2 Aug 24  i i i    `- Re: Memory ordering1Anton Ertl
29 Jul 24  i i `* Re: Memory ordering2Chris M. Thomasson
30 Jul 24  i i  `- Re: Memory ordering1Chris M. Thomasson
29 Jul 24  i +* Re: Arguments for a sane ISA 6-years later13Chris M. Thomasson
29 Jul 24  i i+* Re: Arguments for a sane ISA 6-years later9BGB
29 Jul 24  i ii`* Re: Arguments for a sane ISA 6-years later8Chris M. Thomasson
29 Jul 24  i ii +- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
29 Jul 24  i ii +* Re: Arguments for a sane ISA 6-years later2BGB
29 Jul 24  i ii i`- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
30 Jul 24  i ii `* Re: Arguments for a sane ISA 6-years later4jseigh
30 Jul 24  i ii  `* Re: Arguments for a sane ISA 6-years later3Chris M. Thomasson
31 Jul 24  i ii   `* Re: Arguments for a sane ISA 6-years later2jseigh
31 Jul 24  i ii    `- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
29 Jul 24  i i+- Memory ordering (was: Arguments for a sane ISA 6-years later)1Anton Ertl
29 Jul 24  i i`* Re: Arguments for a sane ISA 6-years later2MitchAlsup1
29 Jul 24  i i `- Re: Arguments for a sane ISA 6-years later1BGB
6 Aug 24  i `* Re: Arguments for a sane ISA 6-years later2Chris M. Thomasson
6 Aug 24  i  `- Re: Arguments for a sane ISA 6-years later1Chris M. Thomasson
26 Jul 24  `* Re: Arguments for a sane ISA 6-years later4MitchAlsup1
27 Jul 24   +- Re: Arguments for a sane ISA 6-years later1BGB
28 Jul 24   `* Re: Arguments for a sane ISA 6-years later2Paul A. Clayton
28 Jul 24    `- Re: Arguments for a sane ISA 6-years later1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal