Newsportal USENET - Re: Memory ordering

Re: Memory ordering

Sujet : Re: Memory ordering
De : chris.m.thomasson.1 (at) *nospam* gmail.com (Chris M. Thomasson)
Groupes : comp.arch
Date : 31. Jul 2024, 01:02:49

Autres entêtes

Organisation : A noiseless patient Spider
Message-ID : <v8buv9$18gep$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9
User-Agent : Mozilla Thunderbird

On 7/30/2024 4:27 PM, MitchAlsup1 wrote:

On Tue, 30 Jul 2024 9:51:46 +0000, Anton Ertl wrote:

mitchalsup@aol.com (MitchAlsup1) writes:
On Mon, 29 Jul 2024 13:21:10 +0000, Anton Ertl wrote:
A problem with that approach is that this requires enough reorder
buffering (or something equivalent, there may be something cheaper for
this particular problem) to cover at least the shared-cache latency
(usually L3, more with multiple sockets).
>
The depth of the execution window may be smaller than the time it takes
to send the required information around and have this core recognize
that it is out-of-order wrt memory.
>
So if we don't want to stall for memory accesses all the time, we need
a bigger execution window, either by making the reorder buffer larger,
or by using a different, cheaper mechanism.
Mc 88120 had a 96-wide execution window, which could be filled up in
16 cycles (optimistically) and was filled up in 32 cycles (average).
Given that DRAM is not going to be less than 20 ns and a 5GHz core,
the execution window is 1/3rd that which would be required to absorb
an cache miss all the way to DRAM. Add in 12-ish cycles for L2, 10-
cycles for transporting L2 miss to Memory controller, 5-cycles between
memory controller and DRAM controller-----and sooner or later it gets
hard. So nobody tries to make the execution window big enough to do
that (actually Mc88120 was to ECL buss technology and was only 100 MHz
so its puny 16-cycle EW actually was big enough to absorb a Cache
miss all the way to DRAM.....but that is for another day.)

Concerning the cheaper mechanism, what I am thinking of is hardware
checkpointing every, say, 200 cycles or so (subject to fine-tuning).
The idea here is that communication between cores is very rare, so
rolling back more cycles than the minimal necessary amount costs
little on average (except that it looks bad on cache ping-pong
microbenchmarks).
You lost me::
Colloquially, there are 2 uses of the word checkpointing:: a) what
HW does each time it inserts a branch into the EW, b) what an OS or
application does to be able to recover from a crash (from any
mechanism).
Neither is used to describe interactions between cores.
<snip>

The operations themselves are not slow.
>
Citation needed.
>
A MEMBAR dropped into the pipeline, when nothing is speculative, takes
no more time than an integer ADD. Only when there is speculation does
it have to take time to relax the speculation.
>
Not sure what kind of speculation you mean here. On in-order cores
like the non-Fujitsu SPARCs from before about 2010 memory barriers are
expensive AFAIK, even though there is essentially no branch
speculation on in-order cores.
You dropped 64 instructions into the EW, and AGEN performs 15 address
generations in the order permitted by operand arrival. These addresses
are routed to the L1s to determine who hits and who misses--all OoO.
Thus, the Addresses are only in Operand order and they touched the
caches
in operand order.
There could be branch mispredictions in EW also causing many of the
AGENs to get thrown away after the branch is discovered to be poorly
predicted.
And ono top of all of this several FP instructions may have raised
exceptions.
EW pipelines are generally designed to "sort all this stuff out at
retirement"; occasionally, memory ordering issues are sorted out
prior to retirement by replaying OoO memory references.
>
Of course, if you mean speculation about the order of loads and
stores, yes, if you don't have such speculation, the memory barriers
are fast, but then loads are extremely slow.
An MEMBAR requires the memory order to catch up to the current point
before adding new AGENs to the problem space. If the memory order
is already SC then MEMBAR has nothing to do and is pushed through
the pipeline without delay.
So, the delay has to do with catching up with memory order not in
pushing the MEMBAR through the pipeline.

[...]
Iirc, there was a very special queue, a single producer/consumer queue that only needed a #StoreStore for a push and a #LoadLoad for a pop, no #LoadStore or #StoreLoad ordering. It used a dummy node...

Les messages affichés proviennent d'usenet.

Date	Sujet	#	Auteur
24 Jul 24	Arguments for a sane ISA 6-years later	63	MitchAlsup1
25 Jul 24	Re: Arguments for a sane ISA 6-years later	62	BGB
25 Jul 24	Re: Arguments for a sane ISA 6-years later	57	Chris M. Thomasson
26 Jul 24	Re: Arguments for a sane ISA 6-years later	56	Anton Ertl
26 Jul 24	Re: Arguments for a sane ISA 6-years later	20	BGB
29 Jul 24	Re: Arguments for a sane ISA 6-years later	19	Anton Ertl
29 Jul 24	Intel overvoltage (was: Arguments for a sane ISA 6-years later)	2	Thomas Koenig
29 Jul 24	Re: Intel overvoltage	1	BGB
29 Jul 24	Re: Arguments for a sane ISA 6-years later	16	BGB
30 Jul 24	Re: Arguments for a sane ISA 6-years later	15	Anton Ertl
30 Jul 24	Re: Arguments for a sane ISA 6-years later	14	BGB
30 Jul 24	Re: Arguments for a sane ISA 6-years later	2	Chris M. Thomasson
30 Jul 24	Re: Arguments for a sane ISA 6-years later	1	BGB
1 Aug 24	Re: Arguments for a sane ISA 6-years later	11	Anton Ertl
1 Aug 24	Re: Arguments for a sane ISA 6-years later	1	Michael S
1 Aug 24	Re: Arguments for a sane ISA 6-years later	8	MitchAlsup1
1 Aug 24	Re: Arguments for a sane ISA 6-years later	1	Michael S
2 Aug 24	Re: Arguments for a sane ISA 6-years later	6	MitchAlsup1
2 Aug 24	Re: Arguments for a sane ISA 6-years later	1	Michael S
4 Aug 24	Re: Arguments for a sane ISA 6-years later	4	MitchAlsup1
5 Aug 24	Re: Arguments for a sane ISA 6-years later	3	Stephen Fuld
5 Aug 24	Re: Arguments for a sane ISA 6-years later	2	Stephen Fuld
5 Aug 24	Re: Arguments for a sane ISA 6-years later	1	MitchAlsup1
1 Aug 24	Re: Arguments for a sane ISA 6-years later	1	BGB
26 Jul 24	Re: Arguments for a sane ISA 6-years later	20	MitchAlsup1
27 Jul 24	Re: Arguments for a sane ISA 6-years later	1	BGB
29 Jul 24	Memory ordering (was: Arguments for a sane ISA 6-years later)	18	Anton Ertl
29 Jul 24	Re: Memory ordering	15	MitchAlsup1
29 Jul 24	Re: Memory ordering	6	Chris M. Thomasson
29 Jul 24	Re: Memory ordering	5	MitchAlsup1
30 Jul 24	Re: Memory ordering	4	Michael S
31 Jul 24	Re: Memory ordering	3	Chris M. Thomasson
31 Jul 24	Re: Memory ordering	2	Chris M. Thomasson
31 Jul 24	Re: Memory ordering	1	Chris M. Thomasson
30 Jul 24	Re: Memory ordering	8	Anton Ertl
30 Jul 24	Re: Memory ordering	2	Chris M. Thomasson
30 Jul 24	Re: Memory ordering	1	Chris M. Thomasson
31 Jul 24	Re: Memory ordering	5	MitchAlsup1
31 Jul 24	Re: Memory ordering	1	Chris M. Thomasson
1 Aug 24	Re: Memory ordering	3	Anton Ertl
1 Aug 24	Re: Memory ordering	2	MitchAlsup1
2 Aug 24	Re: Memory ordering	1	Anton Ertl
29 Jul 24	Re: Memory ordering	2	Chris M. Thomasson
30 Jul 24	Re: Memory ordering	1	Chris M. Thomasson
29 Jul 24	Re: Arguments for a sane ISA 6-years later	13	Chris M. Thomasson
29 Jul 24	Re: Arguments for a sane ISA 6-years later	9	BGB
29 Jul 24	Re: Arguments for a sane ISA 6-years later	8	Chris M. Thomasson
29 Jul 24	Re: Arguments for a sane ISA 6-years later	1	Chris M. Thomasson
29 Jul 24	Re: Arguments for a sane ISA 6-years later	2	BGB
29 Jul 24	Re: Arguments for a sane ISA 6-years later	1	Chris M. Thomasson
30 Jul 24	Re: Arguments for a sane ISA 6-years later	4	jseigh
30 Jul 24	Re: Arguments for a sane ISA 6-years later	3	Chris M. Thomasson
31 Jul 24	Re: Arguments for a sane ISA 6-years later	2	jseigh
31 Jul 24	Re: Arguments for a sane ISA 6-years later	1	Chris M. Thomasson
29 Jul 24	Memory ordering (was: Arguments for a sane ISA 6-years later)	1	Anton Ertl
29 Jul 24	Re: Arguments for a sane ISA 6-years later	2	MitchAlsup1
29 Jul 24	Re: Arguments for a sane ISA 6-years later	1	BGB
6 Aug 24	Re: Arguments for a sane ISA 6-years later	2	Chris M. Thomasson
6 Aug 24	Re: Arguments for a sane ISA 6-years later	1	Chris M. Thomasson
25 Jul 24	Re: Arguments for a sane ISA 6-years later	4	MitchAlsup1
26 Jul 24	Re: Arguments for a sane ISA 6-years later	1	BGB
28 Jul 24	Re: Arguments for a sane ISA 6-years later	2	Paul A. Clayton
28 Jul 24	Re: Arguments for a sane ISA 6-years later	1	MitchAlsup1