Newsportal USENET - Re: Memory ordering

Re: Memory ordering

Sujet : Re: Memory ordering
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.arch
Date : 01. Aug 2024, 16:54:55

Autres entêtes

Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Aug1.175455@mips.complang.tuwien.ac.at>
References : 1 2 3 4 5 6 7 8 9
User-Agent : xrn 10.11

mitchalsup@aol.com (MitchAlsup1) writes:

On Tue, 30 Jul 2024 9:51:46 +0000, Anton Ertl wrote:
>
mitchalsup@aol.com (MitchAlsup1) writes:
The depth of the execution window may be smaller than the time it takes
to send the required information around and have this core recognize
that it is out-of-order wrt memory.
>
So if we don't want to stall for memory accesses all the time, we need
a bigger execution window, either by making the reorder buffer larger,
or by using a different, cheaper mechanism.
>
Mc 88120 had a 96-wide execution window, which could be filled up in
16 cycles (optimistically) and was filled up in 32 cycles (average).
Given that DRAM is not going to be less than 20 ns and a 5GHz core,
the execution window is 1/3rd that which would be required to absorb
an cache miss all the way to DRAM.

Relevant numbers for current cores are 400-600 instructions in the
reorder buffer, 6-8 instructions per cycle, and the core-to-core
latency is (for Bergamo) 30-40ns (90-120 cycles) within a CCX,
100-120ns (300-360 cyles) within a socket, with a 212ns (636 cycles)
worst case across sockets (data from
<https://chipsandcheese.com/2024/06/22/testing-amds-bergamo-zen-4c-spam/>);
I computed with 3GHz clock rate, which fits Bergamo. On the fast
desktop chips the clock rate is higher, but the latency is lower in
both ns and cycles (in particular, no dual-socket penalty); e.g., on
the Ryzen 7950X the core-to-core latency is <80ns (456 cycles)
<https://images.anandtech.com/doci/17585/AMD%20Ryzen%209%207950X%20Core%20to%20Core%20Latency%20Final.jpg>.

The reorder buffers (and integer register files and store buffers)
would need to be even larger to cover the 456 or 636 cycles (and maybe
even more than that latency is needed before you are sure that a load
or store is sequentially consistent), or alternatively one would need
a cheaper mechanism.

Concerning the cheaper mechanism, what I am thinking of is hardware
checkpointing every, say, 200 cycles or so (subject to fine-tuning).
The idea here is that communication between cores is very rare, so
rolling back more cycles than the minimal necessary amount costs
little on average (except that it looks bad on cache ping-pong
microbenchmarks).
>
You lost me::
>
Colloquially, there are 2 uses of the word checkpointing:: a) what
HW does each time it inserts a branch into the EW, b) what an OS or
application does to be able to recover from a crash (from any
mechanism).

What is "EW"?

Anyway, here checkpointing would be a hardware mechanism that allows
rolling back to the state at the point when the checkpoint was made.

An MEMBAR requires the memory order to catch up to the current point
before adding new AGENs to the problem space. If the memory order
is already SC then MEMBAR has nothing to do and is pushed through
the pipeline without delay.

Yes, that's the slow implementation. The fast implementation is to
implement sequential consistency all the time (by predicting and
speculating that memory accesses do not interfer with those of other
cores, and recovering from that speculation when the speculation turns
out to be wrong). In such an implementation memory barriers are noops
(and thus fast), because the hardware already provides sequential
consistency.

Then consider 2 Vector processors performing 2 STs (1 each) to
non-overlapping addresses but with bank aliasing. Consider that
the STs are scatter based and the back conflicts random. There
is no way to determine which store happened first or which
element of each vector store happened first.

It's up to the architecture to define the order of stores and loads of
a given core. For sequential consistency you then interleave the
sequences coming from the cores in some convenient order. It does not
matter what happens earlier in some inertial system. It only matters
what your hardware decides should be treated as being earlier. The
hardware has a lot of freedom here, but the end result as visible to
the cores must be sequentially consistent (or, with a weaker memory
consistency model, consistent with that model).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Les messages affichés proviennent d'usenet.

Date	Sujet	#	Auteur
24 Jul 24	Arguments for a sane ISA 6-years later	63	MitchAlsup1
25 Jul 24	Re: Arguments for a sane ISA 6-years later	62	BGB
25 Jul 24	Re: Arguments for a sane ISA 6-years later	57	Chris M. Thomasson
26 Jul 24	Re: Arguments for a sane ISA 6-years later	56	Anton Ertl
26 Jul 24	Re: Arguments for a sane ISA 6-years later	20	BGB
29 Jul 24	Re: Arguments for a sane ISA 6-years later	19	Anton Ertl
29 Jul 24	Intel overvoltage (was: Arguments for a sane ISA 6-years later)	2	Thomas Koenig
29 Jul 24	Re: Intel overvoltage	1	BGB
29 Jul 24	Re: Arguments for a sane ISA 6-years later	16	BGB
30 Jul 24	Re: Arguments for a sane ISA 6-years later	15	Anton Ertl
30 Jul 24	Re: Arguments for a sane ISA 6-years later	14	BGB
30 Jul 24	Re: Arguments for a sane ISA 6-years later	2	Chris M. Thomasson
30 Jul 24	Re: Arguments for a sane ISA 6-years later	1	BGB
1 Aug 24	Re: Arguments for a sane ISA 6-years later	11	Anton Ertl
1 Aug 24	Re: Arguments for a sane ISA 6-years later	1	Michael S
1 Aug 24	Re: Arguments for a sane ISA 6-years later	8	MitchAlsup1
1 Aug 24	Re: Arguments for a sane ISA 6-years later	1	Michael S
2 Aug 24	Re: Arguments for a sane ISA 6-years later	6	MitchAlsup1
2 Aug 24	Re: Arguments for a sane ISA 6-years later	1	Michael S
4 Aug 24	Re: Arguments for a sane ISA 6-years later	4	MitchAlsup1
5 Aug 24	Re: Arguments for a sane ISA 6-years later	3	Stephen Fuld
5 Aug 24	Re: Arguments for a sane ISA 6-years later	2	Stephen Fuld
5 Aug 24	Re: Arguments for a sane ISA 6-years later	1	MitchAlsup1
1 Aug 24	Re: Arguments for a sane ISA 6-years later	1	BGB
26 Jul 24	Re: Arguments for a sane ISA 6-years later	20	MitchAlsup1
27 Jul 24	Re: Arguments for a sane ISA 6-years later	1	BGB
29 Jul 24	Memory ordering (was: Arguments for a sane ISA 6-years later)	18	Anton Ertl
29 Jul 24	Re: Memory ordering	15	MitchAlsup1
29 Jul 24	Re: Memory ordering	6	Chris M. Thomasson
29 Jul 24	Re: Memory ordering	5	MitchAlsup1
30 Jul 24	Re: Memory ordering	4	Michael S
31 Jul 24	Re: Memory ordering	3	Chris M. Thomasson
31 Jul 24	Re: Memory ordering	2	Chris M. Thomasson
31 Jul 24	Re: Memory ordering	1	Chris M. Thomasson
30 Jul 24	Re: Memory ordering	8	Anton Ertl
30 Jul 24	Re: Memory ordering	2	Chris M. Thomasson
30 Jul 24	Re: Memory ordering	1	Chris M. Thomasson
31 Jul 24	Re: Memory ordering	5	MitchAlsup1
31 Jul 24	Re: Memory ordering	1	Chris M. Thomasson
1 Aug 24	Re: Memory ordering	3	Anton Ertl
1 Aug 24	Re: Memory ordering	2	MitchAlsup1
2 Aug 24	Re: Memory ordering	1	Anton Ertl
29 Jul 24	Re: Memory ordering	2	Chris M. Thomasson
30 Jul 24	Re: Memory ordering	1	Chris M. Thomasson
29 Jul 24	Re: Arguments for a sane ISA 6-years later	13	Chris M. Thomasson
29 Jul 24	Re: Arguments for a sane ISA 6-years later	9	BGB
29 Jul 24	Re: Arguments for a sane ISA 6-years later	8	Chris M. Thomasson
29 Jul 24	Re: Arguments for a sane ISA 6-years later	1	Chris M. Thomasson
29 Jul 24	Re: Arguments for a sane ISA 6-years later	2	BGB
29 Jul 24	Re: Arguments for a sane ISA 6-years later	1	Chris M. Thomasson
30 Jul 24	Re: Arguments for a sane ISA 6-years later	4	jseigh
30 Jul 24	Re: Arguments for a sane ISA 6-years later	3	Chris M. Thomasson
31 Jul 24	Re: Arguments for a sane ISA 6-years later	2	jseigh
31 Jul 24	Re: Arguments for a sane ISA 6-years later	1	Chris M. Thomasson
29 Jul 24	Memory ordering (was: Arguments for a sane ISA 6-years later)	1	Anton Ertl
29 Jul 24	Re: Arguments for a sane ISA 6-years later	2	MitchAlsup1
29 Jul 24	Re: Arguments for a sane ISA 6-years later	1	BGB
6 Aug 24	Re: Arguments for a sane ISA 6-years later	2	Chris M. Thomasson
6 Aug 24	Re: Arguments for a sane ISA 6-years later	1	Chris M. Thomasson
25 Jul 24	Re: Arguments for a sane ISA 6-years later	4	MitchAlsup1
26 Jul 24	Re: Arguments for a sane ISA 6-years later	1	BGB
28 Jul 24	Re: Arguments for a sane ISA 6-years later	2	Paul A. Clayton
28 Jul 24	Re: Arguments for a sane ISA 6-years later	1	MitchAlsup1