Newsportal USENET - Re: Microarchitectural support for counting

Re: Microarchitectural support for counting

Sujet : Re: Microarchitectural support for counting
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.arch
Date : 26. Dec 2024, 15:56:30

Autres entêtes

Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Dec26.155630@mips.complang.tuwien.ac.at>
References : 1 2
User-Agent : xrn 10.11

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:

On 10/3/2024 7:00 AM, Anton Ertl wrote:
Two weeks ago Rene Mueller presented the paper "The Cost of Profiling
in the HotSpot Virtual Machine" at MPLR 2024. He reported that for
some programs the counters used for profiling the program result in
cache contention due to true or false sharing among threads.

The traditional software mitigation for that problem is to split the
counters into per-thread or per-core instances. But for heavily
multi-threaded programs running on machines with many cores the cost
of this mitigation is substantial.

...

For the HotSpot application, the
eventual answer was that they live with the cost of cache contention
for the programs that have that problem. After some minutes the hot
parts of the program are optimized, and cache contention is no longer
a problem.

...

If the per-thread counters are properly padded to a l2 cache line and
properly aligned on cache line boundaries, well, the should not cause
false sharing with other cache lines... Right?

Sure, that's what the first sentence of the second paragraph you cited
(and which I cited again) is about. Next, read the next sentence.

Maybe I should give an example (fully made up on the spot, read the
paper for real numbers): If HotSpot uses, on average one counter per
conditional branch, and assuming a conditional branch every 10 static
instructions (each having, say 4 bytes), with 1MB of generated code
and 8 bytes per counter, that's 200KB of counters. But these counters
are shared between all threads, so for code running on many cores you
get true and false sharing.

As mentioned, the usual mitigation is per-core counters. With a
256-core machine, we now have 51.2MB of counters for 1MB of executable
code. Now this is Java, so there might be quite a bit more executable
code and correspondingly more counters. They eventually decided that
the benefit of reduced cache coherence traffic is not worth that cost
(or the cost of a hardware mechanism), as described in the last
paragraph, from which I cited the important parts.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Les messages affichés proviennent d'usenet.

Date	Sujet	#	Auteur
3 Oct 24	Microarchitectural support for counting	33	Anton Ertl
3 Oct 24	Re: Microarchitectural support for counting	28	Brett
5 Oct 24	Re: Microarchitectural support for counting	27	MitchAlsup1
5 Oct 24	Re: Microarchitectural support for counting	1	Brett
5 Oct 24	Interrupts in OoO (was: Microarchitectural support for counting)	7	Anton Ertl
7 Oct 24	Re: Interrupts in OoO (was: Microarchitectural support for counting)	4	Brett
7 Oct 24	Re: Interrupts in OoO	2	MitchAlsup1
8 Oct 24	Re: Interrupts in OoO	1	MitchAlsup1
8 Oct 24	Re: Interrupts in OoO	1	Terje Mathisen
7 Oct 24	Re: Interrupts in OoO	1	MitchAlsup1
13 Oct 24	Re: Interrupts in OoO	1	Anton Ertl
5 Oct 24	Re: Microarchitectural support for counting	2	MitchAlsup1
25 Dec 24	Re: Microarchitectural support for counting	1	MitchAlsup1
25 Dec 24	Re: Microarchitectural support for counting	8	Paul A. Clayton
25 Dec 24	Re: Microarchitectural support for counting	7	MitchAlsup1
25 Dec 24	Re: Microarchitectural support for counting	1	MitchAlsup1
31 Dec 24	Re: Microarchitectural support for counting	5	Paul A. Clayton
1 Jan 25	Re: Microarchitectural support for counting	4	MitchAlsup1
2 Jan 25	Re: Microarchitectural support for counting	1	MitchAlsup1
6 Jan 25	Re: Microarchitectural support for counting	2	Paul A. Clayton
7 Jan 25	Re: Microarchitectural support for counting	1	Terje Mathisen
25 Dec 24	Re: Microarchitectural support for counting	8	MitchAlsup1
26 Dec 24	Dealing with mispredictions (was: Microarchitectural support ...)	2	Anton Ertl
26 Dec 24	Re: Dealing with mispredictions	1	MitchAlsup1
26 Dec 24	Re: Microarchitectural support for counting	5	Michael S
26 Dec 24	Re: branch guessing, Microarchitectural support for counting	4	John Levine
26 Dec 24	Re: branch guessing, Microarchitectural support for counting	1	Michael S
26 Dec 24	Re: branch guessing, Microarchitectural support for counting	1	MitchAlsup1
26 Dec 24	Re: branch guessing, Microarchitectural support for counting	1	Thomas Koenig
26 Dec 24	Re: Microarchitectural support for counting	2	Chris M. Thomasson
26 Dec 24	Re: Microarchitectural support for counting	1	Anton Ertl
27 Dec 24	Re: Microarchitectural support for counting	2	jseigh
28 Dec 24	Re: Microarchitectural support for counting	1	jseigh