Re: Microarchitectural support for counting

Liste des GroupesRevenir à c arch 
Sujet : Re: Microarchitectural support for counting
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.arch
Date : 26. Dec 2024, 15:56:30
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Dec26.155630@mips.complang.tuwien.ac.at>
References : 1 2
User-Agent : xrn 10.11
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 10/3/2024 7:00 AM, Anton Ertl wrote:
Two weeks ago Rene Mueller presented the paper "The Cost of Profiling
in the HotSpot Virtual Machine" at MPLR 2024.  He reported that for
some programs the counters used for profiling the program result in
cache contention due to true or false sharing among threads.
 
The traditional software mitigation for that problem is to split the
counters into per-thread or per-core instances.  But for heavily
multi-threaded programs running on machines with many cores the cost
of this mitigation is substantial.
...
For the HotSpot application, the
eventual answer was that they live with the cost of cache contention
for the programs that have that problem.  After some minutes the hot
parts of the program are optimized, and cache contention is no longer
a problem.
...
If the per-thread counters are properly padded to a l2 cache line and
properly aligned on cache line boundaries, well, the should not cause
false sharing with other cache lines... Right?

Sure, that's what the first sentence of the second paragraph you cited
(and which I cited again) is about.  Next, read the next sentence.

Maybe I should give an example (fully made up on the spot, read the
paper for real numbers): If HotSpot uses, on average one counter per
conditional branch, and assuming a conditional branch every 10 static
instructions (each having, say 4 bytes), with 1MB of generated code
and 8 bytes per counter, that's 200KB of counters.  But these counters
are shared between all threads, so for code running on many cores you
get true and false sharing.

As mentioned, the usual mitigation is per-core counters.  With a
256-core machine, we now have 51.2MB of counters for 1MB of executable
code.  Now this is Java, so there might be quite a bit more executable
code and correspondingly more counters.  They eventually decided that
the benefit of reduced cache coherence traffic is not worth that cost
(or the cost of a hardware mechanism), as described in the last
paragraph, from which I cited the important parts.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
  Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Date Sujet#  Auteur
3 Oct 24 * Microarchitectural support for counting33Anton Ertl
3 Oct 24 +* Re: Microarchitectural support for counting28Brett
5 Oct 24 i`* Re: Microarchitectural support for counting27MitchAlsup1
5 Oct 24 i +- Re: Microarchitectural support for counting1Brett
5 Oct 24 i +* Interrupts in OoO (was: Microarchitectural support for counting)7Anton Ertl
7 Oct 24 i i+* Re: Interrupts in OoO (was: Microarchitectural support for counting)4Brett
7 Oct 24 i ii+* Re: Interrupts in OoO2MitchAlsup1
8 Oct 24 i iii`- Re: Interrupts in OoO1MitchAlsup1
8 Oct 24 i ii`- Re: Interrupts in OoO1Terje Mathisen
7 Oct 24 i i+- Re: Interrupts in OoO1MitchAlsup1
13 Oct 24 i i`- Re: Interrupts in OoO1Anton Ertl
5 Oct 24 i +* Re: Microarchitectural support for counting2MitchAlsup1
25 Dec 24 i i`- Re: Microarchitectural support for counting1MitchAlsup1
25 Dec 24 i +* Re: Microarchitectural support for counting8Paul A. Clayton
25 Dec 24 i i`* Re: Microarchitectural support for counting7MitchAlsup1
25 Dec 24 i i +- Re: Microarchitectural support for counting1MitchAlsup1
31 Dec 24 i i `* Re: Microarchitectural support for counting5Paul A. Clayton
1 Jan 25 i i  `* Re: Microarchitectural support for counting4MitchAlsup1
2 Jan 25 i i   +- Re: Microarchitectural support for counting1MitchAlsup1
6 Jan 25 i i   `* Re: Microarchitectural support for counting2Paul A. Clayton
7 Jan 25 i i    `- Re: Microarchitectural support for counting1Terje Mathisen
25 Dec 24 i `* Re: Microarchitectural support for counting8MitchAlsup1
26 Dec 24 i  +* Dealing with mispredictions (was: Microarchitectural support ...)2Anton Ertl
26 Dec 24 i  i`- Re: Dealing with mispredictions1MitchAlsup1
26 Dec 24 i  `* Re: Microarchitectural support for counting5Michael S
26 Dec 24 i   `* Re: branch guessing, Microarchitectural support for counting4John Levine
26 Dec 24 i    +- Re: branch guessing, Microarchitectural support for counting1Michael S
26 Dec 24 i    +- Re: branch guessing, Microarchitectural support for counting1MitchAlsup1
26 Dec 24 i    `- Re: branch guessing, Microarchitectural support for counting1Thomas Koenig
26 Dec 24 +* Re: Microarchitectural support for counting2Chris M. Thomasson
26 Dec 24 i`- Re: Microarchitectural support for counting1Anton Ertl
27 Dec 24 `* Re: Microarchitectural support for counting2jseigh
28 Dec 24  `- Re: Microarchitectural support for counting1jseigh

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal