Sujet : Re: Parallel Forth on a 44 core machine
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.lang.forthDate : 18. Aug 2024, 14:42:33
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Aug18.154233@mips.complang.tuwien.ac.at>
References : 1 2 3 4 5
User-Agent : xrn 10.11
mhx@iae.nl (mhx) writes:
What I meant is severe slowdown when reading variables that are
physically *close* to variables that belong to another process.
That is known as false sharing. The cache coherence protocols work at
the granularity of a a cache line (usually 64 bytes). If core A
writes to a variable, and core B, say, reads one in the same cache
line, the cache coherence protocol first makes that cache line
modified by core A (and every other core has to invalidate that cache
line), and then core B has to wait until core A sends out the data to
the other cores.
It happens for both AMD and Intel on both Windows and Linux.
Spacing such variables farther apart has dramatic impact but
is quite inconvenient in most cases.
Yes, but if you want performance, you have to rearrange your data to
avoid false sharing.
I don't recall that transputers had these problems.
Transputers have no shared memory and therefore no cache coherence
protocols.
- anton
-- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.htmlcomp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: https://forth-standard.org/ EuroForth 2024: https://euro.theforth.net