Re: Cost of handling misaligned access

Liste des GroupesRevenir à c arch 
Sujet : Re: Cost of handling misaligned access
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 03. Feb 2025, 22:40:21
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vnrd49$1f52h$2@dont-email.me>
References : 1 2 3 4 5
User-Agent : Mozilla Thunderbird
On 2/3/2025 1:41 PM, Thomas Koenig wrote:
EricP <ThatWouldBeTelling@thevillage.com> schrieb:
 
That is fine for code that is being actively maintained and backward
data structure compatibility is not required (like those inside a kernel).
>
However for x86 there was a few billion lines of legacy code that likely
assumed 2-byte alignment, or followed the fp64 aligned to 32-bits advice,
and a C language that mandates structs be laid out in memory exactly as
specified (no automatic struct optimization). Also I seem to recall some
amount of squawking about SIMD when it required naturally aligned buffers.
As SIMD no longer requires alignment, presumably code no longer does so.
 Looking at Intel's optimization manual, they state in
"15.6 DATA ALIGNMENT FOR INTEL® AVX"
 "Assembly/Compiler Coding Rule 65. (H impact, M generality) Align
data to 32-byte boundary when possible. Prefer store alignment
over load alignment."
 and further down, about AVX-512,
 "18.23.1 Align Data to 64 Bytes"
 "Aligning data to vector length is recommended. For best results,
when using Intel AVX-512 instructions, align data to 64 bytes.
 When doing a 64-byte Intel AVX-512 unaligned load/store, every
load/store is a cache-line split, since the cache-line is 64
bytes. This is double the cache line split rate of Intel AVX2
code that uses 32-byte registers. A high cache-line split rate in
memory-intensive code can cause poor performance."
 This sounds reasonable, and good advice if you want to go
down SIMD lane.
 
This is, ironically, a place where SIMD via ganged registers has an advantage over SIMD via large monolithic registers.
With ganged registers, it means one can load/store them piecewise as needed, and use unaligned loads/stores (with the larger forms being able to actively require natural alignment).
Though, granted, large monolithic registers are a more popular option vs ganged registers.
And, you can make the registers larger without either effectively halving the number of longer registers, or needing to double the number of shorter registers.
But, at the cost that much of the high-order bits of the registers will be essentially wasted for code operating on narrower vectors.
Say, if one has:
   64x 64-bit vectors (group of 1);
   32x 128-bit vectors (group of 2);
   16x 256-bit vectors (group of 4);
   8x 512-bit vectors (group of 8).
If they wanted a 1024-bit vector, they can make a choice:
   Live with only 4 vectors;
   Expand the size of the register file to 128x 64-bit vectors;
   Live with asymmetric wonk
     Parts of the register space only being accessible at larger sizes.
   ...
Though, with monolithic registers, each doubling of the register size also effectively mandates either a whole new set of instructions to deal with the larger size, or some other way to encode or specify the size (or, "who knows, it is whatever it is, software can figure it out"...).
This is less true of ganged registers.
   Say, if the CPU supported it, they could add, say:
     PADDX4.F //256-bit Binary32 ADD
     PSUBX4.F //256-bit Binary32 SUB
     PMULX4.F //256-bit Binary32 MUL
     ...
   While leaving everything else the same as before.
     The addition of wider load/store operations being optional.
     Don't have 256-bit Ld/St, use 128-bit Ld/St.
       Need fully unaligned access, use 64-bit Ld/St's.
     ...
And also making it easy for narrower implementations to simply crack the instructions into 128-bit vector operations internally (which may actually be implemented as two 64 bit vector ops running in parallel).
But, say, the pipeline could be designed internally around 64-bit vector ops, with a 4-wide machine able to do 256-bit vector operations mostly by supporting a 64-bit vector operation on each lane.
And, you can more easily "pretend" in the compiler to have whichever vector size you want. Code asks for 256 bit vectors but target only has 128? Just fake it using 128-bit ops.
But, granted, most ISAs aren't doing SIMD this way.
...

Also in going from 32 to 64 bits, data structures that contain pointers
now could find those 8-byte pointers aligned on 4-byte boundaries.
 This is mandated by the relevant ABI, and ABIs usually mandate
alignment on natural boundaries.
 
While the Linux kernel may not use many misaligned values,
I'd guess there is a lot of application code that does.
 Unless it is generating external binary data (a _very_ bad idea,
XDR was developed for a reason), there is no big reason to use
unaligned data, unless somebody is playing fast and loose
with C pointer types, and that is a bad idea anyway.
 
Often needed for speed in many cases.

Alternatively, a compiler could use it to implement somthing like
memcpy or memmove when it knows that unaligned accesses are safe.
 
Basically required unless you want them to be slow.
The aligned-only versions will almost invariably be slower, potentially significantly slower.

But it would be really interesting to have a access to a system
where unaligned accesses trap, in order to find (and fix) ABI
issues and some undefined behavior on the C side.
It may make sense to add some form of categorical separations:
   Pointers that may be unaligned;
   Pointers that must be aligned.
Trapping on unaligned being a reasonable option for the latter case.
Really needs to be per-pointer or per-access though, and not a global flag (which makes it kind of useless).
Some compilers have __aligned and __unaligned keywords.
Something like "[[aligned]]" and "[[unaligned]]" could also make sense, with the default likely depending on type and implementation...

Date Sujet#  Auteur
2 Feb 25 * Re: Cost of handling misaligned access112BGB
3 Feb 25 +* Re: Cost of handling misaligned access2MitchAlsup1
3 Feb 25 i`- Re: Cost of handling misaligned access1BGB
3 Feb 25 `* Re: Cost of handling misaligned access109Anton Ertl
3 Feb 25  +* Re: Cost of handling misaligned access11BGB
3 Feb 25  i`* Re: Cost of handling misaligned access10Anton Ertl
3 Feb 25  i +- Re: Cost of handling misaligned access1BGB
3 Feb 25  i `* Re: Cost of handling misaligned access8Thomas Koenig
4 Feb 25  i  `* Re: Cost of handling misaligned access7Anton Ertl
4 Feb 25  i   +* Re: Cost of handling misaligned access5Thomas Koenig
4 Feb 25  i   i`* Re: Cost of handling misaligned access4Anton Ertl
4 Feb 25  i   i +* Re: Cost of handling misaligned access2Thomas Koenig
10 Feb 25  i   i i`- Re: Cost of handling misaligned access1Mike Stump
10 Feb 25  i   i `- Re: Cost of handling misaligned access1Mike Stump
4 Feb 25  i   `- Re: Cost of handling misaligned access1MitchAlsup1
3 Feb 25  +* Re: Cost of handling misaligned access3Thomas Koenig
3 Feb 25  i`* Re: Cost of handling misaligned access2BGB
3 Feb 25  i `- Re: Cost of handling misaligned access1MitchAlsup1
4 Feb 25  +* Re: Cost of handling misaligned access41Anton Ertl
5 Feb 25  i`* Re: Cost of handling misaligned access40Terje Mathisen
5 Feb 25  i +* Re: Cost of handling misaligned access4Anton Ertl
5 Feb 25  i i+* Re: Cost of handling misaligned access2Terje Mathisen
6 Feb 25  i ii`- Re: Cost of handling misaligned access1Anton Ertl
6 Feb 25  i i`- Re: Cost of handling misaligned access1Anton Ertl
5 Feb 25  i `* Re: Cost of handling misaligned access35Michael S
6 Feb 25  i  +* Re: Cost of handling misaligned access32Anton Ertl
6 Feb 25  i  i`* Re: Cost of handling misaligned access31Michael S
6 Feb 25  i  i +* Re: Cost of handling misaligned access2Anton Ertl
6 Feb 25  i  i i`- Re: Cost of handling misaligned access1Michael S
6 Feb 25  i  i `* Re: Cost of handling misaligned access28Terje Mathisen
6 Feb 25  i  i  `* Re: Cost of handling misaligned access27Terje Mathisen
6 Feb 25  i  i   `* Re: Cost of handling misaligned access26Michael S
6 Feb 25  i  i    `* Re: Cost of handling misaligned access25Terje Mathisen
6 Feb 25  i  i     +* Re: Cost of handling misaligned access19Michael S
7 Feb 25  i  i     i`* Re: Cost of handling misaligned access18Terje Mathisen
7 Feb 25  i  i     i `* Re: Cost of handling misaligned access17Michael S
7 Feb 25  i  i     i  `* Re: Cost of handling misaligned access16Terje Mathisen
7 Feb 25  i  i     i   `* Re: Cost of handling misaligned access15Michael S
7 Feb 25  i  i     i    +- Re: Cost of handling misaligned access1Terje Mathisen
7 Feb 25  i  i     i    +* Re: Cost of handling misaligned access3MitchAlsup1
8 Feb 25  i  i     i    i+- Re: Cost of handling misaligned access1Terje Mathisen
8 Feb 25  i  i     i    i`- Re: Cost of handling misaligned access1Michael S
8 Feb 25  i  i     i    `* Re: Cost of handling misaligned access10Anton Ertl
8 Feb 25  i  i     i     +- Re: Cost of handling misaligned access1Terje Mathisen
8 Feb 25  i  i     i     +* Re: Cost of handling misaligned access6Michael S
8 Feb 25  i  i     i     i`* Re: Cost of handling misaligned access5Anton Ertl
8 Feb 25  i  i     i     i +- Re: Cost of handling misaligned access1Michael S
9 Feb 25  i  i     i     i +* Re: Cost of handling misaligned access2Michael S
11 Feb 25  i  i     i     i i`- Re: Cost of handling misaligned access1Michael S
9 Feb 25  i  i     i     i `- Re: Cost of handling misaligned access1Michael S
9 Feb 25  i  i     i     +- Re: Cost of handling misaligned access1Michael S
10 Feb 25  i  i     i     `- Re: Cost of handling misaligned access1Michael S
7 Feb 25  i  i     `* Re: Cost of handling misaligned access5BGB
7 Feb 25  i  i      `* Re: Cost of handling misaligned access4MitchAlsup1
7 Feb 25  i  i       `* Re: Cost of handling misaligned access3BGB
8 Feb 25  i  i        `* Re: Cost of handling misaligned access2Anssi Saari
8 Feb 25  i  i         `- Re: Cost of handling misaligned access1BGB
6 Feb 25  i  `* Re: Cost of handling misaligned access2Terje Mathisen
6 Feb 25  i   `- Re: Cost of handling misaligned access1Michael S
6 Feb 25  +* Re: Cost of handling misaligned access5Waldek Hebisch
6 Feb 25  i+* Re: Cost of handling misaligned access3Anton Ertl
6 Feb 25  ii`* Re: Cost of handling misaligned access2Waldek Hebisch
6 Feb 25  ii `- Re: Cost of handling misaligned access1Anton Ertl
6 Feb 25  i`- Re: Cost of handling misaligned access1Terje Mathisen
13 Feb 25  `* Re: Cost of handling misaligned access48Marcus
13 Feb 25   +- Re: Cost of handling misaligned access1Thomas Koenig
14 Feb 25   +* Re: Cost of handling misaligned access41BGB
14 Feb 25   i`* Re: Cost of handling misaligned access40MitchAlsup1
18 Feb 25   i `* Re: Cost of handling misaligned access39BGB
18 Feb 25   i  +* Re: Cost of handling misaligned access33MitchAlsup1
18 Feb 25   i  i+- Re: Cost of handling misaligned access1BGB
18 Feb 25   i  i`* Re: Cost of handling misaligned access31Michael S
18 Feb 25   i  i +- Re: Cost of handling misaligned access1Thomas Koenig
18 Feb 25   i  i +* Re: Cost of handling misaligned access26MitchAlsup1
18 Feb 25   i  i i`* Re: Cost of handling misaligned access25Terje Mathisen
18 Feb 25   i  i i `* Re: Cost of handling misaligned access24MitchAlsup1
19 Feb 25   i  i i  `* Re: Cost of handling misaligned access23Terje Mathisen
19 Feb 25   i  i i   `* Re: Cost of handling misaligned access22MitchAlsup1
19 Feb 25   i  i i    `* Re: Cost of handling misaligned access21BGB
20 Feb 25   i  i i     +- Re: Cost of handling misaligned access1Robert Finch
20 Feb 25   i  i i     +* Re: Cost of handling misaligned access5MitchAlsup1
20 Feb 25   i  i i     i+* Re: Cost of handling misaligned access2BGB
20 Feb 25   i  i i     ii`- Re: Cost of handling misaligned access1BGB
21 Feb 25   i  i i     i`* Re: Cost of handling misaligned access2Robert Finch
21 Feb 25   i  i i     i `- Re: Cost of handling misaligned access1BGB
21 Feb 25   i  i i     `* Re: Cost of handling misaligned access14BGB
22 Feb 25   i  i i      +- Re: Cost of handling misaligned access1Robert Finch
22 Feb 25   i  i i      `* Re: Cost of handling misaligned access12Robert Finch
23 Feb 25   i  i i       +* Re: Cost of handling misaligned access10BGB
23 Feb 25   i  i i       i`* Re: Cost of handling misaligned access9Michael S
24 Feb 25   i  i i       i +- Re: Cost of handling misaligned access1BGB
24 Feb 25   i  i i       i `* Re: Cost of handling misaligned access7Michael S
24 Feb 25   i  i i       i  +* Re: Cost of handling misaligned access4Robert Finch
24 Feb 25   i  i i       i  i+- Re: Cost of handling misaligned access1BGB
24 Feb 25   i  i i       i  i`* Re: Cost of handling misaligned access2MitchAlsup1
25 Feb 25   i  i i       i  i `- Re: Cost of handling misaligned access1BGB
25 Feb 25   i  i i       i  `* Re: Cost of handling misaligned access2MitchAlsup1
25 Feb 25   i  i i       i   `- Re: Cost of handling misaligned access1BGB
23 Feb 25   i  i i       `- Re: Cost of handling misaligned access1Robert Finch
18 Feb 25   i  i `* Re: Cost of handling misaligned access3BGB
19 Feb 25   i  i  `* Re: Cost of handling misaligned access2MitchAlsup1
18 Feb 25   i  `* Re: Cost of handling misaligned access5Robert Finch
17 Feb 25   `* Re: Cost of handling misaligned access5Terje Mathisen

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal