Sujet : Re: Cost of handling misaligned access
De : tkoenig (at) *nospam* netcologne.de (Thomas Koenig)
Groupes : comp.archDate : 17. Feb 2025, 11:00:20
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vov1bk$13df5$1@dont-email.me>
References : 1 2 3 4 5
User-Agent : slrn/1.0.3 (Linux)
On 2025-02-17, Terje Mathisen <
terje.mathisen@tmsw.no> wrote:
No, the real problem is when a compiler want to auto-vectorize any code
working with 1/2/4/8 byte items: All of a sudden the alignment
requirement went from the item stride to the vector register stride
(16/32/64 bytes).
>
The only way this can work is to have the compiler control _all_
allocations to make sure they are properly aligned, including code in
libraries, or the compiler will be forced to use vector load/store
operations which do allow unaligned access.
Not necessarily the compiler's choice - compiler-generated code
has to deal with everything that conforms to the ABI, and if that
specifies 8-byte aligned pointers to doubles, the compiler cannot
assume otherwise unless directed.
Loop peeling might help, but becomes difficult when more than
one pointer is involved. Consider a dot product calculation
which you want to vectorize with 256-bit SIMD instructions,
with pointers a and b.
You then have to deal with the case (uintptr_t) a % 32 == 1
and (uintptr_t) a % 32 == 3, for example.
Or you can use an extension, like __attribute__ ((aligned(32))).
Haut de la page
Les messages affichés proviennent d'usenet.
NewsPortal