Liste des Groupes | Revenir à c arch |
EricP <ThatWouldBeTelling@thevillage.com> wrote:SIMD absolutely require, as a minimum, the ability to handle data that is only aligned according to the internal elements: An array of double can start on any address which is 0 mod 8, similar for float/u32 etc. This way you can go from 128 via 256 to 512 bit SIMD regs with no data alignment change.>I guess that much of that is simply "by accident" because
While the Linux kernel may not use many misaligned values,
I'd guess there is a lot of application code that does.
without alignment checks in hadware misalignemnt may happen
and nobody notices that there is small performance problem.
I worked on a low level program and reasonably recent I did get
bunch of alignment errors. On AMD64 they were due to SSE
instructions used by 'memcpy', on 32-bit ARM due to use of double
precision floating point in 'memcpy'. It took some time to find
them, simply most things worked even without alignment and the
offending cases were hard to trigger.
My personal feeling is that best machine would have aligned
access with checks by default, but also special instructions
for unaligned access. That way code that does not need
unaligned access gets extra error checking, while code that
uses unaligned access pays modest, essentially unavoidable
penalty.
Of course, once architecture officially supports unaligned
access, there will be binaries depending on this and backward
compatibility will prevent change to require alignment.
Concerning SIMD: trouble here is increasing vector length and
consequently increasing alignment requirements. A lot of SIMD
code is memory-bound and current way of doing misaligned
access leads to worse performance. So really no good way
to solve this. In principle set of buffers for 2 cache lines
each and appropriate shifters could give optimal troughput,
but probably would lead to increased latency.
Les messages affichés proviennent d'usenet.