Liste des Groupes | Revenir à c arch |
On 2025-02-03, Anton Ertl wrote:Pretty much.BGB <cr88192@gmail.com> writes:If you compile regular C/C++ code that does not intentionally do anyOn 2/2/2025 10:45 AM, EricP wrote:>Digging deeper with performance counters reveals executing each unaligned>
load instruction results in ~505 executed instructions. P550 almost
certainly doesn’t have hardware support for unaligned accesses.
Rather, it’s likely raising a fault and letting an operating system
handler emulate it in software."
>
An emulation fault, or something similarly nasty...
>
>
At that point, even turning any potentially unaligned load or store into
a runtime call is likely to be a lot cheaper.
There are lots of potentially unaligned loads and stores. There are
very few actually unaligned loads and stores: On Linux-Alpha every
unaligned access is logged by default, and the number of
unaligned-access entries in the logs of our machines was relatively
small (on average a few per day). So trapping actual unaligned
accesses was faster than replacing potential unaligned accesses with
code sequences that synthesize the unaligned access from aligned
accesses.
nasty stuff, you will typically have zero unaligned loads stores.
My machine still does not support unaligned accesses in hardware (it's
on the todo list), and it can run an awful lot of software without
problems.
The problem arises when the programmer *deliberately* does unaligned
loads and stores in order to improve performance. Or rather, if the
programmer knows that the hardware supports unaligned loads and stores,
he/she can use that to write faster code in some special cases.
Of course, if the cost of unaligned accesses is that high, you will
avoid them in cases like block copies where cheap unaligned accesses
would otherwise be beneficial.
>
- anton
Les messages affichés proviennent d'usenet.