Re: Cost of handling misaligned access

Liste des GroupesRevenir à c arch 
Sujet : Re: Cost of handling misaligned access
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.arch
Date : 02. Feb 2025, 18:44:58
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2025Feb2.184458@mips.complang.tuwien.ac.at>
References : 1
User-Agent : xrn 10.11
EricP <ThatWouldBeTelling@thevillage.com> writes:
The incremental cost is in a sequencer in the AGU for handling cache
line and possibly virtual page straddles, and a small byte shifter to
left shift the high order bytes. The AGU sequencer needs to know if the
line straddles a page boundary, if not then increment the 6-bit physical
line number within the 4 kB physical frame number, if yes then increment
virtual page number and TLB lookup again and access the first line.
(Slightly more if multiple page sizes are supported, but same idea.)
For a load AGU merges the low and high fragments and forwards.
...
The hardware cost appears trivial, especially within an OoO core.
So there doesn't appear to be any reason to not handle this.
Am I missing something?

The OS must also be able to keep both pages in physical memory until
the access is complete, or there will be no progress.  Should not be a
problem these days, but the 48 pages or so potentially needed by VAX
complicated the OS.

Yes, hardware is not hard, there is software that benefits, and as a
result, modern architectures (including RISC-V) now support unaligned
accesses (except for atomic accesses).

https://old.chipsandcheese.com/2025/01/26/inside-sifives-p550-microarchitecture/
...
This terrible unaligned access behavior is atypical even for low power
cores. Arm's Cortex A75 only takes 15 cycles in the worst case of
dependent accesses that are both misaligned.
>
Digging deeper with performance counters reveals executing each unaligned
load instruction results in ~505 executed instructions.

This is similar to what I measured on an U74 core from SiFive
<2024May14.073553@mips.complang.tuwien.ac.at>, so they probably use
the same solution.

P550 almost
certainly doesn�t have hardware support for unaligned accesses.
Rather, it�s likely raising a fault and letting an operating system
handler emulate it in software."

The architecture guarantees that unaligned accesses work, so the OS
might not have support for such emulation.  Another option would be to
trap into some kind of firmware-supplied fixup code, along the lines
of Alpha's PALcode.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
  Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Date Sujet#  Auteur
2 Feb 25 * Re: Cost of handling misaligned access19Anton Ertl
2 Feb 25 `* Re: Cost of handling misaligned access18Thomas Koenig
2 Feb 25  +* Re: Fun with a Vax, Cost of handling misaligned access2John Levine
3 Feb 25  i`- Re: Fun with a Vax, Cost of handling misaligned access1John Levine
3 Feb 25  +* Re: Cost of handling misaligned access2BGB
3 Feb 25  i`- Re: Cost of handling misaligned access1BGB
3 Feb 25  `* Re: Cost of handling misaligned access13Terje Mathisen
3 Feb 25   `* Re: Cost of handling misaligned access12John Levine
3 Feb 25    `* Re: Cost of handling misaligned access11MitchAlsup1
4 Feb 25     +* Re: Cost of handling misaligned access4John Levine
4 Feb 25     i`* Re: Cost of handling misaligned access3John Dallman
5 Feb 25     i `* Re: Cost of handling misaligned access2Michael S
5 Feb 25     i  `- Re: Cost of handling misaligned access1John Dallman
4 Feb 25     `* Re: Cost of handling misaligned access6MitchAlsup1
4 Feb 25      +- Re: Cost of handling misaligned access1Stephen Fuld
4 Feb 25      +- Re: Cost of handling misaligned access1Thomas Koenig
4 Feb 25      `* Re: Cost of handling misaligned access3BGB
4 Feb 25       `* Re: Cost of handling misaligned access2MitchAlsup1
5 Feb 25        `- Re: Cost of handling misaligned access1BGB

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal