Re: Cost of handling misaligned access

Liste des GroupesRevenir à c arch 
Sujet : Re: Cost of handling misaligned access
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.arch
Date : 04. Feb 2025, 23:17:33
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <8b692b1e78aca51552dfabc9bad80146@www.novabbs.org>
References : 1 2 3 4 5 6 7 8 9 10
User-Agent : Rocksolid Light
On Tue, 4 Feb 2025 20:49:14 +0000, BGB wrote:

On 2/4/2025 1:25 PM, Scott Lurndal wrote:
mitchalsup@aol.com (MitchAlsup1) writes:
-------------------
Comparing to the CISC architectures of the 60s and 70s,
it's not horrible.
>
>
Well, vs a modern RISC style ISA, say, caller side:
   MOV  R20, R10  //0c (SSC with following)
   MOV  R21, R11  //1c
   BSR  func     //2c (typically)
Cost: 3 cycles.
     MOV  R1,R30
     MOV  R2,R28
     CALL func
3 instructions, might be 1 cycle on a 3-wide machine. And when
BRS/CALL is visible at FETCH 2 cycles before it DECODEs, the
call overhead is 0 cycles.

func:
   ADD SP, -32, SP      //2c (1 c penalty)
   MOV.Q  LR, (SP, 24)  //1c
   MOV.X  R18, (SP, 0)  //1c
   ...
   MOV.Q  (SP, 24), LR  //2c (1c penalty)
   MOV.X  (SP, 0), R18  //1c
   JMP    LR            //10c (*1)
>
*1: Insufficient delay since LR reload, so branch predictor fails to
handle this case.
This should be call/return predicted "just fine".
It should not be indirect predictor predicted.

Cost: 16 cycles.
func:
     ENTER  R30,R1,#32
     ...
     EXIT   R30,R1,#32
9 instructions on your machine, 5 on mine; also note: my ISA loads
the return address directly into IP so FETCH can begin while the
other LDs are in progress:: So, for the same amount of work, it
would take only 3 cycles (with a bunch of caveats).
But in any event, these are down about as low as one can expect
whereas 432 is close to 1000 cycles, we all complained about VAX
when it was in the 20-30 cycle range of overhead.
as to why:: 432 changed the capabilities maps at call and return,
and since these were not cached,... caller cannot see some of the
capabilities called has access to, and vice versa. With a lot bet-
ter caching of capabilities and modern bus widths, 432 might only
be in the 40-50 cycle range of overhead.
Moral:: Do not do way more work than required.

....

Date Sujet#  Auteur
2 Feb 25 * Re: Cost of handling misaligned access19Anton Ertl
2 Feb 25 `* Re: Cost of handling misaligned access18Thomas Koenig
2 Feb 25  +* Re: Fun with a Vax, Cost of handling misaligned access2John Levine
3 Feb 25  i`- Re: Fun with a Vax, Cost of handling misaligned access1John Levine
3 Feb 25  +* Re: Cost of handling misaligned access2BGB
3 Feb 25  i`- Re: Cost of handling misaligned access1BGB
3 Feb 25  `* Re: Cost of handling misaligned access13Terje Mathisen
3 Feb 25   `* Re: Cost of handling misaligned access12John Levine
3 Feb 25    `* Re: Cost of handling misaligned access11MitchAlsup1
4 Feb 25     +* Re: Cost of handling misaligned access4John Levine
4 Feb 25     i`* Re: Cost of handling misaligned access3John Dallman
5 Feb 25     i `* Re: Cost of handling misaligned access2Michael S
5 Feb 25     i  `- Re: Cost of handling misaligned access1John Dallman
4 Feb 25     `* Re: Cost of handling misaligned access6MitchAlsup1
4 Feb 25      +- Re: Cost of handling misaligned access1Stephen Fuld
4 Feb 25      +- Re: Cost of handling misaligned access1Thomas Koenig
4 Feb 25      `* Re: Cost of handling misaligned access3BGB
4 Feb 25       `* Re: Cost of handling misaligned access2MitchAlsup1
5 Feb 25        `- Re: Cost of handling misaligned access1BGB

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal