Re: Cost of handling misaligned access

Liste des GroupesRevenir à c arch 
Sujet : Re: Cost of handling misaligned access
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 05. Feb 2025, 05:55:14
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vnuqvj$28pai$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11
User-Agent : Mozilla Thunderbird
On 2/4/2025 4:17 PM, MitchAlsup1 wrote:
On Tue, 4 Feb 2025 20:49:14 +0000, BGB wrote:
 
On 2/4/2025 1:25 PM, Scott Lurndal wrote:
mitchalsup@aol.com (MitchAlsup1) writes:
-------------------
Comparing to the CISC architectures of the 60s and 70s,
it's not horrible.
>
>
Well, vs a modern RISC style ISA, say, caller side:
   MOV  R20, R10  //0c (SSC with following)
   MOV  R21, R11  //1c
   BSR  func     //2c (typically)
Cost: 3 cycles.
      MOV  R1,R30
     MOV  R2,R28
     CALL func
 3 instructions, might be 1 cycle on a 3-wide machine. And when
BRS/CALL is visible at FETCH 2 cycles before it DECODEs, the
call overhead is 0 cycles.
 
In my case, branch costs are:
   2c: Taken, correctly predicted.
   1c: Not taken, correctly predicted.
   10c: Not predicted.
     2c: Branch proper;
     8c: Pipeline flush.

func:
   ADD SP, -32, SP      //2c (1 c penalty)
   MOV.Q  LR, (SP, 24)  //1c
   MOV.X  R18, (SP, 0)  //1c
   ...
   MOV.Q  (SP, 24), LR  //2c (1c penalty)
   MOV.X  (SP, 0), R18  //1c
   JMP    LR            //10c (*1)
>
*1: Insufficient delay since LR reload, so branch predictor fails to
handle this case.
 This should be call/return predicted "just fine".
It should not be indirect predictor predicted.
 
No special call/return predictor here.
For JMP LR:
   2c: If no modification to LR exists in the ID2 or EXn stages.
   10c: If a modification exits.
For JMP Rn (generic):
   10 cycles.
LR is special and has a hot-path into the branch predictor.
But, it can only be used if LR has not modified within a certain window.
In this example, it fails mostly because there are not enough cycles of delay (need roughly 5 instructions between the LR reload and JMP).
This is basically needed to avoid predicting the branch using a potentially stale value.
For RV's JALR, additionally it also requires that the displacement be 0.
For a moment, I was left feeling unsure about the use-case for the displacement on JALR, but then remembered RV can compose longer-branches via AUIPC+JALR.

Cost: 16 cycles.
 func:
     ENTER  R30,R1,#32
     ...
     EXIT   R30,R1,#32
 9 instructions on your machine, 5 on mine; also note: my ISA loads
the return address directly into IP so FETCH can begin while the
other LDs are in progress:: So, for the same amount of work, it
would take only 3 cycles (with a bunch of caveats).
 But in any event, these are down about as low as one can expect
whereas 432 is close to 1000 cycles, we all complained about VAX
when it was in the 20-30 cycle range of overhead.
 as to why:: 432 changed the capabilities maps at call and return,
and since these were not cached,... caller cannot see some of the
capabilities called has access to, and vice versa. With a lot bet-
ter caching of capabilities and modern bus widths, 432 might only
be in the 40-50 cycle range of overhead.
 Moral:: Do not do way more work than required.
 
Yeah, basically.

....

Date Sujet#  Auteur
2 Feb 25 * Re: Cost of handling misaligned access19Anton Ertl
2 Feb 25 `* Re: Cost of handling misaligned access18Thomas Koenig
2 Feb 25  +* Re: Fun with a Vax, Cost of handling misaligned access2John Levine
3 Feb 25  i`- Re: Fun with a Vax, Cost of handling misaligned access1John Levine
3 Feb 25  +* Re: Cost of handling misaligned access2BGB
3 Feb 25  i`- Re: Cost of handling misaligned access1BGB
3 Feb 25  `* Re: Cost of handling misaligned access13Terje Mathisen
3 Feb 25   `* Re: Cost of handling misaligned access12John Levine
3 Feb 25    `* Re: Cost of handling misaligned access11MitchAlsup1
4 Feb 25     +* Re: Cost of handling misaligned access4John Levine
4 Feb 25     i`* Re: Cost of handling misaligned access3John Dallman
5 Feb 25     i `* Re: Cost of handling misaligned access2Michael S
5 Feb 25     i  `- Re: Cost of handling misaligned access1John Dallman
4 Feb 25     `* Re: Cost of handling misaligned access6MitchAlsup1
4 Feb 25      +- Re: Cost of handling misaligned access1Stephen Fuld
4 Feb 25      +- Re: Cost of handling misaligned access1Thomas Koenig
4 Feb 25      `* Re: Cost of handling misaligned access3BGB
4 Feb 25       `* Re: Cost of handling misaligned access2MitchAlsup1
5 Feb 25        `- Re: Cost of handling misaligned access1BGB

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal