Re: Stack vs stackless operation

Liste des GroupesRevenir à cl forth 
Sujet : Re: Stack vs stackless operation
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.lang.forth
Date : 26. Feb 2025, 18:46:13
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2025Feb26.184613@mips.complang.tuwien.ac.at>
References : 1 2 3 4
User-Agent : xrn 10.11
anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
mhx@iae.nl (mhx) writes:
: :=: ( a b -- ) \ exchange values among two variables
OVER @ >R  DUP @   ROT !   R> SWAP ! ;

Another variant:

: exchange ( addr1 addr2 -- )
    dup @ rot !@ swap ! ;

This uses the primitive

'!@' ( u1 a-addr -- u2 ) gforth-experimental "store-fetch"
   load U2 from A_ADDR, and store U1 there, as atomic operation

I worry that the atomic part will result in it being slower than the
versions that do not use !@.  Let's measure that:

: exchange ( addr1 addr2 -- )
    over @ swap !@ swap ! ;

: :=: ( addr1 addr2 -- )
    OVER @ >R  DUP @   ROT !   R> SWAP ! ;

: bench-exchange ( addr1 addr2 -- )
    100000000 0 do 2dup exchange loop ;

: bench-:=: ( addr1 addr2 -- )
    100000000 0 do 2dup :=: loop ;

variable v1
variable v2

1 v1 !
2 v2 !

Measurement with
perf stat -e cycles -e instructions gforth-fast xxxx.fs -e "v1 v2 bench-exchange bye"
perf stat -e cycles -e instructions gforth-fast xxxx.fs -e "v1 v2 bench-:=: bye"

Results on a Zen4:

       exchange              :=:
       877_054_156       812_761_422      cycles
     3_708_692_329     3_908_642_117      instructions

So the @! variant is indeed slower, but only a little (0.65 cycles per
execution of these words); however, I would expect either a big
slowdown (from latency when dealing with the memory subsystem,
broadcasting to other cores, etc.) or none at all.

And here's the code:
see-code exchange                     see-code :=:                          
$7EFDC12A06A8 over    1->2            $7FBD6B6A06A8 over    1->2            
7EFDC0DEA3B0:   mov     r15,$08[r10]  7FBD6B26B3B0:   mov     r15,$08[r10]  
$7EFDC12A06B0 @    2->2               $7FBD6B6A06B0 @    2->2               
7EFDC0DEA3B4:   mov     r15,[r15]     7FBD6B26B3B4:   mov     r15,[r15]     
$7EFDC12A06B8 swap    2->1            $7FBD6B6A06B8 >r    2->1              
7EFDC0DEA3B7:   mov     [r10],r15     7FBD6B26B3B7:   mov     -$08[r14],r15 
7EFDC0DEA3BA:   sub     r10,$08       7FBD6B26B3BB:   sub     r14,$08       
$7EFDC12A06C0 !@    1->1              $7FBD6B6A06C0 dup    1->2             
7EFDC0DEA3BE:   mov     rax,$08[r10]  7FBD6B26B3BF:   mov     r15,r13       
7EFDC0DEA3C2:   add     r10,$08       $7FBD6B6A06C8 @    2->2               
7EFDC0DEA3C6:   xchg    $00[r13],rax  7FBD6B26B3C2:   mov     r15,[r15]     
7EFDC0DEA3CA:   mov     r13,rax       $7FBD6B6A06D0 rot    2->3             
$7EFDC12A06C8 swap    1->2            7FBD6B26B3C5:   mov     r9,$08[r10]   
7EFDC0DEA3CD:   mov     r15,$08[r10]  7FBD6B26B3C9:   add     r10,$08       
7EFDC0DEA3D1:   add     r10,$08       $7FBD6B6A06D8 !    3->1               
$7EFDC12A06D0 !    2->0               7FBD6B26B3CD:   mov     [r9],r15      
7EFDC0DEA3D5:   mov     [r15],r13     $7FBD6B6A06E0 r>    1->2              
$7EFDC12A06D8 ;s    0->1              7FBD6B26B3D0:   mov     r15,[r14]     
7EFDC0DEA3D8:   mov     r13,$08[r10]  7FBD6B26B3D3:   add     r14,$08       
7EFDC0DEA3DC:   add     r10,$08       $7FBD6B6A06E8 swap    2->3            
7EFDC0DEA3E0:   mov     rbx,[r14]     7FBD6B26B3D7:   add     r10,$08       
7EFDC0DEA3E3:   add     r14,$08       7FBD6B26B3DB:   mov     r9,r13        
7EFDC0DEA3E7:   mov     rax,[rbx]     7FBD6B26B3DE:   mov     r13,[r10]     
7EFDC0DEA3EA:   jmp     eax           $7FBD6B6A06F0 !    3->1               
                                      7FBD6B26B3E1:   mov     [r9],r15      
                                      $7FBD6B6A06F8 ;s    1->1              
                                      7FBD6B26B3E4:   mov     rbx,[r14]     
                                      7FBD6B26B3E7:   add     r14,$08       
                                      7FBD6B26B3EB:   mov     rax,[rbx]     
                                      7FBD6B26B3EE:   jmp     eax           

The difference looks bigger than it is: There are lines for 4
additional primitives (no influence on performance) and 2 additional
instructions, resulting in a 6-line difference.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: https://forth-standard.org/
EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/
EuroForth 2024 proceedings: http://www.euroforth.org/ef24/papers/

Date Sujet#  Auteur
24 Feb 25 * Stack vs stackless operation72LIT
24 Feb 25 +* Re: Stack vs stackless operation4minforth
24 Feb 25 i`* Re: Stack vs stackless operation3LIT
24 Feb 25 i `* Re: Stack vs stackless operation2minforth
24 Feb 25 i  `- Re: Stack vs stackless operation1LIT
24 Feb 25 +* Re: Stack vs stackless operation14Anton Ertl
24 Feb 25 i`* Re: Stack vs stackless operation13LIT
25 Feb 25 i `* Re: Stack vs stackless operation12Anton Ertl
25 Feb 25 i  `* Re: Stack vs stackless operation11LIT
25 Feb 25 i   `* Re: Stack vs stackless operation10Anton Ertl
25 Feb 25 i    `* Re: Stack vs stackless operation9LIT
25 Feb 25 i     +* Re: Stack vs stackless operation5minforth
25 Feb 25 i     i`* Re: Stack vs stackless operation4LIT
25 Feb 25 i     i `* Re: Stack vs stackless operation3minforth
25 Feb 25 i     i  `* Re: Stack vs stackless operation2LIT
25 Feb 25 i     i   `- Re: Stack vs stackless operation1Gerry Jackson
25 Feb 25 i     `* Re: Stack vs stackless operation3Anton Ertl
25 Feb 25 i      `* Re: Stack vs stackless operation2LIT
25 Feb 25 i       `- Re: Stack vs stackless operation1Anton Ertl
25 Feb 25 +* Re: Stack vs stackless operation9dxf
25 Feb 25 i`* Re: Stack vs stackless operation8LIT
25 Feb 25 i +* Re: Stack vs stackless operation6dxf
25 Feb 25 i i`* Re: Stack vs stackless operation5LIT
26 Feb 25 i i `* Re: Stack vs stackless operation4dxf
26 Feb 25 i i  `* Re: Stack vs stackless operation3LIT
26 Feb 25 i i   `* Re: Stack vs stackless operation2minforth
26 Feb 25 i i    `- Re: Stack vs stackless operation1LIT
25 Feb 25 i `- Re: Stack vs stackless operation1Hans Bezemer
25 Feb 25 +* Re: Stack vs stackless operation2LIT
25 Feb 25 i`- do...loop (was: Stack vs stackless operation)1Anton Ertl
25 Feb 25 +* Re: Stack vs stackless operation10LIT
26 Feb 25 i`* Re: Stack vs stackless operation9Hans Bezemer
26 Feb 25 i `* Re: Stack vs stackless operation8LIT
26 Feb 25 i  `* Re: Stack vs stackless operation7Hans Bezemer
26 Feb 25 i   `* Re: Stack vs stackless operation6LIT
27 Feb 25 i    `* Re: Stack vs stackless operation5LIT
27 Feb 25 i     `* Re: Stack vs stackless operation4LIT
2 Mar 25 i      `* Re: Stack vs stackless operation3LIT
5 Mar 25 i       `* Re: Stack vs stackless operation2Hans Bezemer
6 Mar 25 i        `- Re: Stack vs stackless operation1LIT
25 Feb 25 `* Re: Stack vs stackless operation32LIT
25 Feb 25  +* Re: Stack vs stackless operation10Anton Ertl
25 Feb 25  i+- Re: Stack vs stackless operation1LIT
26 Feb 25  i`* Re: Stack vs stackless operation8LIT
26 Feb 25  i +- Re: Stack vs stackless operation1LIT
26 Feb 25  i `* Re: Stack vs stackless operation6John Ames
26 Feb 25  i  `* Re: Stack vs stackless operation5LIT
27 Feb 25  i   `* Re: Stack vs stackless operation4dxf
27 Feb 25  i    `* Re: Stack vs stackless operation3LIT
27 Feb 25  i     `* Re: Stack vs stackless operation2Hans Bezemer
27 Feb 25  i      `- Re: Stack vs stackless operation1LIT
26 Feb 25  +* Re: Stack vs stackless operation2Waldek Hebisch
26 Feb 25  i`- Re: Stack vs stackless operation1Anton Ertl
26 Feb 25  `* Re: Stack vs stackless operation19mhx
26 Feb 25   +- Re: Stack vs stackless operation1minforth
26 Feb 25   +* Re: Stack vs stackless operation16Anton Ertl
26 Feb 25   i`* Re: Stack vs stackless operation15Anton Ertl
26 Feb 25   i +* Re: Stack vs stackless operation7Paul Rubin
26 Feb 25   i i+- Re: Stack vs stackless operation1minforth
27 Feb 25   i i`* Re: Stack vs stackless operation5Anton Ertl
27 Feb 25   i i +* Re: Stack vs stackless operation2Paul Rubin
27 Feb 25   i i i`- Re: Stack vs stackless operation1Anton Ertl
27 Feb 25   i i `* Re: Stack vs stackless operation2Gerry Jackson
27 Feb 25   i i  `- Re: Stack vs stackless operation1Anton Ertl
28 Feb 25   i `* Re: Stack vs stackless operation7Anton Ertl
28 Feb 25   i  `* Re: Stack vs stackless operation6Paul Rubin
1 Mar 25   i   `* Re: Stack vs stackless operation5Anton Ertl
1 Mar 25   i    +- Stack caching (: Stack vs stackless operation)1Anton Ertl
1 Mar 25   i    `* Re: Stack vs stackless operation3Anton Ertl
1 Mar 25   i     `* Re: Stack vs stackless operation2Anton Ertl
1 Mar 25   i      `- Re: Stack vs stackless operation1mhx
27 Feb 25   `- Re: Stack vs stackless operation1mhx

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal