EXECUTE implementation in native-code systems (was: nest-sys revisited)

Liste des GroupesRevenir à cl forth 
Sujet : EXECUTE implementation in native-code systems (was: nest-sys revisited)
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.lang.forth
Date : 17. Mar 2025, 07:12:38
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2025Mar17.071238@mips.complang.tuwien.ac.at>
References : 1 2 3 4
User-Agent : xrn 10.11
dxf <dxforth@gmail.com> writes:
Would you agree 'nest-sys' are peculiar to colon definitions.  That
EXECUTE is a different class of function.  It's not doing a 'call'
as such and not leaving anything on the 'return stack'?

That's certainly the case for threaded-code implementations.

For native-code implementations the implementation of EXECUTE is
usually an indirect call; sometimes an indirect tail-call, i.e. a
jump.

In VFX64 5.43:

: foo execute ;  ok
see foo
FOO
( 0050A250    488BD3 )                MOV     RDX, RBX
( 0050A253    488B5D00 )              MOV     RBX, [RBP]
( 0050A257    488D6D08 )              LEA     RBP, [RBP+08]
( 0050A25B    48FFD2 )                CALL    RDX
( 0050A25E    C3 )                    RET/NEXT

However:

see execute
EXECUTE
( 004211B0    53 )                    PUSH    RBX
( 004211B1    488B5D00 )              MOV     RBX, [RBP]
( 004211B5    488D6D08 )              LEA     RBP, [RBP+08]
( 004211B9    C3 )                    RET/NEXT
( 10 bytes, 4 instructions )

The push-ret combination is an extremely slow form of an indirect
jump; so where is the return address (nest-sys) here?  It's the return
address of the surrounding call.  E.g., if you do

' + ' execute foo

it's the call in FOO.


SwiftForth 4.0.0-RC89:

see foo
4519B7   4028CB ( EXECUTE ) JMP         E90F0FFBFF ok

That's a tail-call to EXECUTE.  When EXECUTE is not tail-called, the
code of EXECUTE is invoked with call:

: bar execute . ;  ok                                                         
see bar                                                                       
4519D3   4028CB ( EXECUTE ) CALL        E8F30EFBFF
4519D8   40B043 ( . ) JMP               E96696FBFF ok

see execute
4028CB   RBX RCX MOV                    488BCB
4028CE   0 [RBP] RBX MOV                488B5D00
4028D2   8 [RBP] RBP LEA                488D6D08
4028D6   4028DD JRCXZ                   E305
4028D8   RDI RCX ADD                    4801F9
4028DB   RCX JMP                        FFE1
4028DD   RET                            C3 ok

This special-cases the 0 EXECUTE case as NOOP, and also adds an offset
(the image start?) to the xt before performing the indirect jump, but
if you ignore those parts, this EXECUTE does the same things as VFX's,
except that it uses the much faster indirect jmp rather than push-ret.


lxf 1.7-172-983:
see foo                                                                       
 8692BC4  8050E6E  11  88C8000   5 normal  FOO

 8050E6E 8BC3                   mov     eax , ebx
 8050E70 8B5D00                 mov     ebx , [ebp]
 8050E73 8D6D04                 lea     ebp , [ebp+4h]
 8050E76 FFD0                   call     eax
 8050E78 C3                     ret     near

Here the EXECUTE is compiled inline and essentially implemented as
indirect call.  lxf does not perform tail-call optimization.

see execute                                                                   
 868E2FC  88D6B47  11  88D475B  92 prim    EXECUTE

 88D6B47 8BC3                   mov     eax , ebx
 88D6B49 8B5D00                 mov     ebx , [ebp]
 88D6B4C 8D6D04                 lea     ebp , [ebp+4h]
 88D6B4F FFD0                   call     eax
 88D6B51 C3                     ret     near

The same code as FOO; after all, both words do the same thing.


iForth 5.1-mini (I think):

FORTH> ' foo idis
$10226000  : foo                        488BC04883ED088F4500      H.@H.m..E.
$1022600A  pop           rbx            5B                        [
$1022600B  or            rbx, rbx       4809DB                    H.[
$1022600E  je            $10226016 offset NEAR
                                        0F8402000000              ......
$10226014  call          rbx            FFD3                      .S
$10226016  ;                            488B45004883C508FFE0      H.E.H.E..` ok

The use of call here is interesting, because iForth uses RSP as
data-stack pointer (e.g., the "pop rbx" moves the xt into rbx) and rbp
as return-stack pointer.  Note the 10 bytes at the start of foo that
are not shown.  If I disassemble that code (into AT&T syntax), it
looks as follows:

   0x10226000:  mov    %rax,%rax
   0x10226003:  sub    $0x8,%rbp
   0x10226007:  pop    0x0(%rbp)
   0x1022600a:  pop    %rbx
   0x1022600b:  or     %rbx,%rbx
   0x1022600e:  je     0x10226016
   0x10226014:  call   *%rbx
   0x10226016:  mov    0x0(%rbp),%rax
   0x1022601a:  add    $0x8,%rbp
   0x1022601e:  jmp    *%rax

So here we see the first and last three instructions disassembled
(which "idis" does not do).  The third instruction moves the return
address from the RSP stack to the RBP stack, and the second
instruction adjusts RBP for that.  Note that this invocation via call
is not the usual way to invoke a colon definition from compiled code
in iForth.  E.g.:

FORTH> : x . . ;
FORTH> ' x idis
$10226940  : x                          488BC04883ED088F4500      H.@H.m..E.
$1022694A  lea           rbp, [rbp -8 +] qword
                                        488D6DF8                  H.mx
$1022694E  mov           [rbp 0 +] qword, $1022695B d#
                                        48C745005B692210          HGE.[i".
$10226956  jmp           .+A ( $1013888A ) offset NEAR
                                        E92F1FF1FF                i/.q.
$1022695B  jmp           .+A ( $1013888A ) offset NEAR
                                        E92A1FF1FF                i*.q.
$10226960  ;                            488B45004883C508FFE0      H.E.H.E..`

Note that both calls to "." jump to ".+A", i.e., they skip the first
three instructions.  The first invocation of "." pushes the return
address explicitly in the instructions at $1022694A and $1022694E, the
second invocation is a tail-call.

Back to EXECUTE: This means that iForth implements EXECUTE as pushing
the return address (in a convoluted way).


In the general case (no-tail EXECUTE) in all these native-code systems
a compiled EXECUTE pushes the return address.

This is not a problem for standard code because colon definitions and
does>-following code is not allowed to inspect stuff on the return
stack that it did not push there, and because other words either don't
access the return stack, or ticking them is non-standard (e.g., ' R@
is non-standard).

Could it be done without call?  How would the return to the code after
the EXECUTE happen?  One way to do it would be as follows:

The code for general (non-tail) EXECUTE:

   ... stack adjustments
   mov rax, ra
   jmp rdx # execute the xt
ra:

and for a constant the xt code would be:

   ... stack adjustment
   mov rbx, const
   jmp rax

while for a colon definition the xt code would be:

   push rax
entry:      #entry point for compiled code
   ... code of the colon definition
   ret

The disadvantage of the scheme is that it does not pair the ret with a
call, but with a push, which leads to slow branch mispredictions.  It
seems to me that if you want to use ret for EXIT and call for compiled
colon definitions, having a call for a non-tail EXECUTE is the most
efficient way to go.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: https://forth-standard.org/
EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/
EuroForth 2024 proceedings: http://www.euroforth.org/ef24/papers/

Date Sujet#  Auteur
14 Mar 25 * Re: nest-sys revisited8dxf
14 Mar 25 +- Re: nest-sys revisited1dxf
16 Mar 25 `* Re: nest-sys revisited6Hans Bezemer
17 Mar 25  `* Re: nest-sys revisited5dxf
17 Mar 25   +* EXECUTE implementation in native-code systems (was: nest-sys revisited)3Anton Ertl
17 Mar 25   i`* Re: EXECUTE implementation in native-code systems2dxf
17 Mar 25   i `- Re: EXECUTE implementation in native-code systems1Anton Ertl
18 Mar 25   `- Re: nest-sys revisited1dxf

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal