Code generation for DOES> in Gforth

Liste des GroupesRevenir à cl forth 
Sujet : Code generation for DOES> in Gforth
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.lang.forth
Date : 21. Sep 2024, 18:25:51
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Sep21.192551@mips.complang.tuwien.ac.at>
User-Agent : xrn 10.11
I recently noticed that Gforth still used the following COMPILE,
implementation for words defined with CREATE...SET-DOES> (and
consequently also for words defined with CREATE...DOES>):

: does, ( xt -- ) does-check ['] does-xt peephole-compile, , ;

Ignore DOES-CHECK (it has to do with stack-depth checking, still
incomplete).  The rest means that it compiles the primitive DOES-XT
with the xt of the COMPILE,d word as immediate argument.  DOES-XT
pushes the body of the word and then EXECUTEs the xt that SET-DOES>
has registered for this word.  In most cases this is a colon
definition (always if DOES> is used), so the next thing that happens
is DOCOL, and then the code for the colon definition is run.

I have now replaced this with

: does, ( xt -- ) does-check dup >body lit, >extra @ compile, ;

What this does is to compile the body as a literal, and then it
COMPILE,s the xt that DOES-XT would EXECUTE.  In the common case of a
colon definition this compiles a call to the colon definition.  This
saves the overhead of accessing the doesfield and of dispatching on
its contents at run-time; all that is now done during compilation.

Let us first look at the generated code.  Consider the example:

: myconst create , does> @ ;
5 myconst five
: foo five ;

SIMPLE-SEE FOO shows:

old                               new
$7F6F5CAE6BC8 does-xt    1->1     $7F46A7EA92B8 lit    1->1
$7F6F5CAE6BD0 five                $7F46A7EA92C0 five
$7F6F5CAE6BD8 ;s    1->1  ok      $7F46A7EA92C8 call    1->1
                                  $7F46A7EA92D0 $7F46A7C0A168
                                  $7F46A7EA92D8 ;s    1->1

For the following microbenchmark:

: d1 ( "name" -- )
  create 0 ,
does> ( -- addr )
; \ yes, an empty DOES> exists in an application program
d1 z1

: bench-z1-comp ( -- )
    iterations 0 ?do
        1 z1 +!
    loop ;

I see the following results per iteration (startup overhead included)
on a Rocket Lake:

 old   new
 8.2   7.5 cycles:u
34.0  29.0 instructions:u
 5.2   4.2 branches:u

So five instructions less (including one branch), resulting in a small
speedup for this microbenchmark.
       
The Gforth image contained 129 occurences of does-xt and after the
change it contains 12 (a part of the image is created with the
cross-compiler, which still compiles to DOES-XT.  As a result, the
image size and gforth-fast (AMD64) native-code size in bytes are as
follows:

  old     new
2189364 2193264 image
 448291  448659 native-code

The larger image is no surprise.  For the 117 replaced does-xts, the
threaded code grows by 2 cells each, and the meta-data grows
correspondingly.

For the native code, the growth is not that expected.  Let's see how
the code looks:

does-xt               lit call
add rbx,$10           mov $00[r13],r8
mov $00[r13],r8       sub r13,$08    
mov r8,-$08[rbx]      mov r8,$08[rbx]
sub r13,$08           mov rax,$18[rbx]
sub rbx,$08           sub r14,$08    
mov rax,-$08[r8]      add rbx,$20    
mov rdx,$18[rax]      mov [r14],rbx  
mov rax,-$10[rdx]     mov rbx,rax    
jmp eax               mov rax,[rbx]  
                      jmp eax        

34 bytes              35 bytes

Ok, it's larger, but that explains only 117 extra bytes.  Maybe the
interaction with other optimizations explains the rest.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: https://forth-standard.org/
   EuroForth 2024: https://euro.theforth.net

Date Sujet#  Auteur
21 Sep 24 * Code generation for DOES> in Gforth2Anton Ertl
3 Oct 24 `- Re: Code generation for DOES> in Gforth1Anton Ertl

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal