Sujet : Re: Stack vs stackless operation
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.lang.forthDate : 27. Feb 2025, 23:03:55
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2025Feb27.230355@mips.complang.tuwien.ac.at>
References : 1 2 3 4 5 6 7 8
User-Agent : xrn 10.11
Paul Rubin <
no.email@nospam.invalid> writes:
anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
Results (on Zen4):
gforth-fast (development): ...
>
It's interesting how little difference there is with gforth-fast. Could
you also do gforth-itc?
gforth-itc (development):
:=: exchange ex ex-locals exchange2
7_527_256_553 5_224_615_325 6_825_283_178 9_238_357_501 7_036_128_309 c.
13_127_503_990 9_326_561_471 12_927_054_153 16_927_820_825 12_027_146_677 i.
For comparison: gforth-fast (development):
:=: exchange ex ex-locals exchange2
814_881_277 879_389_133 928_825_521 875_574_895 808_543_975 cyc.
3_908_874_164 3_708_891_336 4_508_966_770 4_209_778_557 3_708_865_505 inst.
exchange2 is a big win with VFX, suggesting its
optimizer could do better with some of the other versions.
On VFX exchange2 takes the same speed and the same number of
instructions as :=:. EX is slower because VFX does not analyse the
return stack, unlike the data stack. EX-LOCALS is slow because VFX's
locals implementation is not particularly good.
To see what a better analysis can do, let's look at lxf:
:=: ex ex-locals exchange2
502_740_029 502_189_567 502_134_842 502_043_217 cycles
1_701_663_782 1_701_657_866 1_701_677_273 1_701_684_186 instructions
The cycles and instructions are worse (except for ex-locals) than with
VFX, but that's due to inlining (which VFX does and lxf does not).
E.g., here's lxf's code for EX-LOCALS:
869204C 804FCE2 23 88C8000 5 normal EX-LOCALS
804FCE2 8B4500 mov eax , [ebp]
804FCE5 8B00 mov eax , [eax]
804FCE7 8BCB mov ecx , ebx
804FCE9 8B09 mov ecx , [ecx]
804FCEB 8B5500 mov edx , [ebp]
804FCEE 890A mov [edx] , ecx
804FCF0 8903 mov [ebx] , eax
804FCF2 8B5D04 mov ebx , [ebp+4h]
804FCF5 8D6D08 lea ebp , [ebp+8h]
804FCF8 C3 ret near
It's the same code as lxf produces for :=:.
The code lxf produces for EX and EXCHANGE2 is:
804FCF9 8BC3 mov eax , ebx
804FCFB 8B00 mov eax , [eax]
804FCFD 8B4D00 mov ecx , [ebp]
804FD00 8B09 mov ecx , [ecx]
804FD02 890B mov [ebx] , ecx
804FD04 8B5D00 mov ebx , [ebp]
804FD07 8903 mov [ebx] , eax
804FD09 8B5D04 mov ebx , [ebp+4h]
804FD0C 8D6D08 lea ebp , [ebp+8h]
804FD0F C3 ret near
- anton
-- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.htmlcomp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: https://forth-standard.org/EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/EuroForth 2024 proceedings:
http://www.euroforth.org/ef24/papers/