Sujet : Re: Stack vs stackless operation
De : antispam (at) *nospam* fricas.org (Waldek Hebisch)
Groupes : comp.lang.forthDate : 26. Feb 2025, 01:50:52
Autres entêtes
Organisation : To protect and to server
Message-ID : <vploha$3h0e8$2@paganini.bofh.team>
References : 1 2
User-Agent : tin/2.6.2-20221225 ("Pittyvaich") (Linux/6.1.0-9-amd64 (x86_64))
LIT <
zbigniew2011@gmail.com> wrote:
So I did some quite basic testing with x86
fig-Forth for DOS. I devised 4 OOS words:
:=: (exchange values among two variables)
pop BX
pop DI
mov AX,[BX]
xchg AX,[DI]
mov [BX],AX
jmp NEXT
++ (increment variable by one)
pop BX
inc WORD PTR [BX}
jmp NEXT
-- (similar to above, just uses dec -- not tested, it'll give same
result)
+> (add two variables then store result into third one)
pop DI
pop BX
mov CX,[BX]
pop BX
mov AX,[BX]
add AX,CX
mov [DI],AX
jmp NEXT
How the simplistic tests have been done:
7 VARIABLE V1
8 VARIABLE V2
9 VARIABLE V3
: TOOK ( t1 t2 -- )
DROP SPLIT TIME@ DROP SPLIT
ROT SWAP - CR ." It took " U. ." seconds and "
- 10 * U. ." milliseconds "
;
: TEST1
1000 0 DO 10000 0 DO
...expression...
LOOP LOOP
;
0 0 TIME! TIME@ TEST TOOK
The results are (for the following expressions):
V1 @ V2 @ + V3 ! - 25s 430ms
V1 V2 V3 +> - 17s 240ms
1 V1 +! - 14s 60ms
V1 ++ - 10s 820ms
V1 @ V3 ! V2 @ V1 ! V3 @ V2 ! - 40s 150ms
V1 V2 :=: - 15s 260ms
So there is a noticeable difference indeed.
If your expected use case is operations on variables, then
what you gain is merging @ and ! onto operations. Since
you still have variables, gain is at most a factor of 2
(you replace things by V1 @ by plain V1). Cost is need to
have several extra operations. Potential alternative is
a pair of operations, say PUSH and POP, and Forth compiler
that replaces pair like V1 @ by PUSH(V1). Note that here
address of V1 is intended to be part to PUSH (so it will
take as much space as separate V1 and @, but is only a
single primitive).
More generally, a simple "optimizer" that replaces short
sequences of Forth primitives by different, shorter sequence
of primitives is likely to give similar gain. However,
chance of match decreases with length of the sequence.
Above you bet on relatively long seqences (and on programmer
writing alternative seqence). Shorter seqences have more
chance of matching, so you need smaller number of them
for similar gain.
Extra thing: while simple memory to memory operations appear
with some frequency rather typical pattern is expressions
that produce some value that is immediately used by another
operation, stack is very good fit for such use. One can
do better than using machine stack, namely keeping thing in
registers, but that means generating machine code and doing
optimization. OTOH on 64-bit machines machine code is
very natural: machine instructions are typically smaller
than machine words (which are natural unit for threaded
code) and Forth primitives are likely to produce very
small number of instructions.
-- Waldek Hebisch