Sujet : Re: LOOP (was: OOS approach revisited)
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.lang.forthDate : 28. Jun 2025, 17:04:07
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2025Jun28.180407@mips.complang.tuwien.ac.at>
References : 1 2 3 4 5 6 7
User-Agent : xrn 10.11
anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
You obviously ignore repeated refutations of your claims of superior
performance for LOOP-instruction-based counted loops. Maybe you
should implement and measure such a counted loop yourself and compare
it to the LOOP word on SwiftForth and VFX Forth.
Actually it is possible to implement the LOOP word such that it uses
the LOOP instruction by modifying DO/?DO, I, and J to go along with
them.
E.g., VFX generates the following code for LOOP
( 0050A25D 49FFC6 ) INC R14
( 0050A260 49FFC7 ) INC R15
( 0050A263 71F8 ) JNO 0050A25D
Here R14 contains the index (I), and R15 contains
(index-limit) xor 2^63. This value is conditioned such that INC
will set the overflow flag when index reaches limit.
The benefit of the overflow-flag approach is only relevant for +LOOP.
We will ignore that for now, but return to it later.
Instead, you could keep limit-index in RCX. Then you could implement
a VFX-style LOOP as follows:
inc r14
loop <target>
The LOOP instruction will decrement RCX until it reaches 0, i.e.,
until index equals limit.
Ok, this still needs the additional INC instruction that you may want
to avoid. So let's look at SwiftForth. Here LOOP compiles to
4519C5 R14 INC 49FFC6
4519C8 4519BC JNO 0F81EEFFFFFF
So here R14 (not R15) contains (index-limit) xor 2^63. We will
discuss I later.
You could instead have limit-index in RCX, and then let the LOOP word
generate
loop <target>
Look, Ma, LOOP implemented with LOOP!
Now what about I? SwiftForth implements I by keeping limit xor 2^63
in R15. Then I is R15+R14 (and SwiftForth's I is implemented to
produce that computation).
For our LOOP-instruction-base LOOP, we could keep limit in, say, R15.
Then I is R15-RCX.
Now what about +LOOP ? In the usual case the increment is a constant.
For a positive increment, you can implement +LOOP as
add rcx, increment
jnc <target>
For a negative increment, you can implement +LOOP as
add rcx, increment
jc <target>
Only in the case of an increment that is unknown at compile time, you
have to resort to something with jno, maybe
mov r14, rcx
add rcx, increment
btc r14, 63
add r14, increment
jno <target>
You can't have everything:-)
Anyway, given that LOOP is slower than what SwiftForth uses for the
LOOP word now on several CPUs and not faster on almost all others,
nobody interested in performance will go there. But if there ever is
a performance advantage to using the LOOP instruction, we have this
option. So no need to resort to FOR ... NEXT even in that case.
- anton
-- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.htmlcomp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: https://forth-standard.org/EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/EuroForth 2024 proceedings:
http://www.euroforth.org/ef24/papers/