Newsportal USENET - LOOP (was: OOS approach revisited)

Most CPUs have operators for register-based count-down loops
that are blazingly fast.

Which "operators" do you have in mind, and what do you mean with
"blazingly fast".

Anyway, we have discussed this repeatedly, e.g., in
<2022Feb13.231208@mips.complang.tuwien.ac.at> I wrote in reply to your
posting <f4b89e0b-2ded-4b18-8dc1-bba6dcda47bbn@googlegroups.com>, and
cited earlier discussions in the topic.

|"minf...@arcor.de" <minforth@arcor.de> writes:
[...]
|>F.ex. match NEXT efficiently to x_86 processor LOOP instruction (counter in=
|> _CX register)
|>and you'll happily count down from 5 to 1.
|
|Yes, but why would one do this? As we have established in an earlier
|discussion (see below), the LOOP instruction is typically not faster
|than a sequence of simpler instructions:
|
|<2018Jun6.184616@mips.complang.tuwien.ac.at>:
||minforth@arcor.de writes:
||>FOR..NEXT matches easily with the x86 LOOP instruction and ECX as counter.
||>Should do speedy enough. ;-)
||
||Have you measured it? I have
||<2017Mar14.183125@mips.complang.tuwien.ac.at>
||<2017Mar15.141411@mips.complang.tuwien.ac.at> and compared the
||following loops:
||
||.L5: .L5:
|| subq $1, %rax loop .L5
|| jne .L5
||
||I found that for these loops Sandy Bridge, Haswell, and Skylake take
||~4 cycles per iteration using LOOP, and 1-2 cycles per iteration when
||using jne.
|
|<2018Jun7.141731@mips.complang.tuwien.ac.at>:
||cycles for 1000 iterations
|| K10 Excavator    Zen
||Phenom II Athlon X4 845 Ryzen 1600X
|| 3021 1314 1051    loop
|| 2020 1484 1051    sub; jne
|| 2026 1489 1053    add; cmp; jne
|
|There is no performance advantage on modern AMD and Intel CPUs for the
|instruction LOOP over a good implementation of the Forth word LOOP (as
|in the third example).

If they can be used within Forth-based loop constructs
I would expect a greater speed increase than what you measured.

You obviously ignore repeated refutations of your claims of superior
performance for LOOP-instruction-based counted loops. Maybe you
should implement and measure such a counted loop yourself and compare
it to the LOOP word on SwiftForth and VFX Forth.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/
EuroForth 2024 proceedings: http://www.euroforth.org/ef24/papers/

Date	Sujet	#	Auteur
23 Jun 25	OOS approach revisited	23	LIT
24 Jun 25	Re: OOS approach revisited	22	dxf
26 Jun 25	Re: OOS approach revisited	21	LIT
27 Jun 25	Re: OOS approach revisited	19	minforth
27 Jun 25	Re: OOS approach revisited	14	dxf
27 Jun 25	Re: OOS approach revisited	13	minforth
27 Jun 25	Re: OOS approach revisited	3	LIT
27 Jun 25	Re: OOS approach revisited	2	minforth
28 Jun 25	Re: OOS approach revisited	1	Stephen Pelc
28 Jun 25	LOOP (was: OOS approach revisited)	9	Anton Ertl
28 Jun 25	Re: LOOP	7	dxf
28 Jun 25	Re: LOOP	6	sean
28 Jun 25	Re: LOOP	4	Anton Ertl
3 Jul 25	Re: LOOP	3	minforth
7 Jul 25	Re: LOOP	2	Gerry Jackson
7 Jul 25	Re: LOOP	1	minforth
29 Jun 25	Re: LOOP	1	dxf
28 Jun 25	Re: LOOP (was: OOS approach revisited)	1	Anton Ertl
28 Jun 25	DO..LOOP and stack shuffling (was: OOS approach revisited)	3	Anton Ertl
3 Jul 25	Re: DO..LOOP and stack shuffling	1	dxf
3 Jul 25	Re: DO..LOOP and stack shuffling	1	Anton Ertl
30 Jun 25	Re: OOS approach revisited	1	Hans Bezemer
27 Jun 25	Re: OOS approach revisited	1	dxf