Newsportal USENET - Re: Vector sum (was: Parsing timestamps?)

Re: Vector sum (was: Parsing timestamps?)

Sujet : Re: Vector sum (was: Parsing timestamps?)
De : peter.noreply (at) *nospam* tin.it (peter)
Groupes : comp.lang.forth
Date : 19. Jul 2025, 14:24:48

Autres entêtes

Organisation : A noiseless patient Spider
Message-ID : <20250719152448.0000757a@tin.it>
References : 1 2 3 4 5 6 7 8 9 10 11
User-Agent : Claws Mail 4.3.0 (GTK 3.24.42; x86_64-w64-mingw32)

On Sat, 19 Jul 2025 10:18:15 GMT
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

peter <peter.noreply@tin.it> writes:
I did a test coding the sum128 as a code word with avx-512 instructions
and got the following results
>
285,584,376 cycles:u
941,856,077 instructions:u
>
timing was
timer-reset ' recursive-sum bench .elapsed 51 ms elapsed
>
so half the time of the original recursive.
with 32 zmm registers I could have done a sum256 also

One could do sum128 with just 8 registers by performing the adds ASAP,
i.e., for sum32

vmovapd   zmm0, [rbx]
vmovapd   zmm1, [rbx+64]
vaddpd zmm0, zmm0, zmm1
vmovapd   zmm1, [rbx+128]
vmovapd   zmm2, [rbx+192]
vaddpd zmm1, zmm1, zmm2
vaddpd zmm0, zmm0, zmm1
; and then the Horizontal sum

And you can code this as:

vmovapd   zmm0, [rbx]
vaddpd zmm0, zmm0, [rbx+64]
vmovapd   zmm1, [rbx+128]
vaddpd zmm1, zmm1, [rbx+192]
vaddpd zmm0, zmm0, zmm1
; and then the Horizontal sum

; Horizontal sum of zmm0
>
vextractf64x4 ymm1, zmm0, 1
vaddpd ymm2, ymm1, ymm0
>
vextractf64x2 xmm3, ymm2, 1
vaddpd ymm4, ymm3, ymm2
>
vhaddpd xmm0, xmm4, xmm4

the simd instructions does also take a memory operand
I can du sum128 as

code asum128b

movsd [r13-0x8], xmm0
lea r13, [r13-0x8]

vmovapd zmm0, [rbx]
vaddpd zmm0, zmm0, [rbx+64]
vaddpd zmm0, zmm0, [rbx+128]
vaddpd zmm0, zmm0, [rbx+192]
vaddpd zmm0, zmm0, [rbx+256]
vaddpd zmm0, zmm0, [rbx+320]
vaddpd zmm0, zmm0, [rbx+384]
vaddpd zmm0, zmm0, [rbx+448]
vaddpd zmm0, zmm0, [rbx+512]
vaddpd zmm0, zmm0, [rbx+576]
vaddpd zmm0, zmm0, [rbx+640]
vaddpd zmm0, zmm0, [rbx+704]
vaddpd zmm0, zmm0, [rbx+768]
vaddpd zmm0, zmm0, [rbx+832]
vaddpd zmm0, zmm0, [rbx+896]
vaddpd zmm0, zmm0, [rbx+960]

; Horizontal sum of zmm0

vextractf64x4 ymm1, zmm0, 1
vaddpd ymm2, ymm1, ymm0

vextractf64x2 xmm3, ymm2, 1
vaddpd ymm4, ymm3, ymm2

vpermilpd xmm5, xmm4, 1
vaddsd xmm0, xmm4, xmm5

ret
end-code

this compiles to 154 bytes and 25 instructions
The original sum128 is 2157 bytes and 513 instructions!

Yes the horizontal sum should just be done once.
I have only replaced sum128 with simd as a test.
Later I will do a complete example

This asum128b does not change the timing but reduces
the number of instructions

277,333,790 cycles:u
834,846,183 instructions:u # 3.01 insn per cycle

Instead of doing the horizontal sum once for every sum128, it might be
more efficient (assuming the whole thing is not
cache-bandwidth-limited) to have the result of sum128 be a full SIMD
width, and then add them up with vaddpd instead of addsd, and do the
horizontal sum once in the end.

But if the recursive part is to be programmed in Forth, we would need
a way to represent a SIMD width of data in Forth, maybe with a SIMD
stack. I see a few problems there:

* What to do about the mask registers of AVX-512? In the RISC-V
vector extension masks are stored in regular SIMD registers.

* There is a trend visible in ARM SVE and the RISC-V Vector extension
to have support for dealing with loops across longer vectors. Do we
also need to support something like that.

For the RISC-V vector extension, see
<https://riscv.org/wp-content/uploads/2024/12/15.20-15.55-18.05.06.VEXT-bcn-v1.pdf>

One way to deal with all that would be to have a long-vector stack and
have something like my vector wordset
<https://github.com/AntonErtl/vectors>, where the sum of a vector
would be a word that is implemented in some lower-level way (e.g.,
assembly language); the sum of a vector is actually a planned, but not
yet existing feature of this wordset.

An advantage of having a (short) SIMD stack would be that one could
use SIMD operations for other uses where the long-vector wordset looks
too heavy-weight (or would need optimizations to get rid of the
long-vector overhead). The question is if enough such uses exist to
justify adding such a stack.

- anton

I will take a look at your vector implementation and see if it can be used
in lxf64

BR
Peter

Les messages affichés proviennent d'usenet.

Date	Sujet	#	Auteur
6 Oct 24	Parsing timestamps?	282	dxf
6 Oct 24	Re: Parsing timestamps?	245	mhx
6 Oct 24	Re: Parsing timestamps?	3	dxf
6 Oct 24	Re: Parsing timestamps?	2	dxf
7 Oct 24	Re: Parsing timestamps?	1	dxf
7 Jun 25	Re: Parsing timestamps?	241	B. Pym
7 Jun 25	Re: Parsing timestamps?	225	dxf
7 Jun 25	Re: Parsing timestamps?	224	LIT
8 Jun 25	Re: Parsing timestamps?	223	dxf
9 Jun 25	Re: Parsing timestamps?	222	Hans Bezemer
9 Jun 25	Re: Parsing timestamps?	221	LIT
9 Jun 25	Re: Parsing timestamps?	220	Hans Bezemer
9 Jun 25	Re: Parsing timestamps?	219	LIT
10 Jun 25	Re: Parsing timestamps?	207	dxf
10 Jun 25	Re: Parsing timestamps?	2	mhx
10 Jun 25	Re: Parsing timestamps?	1	dxf
10 Jun 25	Re: Parsing timestamps?	1	LIT
19 Jun 25	Re: Parsing timestamps?	203	LIT
20 Jun 25	Re: Parsing timestamps?	202	dxf
20 Jun 25	Re: Parsing timestamps?	7	minforth
20 Jun 25	Re: Parsing timestamps?	2	mhx
20 Jun 25	Re: Parsing timestamps?	1	albert
20 Jun 25	Re: Parsing timestamps?	4	dxf
20 Jun 25	Re: Parsing timestamps?	1	mhx
20 Jun 25	Re: Parsing timestamps?	2	minforth
21 Jun 25	Re: Parsing timestamps?	1	dxf
20 Jun 25	Re: Parsing timestamps?	194	LIT
21 Jun 25	Re: Parsing timestamps?	1	dxf
22 Jun 25	Re: Parsing timestamps?	187	minforth
23 Jun 25	Re: Parsing timestamps?	1	dxf
23 Jun 25	Re: Parsing timestamps?	181	Anton Ertl
23 Jun 25	Re: Parsing timestamps?	180	minforth
24 Jun 25	Re: Parsing timestamps?	162	minforth
24 Jun 25	Re: Parsing timestamps?	161	dxf
24 Jun 25	Re: Parsing timestamps?	13	minforth
24 Jun 25	Re: Parsing timestamps?	2	Anton Ertl
1 Jul 25	Re: Parsing timestamps?	1	Stephen Pelc
26 Jun 25	Re: The future. (was Re: Parsing timestamps?)	2	Paul Rubin
30 Jun 25	Re: The future. (was Re: Parsing timestamps?)	1	albert
15 Jul 25	Re: The future. (was Re: Parsing timestamps?)	8	LIT
16 Jul 25	Re: The future. (was Re: Parsing timestamps?)	7	minforth
16 Jul 25	Re: The future. (was Re: Parsing timestamps?)	5	dxf
16 Jul 25	Re: The future. (was Re: Parsing timestamps?)	1	minforth
16 Jul 25	Re: The future. (was Re: Parsing timestamps?)	3	LIT
17 Jul 25	Re: The future. (was Re: Parsing timestamps?)	2	dxf
17 Jul 25	Re: The future. (was Re: Parsing timestamps?)	1	LIT
16 Jul 25	Re: The future. (was Re: Parsing timestamps?)	1	LIT
25 Jun 25	Re: Parsing timestamps?	12	dxf
25 Jun 25	Re: Parsing timestamps?	11	Paul Rubin
26 Jun 25	Re: Parsing timestamps?	1	Paul Rubin
26 Jun 25	Re: Parsing timestamps?	9	dxf
26 Jun 25	Re: Parsing timestamps?	8	Paul Rubin
27 Jun 25	Re: Parsing timestamps?	7	dxf
27 Jun 25	Re: Parsing timestamps?	1	Paul Rubin
2 Jul 25	Re: Parsing timestamps?	5	dxf
2 Jul 25	Re: Parsing timestamps?	2	Stephen Pelc
4 Jul 25	Re: Parsing timestamps?	1	dxf
6 Jul 25	Re: Parsing timestamps?	2	LIT
6 Jul 25	Re: Parsing timestamps?	1	dxf
25 Jun 25	Re: Parsing timestamps?	2	Paul Rubin
30 Jun 25	Re: Parsing timestamps?	1	Hans Bezemer
26 Jun 25	Re: Parsing timestamps?	133	Waldek Hebisch
26 Jun 25	Re: Parsing timestamps?	132	minforth
26 Jun 25	Re: Parsing timestamps?	2	LIT
27 Jun 25	Re: Parsing timestamps?	1	minforth
29 Jun 25	Re: Parsing timestamps?	129	Anton Ertl
29 Jun 25	Re: Parsing timestamps?	128	LIT
29 Jun 25	Re: Parsing timestamps?	123	LIT
30 Jun 25	Re: Parsing timestamps?	122	dxf
30 Jun 25	Re: Parsing timestamps?	121	LIT
30 Jun 25	Re: Parsing timestamps?	116	dxf
30 Jun 25	Re: Parsing timestamps?	115	LIT
30 Jun 25	Re: Parsing timestamps?	114	dxf
30 Jun 25	Re: Parsing timestamps?	2	LIT
1 Jul 25	Re: Parsing timestamps?	1	dxf
30 Jun 25	Re: Parsing timestamps?	111	Paul Rubin
30 Jun 25	Re: Parsing timestamps?	3	LIT
2 Jul 25	Re: Parsing timestamps?	1	LIT
2 Jul 25	Re: Parsing timestamps?	1	Paul Rubin
1 Jul 25	Re: Parsing timestamps?	107	Paul Rubin
1 Jul 25	Re: Parsing timestamps?	18	minforth
1 Jul 25	Re: Parsing timestamps?	17	Paul Rubin
2 Jul 25	Re: Parsing timestamps?	4	minforth
2 Jul 25	Re: Parsing timestamps?	1	minforth
2 Jul 25	Re: Parsing timestamps?	1	dxf
2 Jul 25	Re: Parsing timestamps?	1	Anton Ertl
2 Jul 25	Re: Parsing timestamps?	12	Anton Ertl
3 Jul 25	Re: Parsing timestamps?	8	Paul Rubin
3 Jul 25	Re: Parsing timestamps?	2	minforth
3 Jul 25	Re: Parsing timestamps?	1	albert
3 Jul 25	Re: Parsing timestamps?	3	Hans Bezemer
3 Jul 25	Re: Parsing timestamps?	2	albert
5 Jul 25	Re: Parsing timestamps?	1	dxf
3 Jul 25	Re: Parsing timestamps?	1	albert
4 Jul 25	Re: Parsing timestamps?	1	dxf
3 Jul 25	Re: Parsing timestamps?	2	Anton Ertl
3 Jul 25	Re: Parsing timestamps?	1	Hans Bezemer
3 Jul 25	Re: Parsing timestamps?	1	dxf
1 Jul 25	Re: Parsing timestamps?	5	peter
2 Jul 25	Re: Parsing timestamps?	2	minforth
2 Jul 25	Re: Parsing timestamps?	1	Anton Ertl
2 Jul 25	Re: Parsing timestamps?	2	Paul Rubin
2 Jul 25	Re: Parsing timestamps?	83	Anton Ertl
30 Jun 25	Re: Parsing timestamps?	4	Paul Rubin
29 Jun 25	Re: Parsing timestamps?	3	Paul Rubin
30 Jun 25	Re: Parsing timestamps?	1	sean
24 Jun 25	Re: Parsing timestamps?	1	Anton Ertl
2 Jul 25	Nested definitions (was: Parsing timestamps?)	16	Ruvim
23 Jun 25	Re: Parsing timestamps?	4	mhx
23 Jun 25	Re: Parsing timestamps?	5	LIT
10 Jun 25	Re: Parsing timestamps?	2	LIT
10 Jun 25	Re: Parsing timestamps?	2	LIT
10 Jun 25	Re: Parsing timestamps?	2	Stephen Pelc
10 Jun 25	Re: Parsing timestamps?	4	LIT
10 Jun 25	Re: Parsing timestamps?	1	Hans Bezemer
9 Jun 25	Re: Parsing timestamps?	1	B. Pym
10 Jun 25	Re: Parsing timestamps?	14	B. Pym
6 Oct 24	Re: Parsing timestamps?	5	Ruvim
6 Oct 24	Re: Parsing timestamps?	6	FFmike
6 Oct 24	Re: Parsing timestamps?	2	Anthony Howe
7 Oct 24	Re: Parsing timestamps?	9	albert
8 Oct 24	Re: Parsing timestamps?	3	albert
9 Oct 24	Re: Parsing timestamps?	4	alaa
18 Oct 24	Re: Parsing timestamps?	7	Gerry Jackson