Sujet : Re: Cray style vectors
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.archDate : 15. Mar 2024, 18:07:19
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Mar15.180719@mips.complang.tuwien.ac.at>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : xrn 10.11
Terje Mathisen <
terje.mathisen@tmsw.no> writes:
Here you can probably schedule the fixup to happen in parallel with the
actual multiplication:
>
;; inputs in r9 & r10, result in rdx:rax, rbx & rcx as scratch
>
mov rax,r9 ;; All these can start in the first cycle
mul r10
mov rbx,r9 ;; The MOV can be handled by the renamer
sar r9,63
mov rcx,r10 ;; Ditto
sar r10,63
>
and rbx,r9 ;; Second set of ops
and rcx,r10
>
add rbx,rcx ;; Third cycle
>
sub rdx,rbx ;; Do a single adjustment as soon as the MUL finishes
Of course on AMD64 you could just use imul instead.
RISC-V also supports signed as well as unsigned (and also
signed*unsigned) multiplication, and I think that's also the case for
ARM A64. But on Alpha this technique would be useful.
- anton
-- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>