Liste des Groupes | Revenir à c arch |
Anton Ertl wrote:; RSI->a[n], RDX->b[n], RDI->sum[n], RCX=-n
I have a similar problem for the carry and overflow bits inMy 66000 ISA can encode the mpn_add_n() inner loop in 5-instructions
< http://www.complang.tuwien.ac.at/anton/tmp/carry.pdf >, and chose to
let those bits not survive across calls; if there was a cheap solution
for the problem, it would eliminate this drawback of my idea.
whereas RISC-V encodes the inner loop in 11 instructions.
Source code:
void mpn_add_n( uint64_t sum, uint64_t a, unit64_t b, int n )
{
uint64_t c = 0;
for( int i = 0; i < n; i++ )
{
{c, sum[i]} = a[i] + b[i] + c;
}
return
}
Assembly code::
.global mpn_add_n
mpn_add_n:
MOV R5,#0 // c
MOV R6,#0 // i
VEC R7,{}
LDD R8,[R2,Ri<<3]
LDD R9,[R3,Ri<<3]
CARRY R5,{{IO}}
ADD R10,R8,R9
STD R10,[R1,Ri<<3]
LOOP LT,R6,#1,R4
RET
So, adding a few "bells and whistles" to RISC-V does give you a
performance gain (1.38×); using a well designed ISA gives you a
performance gain of 2.00× !! {{moral: don't stop too early}}
Note that all the register bookkeeping has disappeared !! because
of the indexed memory reference form.
As I count executing instructions, VEC does not execute, nor does
CARRY--CARRY causes the subsequent ADD to take C input as carry and
the carry produced by ADD goes back in C. Loop performs the ADD-CMP-
BC sequence in a single instruction and in a single clock.
Les messages affichés proviennent d'usenet.