Liste des Groupes | Revenir à c arch |
On 9/22/2024 3:43 PM, Paul A. Clayton wrote:On 9/19/24 11:07 AM, EricP wrote:>
[snip]If the multiplier is pipelined with a latency of 5 and throughput of 1,
then MULL takes 5 cycles and MULL,MULH takes 6.
>
But those two multiplies still are tossing away 50% of their work.
I do not remember how multipliers are actually implemented — and
am not motivated to refresh my memory at the moment — but I
thought a multiply low would not need to generate the upper bits,
so I do not understand where your "50% of their work" is coming
from.
The high result needs the low result carry-out but not the rest of
the result. (An approximate multiply high for multiply by
reciprocal might be useful, avoiding the low result work. There
might also be ways that a multiplier could be configured to also
provide bit mixing similar to middle result for generating a
hash?)
I guess it might be interesting if one made a bigger multiplier out of
4-bit multipliers, in a way similar to a 4-bit shift-add.
Les messages affichés proviennent d'usenet.