Re: Microarch Club

Liste des GroupesRevenir à c arch 
Sujet : Re: Microarch Club
De : terje.mathisen (at) *nospam* tmsw.no (Terje Mathisen)
Groupes : comp.arch
Date : 28. Mar 2024, 09:31:11
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <uu39sg$3fb7n$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10
User-Agent : Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.1
Scott Lurndal wrote:
mitchalsup@aol.com (MitchAlsup1) writes:
BGB wrote:
>
On 3/26/2024 5:27 PM, Michael S wrote:
>
>
For slightly less then 20 years ARM managed OK without integer divide.
Then in 2004 they added integer divide instruction in ARMv7 (including
ARMv7-M variant intended for small microcontroller cores like
Cortex-M3) and for the following 20 years instead of merely OK they are
doing great :-)
>
>
OK.
>
The point is they are doing better now after adding IDIV and FDIV.
>
I think both modern ARM and AMD Zen went over to "actually fast" integer
divide.
>
I think for a long time, the de-facto integer divide was ~ 36-40 cycles
for 32-bit, and 68-72 cycles for 64-bit. This is also on-par with what I
can get from a shift-add unit.
>
While those numbers are acceptable for shift-subtract division (including
SRT variants).
>
What I don't get is the reluctance for using the FP multiplier as a fast
divisor (IBM 360/91). AMD Opteron used this means to achieve 17-cycle
FDIS and 22-cycle SQRT in 1998. Why should IDIV not be under 20-cycles ??
and with special casing of leading 1s and 0s average around 10-cycles ???
 Empirically, the ARM CortexM7 udiv instruction requires 3+[s/2] cycles
(where s is the number of significant digits in the quotient).
 https://www.quinapalus.com/cm7cycles.html
That looks a lot like an SRT divisor with early out?
Having variable timing DIV means that any crypto operating (including hashes?) where you use modulo operations, said modulus _must_ be a known constant, otherwise information about will leak from the timings, right?
 
>
I submit that at 10-cycles for average latency, the need to invent screwy
forms of even faster division fall by the wayside {accurate or not}.
I agree.
Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Date Sujet#  Auteur
21 Mar 24 * Microarch Club22George Musk
25 Mar 24 `* Re: Microarch Club21BGB-Alt
26 Mar 24  `* Re: Microarch Club20MitchAlsup1
26 Mar 24   `* Re: Microarch Club19BGB
26 Mar 24    `* Re: Microarch Club18MitchAlsup1
26 Mar 24     `* Re: Microarch Club17BGB-Alt
27 Mar 24      +* Re: Microarch Club12Michael S
27 Mar 24      i`* Re: Microarch Club11BGB
27 Mar 24      i `* Re: Microarch Club10MitchAlsup1
28 Mar 24      i  +* Re: Microarch Club4Michael S
2 Apr 24      i  i`* Re: Microarch Club3BGB-Alt
5 Apr 24      i  i `* Re: Microarch Club2MitchAlsup1
6 Apr 24      i  i  `- Re: Microarch Club1BGB
28 Mar 24      i  +- Re: Microarch Club1MitchAlsup1
28 Mar 24      i  `* Re: Microarch Club4Terje Mathisen
28 Mar 24      i   `* Re: Microarch Club3Michael S
29 Mar 24      i    `* Re: Microarch Club2Terje Mathisen
29 Mar 24      i     `- Re: Microarch Club1Michael S
27 Mar 24      `* Re: Microarch Club4MitchAlsup1
27 Mar 24       `* Re: Microarch Club3BGB
27 Mar 24        `* Re: Microarch Club2MitchAlsup1
1 Apr 24         `- Re: Microarch Club1BGB

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal