Re: Microarch Club

Liste des GroupesRevenir à c arch 
Sujet : Re: Microarch Club
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 27. Mar 2024, 19:32:56
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <uu1op0$30i4b$1@dont-email.me>
References : 1 2 3 4 5 6 7
User-Agent : Mozilla Thunderbird
On 3/26/2024 5:27 PM, Michael S wrote:
On Tue, 26 Mar 2024 16:59:57 -0500
BGB-Alt <bohannonindustriesllc@gmail.com> wrote:
 
On 3/26/2024 2:16 PM, MitchAlsup1 wrote:
BGB wrote:
  
On 3/25/2024 5:17 PM, MitchAlsup1 wrote:
BGB-Alt wrote:
 
  
Say, "we have an instruction, but it is a boat anchor" isn't an
ideal situation (unless to be a placeholder for if/when it is not
a boat anchor).
>
If the boat anchor is a required unit of functionality, and I
believe IDIV and FPDIV is, it should be defined in ISA and if you
can't afford it find some way to trap rapidly so you can fix it up
without excessive overhead. Like a MIPS TLB reload. If you can't
get trap and emulate at sufficient performance, then add the HW to
perform the instruction.
>
Though, 32-bit ARM managed OK without integer divide.
>
 For slightly less then 20 years ARM managed OK without integer divide.
Then in 2004 they added integer divide instruction in ARMv7 (including
ARMv7-M variant intended for small microcontroller cores like
Cortex-M3) and for the following 20 years instead of merely OK they are
doing great :-)
 
OK.
I think both modern ARM and AMD Zen went over to "actually fast" integer divide.
I think for a long time, the de-facto integer divide was ~ 36-40 cycles for 32-bit, and 68-72 cycles for 64-bit. This is also on-par with what I can get from a shift-add unit.
On my BJX2 core, it is currently similar (36 and 68 cycle for divide).
This works out faster than a generic shift-subtract divider (or using a runtime call which then sorts out what to do).
A special case allows turning small divisors internally into divide-by-reciprocal, which allows for a 3-cycle divide special case. But, this is a LUT cost tradeoff.
It could be possible in theory to support a general 3-cycle integer divide, albeit if one can accept inexact results (would be faster than the software-based lookup table strategy).
But, it is debatable. Pure minimalism would likely favor leaving out divide (and a bunch of other stuff). Usual rationale being, say, to try to fit the entire ISA listing on a single page of paper or similar (vs having a listing with several hundred defined encodings).
Nevermind if the commonly used ISAs (x86 and 64-bit ARM) have ISA listings that are considerably larger (thousands of encodings).
...

Date Sujet#  Auteur
21 Mar 24 * Microarch Club22George Musk
25 Mar 24 `* Re: Microarch Club21BGB-Alt
26 Mar 24  `* Re: Microarch Club20MitchAlsup1
26 Mar 24   `* Re: Microarch Club19BGB
26 Mar 24    `* Re: Microarch Club18MitchAlsup1
26 Mar 24     `* Re: Microarch Club17BGB-Alt
27 Mar 24      +* Re: Microarch Club12Michael S
27 Mar 24      i`* Re: Microarch Club11BGB
27 Mar 24      i `* Re: Microarch Club10MitchAlsup1
28 Mar 24      i  +* Re: Microarch Club4Michael S
2 Apr 24      i  i`* Re: Microarch Club3BGB-Alt
5 Apr 24      i  i `* Re: Microarch Club2MitchAlsup1
6 Apr 24      i  i  `- Re: Microarch Club1BGB
28 Mar 24      i  +- Re: Microarch Club1MitchAlsup1
28 Mar 24      i  `* Re: Microarch Club4Terje Mathisen
28 Mar 24      i   `* Re: Microarch Club3Michael S
29 Mar 24      i    `* Re: Microarch Club2Terje Mathisen
29 Mar 24      i     `- Re: Microarch Club1Michael S
27 Mar 24      `* Re: Microarch Club4MitchAlsup1
27 Mar 24       `* Re: Microarch Club3BGB
27 Mar 24        `* Re: Microarch Club2MitchAlsup1
1 Apr 24         `- Re: Microarch Club1BGB

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal