Re: Making Lemonade (Floating-point format changes)

Liste des GroupesRevenir à c arch 
Sujet : Re: Making Lemonade (Floating-point format changes)
De : terje.mathisen (at) *nospam* tmsw.no (Terje Mathisen)
Groupes : comp.arch
Date : 10. Jun 2024, 07:07:28
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v46571$7cro$1@dont-email.me>
References : 1 2 3 4 5 6 7
User-Agent : Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.2
Lawrence D'Oliveiro wrote:
On Mon, 13 May 2024 21:16:48 +0000, MitchAlsup1 wrote:
 
Emulation is slow when trap overhead is large and not-slow when trap
overhead is small.
 I think it was a particular version of the old Mac OS, from around 1990 or
so, that implemented a really amazing hack. Some 32-bit machines had
hardware floating-point, others didn’t. So developers of numerics-
intensive apps had to build two versions of their code, one with the
floating-point instructions, the other with calls to Apple’s SANE library.
 The hack involved running code built to use hardware floating-point
instructions, on hardware that didn’t have them. The instructions were of
course trapped and emulated. But more than that, the system would patch
the instruction that caused the trap, turning it into a direct call into
the emulation routine. So after the first execution, each such instruction
would run much faster. Until the code got unloaded from RAM and the patch
was lost, of course.
This only works when each FP instruction is at least as long as a function call. This particular approach was standard on PCs more or less from the very beginning (i.e. 1981++):
You could build applicatons with direct 8087 instructions, with pure sw emulation via CALL FDIV_emulation etc, or in a mode where each emitted hw fp instruction was followed by enough NOPs to make the total length at least 5 bytes: This way the missing HW trap handler could patch them into CALLs (possibly followed by one or more NOPS if the HW opcode was very long) instead.
Since all those 8087 instructions were _very_ slow (30-300 clock cycles?), executiong an extra NOP or two made no discernible difference.
Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Date Sujet#  Auteur
10 Jun 24 * Re: Making Lemonade (Floating-point format changes)4Lawrence D'Oliveiro
10 Jun 24 `* Re: Making Lemonade (Floating-point format changes)3Terje Mathisen
10 Jun 24  `* Re: Making Lemonade (Floating-point format changes)2Niklas Holsti
11 Jun 24   `- Re: Making Lemonade (Floating-point format changes)1Lawrence D'Oliveiro

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal