Liste des Groupes | Revenir à c arch |
On 9/1/2024 1:34 AM, Terje Mathisen wrote:MitchAlsup1 wrote:
it is 53×53->106 to get correct rounding in 1 step.It was a revelation to me when I wrote my first fp emulation code and>
grok'ed how having a single guard bit followed by a sticky bit was
sufficient to do this for all rounding modes.
>
At that point I only needed to maintain enough intermediate bits to
guarantee I would still have those rounding bits after normalization.
>
This doesn't mean that I could skip calculating all the bits of the full
NxN->2N mantissa product, only that I didn't need to keep them all
around after normalization.
>
OK.
>
It seemed like when I looked over the 1985 spec initially, it only
required that the result be larger than that of the destination
(seemingly missed the point of it also requiring infinite precision).
>
Say, 54*54 => 68 bits, where 68 > 52, under this interpretation, it
would have worked. Granted, this does turn it into a probability game
whether the result is correct or off by 1.
>My point exactly,
But, have now since noticed that it did specify computing to infinite
precision (in this version of the standard), which, my FPU does not do.
>
>Something IEE specifies but would require an intermediate of 2045
>
There was mention of some operations that I have generally not seen in
the ISA in real-world FPUs:
An FP remainder operator;
Converters to/from ASCII strings;Easier and better in SW.
An FP->Int truncate operator with the result still in FP format;RND (round) instrution.
Usually, one goes round-trip FP->Int->FP;Has underflow and overflow problems 2^1022 -> int=>overflow, ...
...More modern machines have RND nobody will ever have REM.
>
Seems like pretty much everyone offloaded these tasks to the C library.
>You could check for "inability to correctly round and trap on that
>
I had ended up with coverage of most of the rest, albeit still lacking a
"trap on denormal" handler (seemingly worked for MIPS and friends, *).
>
So, it seemed like it was getting pretty close to "could maybe pass the
1985 spec if one lawyers it...". Maybe not so much it seems, unless I
fix the FMUL issue (TBD if it can be done without significantly
increasing adder-chain latency).
>GPUs started out without even IEEE 754 formats and over many generations
>
It is possible I could also add a check to detect and trap multiplies
for cases where both values have non-zero low-order bits (allowing these
to also be emulated in software).
>
So, went and added a flag for "Trap as needed to emulate full IEEE
semantics" to FPSCR, where the idea is that enabling this will cause it
to trap in cases where the FPU detects that the results would likely not
match the IEEE standard (if using FADDG/FSUBG/FMULG/..., generally if
fenv_access is enabled).
>
Might make sense to have a compiler option to assume fenv_access is
always enabled.
>
>
>
*: Though, from what I can gather, most of the N64 games and similar had
operated with this disabled (giving DAZ/FTZ semantics) which apparently
posed an annoyance for later emulators (things like moving platforms in
games like SMB64 would apparently slowly drift upwards or away from the
origin if the map was left running for long enough, etc; due to SSE and
similar tending to operate with denormals enabled).
>
>FMAC (with single rounding, which is the interesting one) you can of>
course get catastrophic cancellation, so you need all the 2N mantissa
bits of the multiplication plus the N bits from the addend, then you
either need a normalizer wide enough to take in any possibly alignments
of the two parts, or you must have separate logic for each of the major
cases.
>
Yeah, for the 2008 spec onward, would also need this...
>
It is possible to provide it as a library call, but granted this makes
it slower.
>
>
There are FMAC instructions, but they are currently both slow and
double-rounded (so, not so useful). Well, except for Binary16 and
Binary32 which appear single-rounded mostly because they happen to be
performed internally as Binary64 (but are still slow).
>
Les messages affichés proviennent d'usenet.