Liste des Groupes | Revenir à c arch |
On Wed, 31 Jul 2024 23:31:35 +0000, BGB wrote:I agree that fp10 is probably the shortest sane/useful version, but 1:3:4 does in fact contain enough exponent and mantissa bits to be considered an ieee754 format.
So, say, we have common formats:So, you have identified the problem:: 8-bits contains insufficient
Binary64, S.E11.F52, Common Use
Binary32, S.E8.F23, Common Use
Binary16, S.E5.F10, Less Common Use
>
But, things get funky below this:
A-Law: S.E3.F4 (Bias=8)
FP8: S.E4.F3 (Bias=7) (E4M3 in NVIDIA terms)
FP8U: E4.F4 (Bias=7)
FP8S: E4.F3.S (Bias=7)
>
>
Semi-absent in my case:
BFloat16: S.E8.F7
Can be faked in software in my case using Shuffle ops.
NVIDIA E5M2 (S.E5.F2)
Could be faked using RGBA32 pack/unpack ops.
exponent and fraction widths to be considered standard format.
Thus, in order to utilize 8-bit FP one needs several incarnations.
This just points back at the problem:: FP needs at least 10 bits.
Les messages affichés proviennent d'usenet.