Sujet : Re: is double slower?
De : fir (at) *nospam* grunge.pl (fir)
Groupes : comp.lang.cDate : 05. Nov 2024, 11:42:38
Autres entêtes
Organisation : i2pn2 (i2pn.org)
Message-ID : <6729F69E.5000808@grunge.pl>
References : 1 2 3 4
User-Agent : Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
David Brown wrote:
On 05/11/2024 10:49, fir wrote:
David Brown wrote:
On 04/11/2024 08:53, fir wrote:
float takes less space and when you keep arrays of floats for sure
float
is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)
>
>
Certainly if you have a lot of them, then the memory bandwidth and cache
it rate can make floats faster than doubles.
>
but when you do calculations on local variables not floats do the
double is slower?
>
I assume that for the calculations in question, the accuracy and range
of float is enough - otherwise the answer is obviously use doubles.
>
>
This is going to depend on the cpu, the type of instructions, the source
code in question, the compiler and the options. So there is no single
easy answer.
>
You can, as Bonita suggested, look up instruction timing information at
agner.org for the cpu you are using (assuming it's an x86 device) to get
some idea of any fundamental differences in timings. Usually for modern
"big" processors, basic operations such as addition and multiplication
are single cycle or faster (i.e., multiple instructions can be done in
parallel) for float and double. But division, square root, and other
more complex operations can take a lot longer with doubles.
>
Next, consider if you can be using vector or SIMD operations. On some
devices, you can do that with floats but not doubles - and even if you
can use doubles, you can usually run floats at twice the rate.
>
>
In the source code, remember it is very easy to accidentally promote to
double when writing in C. If you want to stick to floats, make sure you
don't use double-precision constants - a missing "f" suffix can change a
whole expression into double calculations. Remember that it takes time
to convert between float and double.
>
>
Then look at your compiler flags - these can make a big difference to
the speed of floating point code. I'm giving gcc flags, because those
are the ones I know - if you are using another compiler, look at the
details of its flags.
>
Obviously you want optimisation enabled if speed is relevant - -O2 is a
good start. Make sure you are optimising for the cpu(s) you are using -
"-march=native" is good for local programs, but you will want something
more specific if the binary needs to run on a variety of machines. The
closer you are to the exact cpu model, the better the code scheduling
and instruction choice can be.
>
Look closely at "-ffast-math" in the gcc manual. If that is suitable
for your code (and it often is), it can make a huge difference to
floating point intensive code. If it is unsuitable because you have
infinities, or need deterministic control of things like associativity,
it will make your results wrong.
>
"-Wdouble-promotion" can be helpful to spot accidental use of doubles in
what you think is a float expression. "-Wfloat-equal" is a good idea,
especially if you are mixing floats and doubles. "-Wfloat-conversion"
will warn about implicit conversions from doubles to floats (or to
integers).
>
>
>
the code that seem to speeded up a bit when turning float to double is
>
>
I've tried to snip the bits that are important here.
>
inline float distance2d_(float x1, float y1, float x2, float y2)
{
return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
}
>
>
What happens here depends on what #include files you use. If you have
#include <math.h>, then "sqrt" is defined with doubles. So the
sum-of-squares expression is calculated using floats. Then this sum is
converted to a double (taking an extra instruction or two) before
calling double-precision sqrt. Then it is converting that result back
to float to return it.
>
If you have "#include <tgmath.h>", then "sqrt" here will be done as
float sqrtf, rather than double. But the library version of sqrtf()
might actually call sqrt (double). If you want to be sure, be explicit
with sqrtf().
>
And on many platforms, sqrt (float or double) uses a library function
for full IEEE compatibility. With "-ffast-math", you are telling the
compiler you promise that the operand for "sqrt" will be "nice", and it
can use a single hardware sqrt instruction. This will likely be a lot
faster, especially if the float version is used. (Disclaimer - I
haven't looked at this on modern x86 targets. Check yourself - I
recommend putting your code into godbolt.org and examining the assembly.)
>
>
In the code that uses this function, you are starting with integer types
that need to be converted to float to pass to the distance function, and
the result of the call is used in a float expression before being
converted to double.
>
In short, it is a complete mess of conversions. And unless you are
using something like gcc's "-ffast-math" to say "don't worry about the
minor details of IEEE, optimise akin to integer arithmetic", then the
compiler has to generate all these back-and-forth conversions.
>
>
Being consistent in your types is going to improve things, whether you
use floats or doubles. You might even be better off using integer
arithmetic in some points.
>
>
//fere below was float ->
double p = (R - distance2d_(x,y,point[i].x,point[i].y));
>
>
well that interesting..especially i was unaware of this sqrtf i will see a bit later
as to -fast-math i dont noticed the difference though i was not testing it besides simple sight.. i used it back years then but later i disabled it as i get some bug in one code which was afair caused by that
(im not sure though, today i rarely code at all so im not to much fresh to various test)
in fact i could more hardy optimise it just by building table with that
fading circle of size 45x45 and do a look up there (back then i was doing a big doze of thsi level optimisations, but after all i know it is to do on final stage of app as it generally makes harder to work on it at live and test various changes, but as final stage its generally worth if something runs 30-50% faster)