Sujet : Re: Computer architects leaving Intel...
De : tr.17687 (at) *nospam* z991.linuxsc.com (Tim Rentsch)
Groupes : comp.archDate : 12. Sep 2024, 14:06:46
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <86y13xf2c9.fsf@linuxsc.com>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
[considering which way to copy with memmove()]
If the two memory blocks don't overlap, memmove() can use the
fastest stride. [...]
>
The way to go for memmove() is:
>
On hardware where positive stride is faster:
>
if (((uintptr)(dest-src)) >= len)
return memcpy_posstride(dest,src,len)
else
return memcpy_negstride(dest,src,len)
>
On hardware where the negative stride is faster:
>
if (((uintptr)(src-dest)) >= len)
return memcpy_negstride(dest,src,len)
else
return memcpy_posstride(dest,src,len)
>
And I expect that my test is undefined behaviour, but most people
except the UB advocates should understand what I mean.
Code inside the implementation is allowed to exploit internal
knowledge.
The benefit of this comparison over just comparing the addresses
is that the branch will have a much lower miss rate.
It's a clever idea. It suffers from a few shortcomings.
First, the type name is uintptr_t. Also, uintptr_t might not
exist.
Second, uintptr_t might be small, leading to incorrect behavior
in some cases. Better to use a large unsigned type that is
known to exist, either unsigned long long or uintmax_t.
Third, pointer subtraction is not guaranteed to work for large
differences because ptrdiff_t might not be big enough. This is
just a technicality because presumably the implementation would
know how big ptrdiff_t is and wouldn't use this approach if it
were too small. That said, it's something to keep in mind if the
code is meant to be used on other systems.
Last but not least, having two different code blocks for the
different preferences is clunky. The two blocks can be
combined by fusing the two test expressions into a single
expression, as for example
#ifndef PREFER_UPWARDS
#define PREFER_UPWARDS 1
#endif/*PREFER_UPWARDS*/
extern void* ascending_copy( void*, const void*, size_t );
extern void* descending_copy( void*, const void*, size_t );
void *
good_memmove( void *vd, const void *vs, size_t n ){
const char *d = vd;
const char *s = vs;
_Bool upwards = PREFER_UPWARDS ? d-s +0ull >= n : s-d +0ull < n;
return
upwards
? ascending_copy( vd, vs, n )
: descending_copy( vd, vs, n );
}
Using the preprocessor symbol PREFER_UPWARDS to select between
the two preferences (ascending or descending) allows the choice
to made by a -D compiler option, and we can expect the compiler
to optimize away the part of the test that is never used.