Sujet : Re: Computer architects leaving Intel...
De : tr.17687 (at) *nospam* z991.linuxsc.com (Tim Rentsch)
Groupes : comp.archDate : 16. Sep 2024, 02:32:51
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <86jzfccri4.fsf@linuxsc.com>
References : 1 2 3 4 5
User-Agent : Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
kegs@provalid.com (Kent Dickey) writes:
[examples of descending loops with unsigned loop variables]
>
This discussion wandered into many subthreads, but I only want to make
one post and chose here.
>
When you write code working on signed numbers and do something like:
>
(a < 0) || (a >= max)
>
Then the compiler realizes if you treat 'a' as unsigned, this is just:
>
(unsigned)a >= max
>
since any negative number, treated as unsigned, will be larger than the
largest positive signed number. So, to do loops which count down and
have any stride using an unsigned loop count:
>
for(u = start; u <= start; u -= step)
>
With the usual caveats (start must be a valid signed number, and step
cannot be so large that start + step crosses the signed boundary).
Clever, although maybe too tricky. Better if start and step are
also unsigned, in which case a safe test is easily seen to be
start + step > start.
But: unsigned numbers in C have some dangers, which no one here has
mentioned. Some code presented comes CLOSE to being wrong, but gets
lucky. With "int" being 32-bits, C promotion rules around unsigned
ints, signed ints, and unsigned 64-bit can create trouble.
>
uint64_t dval; uint32_t uval; int a;
>
val32 = 1 dval = 1; a = 1;
dval = val32 - 2 + dval;
>
C will do (val32 - 2) first, with is (1U - 2) which is 0xffff_ffff, and
then add dval, and the result is 0x1_0000_0000.
Not really interesting. It's usually a mistake to mix different
types, whether or not the types have different signedness. Arithmetic
is one problem but assignment is another. Using the same type
throughout avoids surprises like this one.
Signed numbers don't have this risk, so if you're doing known small loops,
you can just use ints. If you're doing possibly large loops, just use
int64_t.
I consider this bad advice. Loops are doing something with the loop
variable, and its type should be chosen according to how it is used.
If the loop variable represents an index, or a length, or count, it
should be unsigned (or unsigned long, etc). If the loop variable
represents degrees C or F, or some other naturally signed measure it
should be signed (or maybe floating point). What kind of loop it
is, whether ascending or descending, or what the increment is, etc,
is secondary; a more important factor is what sort of value is
being represented, and in almost all cases that is what should
determine the type used.
Bringing it back to "architecture" Like Anton Ertl has said, LP64 for
C/C++ is a mistake. It should always have been ILP64, and this nonsense
would go away. Any new architecture should make C ILP64 (looking at you
RISC-V, missing yet another opportunity to not make the same mistakes as
everyone else).
I believe this view is shortsighted. The big mistake is developers
hardcoding types everywhere - especially int, but also long, and
their unsigned variants. It's almost never a good idea to hardcode
a specific width (eg, uint32_t) in a type name used for parameters
or local variables, but that is by far a very common practice.
Names of types should reflect how the variable is meant to be used,
not the specifics of what sort of register it goes into. The more
firmly we cement our programs to specific hardware choices, the
greater the pain when those choices need to change, either due to
time or moving to a different platform. The key is to keep things
light and flexible, not encrusted onto fixed hardware choices like
barnacles on the hull of a ship.