Sujet : Re: int a = a
De : david.brown (at) *nospam* hesbynett.no (David Brown)
Groupes : comp.lang.cDate : 21. Mar 2025, 10:44:05
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vrjcd5$18m5n$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0
On 20/03/2025 20:46, Keith Thompson wrote:
David Brown <david.brown@hesbynett.no> writes:
On 20/03/2025 11:20, Keith Thompson wrote:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
The "could have been declared with the register storage class"
seems quite odd. And in fact it is quite odd.
>
I don't have the same reaction. The point of this phrase is that
undefined behavior occurs only for variables that don't have
their address taken. The phrase used describes that nicely.
Any questions related to "registerness" can be ignored, because
'register' in C really has nothing to do with hardware registers,
despite the name.
DR 338 is explicitly motivated by an IA-64 feature that applies only
to
CPU registers. An object whose address is taken can't be stored (only)
in a register, so it can't have a NaT representation.
The phrase used is "could have been declared with register storage
class
(never had its address taken)". Surely "never had its address taken"
would have been clear enough if CPU registers weren't a big part of the
motivation.
>
I too think the phrasing is a bit odd.
>
Just because a variable's address is taken, does not mean it cannot be
put in a cpu register by the compiler. If the variable is not
accessed in a way that actually requires putting it in memory, then
the compiler can put it in a cpu register (or otherwise optimise it).
So simply taking the address of a variable on IA-64 does not mean it
cannot be in a register, and thus does not necessarily mean it cannot
be NaT. Taking the address of a variable means the variable cannot be
declared "register", but it does not mean it cannot be /in/ a
register.
Sure, any variable that's stored in memory can be mirrored by holding
its value in a register.
int n = 42; // Assume n is assigned a memory address
printf("n+1=%d n+2=%d\n", n+1, n+2);
A compiler could plausibly store the value of n in a register before
computing n+1, and then reuse the register value to compute n+2.
Yes, of course. But there is also no necessity for variables to be in memory at all, or that there is any consistency there. "Assume n is assigned a memory address" is a completely unwarranted assumption for almost all local variables. It is only if the address is taken, and used in some way that is beyond the optimiser, that the variable actually has to go in a fixed place in memory. Otherwise optimisers can and do keep data in registers, or move them in and out of registers and different stack slots according to convenience for efficient code.
uint32_t float_to_uint(float f) {
uint32_t u;
memcpy(&u, &f, 4);
return u;
}
gcc compiles that to :
float_to_uint:
movd eax, xmm0
ret
So even though the addresses of the variable "u" and the parameter "f" are taken, and converted to char pointers, and passed to a function with external linkage, nothing is actually put in memory at all.
Thus the standard's wording as though the legality of using the "register" storage-class specifier corresponds to cpu register usage is, at best, wildly out of date.
(And there are some architectures where the cpu registers are directly mapped to memory, and can be accessed as memory locations or registers.)
My understanding is that IA-64 NaT (Not a Thing) representations
exist only for registers, and the NaT bit should be cleared when
a value is stored in the register.
The odd wording in the standard allows an IA-64 C compiler to
take advantage of NaT representations for their intended purpose.
It might impose some minor constraints on what machine code can be
generated, but *most* of the cases where a NaT could be accessed
are undefined behavior in C.
I see that, but I believe it would be much simpler and clearer if attempting to read an uninitialised and unassigned local variable were undefined behaviour in every case.
Alternatively, it could have said that the value is unspecified in every case. Then on the IA-64, the compiler would have to ensure that registers do not have their NaT bit set even if they are not initialised - this would not be a difficult task. Enabling use of the NaT bit for detection of bugs could then be a compiler option if implementations wanted to provide that feature.
It seems very strange to me that this is UB:
>
int foo1(void) {
int x;
>
return x;
}
>
while this is not :
>
int foo2(void) {
int x;
>
int * p = &x;
>
return x;
}
>
(Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler
in its list.)
>
It strikes me that it would have been far simpler for the standard
simply to say that using the value of an uninitialised and unassigned
variable is undefined behaviour.
In C90, it was. C99 changed that, making the behavior defined if the
representation is not a trap representation.
For C99, a conforming IA-64 C compiler would have had to go out of its
way to avoid accessing NaT representations. For example, if you wrote
{
int n;
n;
}
the most straightforward IA-64 code would store n in a register and
not initialize it, resulting in a trap when the register is read.
A compiler might have to generate code to store an arbitrary value
in the register to void the trap.
I'm undecided on whether reading the value of an uninitialized
automatic object *should* be undefined behavior, but given that
it isn't, the C11 committee made the smallest possible change to
cater to IA-64 semantics.
IMHO, having it as UB is the best option, with unspecified behaviour as a second best option. The jumble that C11 has is not necessary for the IA-64, and clearly worse than the other two choices for architectures that don't have a NaT equivalent.