Sujet : Re: int a = a
De : david.brown (at) *nospam* hesbynett.no (David Brown)
Groupes : comp.lang.cDate : 20. Mar 2025, 15:42:06
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vrh9fu$3e7sn$1@dont-email.me>
References : 1 2 3 4 5 6 7 8
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0
On 19/03/2025 21:34, Keith Thompson wrote:
David Brown <david.brown@hesbynett.no> writes:
[...]
As far as I understand it (and I hope to be corrected if I am wrong),
Your hope is about to be fulfilled.
"int a = a;" is not undefined behaviour as long as the implementation
does not have trap values for "int". It simply leaves "a" as an
unspecified value - just like "int a;" does. Thus it is not in any
way "worse" than "int a;" as far as C semantics are concerned. Any
difference is a matter of implementation - and the usual
implementation effect is to disable "not initialised" warnings.
The behavior is undefined. In C11 and later (N1570 6.3.2.1p2):
Except when [...] an lvalue that does not have array type is
converted to the value stored in the designated object (and is no
longer an lvalue); this is called lvalue conversion.
[...]
If the lvalue designates an object of automatic storage duration that
could have been declared with the register storage class (never had
its address taken), and that object is uninitialized (not declared
with an initializer and no assignment to it has been performed prior
to use), the behavior is undefined.
OK. I had missed that for some reason. Elsewhere (6.7.9p10, under "initialization") the standard says the value is "indeterminate", which is defined as an "unspecified or trap" value.
It is in much the same category as "(void) x;", which is an idiom for
skipping an "unused variable" or "unused parameter" warning.
Unless I'm missing something, `(void)x` also has undefined beahvior
if x is uninitialized, though it's very likely to do nothing in
practice.
The situation where "(void) x;" is most useful is, I would say, unused parameters. So there is no undefined behaviour there. And for other variables it is most likely in situations where you have assigned to the variable but then don't use it (perhaps you plan to use it later). Maybe you have "status = do_something();", and then don't actually make use of "status" - casting it to void tells both the compiler and the reader that you know "do_something()" is returning a status indicator, but that you are then ignoring it. If you are simply declaring a variable without initialising it and you don't want to use it and don't want to be warned about it, it's probably just as easy (and definitely avoids UB) to remove the declaration.
Long digression follows.
The "could have been declared with the register storage class" seems
quite odd. And in fact it is quite odd.
It's tempting to assume that `int n = n;` did not have undefined
behavior prior to C11, or that accessing an automatic object whose
address has not been taken does not have undefined behavior even
in C11 or later, but it's not that simple.
In C90, the non-normative Annex G (renamed to Annex J in later
editions) says:
The behavior in the following circumstances is undefined:
[...]
- The value of an uninitialized object that has automatic storage
duration is used before a value is assigned (6.5.7).
6.5.7 discusses initialization, and says that "If an object that
has automatic storage duration is not initialized explicitly, its
value is indeterminate", and C90's definition of "undefined behavior"
explicitly refers to use of indeterminately valued objects, though
it's not 100% clear that using an indeterminate value *always*
has undefined behavior.
So in C90, `int n = n;` explicitly had undefined behavior, even if
all possible bit representations for an object of type int correspond
to valid values (C90 didn't mention "trap representations").
C99 added a definition for "indeterminate value": "either an
unspecified value or a trap representation", and drops the mention
of indeterminate values in the definition of "undefined behavior".
It dropped the reference to uninitialized objects in Annex G/J.
I believe that in C99, `int n = n;` is well defined *if* int
has no trap representations, or if the representation stored in
the memory occupied by n happens not to be a trap representation.
If int has trap representations, and that memory happens to contain
such a representation, the behavior is undefined.
I found a discussion in comp.std.c from 2023, subject "Does reading
an uninitialized object have undefined behavior?".
The discontinued IA-64/Itanium processor had something called
"NaT", "Not a Thing". NaT representations exist only in CPU
registers, not in memory. (Imagine an extra bit for each register
indicating whether the register contains a "thing".) A NaT allows
for representations that act like C trap representations (called
non-value representations in C23) even for types with no trap
representations (for example where all 2**N possible representations
correspond to valid values) -- but again, only in CPU registers.
https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm
So the "could have been declared with the register storage class"
wording was added in C11 specifically to cater to the IA64. This
change would have been superfluous in C90, where the behavior was
undefined anyway, but is a semantically significant change between
C99 and C11. (If some future CPU has something like NaT that can
be stored in memory, the wording might need to be updated yet again.)
My takeaway is that if it requires this much research to determine
whether accessing the value of an uninitialized object has undefined
behavior (in which circumstances and which edition of the standard),
I'll just avoid doing so altogether. I'll initialize objects
when they're defined whenever practical. If it's not practical
for some reason, I won't initialize it with some dummy value; I'll
leave it uninitialized so the compiler has a chance to warn me if
I accidentally use it before assigning a value to it.
Thanks for that explanation.
My opinions here match your "takeaway" entirely. Just because I have seen "int a = a;", and know how gcc (and perhaps other compilers) handle it, does not mean I think it is a good thing to write!