Sujet : Re: Constants and undefined behavior
De : cross (at) *nospam* spitfire.i.gajendra.net (Dan Cross)
Groupes : comp.lang.cDate : 11. Jun 2026, 12:50:04
Autres entêtes
Organisation : PANIX Public Access Internet and UNIX, NYC
Message-ID : <110e7dc$s4$1@reader1.panix.com>
References : 1 2 3 4
User-Agent : trn 4.0-test77 (Sep 1, 2010)
In article <
110cre9$13aa9$1@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+
u@gmail.com> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
[...]
I see you did not read the other messages in the (sub)thread,
but ok, here it is again, in C:
>
```
term% cat what.c
#include <stdio.h>
int main(void) { for (unsigned int k = 0; k != 1; k += 2); return 0; }
void hello(void) { printf("Hello, World!\n"); }
term% clang --version | sed 1q
clang version 22.1.6
term% clang -Wall -pedantic -pedantic-errors -O1 -std=c23 -o what what.c
what.c:2:58: warning: for loop has empty body [-Wempty-body]
2 | int main(void) { for (unsigned int k = 0; k != 1; k += 2); return 0; }
| ^
what.c:2:58: note: put the semicolon on a separate line to silence this warning
1 warning generated.
term% ./what
Hello, World!
term%
```
>
I see the same behavior.
>
The following largely repeats what I've written previously in
this thread.
>
Apparently the authors of clang decided that this statement in N3220
6.8.6.p4:
>
An iteration statement may be assumed by the implementation to
terminate if its controlling expression is not a constant
expression, ...
>
means that a program that violates that assumption has undefined
behavior. I intensely dislike both the rule and the way it's stated,
but I agree that the conclusion that the behavior is undefined is
a reasonable one.
I think the behavior is technical "unspecified" in the sense of
the C standard, but yes, this is the important bit. The
controlling expresion is not constant, and the loop doesn't meet
any of the other criteria set forth in sec 6.8.6 para 4 for,
therefore, the translator may assume it terminates (it is
unspecified whether or not it does; either behavior is correct.
GCC, for example, appears not to make the same assumption).
Of course since the behavior is undefined, *anything* could happen.
I don't know what happened inside clang (or the minds of its
maintainers) that caused it to generate code that executes a
statement in the body of a function that's never called, but that's
just one of the infinitely many allowed behaviors. A quick look at the
generated code indicates that there's no x86-64 "retq" instruction
for either main() or hello(), and apparently control falls through
from the end of main() to the body of hello(). That seems weird.
Here's a slightly better version of `what.c` (that removes the
annoying "loop is body, move the semicolon to the next line"
warning):
```
#include <stdio.h>
int main(void) { unsigned int k = 0; while (k != 1) k += 2; return 0; }
void hello(void) { printf("Hello, World!\n"); }
```
I think the reasoning goes something like this: in optimization
phase $n$, the compiler determines that `k` can never be 1, and
thus the loop does not terminate, and therefore, `return 0;` is
inaccessible, so it's removed. Then, in phase $n + k$, for
$k>0$, it applies the rules of sec 6.8.6 para 4, assumes that
the loop must terminate, and therefore can be removed, and
removes it. The `return` is already gone. So what you're left
with is an label that just cascades into whatever is next in
object code; that just happens to be `hello`.
It might just be a bug (but not one that, as far as I can tell,
violates the C standard).
It's known. It was known when first reported a couple of years
ago in the C++ context, and I suspect they know about it now. I
can ask someone who works on LLVM. I suspect the reasoning will
be that this is important to guarantee forward progress, and
that they can't solve the halting problem, therefore such loops
can be removed. If that causes your program to do something
weird, then, well, don't do that.
A function whose body contains a construct that would have undefined
behavior if the function were called (not the case here) does not
cause undefined behavior if there are no calls to the function.
True, but irrelevant to the point I was making, which is that UB
can induce a "call" to a function, even without a reference to
it appearing in the source text.
- Dan C.
Haut de la page
Les messages affichés proviennent d'usenet.
NewsPortal