Re: Constants and undefined behavior

Liste des GroupesRevenir à cl c  
Sujet : Re: Constants and undefined behavior
De : cross (at) *nospam* spitfire.i.gajendra.net (Dan Cross)
Groupes : comp.lang.c
Date : 09. Jun 2026, 00:15:48
Autres entêtes
Organisation : PANIX Public Access Internet and UNIX, NYC
Message-ID : <1107if4$6sk$1@reader1.panix.com>
References : 1 2 3 4
User-Agent : trn 4.0-test77 (Sep 1, 2010)
In article <11075os$3fm4u$1@kst.eternal-september.org>,
Keith Thompson  <Keith.S.Thompson+u@gmail.com> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <1100g0e$1lt8i$1@kst.eternal-september.org>,
Keith Thompson  <Keith.S.Thompson+u@gmail.com> wrote:
[...]
A naive compiler that performs no optimizations would generate
code for foo() that attempts to compute (INT_MAX+1)*0 step by
step, without recognizing the overflow, and that code would never
be executed.
>
Sure.  But a far more sophisticated translator (and I would
argue a nefarious one) could emulate that code, decide it was
UB, and immediately fail translation with an error.
>
I disagree.  That's not a sensible interpretation of what the
standard says.

I agree it's not sensible.  But sadly, the standard does not
seem to explicitly prohibit it, either.  This is the point: we
necessarily rely on a "reasonable interpretation" of the
standard to be able to usefully write C code.  An adversarial
interpretation is not sensible, but it appears that such is
possible given the standard as written.  This is a danger with a
language that is not formally specified.

A call to a foo() would have undefined behavior if it occurred.

What I'm really trying to get at is that the behavior of
`int zero = (INT_MAX + 1)*0;` is undefined in all cases.  There
is no input for which it is valid at all.  It is qualitatively
different than other examples where UB cannot be detected
_except_ at runtime.

In particular, it does not become defined just because it's in a
function that is not called; the behavior is UB on its face.  It
is utterly meaningless as far as C is concerned; it is what
Regehr calls a "Type 3" function in his taxonomy at
https://blog.regehr.org/archives/213: it literally has no
definition.

There
is no call to foo().

What I am further saying is that I do not see where the C
standard puts additional constraints on an implementation so
that it _must_ accept a program with such a construct in it, as
sensible as that may otherwise be (I actually don't think that
is very sensible, but that's my opinion).  The specific wording
of the standard appears to allow a compiler to halt translation
if it observes that expression, whether it's in a function that
is called or not.

I readily concede that I may be wrong.  But the arguments I have
heard opposing this interpration are not well-supported by the
text.  I would be happy if someone could provide such an
argument that did not ultimately rely on either intuition or
assumptions about reasonable behavior, but so far, none have
been proferred.

Similarly:
>
   int a = ..., b = ...;
   int c;
   if (b != 0) {
       c = a / b;
   }
   else {
       c = 0;
   }
>
A division by zero would have undefined behavior if it occurred,
but it never occurs.  A compiler cannot reject the above code
because of UB that never happens.

This I also agree with.  But assuming this is in some function
that is otherwise well-defined, this is what Regehr calls a
"Type-1" function: there is no input for which it is undefined.

In this regard, it is qualitatively different than the `foo`
example that is the subject of this thread.  I suggest that that
qualitative difference actually matters.

[...]
>
It returns a status of 0 from main and does nothing else.
A conforming implementation *must* generate code that implements
that behavior.
>
I have yet to find or be shown a way in which the standard
actually guarantees that.
>
How does the standard guarantee *anything*?

The thrust of what I have been driving at is that the standard
actually guarantees a lot less than people take for granted.

This strictly conforming program:
>
   int main(void) { return 0; }
>
when executed returns a status of 0 from main and does nothing else.

Actually, does it?  It also implicitly closes the standard
input, output, and error streams.  That could have side effects.

Adding an uncalled function to the same source file doesn't change
that.

But it's not _just_ an uncalled function.  It's an uncalled
function that is manifestly gibberish because there is no input
for which that expression is well-defined.

I have not found evidence that the standard explicitly prohibits
a pathological compiler from doing something unexpected in that
case.  An adversarial read of the standard could allow a
compiler to treat this in a manner similar to a syntax error.

[...]
>
There was, once, a view that was almost universally shared that
UB was meant for things that could not be precisely described
because hardware was too varied.  We're well past that; now it's
a vehicle for compiler writers to make benchmarks faster, but is
(generally) hostile to programmers.  A lot of hay is made about
it in this group, but at the core, it's just (ironically) not
well-defined.
>
The standard does say what UB is meant for.  It says what UB
*is*, and what constructs lead to it (by omission in some cases).
Any optimization tricks played by compiler implementers must be
based on that specification.

Yes.  Just so.  And it also says that anything not explicitly
stated in the standard is UB.

As we all know, the definition of UB in the standard is,
"behavior, upon use of a nonportable or erroneous program
construct or of erroneous data, for which this document imposes
no requirements."

Behavior is defined as, "external appearance or action".  Note
that this does not explicitly state that "behavior" is only
applicable during execution, and we know that the standard, as
written today, says that some behaviors are "undefined" _at
translation time_.  I cannot find something forbidding an
implementation from interpreting "external appearance or action"
to refer to the success or failure of translation and production
of an associated artifact.  Translation phase 7 then says that
the after all of the preprocessing and so forth, "the resulting
tokens are syntactically and semantically analyzed and
translated as a translation unit."  As written, a compiler could
certainly detect that that expression, whether executed or not,
is UB.

Indeed, sec 3.5.3 para 2, "Note 1 to entry", explicitly mentions
terminating translation as one of a few sample "undefined
behaviors".  It doesn't say that the compiler _has_ to do that,
but does not say that it _must not_, either.

Sec 3.5.3 para 4 ("Note 3 to entry") is the closest I see to
mandating the interpretation you and Rentsch have taken, but
that is specific to _execution time_, not _translation time_,
and the latter is not outright banned from responding to UB: the
text of the standard imposes no requirements in this context.
Dare I say that the translation-time behavior is undefined?

[...]
>
I agree.  printf("hello, world\n") must write that string to standard
output, which may be a file or an interactive device.  Just what
that means is unspecified or implementation-defined.  It might be
printed in EBCDIC or incised into clay tablets.  Closing stdout,
which occurs when main() terminates, might involve firing the tablet
or emitting control sequences for a screen reader.
>
Exactly.  It could also emit the string, "GOODBYE WORLD."
>
No, it couldn't.  It must emit "hello, world\n" in some form.
It must emit the character 'h' as represented in the execution
character set, followed by 'e', and so on.

I didn't say that it wouldn't; I was referring specifically to
the behavior on closing stdout.  You are right, it must emit
something corresponding to, "hello, world\n"; but what it does
after that is up to the implementation.  We agree that it could
emit a terminal reset sequence; there is no reason that sequence
couldn't be, "GOODBYE WORLD."  It'd be a weird one, but it's not
impossible.

[...]
>
This presupposes that the program is strictly conforming, but
in the limit, the standard can be interpreted in such a way that
if any statement in the program is proveably UB (as this one is)
then the program cannot said to be strictly conforming.
>
It's not UB if it's never called.  Behavior that doesn't happen is
not behavior.

See above.  The standard simply does not say that.  The standard
merely says that behavior is something that manifests as
"external appearance or action."  Translation is certainly an
action with an "external appearance" and nothing says that
behavior _during translation_ is any less "behavior" than
behavior during execution.  In fact, the standard explicitly
mentions undefined behavior and translation.

I did not presuppose that the program is strictly conforming.

Well, you kinda did: you said that the program is strictly
conforming, and then said that it must be accepted because it is
strictly conforming.  That acceptance is predicated on it being
strictly conforming.

I read the source code and determined that it meets the standard's
definition of a strictly conforming program.

I have presented what I think is an equally valid, alternative
reading of the text of the standard where that does not hold.

That reading is, admittedly, adversarial.  That does not mean it
is wrong.  I am saying that this is a weakness of the standard,
not a good interpretation.

40 years ago people thought the idea of that a post-modern
compiler time-travelling in the pursuit of optimization when UB
is detected during translation was an adversarial read of the
standard.  And yet, here we are.

[...]
>
Ok, so in that case, would we say that "`foo` has undefined
behavior?"  The qualification, "...if called" seems superfluous,
and I don't see anything in the standard that explicitly
disagrees.
>
The qualification "if called" is the whole point.

Except it's not.  The behavior of that expression is simply
undefined; whether executed or not, there's no way it _could_ be
defined.

[...]
>
UB can time-travel, however.  Because it's undefined, the
compiler is free to assume that it never executes, or that it
always executes.
>
"UB can time-travel" is perhaps an oversimplification.
>
An example is
a bug that occurred in the Linux kernel, something like:
>
   void func(int *ptr) {
       do_something_with(*ptr);
       if (ptr != NULL) {
           blah();
       }
   }
>
The compiler, on seeing the expression `*ptr`, assumed that `ptr` is
not null, and elided the test on the following line.
>
But even assuming that's valid, a compiler absolutely cannot assume that
an instance UB always executes when, according to the semantics of the
program, it provably never executes.

Time travel is a term of art, here.  I posted this elsewhere in
the thread, and I think he does a much better job explaining it
than I can:
https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=633

Reading a bit more, I think that C23 sec 3.5.3 para 4 appears
to be trying to reign that in.  Hope springs eternal.

[...]
>
So any program that produces no output at all is strictly
conforming?  Then what about this?
>
#include <limits.h>
>
int
zero(void)
{
return (INT_MAX + 1) * 0;
}
>
int
main(void)
{
(void)zero();
return 0;
}
>
That's an interesting point.  A more terse example:
>
#include <limits.h>
int main(void) {
   int unused = INT_MAX + 1;
}

Sure.  Or consider this program:

```
#include <limits.h>

int
foo(int a)
{
    extern int int_max;
    int_max = INT_MAX + 1;
    return int_max;
}

int
main(void)
{
    return 0;
}
```

Suppose that no definition for `int_max` is provided; is this a
strictly conforming program?  Consider section 6.9.1, which
describes external definitions.  The relevant paragraph is 5,
which reads in part, "If an identifier declared with external
linkage is used in an expression somewhere in the entire program
there shall be exactly one external definition for the
identifier; otherwise, there shall be no more than one."

But as has been argued, `int_max` is not actually _used_, since
`foo` is never called.  If that holds, then this ought to be
accepted by a conforming implementation.  Yet, this fails to
build with both gcc and clang, clearly both consider `int_max`
to be "used".  Ok, so what about this?

#include <limits.h>

int
foo(int a)
{
    extern int int_max;
    if ((INT_MAX + 1)*0) {
        int_max = INT_MAX + 1;
    }
    return 0;
}

int
main(void)
{
    return 0;
}

This _does_ build.

So it appears that, at least for `gcc` and `clang`, merely not
calling `foo` is insufficient.

This program produces no output, yet clearly executes a function
that contains an expression that induces undefined behavior when
evaluated.  I suppose an argument could be made that it _might_
generate output due to UB, as UB imposes no requirements Not to
do so, so perhaps the _absence_ of output depends on UB.
>
The program clearly has undefined behavior when executed, but no
output depends on that undefined behavior.  In my humble opinion,
this demonstrates a flaw in the standard's definition of "strictly
conforming program".  (As a programmer: Don't do that.)

That's kind of what I'm saying.  Though this interpretation
hinges on whether the absence of output can be defined as output
in some sense; in this case, the compiler could emit code that
says, "this program has UB", and I think that would be fine with
respect to the standard.

But the standard says that an implementation can stop
translating a program if it detects UB, and nothing appears to
limit that to functions that have been called from `main`.

[...]
>
In my ideal world, C would be rigorously defined with a precise
operational semantics.  That would be accompanied by an
explanatory document that presented those semantics in lay
terms in prose, similar to the standard now, for those who did
not want to drive Coq or something similar.  But at least we'd
have something definitive to define the language, so that when
there was apparent ambiguity, we had some objective metric by
which to judge.  The C standard, as written, is nowhere close as
precise as it should be.
 
I do not think that this will ever happen: not only would it be
very difficult to produce (as you noted elsethread), I think the
compiler writers would rebel if they felt that their UB hands
were tied by a formal specification.
>
"There are only two kinds of languages: the ones people complain
about and the ones nobody uses."

Yup.

- Dan C.


Date Sujet#  Auteur
27 May 26 * this girl calls c ugly365fir
27 May 26 `* Re: this girl calls c ugly364fir
28 May 26  `* Re: this girl calls c ugly363BGB
28 May 26   +* Re: this girl calls c ugly5Lawrence D’Oliveiro
28 May 26   i+* Re: this girl calls c ugly3BGB
29 May 26   ii`* Re: this girl calls c ugly2Lawrence D’Oliveiro
29 May 26   ii `- Re: this girl calls c ugly1BGB
28 May 26   i`- Re: this girl calls c ugly1Bonita Montero
28 May 26   +* Re: this girl calls c ugly19Janis Papanagnou
28 May 26   i+* Re: this girl calls c ugly15BGB
29 May 26   ii+- Re: this girl calls c ugly1Lawrence D’Oliveiro
29 May 26   ii`* Re: this girl calls c ugly13Janis Papanagnou
29 May 26   ii `* Re: this girl calls c ugly12BGB
29 May 26   ii  +* Re: this girl calls c ugly9David Brown
29 May 26   ii  i`* Re: this girl calls c ugly8BGB
30 May 26   ii  i `* Re: this girl calls c ugly7David Brown
30 May 26   ii  i  +* Re: this girl calls c ugly2Janis Papanagnou
30 May 26   ii  i  i`- Re: this girl calls c ugly1David Brown
30 May 26   ii  i  `* Re: this girl calls c ugly4BGB
31 May 26   ii  i   `* Re: this girl calls c ugly3David Brown
31 May 26   ii  i    `* Re: this girl calls c ugly2BGB
31 May 26   ii  i     `- Re: this girl calls c ugly1David Brown
29 May 26   ii  +- Re: this girl calls c ugly1Janis Papanagnou
30 May 26   ii  `- Re: this girl calls c ugly1Lawrence D’Oliveiro
28 May 26   i`* Re: this girl calls c ugly3Chris M. Thomasson
29 May 26   i `* Re: this girl calls c ugly2Janis Papanagnou
29 May 26   i  `- Re: this girl calls c ugly1Chris M. Thomasson
28 May 26   `* Re: this girl calls c ugly338fir
28 May 26    `* Re: this girl calls c ugly337BGB
29 May 26     +* Re: this girl calls c ugly330Lawrence D’Oliveiro
29 May 26     i`* Re: this girl calls c ugly329Janis Papanagnou
29 May 26     i `* Re: this girl calls c ugly328Bart
29 May 26     i  +* Re: this girl calls c ugly312Janis Papanagnou
29 May 26     i  i`* Re: this girl calls c ugly311Bart
29 May 26     i  i +* Re: this girl calls c ugly9Janis Papanagnou
29 May 26     i  i i+* Re: this girl calls c ugly2Bart
29 May 26     i  i ii`- Re: this girl calls c ugly1Janis Papanagnou
29 May 26     i  i i`* Re: this girl calls c ugly6Bart
29 May 26     i  i i +* Re: this girl calls c ugly4Janis Papanagnou
29 May 26     i  i i i`* Re: this girl calls c ugly3Bart
29 May 26     i  i i i `* Re: this girl calls c ugly2Janis Papanagnou
29 May 26     i  i i i  `- Re: this girl calls c ugly1Bart
29 May 26     i  i i `- Re: this girl calls c ugly1Keith Thompson
29 May 26     i  i `* Re: this girl calls c ugly301tTh
29 May 26     i  i  `* Re: this girl calls c ugly300Bart
29 May 26     i  i   +* Re: this girl calls c ugly298Keith Thompson
29 May 26     i  i   i`* Re: this girl calls c ugly297Bart
29 May 26     i  i   i +- Re: this girl calls c ugly1Janis Papanagnou
29 May 26     i  i   i `* Re: this girl calls c ugly295Keith Thompson
29 May 26     i  i   i  `* Re: this girl calls c ugly294Bart
29 May 26     i  i   i   +* Re: this girl calls c ugly5Keith Thompson
30 May 26     i  i   i   i`* Re: this girl calls c ugly4James Kuyper
30 May 26     i  i   i   i `* Re: this girl calls c ugly3Bart
30 May 26     i  i   i   i  `* Re: this girl calls c ugly2Keith Thompson
30 May 26     i  i   i   i   `- Re: this girl calls c ugly1Bart
30 May 26     i  i   i   `* Re: this girl calls c ugly288Dan Cross
30 May 26     i  i   i    +* Re: this girl calls c ugly284Bart
31 May 26     i  i   i    i+* Re: this girl calls c ugly282Keith Thompson
31 May 26     i  i   i    ii+* Re: this girl calls c ugly5Janis Papanagnou
31 May 26     i  i   i    iii+* Re: this girl calls c ugly2Keith Thompson
2 Jun 26     i  i   i    iiii`- Re: this girl calls c ugly1Janis Papanagnou
31 May 26     i  i   i    iii`* Re: this girl calls c ugly2David Brown
2 Jun 26     i  i   i    iii `- Re: this girl calls c ugly1Janis Papanagnou
31 May 26     i  i   i    ii`* Re: this girl calls c ugly276Richard Harnden
31 May 26     i  i   i    ii +* Re: this girl calls c ugly171David Brown
31 May 26     i  i   i    ii i+* Re: this girl calls c ugly168Bart
31 May 26     i  i   i    ii ii+* Re: this girl calls c ugly142David Brown
31 May 26     i  i   i    ii iii`* Re: this girl calls c ugly141James Kuyper
31 May 26     i  i   i    ii iii `* Re: this girl calls c ugly140David Brown
31 May 26     i  i   i    ii iii  +* Re: this girl calls c ugly4James Kuyper
31 May 26     i  i   i    ii iii  i`* Re: this girl calls c ugly3David Brown
31 May 26     i  i   i    ii iii  i `* Re: this girl calls c ugly2James Kuyper
1 Jun 26     i  i   i    ii iii  i  `- Re: this girl calls c ugly1David Brown
31 May 26     i  i   i    ii iii  `* Re: this girl calls c ugly135Keith Thompson
1 Jun 26     i  i   i    ii iii   +* Re: this girl calls c ugly2David Brown
1 Jun 26     i  i   i    ii iii   i`- Re: this girl calls c ugly1Keith Thompson
2 Jun 26     i  i   i    ii iii   +* Re: this girl calls c ugly131Janis Papanagnou
2 Jun 26     i  i   i    ii iii   i+- Re: this girl calls c ugly1James Kuyper
2 Jun 26     i  i   i    ii iii   i+* Constants and undefined behavior84Tim Rentsch
2 Jun 26     i  i   i    ii iii   ii`* Re: Constants and undefined behavior83Dan Cross
4 Jun 26     i  i   i    ii iii   ii `* Re: Constants and undefined behavior82Tim Rentsch
4 Jun 26     i  i   i    ii iii   ii  `* Re: Constants and undefined behavior81Dan Cross
4 Jun 26     i  i   i    ii iii   ii   +* Re: Constants and undefined behavior31Keith Thompson
5 Jun 26     i  i   i    ii iii   ii   i+* Re: Constants and undefined behavior28Dan Cross
5 Jun 26     i  i   i    ii iii   ii   ii+* Re: Constants and undefined behavior24Keith Thompson
6 Jun 26     i  i   i    ii iii   ii   iii+* Re: Constants and undefined behavior19Dan Cross
6 Jun 26     i  i   i    ii iii   ii   iiii`* Re: Constants and undefined behavior18Keith Thompson
8 Jun 26     i  i   i    ii iii   ii   iiii `* Re: Constants and undefined behavior17Dan Cross
8 Jun 26     i  i   i    ii iii   ii   iiii  +* Re: Constants and undefined behavior5Keith Thompson
9 Jun 26     i  i   i    ii iii   ii   iiii  i`* Re: Constants and undefined behavior4Dan Cross
9 Jun 26     i  i   i    ii iii   ii   iiii  i `* Re: Constants and undefined behavior3Keith Thompson
9 Jun 26     i  i   i    ii iii   ii   iiii  i  `* Re: Constants and undefined behavior2Dan Cross
9 Jun 26     i  i   i    ii iii   ii   iiii  i   `- Re: Constants and undefined behavior1Keith Thompson
9 Jun 26     i  i   i    ii iii   ii   iiii  `* Re: Constants and undefined behavior11Waldek Hebisch
9 Jun 26     i  i   i    ii iii   ii   iiii   +* Re: Constants and undefined behavior3James Kuyper
10 Jun 26     i  i   i    ii iii   ii   iiii   i`* Re: Constants and undefined behavior2Keith Thompson
10 Jun 26     i  i   i    ii iii   ii   iiii   i `- Re: Constants and undefined behavior1Dan Cross
11 Jun 26     i  i   i    ii iii   ii   iiii   `* Re: Constants and undefined behavior7Janis Papanagnou
11 Jun 26     i  i   i    ii iii   ii   iiii    +* Re: Constants and undefined behavior2Dan Cross
11 Jun 26     i  i   i    ii iii   ii   iiii    i`- Re: Constants and undefined behavior1Janis Papanagnou
11 Jun 26     i  i   i    ii iii   ii   iiii    `* Re: Constants and undefined behavior4Waldek Hebisch
6 Jun 26     i  i   i    ii iii   ii   iii`* Re: Constants and undefined behavior4Tim Rentsch
5 Jun 26     i  i   i    ii iii   ii   ii`* Re: Constants and undefined behavior3Janis Papanagnou
7 Jun 26     i  i   i    ii iii   ii   i`* Re: Constants and undefined behavior2Tim Rentsch
9 Jun 26     i  i   i    ii iii   ii   `* Re: Constants and undefined behavior49Tim Rentsch
2 Jun 26     i  i   i    ii iii   i`* Re: this girl calls c ugly45Keith Thompson
2 Jun 26     i  i   i    ii iii   `- Re: this girl calls c ugly1Chris M. Thomasson
2 Jun 26     i  i   i    ii ii`* Re: this girl calls c ugly25Dan Cross
31 May 26     i  i   i    ii i`* Re: this girl calls c ugly2James Kuyper
31 May 26     i  i   i    ii +* Re: this girl calls c ugly2Keith Thompson
31 May 26     i  i   i    ii `* Re: this girl calls c ugly102Tim Rentsch
31 May 26     i  i   i    i`- Re: this girl calls c ugly1Dan Cross
1 Jun 26     i  i   i    `* Re: this girl calls c ugly3Tim Rentsch
30 May 26     i  i   `- Re: this girl calls c ugly1David Brown
29 May 26     i  +* Re: this girl calls c ugly6Janis Papanagnou
30 May 26     i  `* Re: this girl calls c ugly9Lawrence D’Oliveiro
29 May 26     `* Re: this girl calls c ugly6Bonita Montero

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal