Liste des Groupes | Revenir à cl c |
On 2024-03-22, David Brown <david.brown@hesbynett.no> wrote:Again, you are inferring far too much here. The standard is /not/ limiting like this.You should read the footnotes to 5.1.1.2 "Translation phases".Yes, I'm aware of that. For instance preprocessing can all be jumbled
Footnotes are not normative, but they are helpful in explaining the
meaning of the text. They note that compilers don't have to follow the
details of the translation phases, and that source files, translation
units, and translated translation units don't have to have one-to-one
correspondences.
into one process. But it has to produce that result.
Even if translation phases 7 and 8 are combined, the semantic analysis
of the individual translation unit has to appear to be settled before
linkage. So for instance a translation unit could incrementally emerge
from the semantic analysis steps, and those parts of it already analyzed
(phase 7) could start to be linked to other translation units (phase 8).
I'm just saying that certain information leakage is clearly permitted,The point is that many things are local to a translation unit, such as statics, type definitions, and so on. These are valid within the translation unit (within their scope, of course), and independent of identically named items in other translation units. It is about defining a kind of "unit of compilation" for the language semantics - it is /not/ restricting the behaviour of a compiler.
regardless of how the phases are integrated.
The standard also does not say what the output of "translation" is - itTranslation phase 7 is clearly about a single translation unit in
does not have to be assembly or machine code. It can happily be an
internal format, as used by gcc and clang/llvm. It does not define what
"linking" is, or how the translated translation units are "collected
into a program image" - combining the partially compiled units,
optimising, and then generating a program image is well within that
definition.
>(That can be inferred>
from the rules which forbid semantic analysis across translation
units, only linkage.)
The rules do not forbid semantic analysis across translation units -
they merely do not /require/ it. You are making an inference without
any justification that I can see.
isolation:
"The resulting tokens are syntactically and semantically analyzed
and translated as a translation unit."
Not: "as a combination of multiple translation uints".
5.1.1.1 clearly refers to "[t]he separate translation units of aIt does so all in terms of what a compiler /may/ do.
program".
LTO pretends that the program is still divided into the same translationNo.
units, while minging them together in ways contrary to all those
chapter 5 descriptions.
The conforming way to obtain LTO is to actually combine multipleYou could do that if you like (after manipulating things to handle statics, type definitions, etc.).
preprocessing translation units into one.
No, it is not.In fact, that code generation is forced, when people do not use LTO,That's why we can have a real world security issue caused by zeroing>
being optimized away.
No, it is not. We have real-world security issues for all sorts of
reasons, including people mistakenly thinking they can force particular
types of code generation by calling functions in different source files.
which is not enabled by default.
No one is suggesting doing "nonconforming things".Yes; if you do nonconforming things.The rules spelled out in ISO C allow us to unit test a translation>
unit by linking it to some harness, and be sure it has exactly the
same behaviors when linked to the production program.
No, they don't.
>
If the unit you are testing calls something outside that unit, you may
get different behaviours when testing and when used in production.
Again, claiming this will not make it true. You need to update your ideas about what observable behaviour actually is.only thing you can be sure of from testing is that if you find a bugLTO will break translation units that are simple enough to be trivially
during testing, you have a bug in the code. You can never use testing
to be sure that the code works (with the exception of exhaustive testing
of all possible inputs, which is rarely practical).
proven to have a certain behavior.
The phrase "de facto" is an admission that you understand that none of this is part of the /actual/ standards. You have dropped from "the official standards make this clear" down to "I think this".External calls are de facto observable,If I have some translation unit in which there is a function foo, such>
that when I call foo, it then calls an external function bar, that's
observable.
5.1.2.2.1p6 lists the three things that C defines as "observable
behaviour". Function calls - internal or external - are not amongst these.
because we have it for grantedAll such boundaries are lost in the link stage, before observable behaviour becomes relevant.
when we have a translation unit that calls a certain function, we can
supply another translation unit which supplies that function. In
that function we can communicate with the host environment to confirm
that it was called.
Nonsense.If bar does not call the function, then the observable behavior ofI can link that unit to a program which supplies bar,>
containing a printf call, then call foo and verify that the printf call
is executed.
Yes, you can. The printf call - or, more exactly, the "input and output
dynamics" - are observable behaviour. The call to "bar", however, is not.
printf doesn't occur either; they linked by logic / cause-and-effect.
A behavior that is not itself formally classified as observable can beCalling it "de facto observable behaviour" is just confusing your understanding here. But you can well say that if B is observed, that means A must have happened.
discovered by logical linkage to be necessary for the production of
observable behavior. It can be an "if, and only if" linkage.
If an observable behavior B occurs if, and only if, some behavior A
occurs, then the fact of whether A occurs or not is de facto observable.
Nope.The compiler, when compiling the source of "foo", will include a call toTranslation phases 1 to 7 forbid processing material from another
"bar" when it does not have the source code (or other detailed semantic
information) for "bar" available at the time.
translation unit.
Conforming semantic analysis of a translation unit hasNope.
nothing but that translation unit.
That would be a better way to put it. But it is still not the case here.But you are mistaken toSure; let's say that the call can be tied to observable behavior
think it does so because the call is "observable" or required by the C
standard.
elsewhere such that the call occurs if and only if the observable
behavior occurs.
The compiler can omit the call to "bar" if it is sure that it results in no observable behaviour. It cannot omit it if it is not sure of this. It is /that/ simple.It does so because it cannot prove that /running/ theThe compiler cannot do any of this if it is in a conforming mode.
function "bar" contains no observable behaviour, or otherwise affects
the observable behaviour of the program. The compiler cannot skip the
call unless it can be sure it is safe to do so - and if it knows nothing
about the implementation of "bar", it must assume the worst.
But sure, in the nonconforming LTO paradigm, which does have to adhereWrong.
to sane rules, that more or less follow what would have to happen if
multiple preprocessing translation units were merged at the token level
and thus analyzed together.
Sometimes the compiler may have additional information - such as if itIf the declarations are available only in another translation unit,
is declared the gcc "const" or "pure" attributes (or the standardised
"unsequenced" and "reproducible" attributes in the draft for the next C
version after C23).
they cannot be taken into account when analyzing this translation unit.
The C standards also don't describe drinking coffee while waiting for the compiler. Just because something is not mentioned, does not mean it is forbidden!Any semantic analysis performed be that which is stated in translationSince ISO C says that the semantic analysis has been done (that>
unit having gone through phase 7), we can take it for granted as a
done-and-dusted property of that translation unit that it calls bar
whenever its foo is invoked.
No, we can't - see above. Nothing in the C standards forbids any
additional analysis, or using other information in code generation.
phase 7, which happens for one translation unit, before considering
linkage to other translation units.
What forbids is is that no semantic analysis activity is decribed as
taking place in translation phase 8, other than linage.
Indeed. I am "some people" in this context.Yes, and some people want that, learn how it works, and get their>Say I have a call to foo in main, and the definition of foo is in>
another translation unit. In the absence of LTO, the compiler will have
to generate a call to foo. If LTO is able to determine that foo doesn't
do anything, it can remove the code for the function call, and the
resulting behavior of the linked program is unchanged.
There always situations in which optimizations that have been forbidden
don't cause a problem, and are even desirable.
>
Can you give examples?
>
You already mentioned "-fast-math" (and by implication, its various
subflags in gcc, clang and icc). These are clearly documented as
allowing some violations of the C standards (and not least, the IEEE
floating point standards, which are stricter than those of C).
programs working with it, all the while knowing that it's
nonconforming to IEEE and ISO C.
Another tool in the box.Agreed.
It would run counter to the whole point of having a standard.(While I don't much like an "appeal to authority" argument, I think it'sWhy would it be?
worth noting that the major C / C++ compilers, gcc, clang/llvm and MSVC,
all support link-time optimisation. They also all work together with
both the C and C++ standards committees. It would be quite the scandal
if there were any truth in your claims and these compiler vendors were
all breaking the rules of the languages they help to specify!)
In the first place, all the implementations you mention have to beYes, but they are clear about that. (At least, gcc is - I haven't read the documentation for clang as thoroughly, and have barely touched MSVC.)
explicitly put into a nondefault configuration in order to resemble
conforming ISO C implementations.
LTO is not even enabled by default (for good reasons).The good reasons are that not all setups support it (it needs particular linkers), it can significantly increase build times, it makes some kinds of debugging nearly impossible, it plays badly with other tools such as profilers and code coverage analysis, and you can have trouble if you are doing weird things with compiler and linker file interaction or some other kinds of non-standard C coding.
A few goofballs who maintain GNU/Linux distros are turning on LTO forIf it /were/ nonconforming, I think that would deserve huge attention. But it is not.
compiling upstream packages whose development they know nothing about
beyond ./configure && make. (Luckily, the projects themselves can take
countermeasures to defend against this.)
I think the fact that LTO is almost certainly nonconforming deserves
more attention, but not panic or anything like that.
LTO should be made into a conforming feature that is optional.
Translation phase 8 can be split into 8 and 9. In 8, translation units
would be optionally partitioned into subsets. Each subset containing
two or more translation units would be be subjected to further semantic
analysis, as a group, and turned into a subset translation unit.
Phase 9 would be same as former 8.
Whether an implementation supports subsetting and the manner in which
units are indicated for subsetting would be implementation-defined, but
it would be clear that there is a semantic difference, and that each
implementation must support a translation mode in which the subsetting
isn't performed.
Les messages affichés proviennent d'usenet.