Sujet : Re: "The provenance memory model for C", by Jens Gustedt
De : david.brown (at) *nospam* hesbynett.no (David Brown)
Groupes : comp.lang.cDate : 09. Jul 2025, 10:41:28
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <104ldg8$5f8m$1@dont-email.me>
References : 1 2 3
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0
On 09/07/2025 04:39, BGB wrote:
On 7/2/2025 8:10 AM, Kaz Kylheku wrote:
On 2025-07-02, Alexis <flexibeast@gmail.com> wrote:
>
...
>
I don't have confidence in an author's understanding of C, if they
believe that ISO C defines the behavior of invalid pointers being
compared, such that this needs to be rectified by a private "patch"
of the text.
>
You might not be aware of it, but the author Jens Gustedt is a member of the C standards committee, and has been for some time. He is the most vocal, public and active member. I think that suggests he has quite a good understanding of C and the ISO standards! Not everyone agrees about his ideas and suggestions about how to move C forward - but that's fine (and it's fine by Jens, from what I have read). That's why there is a standards committee, with voting, rather than a BDFL.
The concept of pointer provenance can be expressed other than
as a textual patch against ISO C.
>
There have been plenty of papers and blogs written about pointer provenance (several by Gustedt) and how it could work. It's not a very easy thing to follow in any format. A patch to current C standards is perhaps the least easy to follow, but it is important for how the concept could be added to C.
It can be regarded as a language extension and documented similarly
to how a sane compiler documentor would do it.
>
"In this article, I will try to explain what this is all about, namely
on how a provenance model for pointers interferes with alias analysis of
modern compilers.
>
Well, no shit; provenance is often dynamic; whereas aliasing analysis
wants to be static.
>
For those that are not fluent with the terminology or
the concept we have a short intro what pointer aliasing is all about, a
review of existing tools to help the compiler and inherent difficulties
and then the proposed model itself. At the end there is a brief takeaway
that explains how to generally avoid complications and loss of
optimization opportunities that could result from mis-guided aliasing
analysis."
>
If you think that certain code could go faster because certain suspected
aliasing isn't actually taking place, then since C99 you were able to
spin the roulette wheel and use "restrict".
>
"restrict" can certainly be useful in some cases. There are also dozens of compiler extensions (such as gcc attributes) for giving the compiler extra information about aliasing.
So the aliasing analysis and its missed opportunities are the
programmer's responsibility.
>
It's always better for the machine to miss opportunities than to miss
compile. :)
>
Agreed.
It is always better for the toolchain to be able to optimise automatically than to require manual intervention by the programmer. (It should go without saying that optimisations are only valid if they do not affect the observable behaviour of correct code.) Programmers are notoriously bad at figuring out what will affect their code efficiency, and will either under-use "restrict" where it could clearly be safely used to speed up code, or over-use it resulting in risky code.
If the compiler can't be sure that accesses don't alias, then of course it should assume that aliasing is possible.
The idea of pointer provenance is to let compilers (and programmers!) have a better understanding of when accesses are guaranteed to be alias-free, when they are guaranteed to be aliasing, and when there are no guarantees. This is useful for optimisation and program analysis (including static error checking). The more information the compiler has, the better.
In my compiler, the default was to use a fairly conservative aliasing strategy.
...
With pointer operations, all stores can be assumed potentially aliasing unless restrict is used, regardless of type.
C does not require that. And it is rare in practice, IME, for code to actually need to access the same data through different lvalue types (other than unsigned char). It is rarer still for it not to be handled better using type-punning unions or memcpy() - assuming the compiler handles memcpy() decently.
Equally, this means that using type-based alias analysis generally gives only small efficiency benefits in C code (but more in C++). The majority of situations where alias analysis and a compiler knowledge of no aliasing (or always aliasing) would make a difference, are between pointers or other lvalues of compatible types. That is why provenance tracking can have potentially significant benefits.