On 2024-08-30, Keith Thompson <Keith.S.Thompson+
u@gmail.com> wrote:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Kaz Kylheku <643-408-1753@kylheku.com> writes:
On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:
Bart <bc@freeuk.com> writes:
I think that these (with x, y having compatible scalar types):
>
x + 1 = y;
(x + 1) = y; // in case above was parsed differently
>
are both valid syntax in C. It will fail for a different reason:
an '+' term is not a valid lvalue.
>
The compiler must tell you that neither is valid C. That's
because what is required on each side of assignment is not
exactly the same thing. It's a distraction to argue about why
each is not valid C as both have errors that require diagnostic
at compile time.
>
Bart is only saying that it's valid syntax, not that it's valid C.
>
According to the ISO C syntax (not taking into account contraints,
which are not syntax) that view is justified.
>
The second line is syntactically well-formed. The first line is
not.
>
Right, because the LHS of an assignment is a unary-expression.
`(x + 1)` can be parsed as a unary-expression, but `x + 1` cannot.
However, the compilers I've tried produce the same diagnostic (not a
syntax error message) for both. Probably they use a tweaked grammar
that allows more a general expression as the LHS of an assignment,
and catch errors later in semantic analysis, for the purpose of
producing diagnostics that are easier to understand. It's obvious
that in `x + 1 = y`, the programmer (probably) intended `x + 1`
to be the LHS of an assignment. These compilers (I tried gcc,
clang, and tcc) are clever enough to recognize that.
A standard operator precedence parsing algorithm such as Shunting Yard
cannot help but parse that.
The operator tokens + and = have to be
assigned a precedence and associativity level, and so the parse has to
be (x + 1) = y or else x + (1 = y).
But precedence, in general, doesn't have to be ordered! It doesn't have
to have levels, or even partial ordering with transitivity. Precedence
can be such that for any pair of operators, we arbitrarily assign which
one is higher than the other, without regard for anything else.
Also: precedence can depend on order. It can be that in
X op1 Y op2 Z, where op1 is to the left of op2, op1 has
the higher precedence. But in X op2 Y op1 Z, op2 might have
the higher precedence. Or one order could have a defined
precedence but not the other.
In the C grammar, assignment breaks the cascading sequence. Whereas
most earlier rules refer to their immediate predecessors.
(e.g. additive builds on multiplicative), assignment looks all
the way back to unary. What this means is that the assignment operator
has no defined precedence with regard to all the intermediate
operators between it and unary. Or, at least, when the other operator
is to the left:
x + 1 = y // + =: no defined precedence: ambiguous: syntax error
y = x + 1 // = +: defined precedence: good syntax
When the precedence is not defined in one of the two orders,
you can safely adopt the one from the other order, provided
everything is still diagnosed that should be diagnosed.
The precedence not being defined means that the following parse
tree fragment is invalid:
=
+ y
x 1
it cannot be that + is a left child of =. So the parse could be
allowed by defining the precedence; and then we can detect the invalid
condition by walking the parse tree, looking for assignment nodes that
have a left child that has no precedence relationship to assignment.
But as an AST it is valid because that same AST shape can be forced by
parentheses, and parentheses disappear in abstract syntax.
Any invalid syntax condition that can be removed using parentheses is
not worth enforcing at the parse level. If it's wrong to assign to x +
1, you also need to diagnose when it's (x + 1). It's better to have
a single rule which catches both.
-- TXR Programming Language: http://nongnu.org/txrCygnal: Cygwin Native Application Library: http://kylheku.com/cygnalMastodon: @Kazinator@mstdn.ca