Sujet : Re: Breaking a table of record rows into an array
De : Keith.S.Thompson+u (at) *nospam* gmail.com (Keith Thompson)
Groupes : comp.lang.awkDate : 13. Mar 2024, 22:15:56
Autres entêtes
Organisation : None to speak of
Message-ID : <87h6h96df7.fsf@nosuchdomain.example.com>
References : 1 2 3 4 5 6 7
User-Agent : Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Kaz Kylheku <
433-929-6894@kylheku.com> writes:
On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
arnold@freefriends.org (Aharon Robbins) writes:
In article <usqkgn$he7u$2@dont-email.me>,
Ed Morton <mortonspam@gmail.com> wrote:
the effect of setting `NF` is
undefined behavior per POSIX and so will do different things in
different awk variants and even in 1 awk variant can behave differently
depending on whether you're setting it to a higher or lower than
original value
>
This is not true. The effect of setting NF was well defined
by the original awk book and also in POSIX.
>
Decreasing NF throws away fields. Increasing NF adds the
intervening fields with the null string as their values
and rebuilds the record.
>
I don't see that in the POSIX specification.
>
The key is this:
>
References to nonexistent fields (that is, fields after $NF), shall
evaluate to the uninitialized value.
>
NF is assignable, and fields after $NF do not exist. Thus if we
have four fields and set NF = 3, then $4 doesn't exist.
That describes what happens if NF is modified by assignment, but I don't
see that it implies that such an assignment is allowed.
That implies it must cease to exist; i.e. be destroyed. If setting NF = 4 were
to restore $4 then that would mean it had continued to exist, but was only
hidden.
>
The behavior is present in GNU Awk, Mawk, BusyBox Awk and others.
I accept that most, quite possible all, implementations of Awk allow
assignment to NF, with the semantics of dropping fields after $NF or
adding new fields if the value decreases or increases, respectively.
And on the basis of that, I accept that POSIX *should* specify the
behavior of assigning to NF -- especially if the original AWK book
defines it. The second edition briefly mentions modifying NF:
"Conversely, if NF changes, $0 is recomputed when its value is needed."
But I can imagine a hypothetical awk-like language in which assigning to
NF has undefined behavior. My question is, how does the POSIX
specification not describe that language?
Looking more closely at
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.htmlit can be argued that assigning to NF *is* well defined, but it could be
much clearer. The syntax for a simple assignment is:
lvalue '=' expr
where an lvalue is one of:
NAME
NAME '[' expr_list ']'
'$' expr
and:
The token NAME shall consist of a word that is not a keyword or a
name of a built-in function and is not followed immediately (without
any delimiters) by the '(' character.
Which implies that, for example, `NF = 10` is valid.
Also, NF is a "special variable", which weakly implies that it's
assignable.
On the other hand, it also implies that `foo = 42` is valid where `foo`
is the name of a user-defined function (gawk disallows it). It should
say that the name of a user-defined function is not an lvalue.
The POSIX description reads to me as if the authors just didn't think
about whether assigning to NR, or to user-defined function names, should
be permitted. The behavior of adding or removing fields when NR is
modified by assignment is, I suggest, something that should be stated
explicitly.
[...]
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
"""
NF
The number of fields in the current record. Inside a BEGIN action,
the use of NF is undefined unless a getline function without a var
argument is executed previously. Inside an END action, NF shall
retain the value it had for the last record read, unless a
subsequent, redirected, getline function without a var argument is
performed prior to entering the END action.
>
This looks defective. The value of NF observed in END must obviously
be the last stored one, however it was stored, whether by assignment
or getline.
>
Note that NF is also recalculated if $0 is assigned, which is
explicitly required in the document; it is glaringly defective to
be appearing to be making an exception for getline but not for
assignment to $0.
-- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.comWorking, but not speaking, for Medtronicvoid Void(void) { Void(); } /* The recursive call of the void */