Re: "sed" question

Liste des GroupesRevenir à cl awk 
Sujet : Re: "sed" question
De : Keith.S.Thompson+u (at) *nospam* gmail.com (Keith Thompson)
Groupes : comp.lang.awk
Date : 08. Mar 2024, 05:06:00
Autres entêtes
Organisation : None to speak of
Message-ID : <87zfv9mkpj.fsf@nosuchdomain.example.com>
References : 1 2 3 4 5 6 7 8
User-Agent : Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Grant Taylor <gtaylor@tnetconsulting.net> writes:
On 3/7/24 18:09, Keith Thompson wrote:
I know that's what awk does, but I don't think I would have expected
it if I didn't know about it.
>
Okay.  I think that's a fair observation.
>
$0 is the current input line.
>
Or $0 is the current /record/ in awk parlance.
Yes.

If you don't change anything, or if you modify $0 itself, whitespace
betweeen fields is preserved.
>
If you modify any of the fields, $0 is recomputed and whitespace
between tokens is collapsed.
>
I don't agree with that.
>
   % echo 'one  two   three' | awk '{print $0; print $1,$2,$3}'
   one  two   three
   one two three
>
I didn't /modify/ anything and awk does print the fields with
different white space.
That's just the semantics of print with comma-delimited arguments, just
like:
    % awk 'BEGIN{a="foo"; b="bar"; print a, b}'
    foo bar
Printing the values of $1, $2, and $3 doesn't change $0.  Writing to any
of $1, $2, $3, even with the same value, does change $0.
    $ echo 'one  two   three' | awk '{print $0; print $1,$2,$3; print $0; $2 = $2; print $0}'
    one  two   three
    one two three
    one  two   three
    one two three

awk *could* have been defined to preserve inter-field whitespace
even when you modify individual fields,
>
I question the veracity of that.  Specifically when lengthening or
shortening the value of a field.  E.g. replacing "two" with
"fifteen". This is particularly germane when you look at $0 as a fixed
width formatted output.
But awk doesn't work with fixed-width data.  The length of each field,
and the length of $0, is variable.
If awk *purely* dealt with input lines only as lists of tokens, then
this:
    echo 'one  two   three' | awk '{print $0}'
would print "one two three" rather than "one two three" (and awk would
lose the ability to deal with arbitrarily formatted input).  The fact
that the inter-field whitespace is reset only when individual fields are
touched feels arbitrary to me.

and I think I would have found that more intuitive.
>
I don't agree.
>
(And ideally there would be a way to refer to that inter-field
whitespace.)
>
Remember, awk is meant for working on fields of data in a record.  By
default, the fields are delimited by white space characters.  I'll say it this way, awk is meant for working on the non-white space
characters.   Or yet another way, awk is not meant for working on
white space charters.
Awk has strong builtin support for working on whitespace-delimited
fields, and that support tends to ignore the details of that whitespace.
But you can also write awk code that just deals with $0.
One trivial example:
    awk '{ count += length + 1 } END { print count }'
behaves similarly to `wc -l`, and counts whitespace characters just like
any other characters.

The fact that modifying a field has the side effect of messing up $0
seems counterintuitive.
>
Maybe.
>
But I think it's one that is acceptable for what awk is intended to do.
It's also the existing behavior, and changing it would break things, so
I wouldn't suggest changing it.

Perhaps the behavior matches your intuition better than it matches
mine.
>
I sort of feel like you are wanting to / trying to use awk in places
where sed might be better.  sed just sees a string of text and is ignorant of any structure without a carefully crafted RE to provide it.
Not really.  I'm just remarking on one particular awk feature that I
find a bit counterintuitive.
Awk is optimized for working on records consisting of fields, and not
caring much about how much whitespace there is between fields.  But it's
flexible enought to do *lots* of other things.
[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */

Date Sujet#  Auteur
8 Mar 24 * Re: "sed" question13Grant Taylor
8 Mar 24 `* Re: "sed" question12Keith Thompson
8 Mar 24  +* Re: "sed" question10Mr. Man-wai Chang
8 Mar 24  i`* Re: "sed" question9Janis Papanagnou
8 Mar 24  i +* Re: "sed" question6Grant Taylor
8 Mar 24  i i+* Re: "sed" question4Mr. Man-wai Chang
12 Mar 24  i ii`* Re: "sed" question3Geoff Clare
12 Mar 24  i ii `* Re: "sed" question2Aharon Robbins
13 Mar 24  i ii  `- Re: "sed" question1Geoff Clare
9 Mar 24  i i`- Re: "sed" question1Janis Papanagnou
8 Mar 24  i `* Re: "sed" question2Mr. Man-wai Chang
9 Mar 24  i  `- Re: "sed" question1Janis Papanagnou
8 Mar 24  `- Re: "sed" question1Kaz Kylheku

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal