Sujet : Re: a sed question
De : janis_papanagnou+ng (at) *nospam* hotmail.com (Janis Papanagnou)
Groupes : comp.unix.shellDate : 22. Dec 2024, 00:50:45
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vk7k8m$9b25$1@dont-email.me>
References : 1 2 3 4 5
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
On 21.12.2024 22:41, Keith Thompson wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 21.12.2024 13:17, Salvador Mirzo wrote:
[...]
As previously mentioned, 'sed' might not be the best choice for
developing such scripts; you might want to consider to learn 'awk'.
>
$ git log --oneline | head -1 | awk '{print $1}'
2566d31
>
With Awk you don't need 'head', it can be done like this
>
$ git log --oneline | awk 'NR==1 {print $1}'
>
(For long input files you may want an early exit
...| awk 'NR==1 { print $1 ; exit(0) }'
but that just as an aside.)
[...]
This raises another issue: it's often possible to replace a command in a
pipeline that filters output with an option to the command that does the
same thing. There's no general rule for how to do this, since different
commands do things differently, but for the example above:
git log --oneline -n 1 | awk '{print $1}'
Yes. - I just used the OP's presented sample to show the principle
(and not make up an own example to illustrate the case).
In practice it goes even farther; with Awk typical pipeline command
sequences that use utilities like cat, head, tail, grep, cut, sed,
tr, wc, seq, tee, etc. can typically all be represented and combined
by Awk. There's also the additional effect that if you want to pass
some context information from a tool near the front of the pipe to
a tool near the other end it's possible to maintain arbitrary state
information within the Awk program.
Of course, if you can _reduce_ the amount of data at an early stage
(like in your 'git -n 1' sample) the earlier the better! (My 'git',
BTW, doesn't seem to support an option '-n'; which might be another
reason to let a standard tool like Awk do the task for which it has
been defined, text-processing.)
or even:
git log -n 1 --format=%h
I haven't memorized the "--format" option, so I don't generally us it in
ad-hoc one-liners, but I do use it in scripts. Note both of the above
commands avoid generating the entire list of log entries, which could
save significant time on a large repo.
Using unnecessary commands in pipelines is Mostly Harmless, but IMHO
it's good to think about how to do things more efficiently. See also
"Useless use of cat" (UUOC).
Janis