Sujet : Re: Case Insensitive File Systems -- Torvalds Hates Them
De : invalid (at) *nospam* invalid.invalid (Richard Kettlewell)
Groupes : comp.os.linux.miscDate : 08. May 2025, 09:22:51
Autres entêtes
Organisation : terraraq NNTP server
Message-ID : <wwvr00zo6jo.fsf@LkoBDZeT.terraraq.uk>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
User-Agent : Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
not@telling.you.invalid (Computer Nerd Kev) writes:
Richard Kettlewell <invalid@invalid.invalid> wrote:
Second, the issue in shell is is that newlines (or spaces) interact
badly with its approach to string handling: a filename can cause a
script to unexpectedly fail. For all that C has truly awful string
handling, it doesn't go awry just because there's a space or newline in
a string that it's working with.
>
When dealing with programs like find, sort, uniq etc. it's more of
a data format issue than a shell issue.
Partly agreed - but find/sort/etc are very much part of the wider shell
ecosystem and they inherit their assumptions from the shell data
model. Normally if you want to walk a file tree, or sort some data, in C
(or most other languages) you do that directly rather than calling these
programs. i.e. if you want to order a list in shell, you use sort;
whereas in C, the first thing you reach for is likely to be qsort().
Also modern versions of the tools have long since recognized this and
sprouted options for separating input elements with nulls. That still
leaves them unsuitable for completely general data processing, and it
doesn’t fix shell’s other string-handling issues, but at least when it
comes to filenames it’s a _solved_ data format issue.
Of course a C program can use any character to separate strings,
but newlines are most common in existing UNIX tools for text string
processing, and most easily human-readable, so it's convenient to
use that data format. But it means assuming that newlines in
filenames won't actually appear.
Elevating “convenient for humans” outside the scope where it actually
works well is probably one of the the underlying errors in all this.
Humans can apply context, wider knowledge, etc to interpreting things;
computers just do exactly what they’ve been programmed to.
“To err is human, to really fuck things up it takes a computer.”
-- https://www.greenend.org.uk/rjk/