Sujet : Re: Case Insensitive File Systems -- Torvalds Hates Them
De : blackhole.8.zarquon42 (at) *nospam* spamgourmet.com (Andreas Dehmel)
Groupes : comp.os.linux.misc comp.os.linux.advocacyDate : 29. Apr 2025, 19:11:19
Autres entêtes
Organisation : Zarquon's Den
Message-ID : <20250429201119.736dc05c@blackbird.dehmel-lan.de>
References : 1 2 3 4 5
User-Agent : Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu)
On Mon, 28 Apr 2025 18:56:18 +0000
Farley Flud <
ff@linux.rocks> wrote:
On Mon, 28 Apr 2025 11:12:42 -0700, John Ames wrote:
Just so, it seems to me. Of course it's many years too late for
*nix to course-correct on this, but it was a stupid design decision
in 1970 and it remains stupid now. Well, such is the nature of
things in this vale of sin and tears...
Case insensitivity was only idiotic at the beginning, but now, in the
age of Unicode, it is supremely idiotic.
Consider the German "sharp s," which I cannot enter as UTF-8 here.
But the lower case sharp s maps into TWO DIFFERENT upper case chars:
<can't enter> and "SS," e.g. STRASSE or <can't enter>.
That merely illustrates the point that whoever decided to model it like
this in Unicode was truly a numbskull. For two reasons:
1) just because the result _looks_ like SS doesn't mean it has to be
two characters. A Unicode character can look like anything, even a full
word (and beyond). The only reason to use two characters would be
hyphenation, which in this case is explicitly forbidden. Someone didn't
understand the difference between syntax and semantics.
2) this transformation is not trivially inversible. No, you can't just
translate every SS back to ß, you'd pretty much need an AI to invert
this. Whenever you're introducing a transformation that's trivial in
one direction and extremely hard in the other, and you're not working
in cryptography, you're doing something extremely, horribly wrong.
There are special rules on case folding for thousands of Unicode chars
and the "sharp s" example is one of the simplest.
I seriously doubt that, especially since many (most?) languages don't
even know what "case" is supposed to be in the first place (such as
Japanese, I'm pretty sure it's the same in Chinese and most other asian
languages, which incidentally take up the most code points). And even
if it were true, that'd mean we'd need a couple of thousand additional
code points for these special cases, out of several million -- who
cares, the gender-neutral-smileys-crowd?
Andreas Dehmel