Sujet : Re: Sorting problem with Unix sort(1) with UTF-8 punctuation characters - locale issue
De : naddy (at) *nospam* mips.inka.de (Christian Weisgerber)
Groupes : comp.unix.shellDate : 19. Feb 2025, 21:22:45
Autres entêtes
Message-ID : <slrnvrcfcl.3e0.naddy@lorvorc.mips.inka.de>
References : 1
User-Agent : slrn/1.0.3 (FreeBSD)
On 2025-02-19, Janis Papanagnou <janis_papanagnou+
ng@hotmail.com> wrote:
If anything, I'd expected LC_COLLATE to have an effect on sorting.
Then there's no locale with @isodate on that sort-defunct system.
And clearing that LC_TIME locale or removing the "@isodate" part
did not change anything; it needs that setting to a non-existing
locale file to work correctly on the otherwise not correctly
sorting system.
My working hypothesis would be that setting LC_TIME to a nonexistent
locale causes an error that invalidates the _whole_ locale setting
and causes a fallback to a default setting, likely the "C" locale.
You can check that sorting with LC_ALL=C or an invalid value like
LC_ALL=foobar will produce your "correct" result.
A corollary from this would be that your "sort-defunct" system uses
a different collation order than your "correctly" sorting system
for the de_DE.UTF-8 locale.
On the FreeBSD 14-STABLE system I'm typing this on, sorting your
example data with my typical C.UTF-8 locale produces your expected
result, sorting with de_DE.UTF-8 (or en_US.UTF-8) produces a different
order.
····**·······**················< abc1
···········**······**··········< efg2
·**·························**·< hij3
Also, I have no idea what could be considered the "correct" sorting
order for this.
-- Christian "naddy" Weisgerber naddy@mips.inka.de