Liste des Groupes | Revenir à cl c |
On 5/7/2025 7:58 AM, Janis Papanagnou wrote:No matter how you choose to do it, you will get it wrong sometimes. Case-insensitive comparison has language-specific details in addition to the character in the Unicode tables. Should the lower-case version of "SS" be "ss" or "ß" ? That depends on the language and the position of the letters. Should the capital of "ß" be "SS" or "ẞ"? Should the capital of "i" be "I" or "İ" ? Some languages have a letter "dz" - some of those capitalise it as "DZ", others as "Dz".On 07.05.2025 12:08, BGB wrote:Latin, Greek, and Cyrillic, are the main alphabets which actually have a useful and reasonably well defined concept of "case", and thus "case folding" actually makes sense for these.[...]>
>
Though, if someone really must make something case-insensitive, a case
could be made for only supporting it for maybe Latin, Greek, and
Cyrillic.
I don't understand what you want to say here; it just sounds strange
to me. - Mind to elaborate?
>
For most other places, it does not, and one can likely ignore rules for things outside of these alphabets. Can eliminate a bunch of rules for alphabets that don't actually have "case" as we would understand it.
By limiting rules in these ways, a simpler and more manageable set of rules is possible. Vs, say, actual Unicode rules, which tend to have stuff going on all over the place.
Ligatures pose an issue though, but presumably option is one of:
Case fold between ligatures, when both variants exist;
Treat the ligature as its own character;
Decompose and compare.
Though, FWIW, in my normalization code, I mostly ignored ligatures, as while they could be decomposed in many cases, they could only be recomposed for locales that actually use said ligature (like, in English, if AE and IJ started spontaneously merging into new characters, this would be weird and out of place; and having a filesystem layer that merely decomposed any ligatures it encountered would not be ideal).
Ideally, this would be better handled in a file-browser or>
similar, and not in the VFS or FS driver itself.
Janis
>
Les messages affichés proviennent d'usenet.