Re: locale/LC_CTYPE vs strcasecmp?

Liste des GroupesRevenir à cubf misc 
Sujet : Re: locale/LC_CTYPE vs strcasecmp?
De : wbe (at) *nospam* UBEBLOCK.psr.com.invalid (Winston)
Groupes : comp.unix.bsd.freebsd.misc
Date : 27. Mar 2024, 17:16:40
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <ydbk6zhfhj.fsf@UBEblock.psr.com>
References : 1 2
User-Agent : Gnus/5.13 (Gnus v5.13)
I originally posted:
The man page says strcasecmp_l() takes an explicit locale.
The implication is that strcasecmp() uses the current locale
(presumably as set by setlocale()).

to which Christian Weisgerber <naddy@mips.inka.de> kindly replied:
Yes.
src/lib/libc/string/strcasecmp.c:
>
     57 int
     58 strcasecmp(const char *s1, const char *s2)
     59 {
     60         return strcasecmp_l(s1, s2, __get_locale());
     61 }

:-)

After calling setlocale(LC_ALL, "uk_UA.UTF-8"), I'm seeing that
strcasecmp() is not, in fact, case-independently matching non-ASCII
UTF-8 strings: it's case sensitive (the ASCII equivalent in this
case being that "Abc" isn't matching "abc").

UTF-8 characters are multibyte.  You need to convert the strings
to wide characters and use wcscasecmp().

As one would expect and perfectly reasonable, but something (I forget
what now) led me to think that if strcasecmp accepted UTF-8 locales,
maybe it *would* be willing to, just operating one byte at a time
instead of two.

Thanks for confirming that, Christian.  Onward to upgrading this
code that should have been doing that already ...
 -WBE

Date Sujet#  Auteur
26 Mar 24 * locale/LC_CTYPE vs strcasecmp?3Winston
26 Mar 24 `* Re: locale/LC_CTYPE vs strcasecmp?2Christian Weisgerber
27 Mar 24  `- Re: locale/LC_CTYPE vs strcasecmp?1Winston

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal