Sujet : Re: Simple string conversion from UCS2 to ISO8859-1
De : janis_papanagnou+ng (at) *nospam* hotmail.com (Janis Papanagnou)
Groupes : comp.lang.cDate : 22. Feb 2025, 05:29:14
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vpbjqs$3qgam$1@dont-email.me>
References : 1 2 3 4
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
On 22.02.2025 04:00, Lawrence D'Oliveiro wrote:
On Fri, 21 Feb 2025 13:42:13 +0100, pozz wrote:
Yes, I have to decide, but it is a very big problem (there are thousands
of Unicode symbols that can be approximated to another ISO8859-1 code).
I'm wondering if such an approximation is just implemented somewhere.
If you look at NamesList.txt, you will see, next to each character,
references to others that might be similar or related in some way.
They say not to try to parse that file automatically, but I’ve had some
success doing exactly that ... so far ...
I wonder why they say so, given that there's a syntax description
available on their pages (see the respective HTML file[*]).
BTW; curious about that [informal] part of the syntax description
LF: <any sequence of a single ASCII 0A or 0D, or both>
It looks like they accept not only LF, CR, CR-LF, but also LF-CR.
Is the latter of any practical relevance?
Janis
[*]
https://www.unicode.org/Public/UCD/latest/ucd/NamesList.html