Liste des Groupes | Revenir à cl c |
Il 21/02/2025 13:05, Richard Damon ha scritto:[...]On 2/21/25 6:40 AM, pozz wrote:I want to write a simple function that converts UCS2 string into>
ISO8859-1:
>
void ucs2_to_iso8859p1(char *ucs2, size_t size);
>
ucs2 is a string of type "00480065006C006C006F" for "Hello". I'm
passing size because ucs2 isn't null terminated.
>>
It is trivial to convert "0000"-"007F" chars: it's a simple cast from
unsigned int to char.
Note, I think you will find that it is that 0000-00FF that match. (as
I remember ISO8859-1 was the base for starting Unicode).
>>
It isn't so simple to convert higher codes. For example, the small e
with grave "00E8" can be converted to 0xE8 in ISO8859-1, so it's
trivial again. But I saw the code "2019" (apostrophe) that can be
rendered as 0x27 in ISO8859-1.
To be correct, u2019 isn't 0x27, its just character that looks a lot
like it.
Yes, but as a first approximation, 0x27 is much better than '?' for u2019.
Is there a simplified mapping table that can be written with if/switch?>
>
if (code < 0x80) {
*dst++ = (char)code;
} else {
switch (code) {
case 0x2019: *dst++ = 0x27; break; // Apostrophe
case 0x...: *dst++ = ...; break;
default: *ds++ = ' ';
}
}
>
I'm not searching a very detailed and correct mapping, but just a
"sufficient" implementation.
Then you have to decide which are sufficient mappings. No character
above FF *IS* the character below, but some have a close
approximation, so you will need to decide what to map.
Yes, I have to decide, but it is a very big problem (there are thousands
of Unicode symbols that can be approximated to another ISO8859-1 code).
I'm wondering if such an approximation is just implemented somewhere.
For example, what iconv() does in this case?
Les messages affichés proviennent d'usenet.