Re: Simple string conversion from UCS2 to ISO8859-1

Liste des GroupesRevenir à l c 
Sujet : Re: Simple string conversion from UCS2 to ISO8859-1
De : richard (at) *nospam* damon-family.org (Richard Damon)
Groupes : comp.lang.c
Date : 22. Feb 2025, 02:05:22
Autres entêtes
Organisation : i2pn2 (i2pn.org)
Message-ID : <fff7432902b9f6c8daa6b1cc8369632e064187d7@i2pn2.org>
References : 1 2 3
User-Agent : Mozilla Thunderbird
On 2/21/25 7:42 AM, pozz wrote:
Il 21/02/2025 13:05, Richard Damon ha scritto:
On 2/21/25 6:40 AM, pozz wrote:
I want to write a simple function that converts UCS2 string into ISO8859-1:
>
void ucs2_to_iso8859p1(char *ucs2, size_t size);
>
ucs2 is a string of type "00480065006C006C006F" for "Hello". I'm passing size because ucs2 isn't null terminated.
>
Typically UCS2 strings ARE null terminated, it just a null is two bytes long.
 Sure, but this isn't an issue here.
 
I know I can use iconv() feature, but I'm on an embedded platform without an OS and without iconv() function.
>
It is trivial to convert "0000"-"007F" chars: it's a simple cast from unsigned int to char.
>
Note, I think you will find that it is that 0000-00FF that match. (as I remember ISO8859-1 was the base for starting Unicode).
>
>
It isn't so simple to convert higher codes. For example, the small e with grave "00E8" can be converted to 0xE8 in ISO8859-1, so it's trivial again. But I saw the code "2019" (apostrophe) that can be rendered as 0x27 in ISO8859-1.
>
To be correct, u2019 isn't 0x27, its just character that looks a lot like it.
 Yes, but as a first approximation, 0x27 is much better than '?' for u2019.
And, as such is a subjective decision that you need to make.

 
Is there a simplified mapping table that can be written with if/switch?
>
if (code < 0x80) {
   *dst++ = (char)code;
} else {
   switch (code) {
     case 0x2019: *dst++ = 0x27; break;  // Apostrophe
     case 0x...: *dst++ = ...; break;
     default: *ds++ = ' ';
   }
}
>
I'm not searching a very detailed and correct mapping, but just a "sufficient" implementation.
>
Then you have to decide which are sufficient mappings. No character above FF *IS* the character below, but some have a close approximation, so you will need to decide what to map.
 Yes, I have to decide, but it is a very big problem (there are thousands of Unicode symbols that can be approximated to another ISO8859-1 code). I'm wondering if such an approximation is just implemented somewhere.
 For example, what iconv() does in this case?
Just look at its code, there will be open source versions of it.
The two real options is just reject anything above 0xFF, or have a big table/switch to handle some determined list of things "close enough"

Date Sujet#  Auteur
21 Feb 25 * Simple string conversion from UCS2 to ISO8859-165pozz
21 Feb 25 +* Re: Simple string conversion from UCS2 to ISO8859-129Richard Damon
21 Feb 25 i`* Re: Simple string conversion from UCS2 to ISO8859-128pozz
21 Feb 25 i +* Re: Simple string conversion from UCS2 to ISO8859-116Janis Papanagnou
21 Feb 25 i i+- Re: Simple string conversion from UCS2 to ISO8859-11Janis Papanagnou
21 Feb 25 i i`* Re: Simple string conversion from UCS2 to ISO8859-114Keith Thompson
21 Feb 25 i i `* Re: Simple string conversion from UCS2 to ISO8859-113Janis Papanagnou
22 Feb 25 i i  `* Re: Simple string conversion from UCS2 to ISO8859-112David Brown
22 Feb 25 i i   +* Re: Simple string conversion from UCS2 to ISO8859-15Janis Papanagnou
22 Feb 25 i i   i+- Re: Simple string conversion from UCS2 to ISO8859-11David Brown
22 Feb 25 i i   i`* Re: Simple string conversion from UCS2 to ISO8859-13Lawrence D'Oliveiro
24 Feb 25 i i   i `* Re: Simple string conversion from UCS2 to ISO8859-12Janis Papanagnou
24 Feb 25 i i   i  `- Re: Simple string conversion from UCS2 to ISO8859-11Lawrence D'Oliveiro
22 Feb 25 i i   `* Re: Simple string conversion from UCS2 to ISO8859-16Richard Damon
22 Feb 25 i i    +- Re: Simple string conversion from UCS2 to ISO8859-11David Brown
22 Feb 25 i i    +* Re: Simple string conversion from UCS2 to ISO8859-12Janis Papanagnou
23 Feb 25 i i    i`- Re: Simple string conversion from UCS2 to ISO8859-11Richard Damon
22 Feb 25 i i    +- Re: Simple string conversion from UCS2 to ISO8859-11Lawrence D'Oliveiro
23 Feb 25 i i    `- Re: Simple string conversion from UCS2 to ISO8859-11Waldek Hebisch
22 Feb 25 i +- Re: Simple string conversion from UCS2 to ISO8859-11Richard Damon
22 Feb 25 i `* Re: Simple string conversion from UCS2 to ISO8859-110Lawrence D'Oliveiro
22 Feb 25 i  `* Re: Simple string conversion from UCS2 to ISO8859-19Janis Papanagnou
22 Feb 25 i   +* Re: Simple string conversion from UCS2 to ISO8859-13Lawrence D'Oliveiro
22 Feb 25 i   i`* Re: Simple string conversion from UCS2 to ISO8859-12Janis Papanagnou
22 Feb 25 i   i `- Re: Simple string conversion from UCS2 to ISO8859-11Lawrence D'Oliveiro
23 Feb 25 i   +- Re: Simple string conversion from UCS2 to ISO8859-11James Kuyper
23 Feb 25 i   +- Re: Simple string conversion from UCS2 to ISO8859-11Lawrence D'Oliveiro
23 Feb 25 i   `* Re: Simple string conversion from UCS2 to ISO8859-13Kaz Kylheku
24 Feb 25 i    `* Re: Simple string conversion from UCS2 to ISO8859-12Janis Papanagnou
24 Feb 25 i     `- Re: Simple string conversion from UCS2 to ISO8859-11Lawrence D'Oliveiro
21 Feb 25 +* Re: Simple string conversion from UCS2 to ISO8859-12David Brown
21 Feb 25 i`- Re: Simple string conversion from UCS2 to ISO8859-11pozz
21 Feb 25 +* Re: Simple string conversion from UCS2 to ISO8859-130Keith Thompson
22 Feb 25 i`* Re: Simple string conversion from UCS2 to ISO8859-129David Brown
24 Feb 25 i `* Re: Simple string conversion from UCS2 to ISO8859-128pozz
24 Feb 25 i  `* Re: Simple string conversion from UCS2 to ISO8859-127Lawrence D'Oliveiro
25 Feb 25 i   +* Re: Simple string conversion from UCS2 to ISO8859-12pozz
25 Feb 25 i   i`- Re: Simple string conversion from UCS2 to ISO8859-11Lawrence D'Oliveiro
25 Feb 25 i   `* Re: Simple string conversion from UCS2 to ISO8859-124pozz
25 Feb 25 i    `* Re: Simple string conversion from UCS2 to ISO8859-123Richard Damon
25 Feb 25 i     `* Re: Simple string conversion from UCS2 to ISO8859-122pozz
25 Feb 25 i      +* Re: Simple string conversion from UCS2 to ISO8859-115David Brown
26 Feb 25 i      i`* [OT] Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)14Janis Papanagnou
26 Feb 25 i      i +* Re: [OT] Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)2David Brown
26 Feb 25 i      i i`- Re: [OT] Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)1Janis Papanagnou
26 Feb 25 i      i `* Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)11Lawrence D'Oliveiro
27 Feb 25 i      i  `* Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)10Janis Papanagnou
27 Feb 25 i      i   `* Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)9David Brown
27 Feb 25 i      i    +- Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)1Richard Heathfield
27 Feb 25 i      i    +* Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)5bart
28 Feb 25 i      i    i+* Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)2Lawrence D'Oliveiro
28 Feb 25 i      i    ii`- Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)1Janis Papanagnou
28 Feb 25 i      i    i+- Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)1James Kuyper
28 Feb 25 i      i    i`- Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)1David Brown
28 Feb 25 i      i    `* Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)2Janis Papanagnou
28 Feb 25 i      i     `- Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)1David Brown
25 Feb 25 i      +* Re: Simple string conversion from UCS2 to ISO8859-13Lawrence D'Oliveiro
25 Feb 25 i      i+- Re: Simple string conversion from UCS2 to ISO8859-11pozz
26 Feb 25 i      i`- Re: Simple string conversion from UCS2 to ISO8859-11Richard Damon
26 Feb 25 i      `* Re: Simple string conversion from UCS2 to ISO8859-13Lawrence D'Oliveiro
26 Feb 25 i       `* Re: Simple string conversion from UCS2 to ISO8859-12Keith Thompson
26 Feb 25 i        `- Re: Simple string conversion from UCS2 to ISO8859-11David Brown
22 Feb 25 +- Re: Simple string conversion from UCS2 to ISO8859-11Kaz Kylheku
25 Feb 25 +- Re: Simple string conversion from UCS2 to ISO8859-11Richard Harnden
1 Mar 25 `- Re: Simple string conversion from UCS2 to ISO8859-11Geoff

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal