Newsportal USENET - Re: Simple string conversion from UCS2 to ISO8859-1

Re: Simple string conversion from UCS2 to ISO8859-1

Sujet : Re: Simple string conversion from UCS2 to ISO8859-1
De : david.brown (at) *nospam* hesbynett.no (David Brown)
Groupes : comp.lang.c
Date : 25. Feb 2025, 17:16:23

Autres entêtes

Organisation : A noiseless patient Spider
Message-ID : <vpkqcn$22c6h$1@dont-email.me>
References : 1 2 3 4 5 6 7 8
User-Agent : Mozilla Thunderbird

On 25/02/2025 15:53, pozz wrote:

Il 25/02/2025 13:18, Richard Damon ha scritto:
On 2/25/25 2:35 AM, pozz wrote:
Il 24/02/2025 21:13, Lawrence D'Oliveiro ha scritto:
On Mon, 24 Feb 2025 16:57:24 +0100, pozz wrote:
>
Il 22/02/2025 14:18, David Brown ha scritto:
>
My understanding here is that the OP is getting the UCS-2 encoded
string in from a modem, almost certainly on a serial line. The UCS-2
encoded data is itself a binary sequence of 16-bit code units, and the
modem firmware is sending those as four hex digits.
>
Exactly. This is the reply to AT+CMGR command that is standardized in
3GPP TS 27.005.
>
Anything that is specifying the use of UCS-2 encoding automatically dates
itself to about the early-to-mid 1990s.
>
Sincereley I don't know why and when, but the LTE modem I'm using (Simcom A7672E) replies to AT+CMGR in two different format:
>
- what is described as GSM 7-bit alphabet (but it's really UTF-8 when non ASCII chas are present)
>
- UCS2
>
Of course, in the header, it specifies the <dcs> (data coding scheme) so the receiver on the UART can interpret correctly all the data.
>
>
Are you sure it is UCS2 and not UTF-16?
>
Can it not handle characters not in the BMP?
>
The difference between UCS2 and UTF-16 is that UCS2 is the character set that predates the surrogate-pairs added to extend it. It is very much the equivalent relationship of ASCII to UTF-8.
Sincerely I don't know, the standard says UCS2

The standard used by modems here is UCS2, not UTF-16. As you point out, this was all standardised in the early 1990's (before UTF-16) - as a standardisation of things that had already been used before that. And once a telecom standard is made, it is set in stone and never changed. Unlike for some things that adopted Unicode early using UCS2 (like Windows NT, Java, Qt, Python) the UCS2 use in established modem standard commands (like AT+CMGR) could not, and were not, extended to UTF-16. There might be other AT commands supported by some modems that /do/ support UTF-8 or UTF-16, but existing standardised commands don't change.
For all Unicode code points supported by UCS2, the coding is the same as for UTF-16 (as Richard says, it's like the ASCII subset of UTF-8). So you can always treat UCS2 as UTF-16. Unicode characters outside this set simply have no representation in UCS2.

Les messages affichés proviennent d'usenet.

Date	Sujet	#	Auteur
21 Feb 25	Simple string conversion from UCS2 to ISO8859-1	65	pozz
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	29	Richard Damon
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	28	pozz
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	16	Janis Papanagnou
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Janis Papanagnou
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	14	Keith Thompson
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	13	Janis Papanagnou
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	12	David Brown
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	5	Janis Papanagnou
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	David Brown
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	3	Lawrence D'Oliveiro
24 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	2	Janis Papanagnou
24 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Lawrence D'Oliveiro
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	6	Richard Damon
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	David Brown
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	2	Janis Papanagnou
23 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Richard Damon
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Lawrence D'Oliveiro
23 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Waldek Hebisch
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Richard Damon
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	10	Lawrence D'Oliveiro
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	9	Janis Papanagnou
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	3	Lawrence D'Oliveiro
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	2	Janis Papanagnou
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Lawrence D'Oliveiro
23 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	James Kuyper
23 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Lawrence D'Oliveiro
23 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	3	Kaz Kylheku
24 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	2	Janis Papanagnou
24 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Lawrence D'Oliveiro
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	2	David Brown
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	pozz
21 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	30	Keith Thompson
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	29	David Brown
24 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	28	pozz
24 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	27	Lawrence D'Oliveiro
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	2	pozz
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Lawrence D'Oliveiro
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	24	pozz
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	23	Richard Damon
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	22	pozz
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	15	David Brown
26 Feb 25	[OT] Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	14	Janis Papanagnou
26 Feb 25	Re: [OT] Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	2	David Brown
26 Feb 25	Re: [OT] Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	1	Janis Papanagnou
26 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	11	Lawrence D'Oliveiro
27 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	10	Janis Papanagnou
27 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	9	David Brown
27 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	1	Richard Heathfield
27 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	5	bart
28 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	2	Lawrence D'Oliveiro
28 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	1	Janis Papanagnou
28 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	1	James Kuyper
28 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	1	David Brown
28 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	2	Janis Papanagnou
28 Feb 25	Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)	1	David Brown
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	3	Lawrence D'Oliveiro
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	pozz
26 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Richard Damon
26 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	3	Lawrence D'Oliveiro
26 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	2	Keith Thompson
26 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	David Brown
22 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Kaz Kylheku
25 Feb 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Richard Harnden
1 Mar 25	Re: Simple string conversion from UCS2 to ISO8859-1	1	Geoff