Liste des Groupes | Revenir à cl c |
Il 25/02/2025 13:18, Richard Damon ha scritto:The standard used by modems here is UCS2, not UTF-16. As you point out, this was all standardised in the early 1990's (before UTF-16) - as a standardisation of things that had already been used before that. And once a telecom standard is made, it is set in stone and never changed. Unlike for some things that adopted Unicode early using UCS2 (like Windows NT, Java, Qt, Python) the UCS2 use in established modem standard commands (like AT+CMGR) could not, and were not, extended to UTF-16. There might be other AT commands supported by some modems that /do/ support UTF-8 or UTF-16, but existing standardised commands don't change.On 2/25/25 2:35 AM, pozz wrote:Sincerely I don't know, the standard says UCS2Il 24/02/2025 21:13, Lawrence D'Oliveiro ha scritto:>On Mon, 24 Feb 2025 16:57:24 +0100, pozz wrote:>
>Il 22/02/2025 14:18, David Brown ha scritto:>>Exactly. This is the reply to AT+CMGR command that is standardized in
My understanding here is that the OP is getting the UCS-2 encoded
string in from a modem, almost certainly on a serial line. The UCS-2
encoded data is itself a binary sequence of 16-bit code units, and the
modem firmware is sending those as four hex digits.
>
3GPP TS 27.005.
Anything that is specifying the use of UCS-2 encoding automatically dates
itself to about the early-to-mid 1990s.
Sincereley I don't know why and when, but the LTE modem I'm using (Simcom A7672E) replies to AT+CMGR in two different format:
>
- what is described as GSM 7-bit alphabet (but it's really UTF-8 when non ASCII chas are present)
>
- UCS2
>
Of course, in the header, it specifies the <dcs> (data coding scheme) so the receiver on the UART can interpret correctly all the data.
>
Are you sure it is UCS2 and not UTF-16?
>
Can it not handle characters not in the BMP?
>
The difference between UCS2 and UTF-16 is that UCS2 is the character set that predates the surrogate-pairs added to extend it. It is very much the equivalent relationship of ASCII to UTF-8.
Les messages affichés proviennent d'usenet.