Sujet : Re: OT: unicode (Was: Re: Upcoming gfortran 15 will contain unsigned numbers)
De : ldo (at) *nospam* nz.invalid (Lawrence D'Oliveiro)
Groupes : comp.lang.fortranDate : 26. Nov 2024, 00:35:34
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vi31k5$32sr1$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9
User-Agent : Pan/0.161 (Chasiv Yar; )
On Mon, 25 Nov 2024 08:35:48 -0300, Wolfgang Agnes wrote:
It's a bit difficult to understand ``surrogates''.
The Unicode folks just decided that the ranges 0xD800-0xDBFF (1024 codes
of “high surrogates”) and 0xDC00-0xDFFF (1024 codes of “low surrogates”)
would be used in pairs to represent codes above 0xFFFF in UTF-16 encoding.
This gives an additional 1024×1024 = 1048576 different codes, which should
be enough to cover the entire (current) Unicode range, which officially
goes up to 0x10FFFF. At least, that’s what they’re saying right now.
In the full UCS-4 encoding, those ranges are considered invalid.