Liste des Groupes | Revenir à cl c |
On 14/08/2024 12:34, Bart wrote:
So the 50071 is the 2-byte UTF8 sequence.In that case I don't understand what you are testing for here. Is it an error for '×' to be 215, or an error for it not to be?GCC handles this as multibyte. Without decoding.
The result of GCC is 50071
static_assert('×' == 50071);
The explanation is that GCC is doing:
256*195 + 151 = 50071
(Remember the utf8 bytes were 195 151)I don't understand. 'a' and 'b' each occupy one byte. Together they need two bytes.
The way 'ab' is handled is the same of '×' on GCC.
Who or what does that, and for what purpose? From what I've seen, only you have introduced it.And what is the test for, to ensure encoding is UTF8 in this ... source file? ... compiler?MSVC has some checks, I don't know that is the logic.
Where would the 'decoded 215' come into it?215 is the value after decoding utf8 and producing the unicode value.
So my suggestion is decode first.Why? What are you comparing? Both sides of == must use UTF8 or Unicode, but why introduce Unicode at all if apparently everything in source code and at compile time, as you yourself have stated, is UTF8?
The bad part of my suggestion we may have two different ways of producing the same value.I don't think so. If I run this program:
For instance the number generated by ab is the same of
'ab' == '𤤰'
Les messages affichés proviennent d'usenet.