Re: multi bytes character - how to make it defined behavior?

Liste des GroupesRevenir à l c 
Sujet : Re: multi bytes character - how to make it defined behavior?
De : ben (at) *nospam* bsb.me.uk (Ben Bacarisse)
Groupes : comp.lang.c
Date : 14. Aug 2024, 02:32:14
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <874j7ot04x.fsf@bsb.me.uk>
References : 1
User-Agent : Gnus/5.13 (Gnus v5.13)
Thiago Adams <thiago.adams@gmail.com> writes:

static_assert('×' == 50071);

static_assert(U'×' == 215);

works, but then I don't know what you were trying to do.

GCC -  warning multi byte
CLANG - error character too large
>
I think instead of "multi bytes" we need "multi characters" - not
bytes.
>
We decode utf8 then we have the character to decide if it is multi char or
not.

These terms can be confusing and I don't know exactly how you are using
them.  Basically I simply don't know what that second sentence is
saying.

decoding '×' would consume bytes 195 and 151 the result is the decoded
Unicode value of 215.

Yes, Unicode 215 is UTF-8 encoded as two bytes with values 195 and 151.

It is not multi byte : 256*195 + 151 = 50071

If that × is UTF-8 encoded then it might look, to the compiler, just
like an old-fashioned multi-character character constant just like 'ab'
does.  Then again, it might not.  gcc and clan take different views on
the matter.

You can get clang to that the same view a gcc by writing

  static_assert('\xC3\x97' == 50071);

instead.  Now both gcc and clang see it for what it is: an old-fashioned
multi-character character constant.

O the other hand 'ab' is "multi character" resulting

The term for these things used to be "multi-byte character constant" and
they were highly non-portable.  The trouble is that the term "multi-byte
character" now refers to highly portable encodings like UTF-8.  Maybe
that's why gcc seems to have changed it's warning from what you gave to:

  warning: multi-character character constant [-Wmultichar]

256 * 'a' + 'b' = 256*97+98= 24930
>
One consequence is that
>
'ab' == '𤤰'
>
But I don't think this is a problem. At least everything is defined.
>

--
Ben.

Date Sujet#  Auteur
13 Aug 24 * multi bytes character - how to make it defined behavior?19Thiago Adams
14 Aug 24 +* Re: multi bytes character - how to make it defined behavior?16Bart
14 Aug 24 i`* Re: multi bytes character - how to make it defined behavior?15Keith Thompson
14 Aug 24 i `* Re: multi bytes character - how to make it defined behavior?14Thiago Adams
14 Aug 24 i  `* Re: multi bytes character - how to make it defined behavior?13Bart
14 Aug 24 i   +* Re: multi bytes character - how to make it defined behavior?11Thiago Adams
14 Aug 24 i   i+* Re: multi bytes character - how to make it defined behavior?9Bart
14 Aug 24 i   ii`* Re: multi bytes character - how to make it defined behavior?8Thiago Adams
14 Aug 24 i   ii +- Re: multi bytes character - how to make it defined behavior?1Thiago Adams
14 Aug 24 i   ii +* Re: multi bytes character - how to make it defined behavior?5Bart
14 Aug 24 i   ii i`* Re: multi bytes character - how to make it defined behavior?4Thiago Adams
14 Aug 24 i   ii i `* Re: multi bytes character - how to make it defined behavior?3Bart
14 Aug 24 i   ii i  `* Re: multi bytes character - how to make it defined behavior?2Thiago Adams
14 Aug 24 i   ii i   `- Re: multi bytes character - how to make it defined behavior?1Bart
15 Aug 24 i   ii `- Re: multi bytes character - how to make it defined behavior?1Lawrence D'Oliveiro
15 Aug 24 i   i`- Re: multi bytes character - how to make it defined behavior?1Lawrence D'Oliveiro
15 Aug 24 i   `- Re: multi bytes character - how to make it defined behavior?1Lawrence D'Oliveiro
14 Aug 24 +- Re: multi bytes character - how to make it defined behavior?1Ben Bacarisse
14 Aug 24 `- Re: multi bytes character - how to make it defined behavior?1Richard Damon

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal