Re: C23 thoughts and opinions

Liste des GroupesRevenir à cl c 
Sujet : Re: C23 thoughts and opinions
De : bc (at) *nospam* freeuk.com (bart)
Groupes : comp.lang.c
Date : 15. Jun 2024, 20:27:41
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v4kpvc$3jrmr$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
User-Agent : Mozilla Thunderbird
On 15/06/2024 18:17, David Brown wrote:
On 15/06/2024 00:39, bart wrote:
On 14/06/2024 22:30, Keith Thompson wrote:
>
Now that it's too late to change the definition, I've thought of
something that I think would have been a better way to specify #embed.
>
Define a new kind of string literal, with a "uc" prefix.  `uc"foo"` is
of type `unsigned char[3]`.  (Or `const unsigned char[3]`, if that's not
too radical.)  Unlike other string literals, there is no implicit
terminating '\0'.  Arbitrary byte values can of course be specified in
hexadecimal: uc"\x01\x02\x03\x04".  Since there's no terminating null
character and C doesn't support zero-sized objects, uc"" is a syntax
error.
>
uc"..." string literals might be made even simpler, for example allowing
only hex digits and not requiring \x (uc"01020304" rather than
uc"\x01\x02\x03\x04").  That's probably overkill.  uc"..."  literals
could be useful in other contexts, and programmers will want
flexibility.  Maybe something like hex"01020304" (embedded spaces could
be ignored) could be defined in addition to uc"\x01\x02\x03\x04".
>
That's something I added to string literals in my language within the last few months. Nothing do with embedding (but it can make hex sequences within strings more efficient, if that approach was used).
>
Writing byte-at-a-time hex data was always a bit fiddly:
>
     0x12, 0x34, 0xAB, ...
     "\x12\x34\xAB...
>
It was made worse by my preference for `x` being in lower case, and the hex digits in upper case, otherwise 0XABC or 0Xab or 0xab look wrong.
>
What I did was create a new, variable-lenghth string escape sequence that looks like this:
>
   "ABC\h1234AB...\nopq"     // hex sequence between ABC & nopq
>
Hex digits after \h or \H are read in pairs. White space is allowed between pairs:
>
   "ABC\H 12 34 AB ...\nopq"
>
The only thing I wasn't sure about was the closing backslash, which looks at first like another escape code. But I think it is sound, although it can still be tweaked.
>
>
 How often would something like that be useful?  I would have thought that it is rare to see something that is basically text but has enough odd non-printing characters (other than the common \n, \t, \e) to make it worth the fuss.  If you want to have binary data in something that looks like a string literal, then just use straight-up two hex digits per character - "4142431234ab".  It's simpler to generate and parse.  I don't see the benefit of something that mixes binary and text data.
That's not the same thing. That sequence "...1234..." occupies 4 bytes (with values 49 50 51 52), not two bytes (with values 0x12 and 0x34, or 18 and 52).
Here's an example of wanting to print '€4.99', first in C (note that my editor doesn't support Unicode so this stuff is needed):
    puts("\xE2\x82\xAC" "4.99");
The euro symbol occupies three bytes in UTF8. It's awkward to type: it has loads of backslashes, it keeps switching case and it needs more concentration.
Plus I had to split the string since apparently \x doesn't stop at two hex digits, it keeps going: it would have read \xAC4, which overflows the 8-bit width of a character anyway, so I don't know what the point is of reading more than 2 hex characters.
Using my feature, it looks like this:
     println "\H E2 82 AC\4.99"
There must be loads of examples of wanting to write many byte values within strings, which in C can also be used to initialise byte arrays (a useful feature I've now adopted; see below).
Here's another example, in my language, which is the first 128 bytes of an EXE file which is constant. It is currently defined like this, probably created with a script:
   []byte stubdata = (
     0x4D, 0x5A, 0x90, 0x00, 0x03, 0x00, 0x00, 0x00,
     0x04, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00,
     ...
Using the new escape, I can just copy&paste a dump, and use a text editor to put in the string context needed, which took under a minute:
[]byte stubdata=
   b"\H 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF 00 00\"+
   b"\H B8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00\"+
   b"\H 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00\"+
   b"\H 00 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00\"+
   b"\H 0E 1F BA 0E 00 B4 09 CD 21 B8 01 4C CD 21 54 68\"+
   b"\H 69 73 20 70 72 6F 67 72 61 6D 20 63 61 6E 6E 6F\"+
   b"\H 74 20 62 65 20 72 75 6E 20 69 6E 20 44 4F 53 20\"+
   b"\H 6D 6F 64 65 2E 0D 0D 0A 24 00 00 00 00 00 00 00\"+
   b"\H 50 45 00 00 64 86 04 00 00 00 00 00 00 00 00 00\"
(The 's'/'b' prefixes are needed for strings to have a type of (in C terms) char[] rather than char*, a detail that C glosses over via some magic. 's' gives you a zero terminator, 'b' as used here doesn't. The "+" is used for compile-time string/data-string concatenation.)
In short, more is possible without needed to resort to tools. You can directly work from a hex dump.

Date Sujet#  Auteur
14 Jun 24 * Re: C23 thoughts and opinions56Keith Thompson
14 Jun 24 +* Re: C23 thoughts and opinions12bart
15 Jun 24 i`* Re: C23 thoughts and opinions11David Brown
15 Jun 24 i `* Re: C23 thoughts and opinions10bart
15 Jun 24 i  +* Re: C23 thoughts and opinions5Lawrence D'Oliveiro
16 Jun 24 i  i`* Re: C23 thoughts and opinions4bart
16 Jun 24 i  i +- Re: C23 thoughts and opinions1Lawrence D'Oliveiro
16 Jun 24 i  i `* Re: C23 thoughts and opinions2Chris M. Thomasson
17 Jun 24 i  i  `- Re: C23 thoughts and opinions1Lawrence D'Oliveiro
16 Jun 24 i  `* Re: C23 thoughts and opinions4David Brown
16 Jun 24 i   `* Re: C23 thoughts and opinions3bart
17 Jun 24 i    +- Re: C23 thoughts and opinions1David Brown
17 Jun 24 i    `- Re: C23 thoughts and opinions1Michael S
15 Jun 24 +* Re: C23 thoughts and opinions3David Brown
15 Jun 24 i`* Re: C23 thoughts and opinions2Lawrence D'Oliveiro
16 Jun 24 i `- Re: C23 thoughts and opinions1David Brown
17 Jun 24 `* Hex string literals (was Re: C23 thoughts and opinions)40Keith Thompson
17 Jun 24  +* Re: Hex string literals (was Re: C23 thoughts and opinions)20David Brown
18 Jun 24  i+* Re: Hex string literals (was Re: C23 thoughts and opinions)18Keith Thompson
18 Jun 24  ii+* Re: Hex string literals (was Re: C23 thoughts and opinions)2Lawrence D'Oliveiro
18 Jun 24  iii`- Re: Hex string literals (was Re: C23 thoughts and opinions)1Keith Thompson
18 Jun 24  ii`* Re: Hex string literals (was Re: C23 thoughts and opinions)15David Brown
18 Jun 24  ii +* Re: Hex string literals (was Re: C23 thoughts and opinions)6Keith Thompson
19 Jun 24  ii i`* Re: Hex string literals (was Re: C23 thoughts and opinions)5David Brown
19 Jun 24  ii i `* Re: Hex string literals (was Re: C23 thoughts and opinions)4Kaz Kylheku
19 Jun 24  ii i  `* Re: Hex string literals (was Re: C23 thoughts and opinions)3Michael S
19 Jun 24  ii i   +- Re: Hex string literals (was Re: C23 thoughts and opinions)1bart
19 Jun 24  ii i   `- Re: Hex string literals (was Re: C23 thoughts and opinions)1Michael S
19 Jun 24  ii `* Re: Hex string literals (was Re: C23 thoughts and opinions)8Lawrence D'Oliveiro
19 Jun 24  ii  +* Re: Hex string literals (was Re: C23 thoughts and opinions)6David Brown
21 Jun 24  ii  i`* Re: Hex string literals (was Re: C23 thoughts and opinions)5Lawrence D'Oliveiro
21 Jun 24  ii  i +* Re: Hex string literals (was Re: C23 thoughts and opinions)3David Brown
21 Jun 24  ii  i i`* Re: Hex string literals (was Re: C23 thoughts and opinions)2Lawrence D'Oliveiro
22 Jun 24  ii  i i `- Re: Hex string literals (was Re: C23 thoughts and opinions)1David Brown
21 Jun 24  ii  i `- Re: Hex string literals (was Re: C23 thoughts and opinions)1James Kuyper
19 Jun 24  ii  `- Re: Hex string literals (was Re: C23 thoughts and opinions)1Keith Thompson
18 Jun 24  i`- Re: Hex string literals (was Re: C23 thoughts and opinions)1Lawrence D'Oliveiro
17 Jun 24  +* Re: Hex string literals (was Re: C23 thoughts and opinions)5Richard Kettlewell
17 Jun 24  i+- Re: Hex string literals (was Re: C23 thoughts and opinions)1Richard Kettlewell
18 Jun 24  i`* Re: Hex string literals (was Re: C23 thoughts and opinions)3Keith Thompson
18 Jun 24  i +- Re: Hex string literals (was Re: C23 thoughts and opinions)1Lawrence D'Oliveiro
18 Jun 24  i `- Re: Hex string literals (was Re: C23 thoughts and opinions)1Richard Kettlewell
17 Jun 24  `* Re: Hex string literals (was Re: C23 thoughts and opinions)14bart
18 Jun 24   +- Re: Hex string literals (was Re: C23 thoughts and opinions)1Keith Thompson
18 Jun 24   +* Re: Hex string literals (was Re: C23 thoughts and opinions)7Tim Rentsch
18 Jun 24   i`* Re: Hex string literals (was Re: C23 thoughts and opinions)6Michael S
18 Jun 24   i +* Re: Hex string literals (was Re: C23 thoughts and opinions)2bart
18 Jun 24   i i`- Re: Hex string literals (was Re: C23 thoughts and opinions)1Tim Rentsch
18 Jun 24   i +- Re: Hex string literals (was Re: C23 thoughts and opinions)1David Brown
18 Jun 24   i +- Re: Hex string literals (was Re: C23 thoughts and opinions)1Tim Rentsch
20 Jun 24   i `- Re: Hex string literals (was Re: C23 thoughts and opinions)1Lawrence D'Oliveiro
18 Jun 24   `* Re: Hex string literals (was Re: C23 thoughts and opinions)5Kaz Kylheku
18 Jun 24    `* Re: Hex string literals (was Re: C23 thoughts and opinions)4David Brown
18 Jun 24     `* Re: Hex string literals (was Re: C23 thoughts and opinions)3Richard Harnden
18 Jun 24      +- Re: Hex string literals (was Re: C23 thoughts and opinions)1Richard Harnden
21 Jun 24      `- Re: Hex string literals (was Re: C23 thoughts and opinions)1Lawrence D'Oliveiro

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal