Sujet : Re: C23 thoughts and opinions
De : Keith.S.Thompson+u (at) *nospam* gmail.com (Keith Thompson)
Groupes : comp.lang.cDate : 28. May 2024, 21:44:47
Autres entêtes
Organisation : None to speak of
Message-ID : <87zfs9y8cg.fsf@nosuchdomain.example.com>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
scott@slp53.sl.home (Scott Lurndal) writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
scott@slp53.sl.home (Scott Lurndal) writes:
[...]
E.g. if the embedded file contained an array of some structure,
the binary format of the embedded file must match the binary format
that would be expected by the compiler (field sizes, alignment etc)
for an array of said structure.
>
The spec does say that the data in memory must match the data in the
file.
>
Where does it say that?
>
See <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf>
6.10.4. (N3220 is a C26 draft, but it's very close to C23.)
>
The spec says that #embed expands to a comma-delimited sequence of
integer constant expressions (and like anything, optimizations that
don't violate the specified behavior are allowed). If the
implementation-defined *embed element width* is CHAR_BIT (which is not
guaranteed), then you can expect the same data layout *if* you use it to
initialize an array of characters, preferably unsigned char.
>
>
"Implementations should take into account translation-time bit and
byte orders as well as execution-time bit and byte orders to more
appropriately represent the resource's binary data from the directive.
This maximizes the chance that, if the resource referenced at translation
time through the #embed irective is the same one accessed through
execution-time means, the data that is e.g. fread or similar into contiguous
storage will compare bit-for-bit equal to an array of character type initialized
from an #embed directive's expanded contents."
>
p. 172 n3220.pdf.
6.10.4.1p15. (Section references are more stable across drafts than
page numbers.)
Yes, but that applies only when #embed is used to initialize an
array of unsigned char (or of plain char if plain char is unsigned).
It's under "Recommended practice", so it doesn't override the
specified semantics.
If I use #embed to initialize a struct with varying member types
and sizes, the compiler isn't going to treat the embedded file
as an object of that struct type. It's still going to generate a
sequence representing the bytes of the embedded file (or equivalent).
It feels like that paragraph *assumes* that #embed will be used
only to initialize arrays of unsigned char (or plain char if plain
char is unsigned) -- which is probably a reasonable assumption
in practice.
Using #embed for anything other than an array of unsigned char is
going to be unusual -- and in fact I'm thinking that it might have
been better to require #embed to work *only* when initializing such
an array, with undefined behavior otherwise. It would have avoided
the complexity of using unspecified heuristics to decide whether
to expand #embed to that sequence of integer constant expressions
or to something equivalent but more efficient.
-- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.comvoid Void(void) { Void(); } /* The recursive call of the void */