Re: Representation of _Bool

Liste des GroupesRevenir à cl c 
Sujet : Re: Representation of _Bool
De : Keith.S.Thompson+u (at) *nospam* gmail.com (Keith Thompson)
Groupes : comp.lang.c
Date : 17. Jan 2025, 22:34:53
Autres entêtes
Organisation : None to speak of
Message-ID : <87ed116s5e.fsf@nosuchdomain.example.com>
References : 1 2
User-Agent : Gnus/5.13 (Gnus v5.13)
learningcpp1@gmail.com (m137) writes:
Hi Keith,
>
Thank you for posting this.

The message being referred to is one I posted Sun 2021-05-23, with
Message-ID <87tums515a.fsf@nosuchdomain.example.com>.  It's visible on
Google Groups at
<https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.

As others have suggested, please include attribution information when
posting a followup.  You don't need to quote the entire message,
but provide at least some context, particularly when the parent
message is old.

This is an update to that message.

                            I noticed that the newer drafts of C23
(N2912 onwards, I think) have replaced the term "trap representation"
with "non-value representation":
- **Trap representation** was last defined in [N2731
3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
as "an object representation that need not represent a value of the
object type."
- **Non-value representation** is most recently defined in [N3435
3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
as "an object representation that does not represent a value of the
object type."
>
The definition of non-value representation rules out object
representations that represent a value of the object type from being
non-value representations. So it seems to be stricter than the
definition of trap representation, which does not seem to rule out such
object representations from being trap representations. Is this
interpretation correct?

I don't believe so.  As far as I can tell, a "non-value
representation" (C23 and later) is exactly the same thing as a "trap
representation" (C17 and earlier).  The older term was probably
considered unclear, since it could imply that a trap is required.
In fact, reading an object with a trap/non-value representation
has undefined behavior, which can include yielding the value you
might have expected.

If so, what happens to the 254 trap representations that GCC and Clang
reserve for `_Bool`?

I see no evidence in gcc's documentation that gcc treats
representations other than 0 or 1 as trap/non-value representations.
I see only two references to "trap representation", one for signed
integer types (saying that there are no trap representations)
and one regarding type-punning via unions.  There are no relevant
references to "padding bits".

I'm less familiar with clang's documentation, but I see no reference
to "trap representation" or "non-value representation".

We can get some information about this by running a test program.
See below.

                     Assuming a width of 1, each of those 254 object
representations represents a value in `_Bool`'s domain (the half whose
value bit is 1 represents the value `true`, while the other half whose
value bit is 0 represents the value `false`), so they cannot be thought
of as non-value representations (since a non-value representation must
be an object representation that **does not** represent a value of the
object type).

Reading an object with a non-value representation has undefined
behavior.  If the observed value happens to be a valid value of the
object's type, that's still consistent with undefined behavior.
*Everything* is consistent with undefined behavior.

I've been stuck on this for quite some time, so would be grateful for
any guidance you could provide.

Editions of the C standard earlier than C23 were not entirely
clear about the representation of _Bool.  (C90 does not have _Bool
or bool.  C99 through C17 have _Bool as a keyword, with bool as
a macro defined in <stdbool.h>.  C23 has bool as a keyword, with
_Bool as an alternate spelling.)

In C99 and later, _Bool/bool is required to be an unsigned integer
type large enough to hold the values 0 and 1.  Its size must be at
least CHAR_BIT bits (which is at least 8).  The *rank* of _Bool is
less than the rank of all other standard integer types.

The rank implies that the range of values is a subset of the
range of values of any other unsigned integer type.  The rank does
*not* imply anything about relative sizes.  unsigned char has a
higher rank than bool, but bool could have additional padding bits
making sizeof(bool)>1.  (Probably no implementation does this.)
unsigned char has no padding bits.

C11 implies that _Bool can have more than one value bit, which
means it could represent values greater than 1 (but no more than
0..UCHAR_MAX).

C23 (I'm using the N3096 draft) tightens the requirements, saying
that bool has exactly one value bit and (sizeof(bool)*CHAR_BIT)-1
padding bits -- again implying that sizeof(bool) might be greater
than 1, but forbidding values greater than 1.

Typically in C17 and earlier, and always in C23, _Bool/bool will
have exactly 1 value bit and CHAR_BIT-1 padding bits.  Padding bits
do not contribute to the value of an object (so 0 and 1 are the
only possible values), but non-zero padding bits *may or may not*
create trap/non-value representations.  (A gratuitously exotic
implementation might use a representation other than 00000001 for
true, but 00000000 is guaranteed to be a representation for 0/false.)

As far as I can tell, the standard is silent on whether a bool object
with non-zero padding bits is a trap/non-value representation or not.

I wrote a test program to explore how bool is treated.  It uses
memcpy to set the representation of a bool object and then prints
the value of that object.  Source is at the bottom of this message.

If bool has no non-value representations, then the values of the
CHAR_BIT-1 padding bits must be ignored when reading a bool object,
and the value of such an object is determined only by its single
value bit, 0 or 1.  If it does have non-value representations,
then reading such an object has undefined behavior.

With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
when used in a condition and all other representations are treated
as true.  Converting the value of a bool object to another integer
type yields the value of its full 8-bit representation.  If a bool
object holds a representation other than 00000000 or 00000001,
it compares equal to both `true` and `false`.

This implies that bool has 1 value bit and 7 padding bits (as
required by C23) and that it has 2 value representations and 254
trap representations.  The observed behavior for the non-value
representations is the result of undefined behavior.  (gcc -std=c23
sets __STDC_VERSION__ to 202000L, not 202311L.  The documentation
acknowledges that support for C23 is experimental and incomplete.)

With clang 19.1.4, with "-std=c23", the behavior is consistent
with bool having no non-value representations.  The 7 padding bits
do not contribute to the value of a bool object.  Any bool object
with 0 as the low-order bit is treated as false in a condition and
yields 0 when converted to another integer type,.  Any bool object
with 1 as the low-order bit is treated as true, and yields 1 when
converted to another integer type.  I presume the intent is for bool
to have 256 value representations and no non-value representations
(with the padding bits ignored as required), but it's also consistent
with bool having non-value representations and the observed behavior
being undefined.  It's not possible to determine with a test program
whether the output is the result of undefined behavior or not.

As far as I can tell, the question of whether bool has non-value
representations is unspecified but not implementation-defined,
meaning that an implementation is not required to document its
choice.

#include <stdio.h>
#include <string.h>
#include <limits.h>
#if __STDC_VERSION__ < 202311L
#include <stdbool.h>
#endif
int main() {
    printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
#if __STDC_VERSION__ < 202311L
    puts("Older than C23, using <stdbool.h>");
#else
    puts("C23 or later, using bool directly");
#endif
    printf("sizeof (unsigned char) = %zu, sizeof (bool) = %zu\n",
           sizeof (unsigned char), sizeof (bool));

    const bool no = false;
    const bool yes = true;
    unsigned char uc;
    memcpy(&uc, &no, 1);
    printf("false is represented as %d\n", (int)uc);
    memcpy(&uc, &yes, 1);
    printf("true  is represented as %d\n", (int)uc);

    for (int i = 0; i <= UCHAR_MAX; i ++) {
        const unsigned char uc = i;
        bool b;
        memcpy(&b, &uc, 1);
        const unsigned char value = b;
        printf("uc = 0x%02x b = 0x%02x b is %s, b%sfalse, b%strue\n",
               (unsigned)uc,
               value,
               b ? "truthy" : "falsy ",
               b == false ? "==" : "!=",
               b == true  ? "==" : "!=");
    }
}

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Date Sujet#  Auteur
17 Jan 25 * Re: Representation of _Bool15m137
17 Jan 25 +* Re: Representation of _Bool6Kaz Kylheku
17 Jan 25 i+- Re: Representation of _Bool1David Brown
17 Jan 25 i+- Eternal September server retention Was: Representation of _Bool1Michael S
17 Jan 25 i+- Re: Representation of _Bool1James Kuyper
19 Jan 25 i`* Re: Representation of _Bool2m137
19 Jan 25 i `- Re: Representation of _Bool1Keith Thompson
17 Jan 25 +* Re: Representation of _Bool3Tim Rentsch
19 Jan 25 i`* Re: Representation of _Bool2m137
19 Jan 25 i `- Re: Representation of _Bool1Tim Rentsch
17 Jan 25 `* Re: Representation of _Bool5Keith Thompson
18 Jan 25  +- Re: Representation of _Bool1Tim Rentsch
19 Jan 25  `* Re: Representation of _Bool3m137
19 Jan 25   `* Re: Representation of _Bool2Kenny McCormack
21 Jan 25    `- Re: Representation of _Bool1m137

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal