Sujet : Re: May a string span multiple, independent objects?
De : tr.17687 (at) *nospam* z991.linuxsc.com (Tim Rentsch)
Groupes : comp.std.cDate : 08. Aug 2024, 16:35:04
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <86zfpngh93.fsf@linuxsc.com>
References : 1 2 3 4
User-Agent : Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Vincent Lefevre <
vincent-news@vinc17.net> writes:
In article <87zfqy6v54.fsf@bsb.me.uk>,
Ben Bacarisse <ben@bsb.me.uk> wrote:
>
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
>
On 7/3/24 10:31, Vincent Lefevre wrote:
>
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A
string is a contiguous sequence of characters terminated by and
including the first null character."
>
But may a string span multiple, independent objects that happens
to be contiguous in memory?
>
...
>
For instance, is the following program valid and what does the
ISO C standard say about that?
>
#include <stdio.h>
#include <string.h>
>
typedef char *volatile vp;
>
int main (void)
{
char a = '\0', b = '\0';
>
a and b are not guaranteed to be contiguous.
>
vp p = &a, q = &b;
>
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
>
That comparison is legal, and has well-defined behavior. It will
be true only if they are in fact contiguous.
>
a = 'x';
printf ("%zd\n", strlen (p));
>
Because strlen() must take a pointer to 'a' (which is treated, for
these purposes, as a array of char of length 1), and increment it
one past the end of that array, and then dereference that pointer
to check whether it points as a null character, the behavior is
undefined.
>
I think this is slightly misleading. It suggests that the UB comes
from something strlen /must/ do, but strlen must be thought of as a
black box. We can't base anyhting on a assumed implementation.
>
I agree (and note that strlen is not necessarily written in C).
>
But our conclusion is correct because there is explicit wording
covering this case. The section on "String function conventions"
(7.24.1) states:
>
"If an array is accessed beyond the end of an object, the
behavior is undefined."
>
Arguments of these functions are either arrays and strings, where
a string is not defined as being an array (or a part of an array).
So I don't see why this text, as written, would apply to strings.
Something that's important to understand is the C standard is not
meant to be read as legalese or mathematicalese. Certainly the
authors are making an effort to be precise, but not always to the
degree that every sentence is entirely correct, or presenting the
whole story, if considered just in isolation. To avoid being led
astray it helps to remember that and try to read holistically in
addition to reading passages individually.
In any case, the question here is easily resolved by noting the
description in paragraph 1 of 7.24.1 "String function conventions",
which says in part
The header <string.h> declares one type and several functions,
and defines one macro useful for manipulating arrays of
character type and other objects treated as arrays of character
type. [...] Various methods are used for determining the
lengths of the arrays, but in all cases a char * or void *
argument points to the initial (lowest addressed) character of
the array.
Note especially the second part of the last sentence, starting with
"but in all cases". Arguments to functions in <string.h> always
refer to arrays, regardless of whether they might also refer to
strings.