Sujet : Re: May a string span multiple, independent objects?
De : vincent-news (at) *nospam* vinc17.net (Vincent Lefevre)
Groupes : comp.std.cDate : 04. Jul 2024, 14:22:26
Autres entêtes
Organisation : a training zoo
Message-ID : <20240704130236$a100@vinc17.org>
References : 1 2 3
User-Agent : tin/2.6.4-20240531 ("Banff") (Linux/6.7.12-amd64 (x86_64))
In article <
87zfqy6v54.fsf@bsb.me.uk>,
Ben Bacarisse <
ben@bsb.me.uk> wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 7/3/24 10:31, Vincent Lefevre wrote:
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
>
But may a string span multiple, independent objects that happens
to be contiguous in memory?
...
For instance, is the following program valid and what does the ISO C
standard say about that?
>
#include <stdio.h>
#include <string.h>
>
typedef char *volatile vp;
>
int main (void)
{
char a = '\0', b = '\0';
>
a and b are not guaranteed to be contiguous.
>
vp p = &a, q = &b;
>
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
>
That comparison is legal, and has well-defined behavior. It will be true
only if they are in fact contiguous.
>
a = 'x';
printf ("%zd\n", strlen (p));
>
Because strlen() must take a pointer to 'a' (which is treated, for these
purposes, as a array of char of length 1), and increment it one past the
end of that array, and then dereference that pointer to check whether it
points as a null character, the behavior is undefined.
I think this is slightly misleading. It suggests that the UB comes from
something strlen /must/ do, but strlen must be thought of as a black
box. We can't base anyhting on a assumed implementation.
I agree (and note that strlen is not necessarily written in C).
But our conclusion is correct because there is explicit wording covering
this case. The section on "String function conventions" (7.24.1)
states:
"If an array is accessed beyond the end of an object, the behavior is
undefined."
Arguments of these functions are either arrays and strings, where a
string is not defined as being an array (or a part of an array). So
I don't see why this text, as written, would apply to strings.
BTW, the definition of an object is rather vague: "region of data
storage in the execution environment, the contents of which can
represent values". But it is not excluded that contiguous areas
can form an object.
Similarly, malloc() is specified as allocating space for an object,
but this does not mean that one initially has an object in the
allocated space, though with the above restriction, this would
be important to be able to use memset() on this storage area.
-- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)