Sujet : Re: May a string span multiple, independent objects?
De : jameskuyper (at) *nospam* alumni.caltech.edu (James Kuyper)
Groupes : comp.std.cDate : 05. Jul 2024, 06:37:35
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v680qv$34e6h$1@dont-email.me>
References : 1 2 3 4
User-Agent : Mozilla Thunderbird
On 7/4/24 09:22, Vincent Lefevre wrote:
In article <87zfqy6v54.fsf@bsb.me.uk>,
Ben Bacarisse <ben@bsb.me.uk> wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 7/3/24 10:31, Vincent Lefevre wrote:
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
>
But may a string span multiple, independent objects that happens
to be contiguous in memory?
...
For instance, is the following program valid and what does the ISO C
standard say about that?
>
#include <stdio.h>
#include <string.h>
>
typedef char *volatile vp;
>
int main (void)
{
char a = '\0', b = '\0';
>
a and b are not guaranteed to be contiguous.
>
vp p = &a, q = &b;
>
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
>
That comparison is legal, and has well-defined behavior. It will be true
only if they are in fact contiguous.
>
a = 'x';
printf ("%zd\n", strlen (p));
>
Because strlen() must take a pointer to 'a' (which is treated, for these
purposes, as a array of char of length 1), and increment it one past the
end of that array, and then dereference that pointer to check whether it
points as a null character, the behavior is undefined.
I think this is slightly misleading. It suggests that the UB comes from
something strlen /must/ do, but strlen must be thought of as a black
box. We can't base anyhting on a assumed implementation.
I agree (and note that strlen is not necessarily written in C).
But our conclusion is correct because there is explicit wording covering
this case. The section on "String function conventions" (7.24.1)
states:
"If an array is accessed beyond the end of an object, the behavior is
undefined."
Arguments of these functions are either arrays and strings, where a
string is not defined as being an array (or a part of an array). So
I don't see why this text, as written, would apply to strings.
BTW, the definition of an object is rather vague: "region of data
storage in the execution environment, the contents of which can
represent values". But it is not excluded that contiguous areas
can form an object.
Not everything you need to know about a term defined in the C standard
is included in its definition. Other parts of the standard tell you that
objects are created by declarations of identifiers for those objects
with static, thread_local, or automatic storage duration. Other parts
tell you that anonymous objects can be created by the presence of string
or compound literals. The description of the standard library tells you
that objects with allocated storage duration are created by calling
memory allocation functions.
Nowhere does it say that a larger C object can be created simply by
having two C objects that happen to be adjacent with each other.
The basic rule, even though it is not explicitly part of the definition
of "object", is that you don't have a C object unless some clause of the
C standard tells you that it is an object, and the clauses I've
summarized above are the only ones that do so.
Note: if they don't just "happen" to be adjacent - if the C standard
guarantees that two objects are adjacent to each other by reason of
being sub-objects of some larger object - then the existence of that
larger object is what makes the behavior defined when incrementing a
pointer into the first object through the second.
Similarly, malloc() is specified as allocating space for an object,
but this does not mean that one initially has an object in the
Actually, it does. "The lifetime of an allocated object extends from the
allocation until the deallocation." (7.24.3p1). It becomes an object as
soon as allocated.
"The effective type of an object for an access to its stored value is
the declared type of the object, if any." (6.5p6).
Note that allocated memory is the only kind that doesn't start out with
a declared type. That paragraph goes on to say that
"If a value is stored into an object having no declared type through an
lvalue having a type that is not a non-atomic character type, then the
type of the lvalue becomes the effective type of the object for that
access and for subsequent accesses that do not modify the stored value."
Note that this wording describes it as already being an object before
any value has been written into the allocated memory. The second way to
give allocated memory an effective type uses wording with that same
implication:
"If a value is copied into an object having no declared type using
memcpy or memmove, or is copied as an array of character type, then the
effective type of the modified object for that access and for subsequent
accesses that do not modify the value is the effective type of the
object from which the value is copied, if it has one."