Re: Implicit String-Literal Concatenation

Liste des GroupesRevenir à cl c  
Sujet : Re: Implicit String-Literal Concatenation
De : Keith.S.Thompson+u (at) *nospam* gmail.com (Keith Thompson)
Groupes : comp.lang.c
Date : 08. Mar 2024, 00:46:01
Autres entêtes
Organisation : None to speak of
Message-ID : <87frx1obba.fsf@nosuchdomain.example.com>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
User-Agent : Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2024-03-07, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2024-03-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
On Mon, 04 Mar 2024 20:55:28 -0800, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Thu, 29 Feb 2024 14:14:52 -0800, Keith Thompson wrote:
"A *string* is a contiguous sequence of characters terminated by and
including the first null character."
>
So how come strlen(3) does not include the null?
 Because the *length of a string* is by definition "the number of bytes
preceding the null character".
>
So the “string” itself includes the null character, but its “length” does not?
>
That's correct. However, its size includes it.
>
 sizeof "abc" == 4
>
 strlen("abc") == 3
>
The abstract string does not include the null character;
we understand "abc" to be a three character string.
>
Sure, if you define "abstract string" that way.  I'll just note that C's
definition of the word "string" does include the terminating null
character, and does not talk about "abstract strings".  (A string in the
abstract machine clearly includes the null character, but that's a bit
of a stretch.)
>
Yes; "abstract machine" is not what I mean by abstract.
>
The concept of the abstract string lives in the semantics though.
>
When N strings are catenated together, their abstract strings are
juxtaposed together without any nulls in between, with only a single
null at the end.
True both for compile-time string literal catenation and for strcat().
But for the former, embedded null characters can slightly complicate
matters.  The value of a string literal isn't necessarily a string.
#include <stdio.h>
int main(void) {
    const char s[] = "abc\0def" "ghi\0";
    puts(s);
    for (size_t i = 0; i < sizeof s; i ++) {
        if (s == '\0') {
            fputs("\0", stdout);
        }
        else {
            putchar(s);
        }
    }
    putchar('\n');
}
Output:
abc
abc\0defghi\0\0

Furthermore, when a string is sent to a stream with %s or {f}puts,
the null byte is omitted, like in the calculation of length.
>
Clearly, there is a semantics that the part before the null byte
is the text processing payload; what I'm calling the abstract string.
Agreed.  To be clear, I like the idea of referring to the contents of a
string excluding the terminating null character as an "abstract string".

(With character encodings, it gets hairy. The part before the null
may be a UTF-8 sequence, where the abstract string consists of code
points. Which may be combining characters, so the True Scotsman's
abstract string is the sequence of characters.)
Yes.  With UTF-8, the term "abstract string" might reasonably refer
either to the sequence of bytes preceding the terminating '\0', or to
the sequence of code points.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */

Date Sujet#  Auteur
7 Mar 24 * Re: Implicit String-Literal Concatenation8Lawrence D'Oliveiro
7 Mar 24 +* Re: Implicit String-Literal Concatenation6Kaz Kylheku
8 Mar 24 i`* Re: Implicit String-Literal Concatenation5Keith Thompson
8 Mar 24 i +* Re: Implicit String-Literal Concatenation2Kaz Kylheku
8 Mar 24 i i`- Re: Implicit String-Literal Concatenation1Keith Thompson
8 Mar 24 i +- Re: Implicit String-Literal Concatenation1Chris M. Thomasson
8 Mar 24 i `- Re: Implicit String-Literal Concatenation1Richard Harnden
8 Mar 24 `- Re: Implicit String-Literal Concatenation1Keith Thompson

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal