Sujet : Re: relearning C: why does an in-place change to a char* segfault?
De : ben (at) *nospam* bsb.me.uk (Ben Bacarisse)
Groupes : comp.lang.cDate : 01. Aug 2024, 11:53:48
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <87h6c4ecoz.fsf@bsb.me.uk>
References : 1 2
User-Agent : Gnus/5.13 (Gnus v5.13)
Mark Summerfield <
mark@qtrac.eu> writes:
The formatting was messed up by Pan.
>
The function was:
>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s);
There's a tricky technicality with all of the character functions. They
take an int argument so that EOF (typically -1) can be passed, but
otherwise the argument must be an int "which shall be representable as
an unsigned char" or the result is undefined.
If char is signed (as it very often is) then in some locales, like the
ISO-8859-* ones, many lower-case letters are negative so, to be 100%
portable, you should write
*s = toupper((unsigned char)*s);
Now, since the behaviour is undefined, many implementations "do what you
want" but that only means you won't spot the bug by testing until the
code is linked to some old library that does not fix the issue!
s++;
}
}
Note that this does not crop up in a typical input loop:
int ch;
while ((ch = getchar()) != EOF)
putchar(toupper(ch));
because the input function "obtains [the] character as an unsigned char
converted to an int".
-- Ben.