Sujet : Re: relearning C: why does an in-place change to a char* segfault?
De : 643-408-1753 (at) *nospam* kylheku.com (Kaz Kylheku)
Groupes : comp.lang.cDate : 02. Aug 2024, 01:37:44
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <20240801172148.200@kylheku.com>
References : 1 2 3
User-Agent : slrn/pre1.0.4-9 (Linux)
On 2024-08-01, Bart <
bc@freeuk.com> wrote:
On 01/08/2024 20:39, Kaz Kylheku wrote:
On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
>
#include <ctype.h>
#include <stdio.h>
>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
>
int main() {
char* text = "this is a test";
The "this is a test" object is a literal. It is part of the program's image.
>
So is the text here:
>
char text[]="this is a test";
>
But this can be changed without making the program self-modifying.
The array which is initialized by the literal is what can be
changed.
In this situation, the literal is just initializer syntax,
not required to be an object with an address.
But there could well be such an object in the program image,
especially if the array is automatic, and thus instantiated
many times.
If the program tries to search for that object and modify it,
it will run into UB.
I guess it depends on what is classed as the program's 'image'.
>
I'd say the image in the state it is in just after loading or just
before execution starts (since certain fixups are needed). But some
sections will be writable during execution, some not.
Programs can self-modify in ways designed into the run time.
The toaster has certain internal receptacles that can take
certain forks, according to some rules, which do not affect
the user operating the toaster according to the manual.
The dangers are small, but there must be reasons why a dedication
section is normally used. gcc on Windows creates up to 19 sections, so
it would odd for literal strings to share with code.
One reason is that PC-relative addressing can be used by code to
find its literals. Since that usually has a limited range, it helps
to keep the literals with the code. Combining sections also reduces
size. The addressing is also relocatable, which is useful in shared
libs.
-- TXR Programming Language: http://nongnu.org/txrCygnal: Cygwin Native Application Library: http://kylheku.com/cygnalMastodon: @Kazinator@mstdn.ca