On 2025-02-15, Michael S <
already5chosen@yahoo.com> wrote:
On Fri, 14 Feb 2025 20:51:38 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
Actually, in the same code, I'm also using the strtok() function
>
strtok() is one of the relatively small set of more problemetic
functions in C library that are not thread-safe.
The design of the strtok() API is not inherently unsafe against threads;
but it requires thread-local storage to be safe.
Since ISO C has threads now, it now takes the opportunity to
explicitly removes any requirements for thread safety in strtok.
However, it is possible for an implementation to step forward and
make it thread safe. For instance, in a POSIX system, a thread-specific
key can be allocated for strtok on library initialization,
or the first use of strtok (via pthread_once).
static pthread_key_t strtok_key;
// ...
if (pthread_key_create(&strtok_key, NULL))
...
Then strtok does
char *strtok (char * restrict str, const char * restrit delim)
{
if (str == NULL)
str = pthread_getspecific(strtok_key);
...
// all return paths do this, if str has changed:
pthread_setspecific(strtok_key, str);
return ...;
}
Only problem is that this will not perform anywhere near as well as
strtok_r, which specifies an inexpensive location for the context
pointer.
If you only care about POSIX target, the I'd reccomend to avoid strtok
and to use strtok_r().
I would recommend learning about strspn and strcspn, and writing
your own tokenizing loop:
/* strtok-like loop: input variabls are str and delim */
for (;;) {
/* skip delim chars to find start of tok */
char *tok = str + strspn(str, delim);
/* tokens must be nonempty;
if (*tok == 0)
break;
/* OK; tok points to non-delim char.
Find end of token: skip span of non-delim chars. */
char *end = tok + strcspn(str, delim);
/* Record whether the end of the token is the end
of the string. */
char more = *end;
/* null-terminate token */
*end = 0;
{ /* process tok here */ }
if (!more)
break;
/* If there is more material after the tok, point
str there and continue */
str = end + 1;
}
The strok function is ill-suited to many situations. For instance,
there are situations in which you do want empty tokens, like CSV, such
that ",abc,def," shows four tokens, two of them empty.
With the strspn and strcspn building blocks, you can easily whip up a
custom tokenizing loop that has the right semantics for the situation.
We can also write our loop such that it restores the original
character that was overwritten in order to null-terminate the token,
simply by adding *end = more. Thus when the loop ends, the string
is restored to its original state.
I can understand code like that above without having to look up
anything, but if I see strtok or strtok_r code after many years of not
working with strtok, I will need a refresher on how exactly they define
a token.
-- TXR Programming Language: http://nongnu.org/txrCygnal: Cygwin Native Application Library: http://kylheku.com/cygnalMastodon: @Kazinator@mstdn.ca