Liste des Groupes | Revenir à cl misc |
scott@slp53.sl.home (Scott Lurndal) writes:Rainer Weikusat <rweikusat@talktalk.net> writes:>cross@spitfire.i.gajendra.net (Dan Cross) writes:>Rainer Weikusat <rweikusat@talktalk.net> wrote:>cross@spitfire.i.gajendra.net (Dan Cross) writes:>Rainer Weikusat <rweikusat@talktalk.net> wrote:>Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:>
>
[...]
>Personally I think that writing bulky procedural stuff for something>
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.
Assuming that p is a pointer to the current position in a string, e is a
pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is
>
while (p < e && *p - '0' < 10) ++p;
>
That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).
It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).
The regex won't match any digits if there aren't any. In this case, the
match will fail. I didn't include the code for handling that because it
seemed pretty pointless for the example.
That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.
That's the core part of matching someting equivalent to the regex [0-9]+
and the only part of it is which is at least remotely interesting.
>But that's not the same as a regex matcher, which has a semantic>
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.
Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?
>
[...]
>By the way, something that _would_ match `^[0-9]+$` might be:>
[too much code]
>
Something which would match [0-9]+ in its first argument (if any) would
be:
>
#include "string.h"
#include "stdlib.h"
>
int main(int argc, char **argv)
{
char *p;
unsigned c;
>
p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}
>
but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.
Personally, I'd use:
Albeit this is limited to strings of digits that sum to less than
ULONG_MAX...
$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>
>
int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;
>
if (argc < 2) return 1;
>
value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}
$ cc -o /tmp/a /tmp/a.c
$ /tmp/a 13254
$ echo $?
0
$ /tmp/a 23v23
$ echo $?
1
Les messages affichés proviennent d'usenet.