Liste des Groupes | Revenir à l misc |
Rainer Weikusat <rweikusat@talktalk.net> wrote:cross@spitfire.i.gajendra.net (Dan Cross) writes:>Rainer Weikusat <rweikusat@talktalk.net> wrote:>Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:>
>
[...]
>Personally I think that writing bulky procedural stuff for something>
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.
Assuming that p is a pointer to the current position in a string, e is a
pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is
>
while (p < e && *p - '0' < 10) ++p;
>
That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).
It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).
The regex won't match any digits if there aren't any. In this case, the
match will fail. I didn't include the code for handling that because it
seemed pretty pointless for the example.
That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.
But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.
By the way, something that _would_ match `^[0-9]+$` might be:
Les messages affichés proviennent d'usenet.