Re: Preliminary version of new regex matcher for gawk now available

Liste des Groupes 
Sujet : Re: Preliminary version of new regex matcher for gawk now available
De : arnold (at) *nospam* skeeve.com (Aharon Robbins)
Groupes : comp.lang.awk
Date : 26. Jul 2024, 08:31:53
Autres entêtes
Organisation : Arnold Robbins
Message-ID : <66a350e9$0$706$14726298@news.sunsite.dk>
References : 1 2
User-Agent : trn 4.0-test77 (Sep 1, 2010)
In article <v7tf29$2984r$1@dont-email.me>,
Janis Papanagnou  <janis_papanagnou+ng@hotmail.com> wrote:
On 25.07.2024 11:44, Aharon Robbins wrote:
Hi All.
 
I've been working with Mike Haertel (the original author of GNU grep)
for a number of months now.  He is writing a new regexp matcher for
use in gawk (and other places, as people desire).
>
[ clipped ]

My system complains about -std=c++20 so I cannot test it. (I think
I'll wait for a native C release.)

That will be a while. It's not hard to build current GCC from scratch
on a Linux system.

Questions, comments, and *bug reports* are welcome.
>
Well, I skimmed through the txt file on Mike's git page to learn
about the algorithm; especially the algorithm and its complexity
is of interest to me. The document was not quite clear about that
(or at least made me doubt) beyond the general and typical O(N*M)
characteristics.

You can email Mike directly about the technical stuff. It's mostly
beyond me.  Or, just open an issue on the GitHub and ask questions there.

I forgot to mention what is likely the most important point about
the new matcher, which is that it is fully POSIX-compliant. The
existing GNU matchers are not, and likely never will be.  There's at
least one bug I reported a few years back in the GNU matchers
that MinRX doesn't have, also.

This matcher also has advantages for me as the maintainer.

Algorithm simplicity is nice but as I understand there's not yet
performance comparisons done?

They will be done.  By the time MinRX is in gawk for real, it will
be performant, and in C.

Unless it was a deliberate offer to use GNU Awk as a test bed.
And "nearly-feature-complete implementation" (section Features)
is not quite a fruitful marketing concept.

As far as I'm concerned, it is feature complete.  However, it
doesn't support POSIX BREs.

I also wonder why BSD and GNU extensions are supported but not
the very useful abbreviations for {some,all} Perl RE shortcuts.

Because they're just window dressing. I have no desire to be
perl compatible.  My needs are to be able to do what gawk currently
does, no more.

HTH.
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com

Date Sujet#  Auteur
25 Jul 24 * Preliminary version of new regex matcher for gawk now available6Aharon Robbins
25 Jul 24 `* Re: Preliminary version of new regex matcher for gawk now available5Janis Papanagnou
26 Jul 24  +* Re: Preliminary version of new regex matcher for gawk now available3Aharon Robbins
27 Jul 24  i`* C++20??? (Was: Preliminary version of new regex matcher for gawk now available)2Kenny McCormack
10 Aug 24  i `- Re: C++20??? (Was: Preliminary version of new regex matcher for gawk now available)1Kenny McCormack
26 Jul 24  `- Re: Preliminary version of new regex matcher for gawk now available1Ben Bacarisse

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal