Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]

Liste des GroupesRevenir à cu shell 
Sujet : Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]
De : 643-408-1753 (at) *nospam* kylheku.com (Kaz Kylheku)
Groupes : comp.unix.shell
Date : 24. Jul 2024, 20:35:51
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <20240724112619.254@kylheku.com>
References : 1 2 3 4 5 6 7
User-Agent : slrn/pre1.0.4-9 (Linux)
On 2024-07-24, Ben Bacarisse <ben@bsb.me.uk> wrote:
Kaz Kylheku <643-408-1753@kylheku.com> writes:
>
On 2024-07-23, Ben Bacarisse <ben@bsb.me.uk> wrote:
Kaz Kylheku <643-408-1753@kylheku.com> writes:
This matters when regexes are used for matching a prefix of the input;
if the regex is interpreted according to the theory should match
the longest possible prefix; it cannot ignore R3, which matches
thousands of symbols, because R2 matched three symbols.
>
This is more a consequence of the different views. The in the formal
theory there is no notion of "matching".  Regular expressions define
languages (i.e. sets of sequences of symbols) according to a recursive
set of rules.  The whole idea of an RE matching a string is from their
use in practical applications.
>
Under the set view, we can ask, what is the longest prefix of
the input which belongs to the language R1|R2. The answer is the
same for R2|R1, which denote the same set, since | corresponds
to set union.
>
What is "the input" in the set view.  The set view is simply a recursive
definition of the language.

It is a separate string under consideration.

We have a set, and are asking the question "what is the longest prefix
of the given string which is a member of the set".

Broken regular expressions identify the longest prefix, except
when the | operator is used; then they just identify a prefix,
not necessarily longest.
>
What is a "broken" RE in the set view?

Inconsistency in being able to answer the question "what is the longest
prefix of the string which is a member of the set".

Broken regexes contain a pitfall: they deliver the right answer
for expressions like ab*. If the input is "abbbbbbbc",
they identify the entire "abbbbbbb" prefix. But if the branch
operator is used, as in "a|ab*", oops, they short-circuit.
The "a" matches a prefix of the input, and so that's done; no need
to match the "ab*" part of the branch.

The "a" prefix is in the language described from the language; a
set element has been identified. But it's not the longest one.

It is an inconsistency. If the longest match is not required, why
bother finding one for "ab*"; for that expression, the "a" prefix could
also just be returned.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

Date Sujet#  Auteur
22 Jul 24 * bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]19Kenny McCormack
23 Jul 24 +* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]9Kaz Kylheku
23 Jul 24 i`* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]8Janis Papanagnou
23 Jul 24 i +* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]6Kenny McCormack
23 Jul 24 i i`* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]5Janis Papanagnou
23 Jul 24 i i `* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]4Kenny McCormack
23 Jul 24 i i  `* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]3Janis Papanagnou
23 Jul 24 i i   `* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]2Kenny McCormack
24 Jul 24 i i    `- Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]1Janis Papanagnou
23 Jul 24 i `- Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]1Kaz Kylheku
23 Jul 24 `* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]9Arti F. Idiot
23 Jul 24  `* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]8Kenny McCormack
23 Jul 24   `* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]7Kaz Kylheku
24 Jul 24    +* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]5Ben Bacarisse
24 Jul 24    i`* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]4Kaz Kylheku
24 Jul 24    i `* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]3Ben Bacarisse
24 Jul 24    i  `* Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]2Kaz Kylheku
24 Jul 24    i   `- Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]1Ben Bacarisse
24 Jul 24    `- Re: bash aesthetics question: special characters in reg exp in [[ ... =~~ ... ]]1Janis Papanagnou

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal