Re: Experiences with match() subexpressions?

Liste des GroupesRevenir à cl awk 
Sujet : Re: Experiences with match() subexpressions?
De : janis_papanagnou+ng (at) *nospam* hotmail.com (Janis Papanagnou)
Groupes : comp.lang.awk
Date : 10. Apr 2025, 12:55:07
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vt8bit$2uiq5$1@dont-email.me>
References : 1 2 3
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
On 10.04.2025 13:08, Kenny McCormack wrote:
In article <vt7qs4$2gior$1@dont-email.me>,
Janis Papanagnou  <janis_papanagnou+ng@hotmail.com> wrote:
On 10.04.2025 09:06, Janis Papanagnou wrote:
I'm looking for subexpressions of regexp-matches using GNU Awk's
third parameter of match(). For example
>
  data = "R=r1,R=r2,R=r3,E=e"
  match (data, /^(R=([^,]+),){2,5}E=(.+)$/, arr)
>
The result stored in 'arr' seems to be determined by the static
parenthesis structure, so with the pattern repetition {2,5} only
the last matched data in the subexpression (r3) seems to persist
in arr. - I suppose there's no cute way to achieve what I wanted?
>
To clarify; what I wanted is access of the values "r1", "r2", "r3",
and "e" through 'arr'.
 
I have to admit that I (still) don't really understand how this match third
arg stuff works.

I've never used that before but it seems to be quite simple; for every
parenthesis group expression in the regexp it provides (statically, as
the parentheses are written, from left to right) an array element with
the expanded matched subexpression.

I.e., I can never predict what will happen, so I always
just dump out the array and try to reverse-engineer it each time I need to
use it.
 
I adapted your code into the following test script:
 
--- Cut Here ---
#!/bin/sh
gawk 'BEGIN {
    data = "R=r1,R=r2,R=r3,E=e"
    match (data, /^(R=([^,]+),){2,5}E=(.+)$/, arr)
    for (i in arr) print i,arr[i]
    }'
 
# To clarify; what I wanted is access of the values "r1", "r2", "r3",
# and "e" through 'arr'.
--- Cut Here ---
 
The output I get is:
 
--- Cut Here ---
0start 1
0length 18
3start 18
1start 11
2start 13
3length 1
2length 2
1length 5

Above output stuff appears because in 'arr' there's additional elements
about the pattern positions stored.

I don't need that so I'm just interested in the data patterns below and
iterate with a index-counted loop...

0 R=r1,R=r2,R=r3,E=e

the whole expression

1 R=r3,

the expression in the first parenthesis

2 r3

the expression in the second, embedded parenthesis

3 e

the expression in the final parenthesis

--- Cut Here ---
 
After playing around a bit, I could not come up with any sensible way of
getting what you want to get.

Yeah, Arnold just told me the same; that it's impossible because the
underlying GNU regexp library doesn't support what I'm looking for.

What I considered a possible workaround (in this case) is to sequence
the (...){2,5} expression by using sequences of (...)? expressions.
(But in the general case, for larger ranges than 2-5, that's neither
feasible nor sensible any more.)

 
As an alternative, it sounds like you could just could just split the
string on the comma; that would get you:

Yes, that was also how I did such things in the past. Only when I saw
that "third argument" to match() I hoped the two-level parsing could
be simplified in one step. The reason was that I thought to have seen
other languages (Perl, maybe?) that supported such a feature.

 
    R=r1
    R=r2
    R=r3
    E=e
 
Or, for finer control, you could use patsplit().

I think I'll do the parsing the straightforward two-step way as I did
before the GNU Awk specific functions were available; it's probably
also the clearest way to program that functionality.

Janis


Date Sujet#  Auteur
10 Apr 25 * Experiences with match() subexpressions?22Janis Papanagnou
10 Apr 25 `* Re: Experiences with match() subexpressions?21Janis Papanagnou
10 Apr 25  +* Re: Experiences with match() subexpressions?14Kenny McCormack
10 Apr 25  i`* Re: Experiences with match() subexpressions?13Janis Papanagnou
10 Apr 25  i `* Re: Experiences with match() subexpressions?12Kenny McCormack
10 Apr 25  i  `* Re: Experiences with match() subexpressions?11Janis Papanagnou
11 Apr 25  i   `* Re: Experiences with match() subexpressions?10Aharon Robbins
11 Apr 25  i    +* Re: Experiences with match() subexpressions?5Janis Papanagnou
11 Apr 25  i    i+- Re: Experiences with match() subexpressions?1Kaz Kylheku
18 Apr 25  i    i`* Re: Experiences with match() subexpressions?3Manuel Collado
18 Apr 25  i    i +- Re: Experiences with match() subexpressions?1Kenny McCormack
18 Apr 25  i    i `- Re: Experiences with match() subexpressions?1Janis Papanagnou
11 Apr 25  i    +- Re: Experiences with match() subexpressions?1Kaz Kylheku
11 Apr 25  i    +* The new matcher (Was: Experiences with match() subexpressions?)2Kenny McCormack
11 Apr 25  i    i`- Re: The new matcher (Was: Experiences with match() subexpressions?)1Janis Papanagnou
11 Apr 25  i    `- Re: Experiences with match() subexpressions?1Kaz Kylheku
11 Apr 25  `* Re: Experiences with match() subexpressions?6Ed Morton
13 Apr 25   `* Re: Experiences with match() subexpressions?5Ed Morton
14 Apr 25    `* Nitpicking the code (Was: Experiences with match() subexpressions?)4Kenny McCormack
14 Apr 25     `* Re: Nitpicking the code (Was: Experiences with match() subexpressions?)3Janis Papanagnou
15 Apr 25      `* Re: Nitpicking the code (Was: Experiences with match() subexpressions?)2Ed Morton
15 Apr 25       `- Re: Nitpicking the code (Was: Experiences with match() subexpressions?)1Janis Papanagnou

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal