Sujet : Re: Experiences with match() subexpressions?
De : gazelle (at) *nospam* shell.xmission.com (Kenny McCormack)
Groupes : comp.lang.awkDate : 10. Apr 2025, 12:08:55
Autres entêtes
Organisation : The official candy of the new Millennium
Message-ID : <vt88s7$1ghd2$1@news.xmission.com>
References : 1 2
User-Agent : trn 4.0-test77 (Sep 1, 2010)
In article <
vt7qs4$2gior$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou+
ng@hotmail.com> wrote:
On 10.04.2025 09:06, Janis Papanagnou wrote:
I'm looking for subexpressions of regexp-matches using GNU Awk's
third parameter of match(). For example
data = "R=r1,R=r2,R=r3,E=e"
match (data, /^(R=([^,]+),){2,5}E=(.+)$/, arr)
The result stored in 'arr' seems to be determined by the static
parenthesis structure, so with the pattern repetition {2,5} only
the last matched data in the subexpression (r3) seems to persist
in arr. - I suppose there's no cute way to achieve what I wanted?
>
To clarify; what I wanted is access of the values "r1", "r2", "r3",
and "e" through 'arr'.
I have to admit that I (still) don't really understand how this match third
arg stuff works. I.e., I can never predict what will happen, so I always
just dump out the array and try to reverse-engineer it each time I need to
use it.
I adapted your code into the following test script:
--- Cut Here ---
#!/bin/sh
gawk 'BEGIN {
data = "R=r1,R=r2,R=r3,E=e"
match (data, /^(R=([^,]+),){2,5}E=(.+)$/, arr)
for (i in arr) print i,arr[i]
}'
# To clarify; what I wanted is access of the values "r1", "r2", "r3",
# and "e" through 'arr'.
--- Cut Here ---
The output I get is:
--- Cut Here ---
0start 1
0length 18
3start 18
1start 11
2start 13
3length 1
2length 2
1length 5
0 R=r1,R=r2,R=r3,E=e
1 R=r3,
2 r3
3 e
--- Cut Here ---
After playing around a bit, I could not come up with any sensible way of
getting what you want to get.
As an alternative, it sounds like you could just could just split the
string on the comma; that would get you:
R=r1
R=r2
R=r3
E=e
Or, for finer control, you could use patsplit().
-- The randomly chosen signature file that would have appeared here is more than 4lines long. As such, it violates one or more Usenet RFCs. In order to remainin compliance with said RFCs, the actual sig can be found at the following URL: http://user.xmission.com/~gazelle/Sigs/Reaganomics