On 20.11.2024 12:30,
Muttley@DastartdlyHQ.org wrote:
On Wed, 20 Nov 2024 11:51:11 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:
On 20.11.2024 09:21, Muttley@DastartdlyHQ.org wrote:
On Tue, 19 Nov 2024 18:43:48 -0800
merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:
>
I'm often reminded of this as I've been coding very little in Perl these
days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.
>
Regex itself is clumsy beyond simple search and replace patterns. A lot of
stuff I've seen done in regex would have better done procedurally at the
expense of slightly more code but a LOT more readability. Also given its
effectively a compact language with its own grammar and syntax IMO it should
not be the core part of any language as it can lead to a syntatic mess,
which
is what often happens with Perl.
>
I wouldn't look at it that way. I've seen Regexps as part of languages
usually in well defined syntactical contexts. For example, like strings
are enclosed in "...", Regexps could be seen within /.../ delimiters.
GNU Awk (in recent versions) went towards first class "strongly typed"
Regexps which are then denoted by the @/.../ syntax.
>
I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?
Anything that can be done in regex can obviously also be done procedurally.
At the point regex expression become unwieldy - usually when substitution
variables raise their heads - I prefer procedural code as its also often
easier to debug.
You haven't even tried to honestly answer my (serious) question.
With your statement above and your hostility below, it rather seems
you have no clue of what I am talking about.
In practice, given that a Regexp conforms to a FSA, any Regexp can be
precompiled and used multiple times. The thing I had used in Java - it
Precompiled regex is no more efficient than precompiled anything , its all
just assembler at the bottom.
The Regexps are a way to specify the words of a regular language;
for pattern matching the expression gets interpreted or compiled; you
specify it, e.g., using strings of characters and meta-characters.
If you have a programming language where that string gets repeatedly
interpreted then it's slower than a precompiled Regexp expression.
I give you examples...
(1) DES encryption function
(1a) ciphertext = des_encode (key, plaintext)
(1b) cipher = des (key)
ciphertext = cipher.encode (plaintext)
In case (1) you can either call the des encription (decription) for
any (key, plaintext)-pair in a procedural function as in (1a), or
you can create the key-specific encryption once and encode various
texts with the same cipher object as in (1b).
(2) regexp matching
(2a) location = regexp (pattern, string)
(2b) fsm = rexexp (pattern)
location = fsm.match (string)
In case (2) you can either do the match in a string with a pattern
in a procedural form as in (2a) or you can create the FSM for the
given Regexp just once and apply it on various strings as in (2b).
That's what I was talking about.
Only if key (in (1)) or pattern (in (2)) are static or "constant"
that compilation could (but only theoretically) be done in advance
and optimizing system may (or may not) precompile it (both) to
[similar] assembler code. How should that work with regexps or DES?
The optimizing system would need knowledge how to use the library
code (DES, Regexps, ...) to create binary structures based on the
algorithms (key-initialization in DES, FSM-generation in Regexps).
This is [statically] not done.
Otherwise - i.e. the normal, expected case - there's an efficiency
difference to observe between the respective cases of (a) and (b).
then operate on that same object. (Since there's still typical Regexp
syntax involved I suppose that is not what you meant by "procedural"?)
If you don't know the different between declarative syntax like regex and
procedural syntax then there's not much point continuing this discussion.
Why do you think so, and why are you saying that? - That wasn't and
still isn't the point. - You said upthread
"A lot of stuff I've seen done in regex would have better done
procedurally at the expense of slightly more code but a LOT more
readability."
and I asked
"I'm curious what you mean by Regexps presented in a "procedural"
form.
Can you give some examples?"
What you wanted to say wasn't clear to me, since you were complaining
about the _Regexp syntax_. So it couldn't be meant to just write
regexp (pattern, string) instead of pattern ~ string
but to somehow(!) transform "pattern", say, like /[0-9]+(ABC)?x*foo/,
to something syntactically "better".
I was interested in that "somehow" (that I emphasized), and in an
example how that would look like in your opinion.
If you're unable to answer that simple question then just take that
simple regexp /[0-9]+(ABC)?x*foo/ example and show us your preferred
procedural variant.
But my expectation is that you cannot provide any reasonable example
anyway.
Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.
Janis