Sujet : Re: Static regex for embedded systems
De : gneuner2 (at) *nospam* comcast.net (George Neuner)
Groupes : comp.arch.embeddedDate : 23. Jan 2025, 00:33:52
Autres entêtes
Organisation : i2pn2 (i2pn.org)
Message-ID : <biv2pjp1kua2nbd1pu87mu85i3vnistcjg@4ax.com>
References : 1 2 3 4 5
User-Agent : ForteAgent/8.00.32.1272
On Wed, 22 Jan 2025 17:53:15 +0100, Stefan Reuther
<
stefan.news@arcor.de> wrote:
Am 22.01.2025 um 01:38 schrieb George Neuner:
On Tue, 21 Jan 2025 18:03:48 +0100, pozz <pozzugno@gmail.com> wrote:
(Personally, I have no problem with handcrafted parsers.)
So long as they are correct 8-)
>
Correctness has an inverse correlation with complexity, so optimize for
non-complexity.
>
I would implement a two-stage parser: first break the lines into a
buffer, then throw a bunch of statements like
>
if (Parser p(str); p.matchString("+")
&& p.matchTextUntil(":", &prefix)
&& p.matchWhitespace() ...)
>
at this, with Parser being a small C++ class wrapping the individual
matching operations (strncmp, strspn, etc.)
>
Surely this is more complex as a regex/template, but still easy enough
to be "obviously correct".
>
Lex and Flex create table driven lexers (and driver code for them).
Under certain circumstances Flex can create far smaller tables than
Lex, but likely either would be massive overkill for the scenario you
described.
>
Maybe, maybe not. I find it hard to extrapolate to the complete task
from the two examples given. If there's hundreds of these templates,
that need to be matched bit-by-bit, I have the impression that lex would
be a quick and easy way to pull them out of a byte stream.
Agreed the task is ambigious, but my (possibly very wrong) impression
was of a relatively simple parser needing to recognize just a handful
of "commands".
"hundreds of templates" ... where "template" implies to me that there
is inline data to be extracted ... is more a job for Yacc/Bison than
for Lex/Flex.
But splitting it into lines first, and then tackling each line on its
own (...using lex, maybe? Or any other tool. Or a parser class.) might
be a good option as well. For example, this can answer the question
whether linefeeds are required to be \r\n, or whether a single \n also
suffices, in a central place. And if you decide that you want to do a
hard connection close if you see a \r or \n outside a \r\n sequence (to
prevent an attack such as SMTP smuggling), that would be easy.
>
Stefan