Re: Static regex for embedded systems

Liste des GroupesRevenir à ca embedded 
Sujet : Re: Static regex for embedded systems
De : stefan.news (at) *nospam* arcor.de (Stefan Reuther)
Groupes : comp.arch.embedded
Date : 22. Jan 2025, 17:53:15
Autres entêtes
Message-ID : <vmrbac.3i8.1@stefan.msgid.phost.de>
References : 1 2 3 4
User-Agent : Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 Hamster/2.1.0.1538
Am 22.01.2025 um 01:38 schrieb George Neuner:
On Tue, 21 Jan 2025 18:03:48 +0100, pozz <pozzugno@gmail.com> wrote:
(Personally, I have no problem with handcrafted parsers.)
 
So long as they are correct 8-)

Correctness has an inverse correlation with complexity, so optimize for
non-complexity.

I would implement a two-stage parser: first break the lines into a
buffer, then throw a bunch of statements like

   if (Parser p(str); p.matchString("+")
         && p.matchTextUntil(":", &prefix)
         && p.matchWhitespace() ...)

at this, with Parser being a small C++ class wrapping the individual
matching operations (strncmp, strspn, etc.)

Surely this is more complex as a regex/template, but still easy enough
to be "obviously correct".

Lex and Flex create table driven lexers (and driver code for them).
Under certain circumstances Flex can create far smaller tables than
Lex, but likely either would be massive overkill for the scenario you
described.

Maybe, maybe not. I find it hard to extrapolate to the complete task
from the two examples given. If there's hundreds of these templates,
that need to be matched bit-by-bit, I have the impression that lex would
be a quick and easy way to pull them out of a byte stream.

But splitting it into lines first, and then tackling each line on its
own (...using lex, maybe? Or any other tool. Or a parser class.) might
be a good option as well. For example, this can answer the question
whether linefeeds are required to be \r\n, or whether a single \n also
suffices, in a central place. And if you decide that you want to do a
hard connection close if you see a \r or \n outside a \r\n sequence (to
prevent an attack such as SMTP smuggling), that would be easy.


  Stefan

Date Sujet#  Auteur
21 Jan 25 * Static regex for embedded systems11pozz
21 Jan 25 +- Re: Static regex for embedded systems1David Brown
21 Jan 25 `* Re: Static regex for embedded systems9Stefan Reuther
21 Jan 25  `* Re: Static regex for embedded systems8pozz
21 Jan 25   +- Re: Static regex for embedded systems1Hans-Bernhard Bröker
22 Jan 25   +- Re: Static regex for embedded systems1Niocláiſín Cóilín de Ġloſtéir
22 Jan 25   `* Re: Static regex for embedded systems5George Neuner
22 Jan 25    +* Re: Static regex for embedded systems2David Brown
23 Jan 25    i`- Re: Static regex for embedded systems1George Neuner
22 Jan 25    `* Re: Static regex for embedded systems2Stefan Reuther
23 Jan 25     `- Re: Static regex for embedded systems1George Neuner

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal