Sujet : Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API
De : olegsivokon (at) *nospam* gmail.com (Left Right)
Groupes : comp.lang.pythonDate : 02. Oct 2024, 07:05:02
Autres entêtes
Message-ID : <mailman.27.1727877147.3018.python-list@python.org>
References : 1 2 3 4 5 6 7 8
By that definition of "streaming", no parser can ever be streaming,
because there will be some constructs that must be read in their
entirety before a suitably-structured piece of output can be
emitted.
In the same email you replied to, I gave examples of languages for
which parsers can be streaming (in general): SCSI or IP. For some
languages (eg. everything in the context-free family) streaming
parsers are _in general_ impossible, because there are pathological
cases like the one with parsing numbers. But this doesn't mean that
you cannot come up with a parser that is only useful _sometimes_.
And, in practice, languages like XML or JSON do well with streaming,
even though in general it's impossible.
I'm sorry if this comes as a surprise. On one hand I don't want to
sound condescending, on the other hand, this is something that you'd
typically study in automata theory class. Well, not exactly in the
very same words, but you should be able to figure this stuff out if
you had that class.