Liste des Groupes | Revenir à cl python |
What am I missing? Handwavingly, start with the first digit, and as
long as the next character is a digit, multipliy the accumulated result
by 10 (or the appropriate base) and add the next value. Oh, and handle
scientific notation as a special case, and perhaps fail spectacularly
instead of recovering gracefully in certain edge cases. And in the
pathological case of a single number with 60 billion digits, run out of
memory (and complain loudly to the person who claimed that the file
contained a "dataset"). But why do I need to start with the least
significant digit?
>Streaming won't work because the file is gzipped. You have to receive>
the whole thing before you can unzip it. Once unzipped it will be even
larger, and all in memory.
GZip is specifically designed to be streamed. So, that's not a
problem (in principle), but you would need to have a streaming GZip
parser, quick search in PyPI revealed this package:
https://pypi.org/project/gzip-stream/ .
>
On Mon, Sep 30, 2024 at 6:20 PM Thomas Passin via Python-list
<python-list@python.org> wrote:>
On 9/30/2024 11:30 AM, Barry via Python-list wrote:>>
>On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list <python-list@python.org> wrote:>
>
>
import polars as pl
pl.read_json("file.json")
>
>
This is not going to work unless the computer has a lot more the 60GiB of RAM.
>
As later suggested a streaming parser is required.
Streaming won't work because the file is gzipped. You have to receive
the whole thing before you can unzip it. Once unzipped it will be even
larger, and all in memory.
--
https://mail.python.org/mailman/listinfo/python-list
Les messages affichés proviennent d'usenet.