Sujet : Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API
De : olegsivokon (at) *nospam* gmail.com (Left Right)
Groupes : comp.lang.pythonDate : 30. Sep 2024, 21:30:06
Autres entêtes
Message-ID : <mailman.13.1727724684.3018.python-list@python.org>
References : 1 2 3 4
Streaming won't work because the file is gzipped. You have to receive
the whole thing before you can unzip it. Once unzipped it will be even
larger, and all in memory.
GZip is specifically designed to be streamed. So, that's not a
problem (in principle), but you would need to have a streaming GZip
parser, quick search in PyPI revealed this package:
https://pypi.org/project/gzip-stream/ .
On Mon, Sep 30, 2024 at 6:20 PM Thomas Passin via Python-list
<
python-list@python.org> wrote:
>
On 9/30/2024 11:30 AM, Barry via Python-list wrote:
>
>
On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list <python-list@python.org> wrote:
>
>
import polars as pl
pl.read_json("file.json")
>
>
>
This is not going to work unless the computer has a lot more the 60GiB of RAM.
>
As later suggested a streaming parser is required.
>
Streaming won't work because the file is gzipped. You have to receive
the whole thing before you can unzip it. Once unzipped it will be even
larger, and all in memory.
--
https://mail.python.org/mailman/listinfo/python-list