Uwe, I am not. Should I be? I forgot to mention earlier that the Parquet file came from Spark/PySpark.
On Wed, Apr 25, 2018 at 1:32 PM Uwe L. Korn <uw...@xhochy.com> wrote: > Hello Bryant, > > are you using any options on `pyarrow.parquet.read_table` or a possible > `to_pandas` afterwards? > > Uwe > > On Wed, Apr 25, 2018, at 7:27 PM, Bryant Menn wrote: > > I tried reading a Parquet file (<200MB, lots of text with snappy) using > > read_table and saw the memory usage peak over 8GB before settling back > down > > to ~200MB. This surprised me as I was expecting to be able to handle a > > Parquet file of this size with much less RAM (doing some processing with > > smaller VMs). > > > > I am not sure if this expected, but I thought I might check with everyone > > here and learn something new. Poking around it seems to be related with > > ParquetReader.read_all? > > > > Thanks in advance, > > Bryant >