Re: Metadata for partitioned datasets in pyarrow.parquet

2019-05-21 Thread Richard Zamora
> But it would be good if some people more familiar with Parquet could chime > in here. > > Best, > Joris > > Op do 16 mei 2019 om 16:37 schreef Richard Zamora : > >> Note that I was asked to post here after making a similar comment

Metadata for partitioned datasets in pyarrow.parquet

2019-05-16 Thread Richard Zamora
Note that I was asked to post here after making a similar comment on GitHub (https://github.com/apache/arrow/pull/4236)… I am hoping to help improve the use of pyarrow.parquet within dask (https://github.com/dask/dask). To this end, I put together a simple notebook to explore how pyarrow.parque