Re: PyArrow and Parquet DELTA_BINARY_PACKED

2018-05-18 Thread Feras Salim
f > this were the case with larger files, though (I'm not sure what > fraction of a column chunk consists of data page headers vs. actual > data in practice) > > - Wes > > On Tue, May 15, 2018 at 12:17 AM, Feras Salim wrote: > > Hi Uwe, > > > > I'm qui

Re: PyArrow and Parquet DELTA_BINARY_PACKED

2018-05-14 Thread Feras Salim
ize statistics of the parts of the individual Parquet file. > > Uwe > > On Fri, May 11, 2018, at 3:07 AM, Feras Salim wrote: > > Hi, I was wondering if I'm missing something or currently the > > `DELTA_BINARY_PACKED` is only available for reading when it comes to > > p

PyArrow and Parquet DELTA_BINARY_PACKED

2018-05-10 Thread Feras Salim
Hi, I was wondering if I'm missing something or currently the `DELTA_BINARY_PACKED` is only available for reading when it comes to parquet files, I can't find a way for the writer to encode timestamp data with `DELTA_BINARY_PACKED`, furthermore I seem to get about 10% increase in final file size wh