Hello Feras,

`DELTA_BINARY_PACKED` is at the moment only implemented in parquet-cpp on the 
read path. The necessary encoder implementation for this code is missing at the 
moment.

The change in file size is something I also don't understand. The only 
difference between the two versions is that with version 2, we encode uint32 
columns in version 1 as INT64 whereas in version two, we can encode them as 
UINT32. This type was not available in version 1. It would be nice, if you 
could narrow down the issue to e.g. the column which causes the increase in 
size. You might also use the Java parquet-tools or parquet-cli to inspect the 
size statistics of the parts of the individual Parquet file.

Uwe

On Fri, May 11, 2018, at 3:07 AM, Feras Salim wrote:
> Hi, I was wondering if I'm missing something or currently the
> `DELTA_BINARY_PACKED` is only available for reading when it comes to
> parquet files, I can't find a way for the writer to encode timestamp data
> with `DELTA_BINARY_PACKED`, furthermore I seem to get about 10% increase in
> final file size when I change from ver 1 to ver 2 without changing anything
> else about the schema or data.

Reply via email to