Re: Enabling Snappy compression on Parquet

2018-08-22 Thread Patrick Duin
Thanks both for explaining! Snappy is doing fine for me at the moment but I was curious about the other options. I'll have look at the parquet tool and see if that can help me a bit as well. Op wo 22 aug. 2018 om 08:05 schreef Jörn Franke : > No parquet and orc have internal compression which

Re: Enabling Snappy compression on Parquet

2018-08-21 Thread Jörn Franke
No parquet and orc have internal compression which must be used over the external compression that you are referring to. Internal compression can be decompressed in parallel which is significantly faster. Internally parquet supports only snappy, gzip,lzo, brotli (2.4.), lz4 (2.4), zstd (2.4).

Re: Enabling Snappy compression on Parquet

2018-08-21 Thread Tanvi Thacker
Hi Patrick, *What are other formats supported? * - As far as I know, you can set any compression with any format (ORC, Text with snappy ,gzip etc). Are you looking for any specific format or compression? How can I verify a file is compressed and with what algorithm? - you may check parquet-tools

Enabling Snappy compression on Parquet

2018-08-10 Thread Patrick Duin
Hi, I got some hive tables in Parquet format and I am trying to find out how best to enable compression. Done a bit of searching and the information is a bit scattered but I found I can use this hive property to enable compression.It needs to be set before doing an insert. set parquet.compressio