Thanks both for explaining!
Snappy is doing fine for me at the moment but I was curious about the other
options.
I'll have look at the parquet tool and see if that can help me a bit as
well.
Op wo 22 aug. 2018 om 08:05 schreef Jörn Franke :
> No parquet and orc have internal compression which
No parquet and orc have internal compression which must be used over the
external compression that you are referring to.
Internal compression can be decompressed in parallel which is significantly
faster. Internally parquet supports only snappy, gzip,lzo, brotli (2.4.), lz4
(2.4), zstd (2.4).
Hi Patrick,
*What are other formats supported? *
- As far as I know, you can set any compression with any format (ORC, Text
with snappy ,gzip etc). Are you looking for any specific format or
compression?
How can I verify a file is compressed and with what algorithm?
- you may check parquet-tools
Hi,
I got some hive tables in Parquet format and I am trying to find out how
best to enable compression.
Done a bit of searching and the information is a bit scattered but I found
I can use this hive property to enable compression.It needs to be set
before doing an insert.
set parquet.compressio