Re: Understanding ORC file format compression

2015-06-21 Thread Daniel Haviv
Hi Sreejesh, The data in an ORC file is divided into stripes and in these stripes columns are divided into column groups. The compression is at the column group level, so to answer your question ORC files are splittable no matter the codec used. Daniel > On 21 ביוני 2015, at 10:56, sreejesh s

Understanding ORC file format compression

2015-06-21 Thread sreejesh s
Hi, As per my understanding, the available codecs for ORC file format Hive table compression are either Zlib or Snappy.Both the compression techniques are non splittable.. Does it mean that any queries on Hive table stored as ORC and compressed will not run multiple maps in parallel ??? I know t