WHERE
attribute_X1='1' AND attribute_X2='1' ) atON jt.customerId
= at.customerId
From: ptrst
To: user@hive.apache.org; Sanjay Subramanian
Sent: Wednesday, October 22, 2014 1:02 AM
Subject: Re: Optimize hive external tables with serde
ad
ad 1) My files are not bigger than Block Size. To be precise all data from
one day are up to 2GB gzipped. Unzipped they are ~15GB. The are split in
one folder into files less then block size (block size in my case is 128MB,
files are ~100MB). I can transform them to other format if you think it
wil
1. The gzip files are not splittable, so gzip itself will make the queries
slower.
2. As a reference for JSON serdes , here is a example from my blog
http://bigdatalatte.wordpress.com/2014/08/21/denormalizing-json-arrays-in-hive/
3. Need to see your query first to try and optimize it
4. Even if y