Thanks Bejoy On Tue, Aug 14, 2012 at 2:00 PM, Bejoy Ks <bejoy...@yahoo.com> wrote:
> Hi Sudeep > > You can also look at join optimizations like map join, bucketed map > join,sort merge join etc and choose the right one that fits your > requirement. > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins > > > > Regards, > Bejoy KS > > ------------------------------ > *From:* sudeep tokala <sudeeptok...@gmail.com> > *To:* user@hive.apache.org > *Sent:* Tuesday, August 14, 2012 11:00 PM > *Subject:* Re: OPTIMIZING A HIVE QUERY > > hi Bertrand, > > Thanks for the reply. > > My question was every join in a hive query would constitute to a Mapreduce > job. > Mapreduce job goes through serialization and deserilaization of objects > Isnt it a overhead. > > Store data in the smarter way? can you please elaborate on this. > > Regards > Sudeep > > On Tue, Aug 14, 2012 at 11:39 AM, Bertrand Dechoux <decho...@gmail.com>wrote: > > You may want to be clearer. Is your question : how can I change the > serialization strategy of Hive? (If so I let other users answer and I am > also interested in the answer.) > > Else the answer is simple. If you want to join data which can not be > stored into memory, you need to serialize them. The only solution is to > store the data in a smarter way which would not require you to do the join. > By the way, how do you know the serialisation is the bottleneck? > > Bertrand > > > On Tue, Aug 14, 2012 at 5:11 PM, sudeep tokala <sudeeptok...@gmail.com>wrote: > > > > On Tue, Aug 14, 2012 at 11:08 AM, sudeep tokala <sudeeptok...@gmail.com>wrote: > > Hi all, > > How to avoid serialization and deserialization overhead in hive join query > ? will this optimize my query performance. > > Regards > sudeep > > > > > > -- > Bertrand Dechoux > > > > >