nfluence/display/Hive/LanguageManual+Joins
>
>
>
> Regards,
> Bejoy KS
>
> --
> *From:* sudeep tokala
> *To:* user@hive.apache.org
> *Sent:* Tuesday, August 14, 2012 11:00 PM
> *Subject:* Re: OPTIMIZING A HIVE QUERY
>
> hi Bertrand,
>
&
Thanks for the reply Bertrand.
On Tue, Aug 14, 2012 at 2:12 PM, Bertrand Dechoux wrote:
>
> > My question was every join in a hive query would constitute to a
> Mapreduce job.
> In the general case, yes. BUT if one side of your join is small enough (ie
> you can keep all in memory), a hash join/m
> My question was every join in a hive query would constitute to a
Mapreduce job.
In the general case, yes. BUT if one side of your join is small enough (ie
you can keep all in memory), a hash join/map join can be performed which is
much more performant (no reduce is required).
Bejoy KS has just p
: sudeep tokala
To: user@hive.apache.org
Sent: Tuesday, August 14, 2012 11:00 PM
Subject: Re: OPTIMIZING A HIVE QUERY
hi Bertrand,
Thanks for the reply.
My question was every join in a hive query would constitute to a Mapreduce job.
Mapreduce job goes through serialization and deserilaization of
hi Bertrand,
Thanks for the reply.
My question was every join in a hive query would constitute to a Mapreduce
job.
Mapreduce job goes through serialization and deserilaization of objects
Isnt it a overhead.
Store data in the smarter way? can you please elaborate on this.
Regards
Sudeep
On Tue,
You may want to be clearer. Is your question : how can I change the
serialization strategy of Hive? (If so I let other users answer and I am
also interested in the answer.)
Else the answer is simple. If you want to join data which can not be stored
into memory, you need to serialize them. The only
On Tue, Aug 14, 2012 at 11:08 AM, sudeep tokala wrote:
> Hi all,
>
> How to avoid serialization and deserialization overhead in hive join query
> ? will this optimize my query performance.
>
> Regards
> sudeep
>