As suggested, looking at the explain plan should tell you if map-join
is getting used.
Using a recent version with hive-on-tez would also give you further
speedup as map-joins are optimized further in it.
On Tue, Mar 15, 2016 at 9:32 AM, sreebalineni . wrote:
> You can think of map joins.If clus
How about using Hive on Spark so your A is your fact table and the rest of
your tables are dimensions.
20 million rows are not that big. has your fact table partitioned and more
importantly scattered by your dimensional keys?
CLUSTERED BY (
prod_id,
cust_id,
time_id,
channel_id,
promo_i
>I have a query where I am joining with 10 other entities
Are you using Tez?
This looks like an obvious candidate for a broadcast join.
Cheers,
Gopal
You can think of map joins.If cluster is configured by default it must be
happening already check query profile
On Tue, 15 Mar 2016 21:12 Himabindu sanka,
wrote:
> Hi Team,
>
>
>
> I have a query where I am joining with 10 other entities
>
>
>
> Like
>
>
>
> Select a.col1,b1.col1,b2.col1 from
>