sage-
From: Jörn Franke [mailto:jornfra...@gmail.com]
Sent: 18 January 2016 08:37
To: user@hive.apache.org
Subject: Re: optimize joins in hive 1.2.1
Do you have some data model?
Basically modern technologies, such as Hive, but also relational database,
suggest to prejoin tables and
Do you have some data model?
Basically modern technologies, such as Hive, but also relational database,
suggest to prejoin tables and working on big flat tables. The reason is that
they are distributed systems and you should avoid transferring for each query a
lot of data between nodes.
Hence,
Hi Divya
Below are some quick tips that always helps:
1. Partition your data set and use partition keys while selecting data to
reduce data set.
2.Also, if both data sets can be joined by the same partition key then use
it in the join.
3. If one table being joined is a small table then you can
Hi,
Need tips/guidance to optimize(increase perfomance) billion data rows
joins in hive .
Any help would be appreciated.
Thanks,
Divya