Re: [jira] [Created] (HIVE-2845) Add support for index joins in Hive

Mahsa Mofidpoor Tue, 10 Jul 2012 17:13:40 -0700

Hello all,

On Tue, Mar 6, 2012 at 8:31 PM, Namit Jain (Created) (JIRA) <j...@apache.org
> wrote:


> Add support for index joins in Hive
> -----------------------------------
>
>                  Key: HIVE-2845
>                  URL: https://issues.apache.org/jira/browse/HIVE-2845
>              Project: Hive
>           Issue Type: New Feature
>             Reporter: Namit Jain
>
>
> Hive supports indexes, which are used for filters currently.
>
> It would be very useful to add support for index-based joins in Hive.
> If 2 tables A and B are being joined, and an index exists on the join key
> of A,
> B can be scanned (by the mappers), and for each row in B, a lookup for the
> corresponding row in A can be performed.


According to
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins, only
the last table which is streamed could be scanned by an index which is in
this case B. Please correct me if I'm wrong.

This can be very useful for some usecases.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators:
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
> The process may be re-writing the original query as an added-stage in
physical optimizer, but would not produce any different MapReduce job like
the ones that HiveSkewJoin does in the physical optimizer. If this
is effective, how would that  query rewriting process be? Should it match
with a "JOIN" rule, like HiveSkewJoin, and then replace the second "TS"?
How?

I am eager to implement this issue an I was wondering if it could be
assigned to me.
I appreciate any hints/clues in advance.

Regards,
Mahsa

Re: [jira] [Created] (HIVE-2845) Add support for index joins in Hive

Reply via email to