[jira] Commented: (HIVE-1642) Convert join queries to map-join based on size of table/row

Namit Jain (JIRA) Sun, 14 Nov 2010 22:21:40 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931991#action_12931991
 ]


Namit Jain commented on HIVE-1642:
----------------------------------

1. All the new parameters in HiveConf.java need to be added in hive-default.xml 
along with comments
2. Have you run all the tests with hive.auto.convert.join - all the diffs 
should only be in plan - no results
   change
3. DriverContext.java: comments for backup* fields
   I think the logic can be simplified with having a backupTask in Task itself
4. System.out.println in Driver.java
5. Backup task not generic
6. ExecMaper: undoes isLogInforEnabled - optimization put by Siying
7. ExplainTask: Where is backup ask being printed
8. Why do you need to make so many classes Serializable
9. I dont see any explain plan in the new tests

I was thinking about it - I think you can do this all backup task business much 
easier.
No need for any special casing - every task has a backup task (currently, only 
valid
for joins, but nothing special from a task point of view).

The change is needed in ConditionalTask execute - If a conditional task consists
of task 1, 2 and 3 - and 1 is getting executed, we remove 2 and 3 from the 
children
of 2 and 3 respectively as if they never existed. This does not work, if there 
is
1 followed by 1.1, 2 followed by 2.2, etc.. all the common child X (which is 
the case
in your scenario) - we should fix that. Instead of removing only the immediate 
child's
parent - we check if the child had any remaining parents, if not, we recurse. 
This
way, the conditional task can containa a tree - you dont need 
grand-child/mapred local
task (all that special logic) all over.


I will continue to review more

> Convert join queries to map-join based on size of table/row
> -----------------------------------------------------------
>
>                 Key: HIVE-1642
>                 URL: https://issues.apache.org/jira/browse/HIVE-1642
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Liyin Tang
>             Fix For: 0.7.0
>
>         Attachments: hive_1642_1.patch
>
>
> Based on the number of rows and size of each table, Hive should automatically 
> be able to convert a join into map-join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1642) Convert join queries to map-join based on size of table/row

Reply via email to