[ 
https://issues.apache.org/jira/browse/HIVE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505559#comment-14505559
 ] 

Gunther Hagleitner commented on HIVE-10323:
-------------------------------------------

Patch looks good. Minor nit: The condition for nextKeyGroup should be an else 
block.

Some other considerations:

- Maybe we should log emit and spill intervals. Also warn if the first is > 
than latter?
- Looks like you emit before you put the current record into storage. Wouldn't 
it be better to do that afterwards?

Biggest concern: There's not a lot of testing going on. For one thing I think 
you could set the emit interval low (2?) for all tez tests and see if you get 
bigger coverage that way. If not you should test all the combinations: left, 
right, outer, multi key, multi table, spill other tables, etc.

> Tez merge join operator does not honor hive.join.emit.interval
> --------------------------------------------------------------
>
>                 Key: HIVE-10323
>                 URL: https://issues.apache.org/jira/browse/HIVE-10323
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 1.2.0
>            Reporter: Vikram Dixit K
>            Assignee: Vikram Dixit K
>         Attachments: HIVE-10323.1.patch
>
>
> This affects efficiency in case of skews.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to