[jira] [Commented] (HIVE-2453) Need a way to categorize queries in hooks for improved logging

Ning Zhang (JIRA) Fri, 16 Sep 2011 22:01:17 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107024#comment-13107024
 ]


Ning Zhang commented on HIVE-2453:
----------------------------------

Kevin, I guess we cross posted on the review board and here. As you have 
noticed that there's really no much difference in the resulting operator tree 
beween sortby and orderby except the latter requires 1 reducer. However they 
are different from the syntax point of view. So 2 queries may have different 
syntaxes but their plan may be the same, or it could also be true that 2 
queries's syntax are very similar but there execution plans are different 
(e.g., CommonJoin can be converted to MapJoin at execution time). So I think 
for this task we should focus on tag the syntax tree rather than the physical 
execution plan tree. We probably should examine the operator tree and tag it at 
one pre-exec hook. 

BTW, we may also need to capture "distribute by", which just distribute the 
key-values pairs based on keys without sorting at the reducer. This is also one 
indicator for the analyses that the job need a reduce phase. 



> Need a way to categorize queries in hooks for improved logging
> --------------------------------------------------------------
>
>                 Key: HIVE-2453
>                 URL: https://issues.apache.org/jira/browse/HIVE-2453
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2453.1.patch.txt
>
>
> We need a way to categorize queries, such as whether or not the include a 
> join clause, a group by clause, etc., in the hooks.  This will allow for 
> better performance logging.
> Currently the only way I can find is to go through the operators in the 
> tasks, but which operators are used for the different types of queries may 
> change over time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2453) Need a way to categorize queries in hooks for improved logging

Reply via email to