[ https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632523#comment-13632523 ]
Gunther Hagleitner commented on HIVE-4318: ------------------------------------------ I've repeated the experiments with [~pamelavagata]'s patch. Summary: Sorry, but I am *still* consistently measuring a difference. I've re-setup the experiment (didn't have the EC2 instance from last week anymore). I've setup the same data, same query, same codebase etc, but I wasn't running on the same machine, so we should be careful with comparisons between these runs and the previous ones. I did use the same method as above to get the numbers. First I created two builds: One by applying Pamela's patch ("fixed hooks") and one where I additionally commented out the operator hooks ("no hooks"). The results I got were: no hooks: 44.4 seconds fixed hooks: 46.2 seconds Then I created two more builds: One by applying Pamela's patch and commenting out the counters ("fixed hooks no counters") and one by commenting out both the operator hoos and the counters ("no hooks no counters"). The result I got was: no hooks no counters: 29.6 seconds fixed hooks no counters: 32.3 seconds > OperatorHooks hit performance even when not used > ------------------------------------------------ > > Key: HIVE-4318 > URL: https://issues.apache.org/jira/browse/HIVE-4318 > Project: Hive > Issue Type: Bug > Components: Query Processor > Environment: Ubuntu LXC (64 bit) > Reporter: Gopal V > Assignee: Gunther Hagleitner > Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch, > HIVE-4318.patch.pam.txt > > > Operator Hooks inserted into Operator.java cause a performance hit even when > it is not being used. > For a count(1) query tested with & without the operator hook calls. > {code:title=with} > 2013-04-09 07:33:58,920 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 84.07 sec > Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec > OK > 28800991 > Time taken: 40.407 seconds, Fetched: 1 row(s) > {code} > {code:title=without} > 2013-04-09 07:33:02,355 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 68.48 sec > ... > Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec > OK > 28800991 > Time taken: 35.907 seconds, Fetched: 1 row(s) > {code} > The effect is multiplied by the number of operators in the pipeline that has > to forward the row - the more operators there are the, the slower the query. > The modification made to test this was > {code:title=Operator.java} > --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java > +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java > @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws > HiveException { > return; > } > OperatorHookContext opHookContext = new OperatorHookContext(this, row, > tag); > - preProcessCounter(); > - enterOperatorHooks(opHookContext); > + //preProcessCounter(); > + //enterOperatorHooks(opHookContext); > processOp(row, tag); > - exitOperatorHooks(opHookContext); > - postProcessCounter(); > + //exitOperatorHooks(opHookContext); > + //postProcessCounter(); > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira