[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632523#comment-13632523
 ] 

Gunther Hagleitner commented on HIVE-4318:
------------------------------------------

I've repeated the experiments with [~pamelavagata]'s patch. Summary: Sorry, but 
I am *still* consistently measuring a difference.

I've re-setup the experiment (didn't have the EC2 instance from last week 
anymore). I've setup the same data, same query, same codebase etc, but I wasn't 
running on the same machine, so we should be careful with comparisons between 
these runs and the previous ones. I did use the same method as above to get the 
numbers.

First I created two builds: One by applying Pamela's patch ("fixed hooks") and 
one where I additionally commented out the operator hooks ("no hooks"). The 
results I got were:

no hooks: 44.4 seconds
fixed hooks: 46.2 seconds

Then I created two more builds: One by applying Pamela's patch and commenting 
out the counters ("fixed hooks no counters") and one by commenting out both the 
operator hoos and the counters ("no hooks no counters"). The result I got was:

no hooks no counters: 29.6 seconds
fixed hooks no counters: 32.3 seconds
                
> OperatorHooks hit performance even when not used
> ------------------------------------------------
>
>                 Key: HIVE-4318
>                 URL: https://issues.apache.org/jira/browse/HIVE-4318
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>         Environment: Ubuntu LXC (64 bit)
>            Reporter: Gopal V
>            Assignee: Gunther Hagleitner
>         Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch, 
> HIVE-4318.patch.pam.txt
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when 
> it is not being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has 
> to forward the row - the more operators there are the, the slower the query.
> The modification made to test this was 
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
> HiveException {
>        return;
>      }
>      OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
> tag);
> -    preProcessCounter();
> -    enterOperatorHooks(opHookContext);
> +    //preProcessCounter();
> +    //enterOperatorHooks(opHookContext);
>      processOp(row, tag);
> -    exitOperatorHooks(opHookContext);
> -    postProcessCounter();
> +    //exitOperatorHooks(opHookContext);
> +    //postProcessCounter();
>    }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to