[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140154#comment-14140154
 ] 

Prasanth J commented on HIVE-8188:
----------------------------------

I think its because hash-aggregation needs to estimate the size of the hash 
map. The values of the hashmaps are UDAFs whose aggregation buffer size can be 
estimated if the aggregation buffer has this annotation 
"@AggregationType(estimable = true)". GroupByOperator.shouldBeFlushed() is 
called for every row that is added to hash map. shouldBeFlushed() calls 
isEstimable() helper function which uses reflection every time to see if the 
aggregation function is estimable. Not sure why it is done this way but yes 
this will be slow as hell. This needs to be fixed.

> ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
> loop
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-8188
>                 URL: https://issues.apache.org/jira/browse/HIVE-8188
>             Project: Hive
>          Issue Type: Bug
>          Components: UDF
>    Affects Versions: 0.14.0
>            Reporter: Gopal V
>         Attachments: udf-deterministic.png
>
>
> When running a near-constant UDF, most of the CPU is burnt within the VM 
> trying to read the class annotations for every row.
> !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to