Szehon Ho created HIVE-20153:
--------------------------------

             Summary: Count and Sum UDF consume more memory in Hive 2+
                 Key: HIVE-20153
                 URL: https://issues.apache.org/jira/browse/HIVE-20153
             Project: Hive
          Issue Type: Bug
          Components: UDF
    Affects Versions: 2.3.2
            Reporter: Szehon Ho


While playing with Hive2, we noticed that queries with a lot of count() and 
sum() aggregations run out of memory on Hadoop side much faster than in Hive1.  
Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to