Misha Dmitriev created HIVE-16166:
-------------------------------------

             Summary: HS2 may still waste up to 15% of memory on duplicate 
strings
                 Key: HIVE-16166
                 URL: https://issues.apache.org/jira/browse/HIVE-16166
             Project: Hive
          Issue Type: Improvement
            Reporter: Misha Dmitriev
            Assignee: Misha Dmitriev


A heap dump obtained from one of our users shows that 15% of memory is wasted 
on duplicate strings, despite the recent optimizations that I made. The 
problematic strings just come from different sources this time. See the excerpt 
from the jxray (www.jxray.com) analysis attached.

Adding String.intern() calls in the appropriate places reduces the overhead of 
duplicate strings with this workload to ~6%. The remaining duplicates come 
mostly from JDK internal and MapReduce data structures, and thus are more 
difficult to fix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to