[ https://issues.apache.org/jira/browse/HIVE-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16400630#comment-16400630 ]
Misha Dmitriev commented on HIVE-16879: --------------------------------------- I agree about the negligible CPU performance impact of String.intern(), especially when compared with reduced heap size and GC time. Again, I think this is a good change, assuming that it's applied in the right place. However, my experience is that guessing doesn't always work when you try to determine where _exactly_ memory is wasted. Do you have access to some running Hive instances where you would expect this to be a problem? Then, at a minimum, you can run 'jmap -histo:live' to get the number of Key instances and roughly estimate memory used by the strings that Keys reference. And the best thing would be to take a heap dump (jmap -dump:live,format=b,...) and analyze it with a tool, e.g. [www.jxray.com,|http://www.jxray.com,/] that immediately tells you the memory overhead of duplicate strings. You will immediately see whether Keys cause noticeable overhead, and/or what other classes cause it. > Improve Cache Key > ----------------- > > Key: HIVE-16879 > URL: https://issues.apache.org/jira/browse/HIVE-16879 > Project: Hive > Issue Type: Improvement > Components: Metastore > Affects Versions: 3.0.0 > Reporter: BELUGA BEHR > Assignee: BELUGA BEHR > Priority: Trivial > Attachments: HIVE-16879.1.patch, HIVE-16879.2.patch > > > Improve cache key for cache implemented in > {{org.apache.hadoop.hive.metastore.AggregateStatsCache}}. > # Cache some of the key components themselves (db name, table name) using > {{String}} intern method to conserve memory for repeated keys, to improve > {{equals}} method as now references can be used for equality, and hashcodes > will be cached as well as per {{String}} clash hashcode method. > # Upgrade _debug_ logging to not generate text unless required > # Changed _equals_ method to check first for the item most likely to be > different, column name -- This message was sent by Atlassian JIRA (v7.6.3#76005)