[ 
https://issues.apache.org/jira/browse/HIVE-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399668#comment-16399668
 ] 

Misha Dmitriev commented on HIVE-16879:
---------------------------------------

This looks like nice optimization work, assuming that the right things are 
optimized.

Did you measure that the duplicate strings referenced by the fields of Key 
indeed waste a noticeable amount of memory? If yes, what tool did you use and 
can you share your findings? Is it really the case that dbName and tblName 
cause enough duplication to benefit from interning, but colName does not?

> Improve Cache Key
> -----------------
>
>                 Key: HIVE-16879
>                 URL: https://issues.apache.org/jira/browse/HIVE-16879
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>    Affects Versions: 3.0.0
>            Reporter: BELUGA BEHR
>            Assignee: BELUGA BEHR
>            Priority: Trivial
>         Attachments: HIVE-16879.1.patch, HIVE-16879.2.patch
>
>
> Improve cache key for cache implemented in 
> {{org.apache.hadoop.hive.metastore.AggregateStatsCache}}.
> # Cache some of the key components themselves (db name, table name) using 
> {{String}} intern method to conserve memory for repeated keys, to improve 
> {{equals}} method as now references can be used for equality, and hashcodes 
> will be cached as well as per {{String}} clash hashcode method.
> # Upgrade _debug_ logging to not generate text unless required
> # Changed _equals_ method to check first for the item most likely to be 
> different, column name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to