[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463221#comment-16463221 ]
Sahil Takiar commented on HIVE-19041: ------------------------------------- I guess if you have a comment for the partition column you will have tons of duplicate comments. I don't think we can selectively intern just for the partition column comments though. I'm not sure how overhead it would introduce if we just blindly intern all comments. This would include table, database, and column level comments. [~mi...@cloudera.com] do interned strings ever get purged from the Java heap? > Thrift deserialization of Partition objects should intern fields > ---------------------------------------------------------------- > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore > Affects Versions: 3.0.0, 2.3.2 > Reporter: Vihang Karajgaonkar > Assignee: Vihang Karajgaonkar > Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)