[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296454#comment-15296454 ]
Naveen Gangam commented on HIVE-13749: -------------------------------------- [~thejas] I have a better understanding of what is causing this issue. It appears that FileSystem.Cache (hadoop APIs) is retaining the instances of Configuration in its cache. Anytime we call a FileSystem.get(conf), like so https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1685 the conf object becomes the part of the key for the map entry. Its meant to improve performance so we dont have to re-create these FileSystem objects, but doesnt appear that Hive's use of these APIs is using the cache efficiently. There are other areas in the code that contribute, like Path.getFileSystem() under the covers could add to this cache. https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java#L104 Caching can be turned off entirely by using fs.%s.impl.disable.cache=true where %s is the caching scheme (ex: hdfs or s3) which might make this problem go away but has a performance overhead. (I havent measured it though). Unfortunately, there is no means to selectively turn off the caching on a per call basis. So we have to fix this in the hive code. fs.close() would remove the entry from the cache. But we cannot call it every time we use this API, as it would be the same as disabling the cache entirely. So its easy choice to add fs.close() here https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1685 But for the other code in Warehouse, we need more data around the cache hits and misses. I am working on instrumenting the FileSystem code to provide this info. Alternate thought, (I am not sure how feasible it is though), since the FileSystem code does not appear to be using the properties within this Configuration object itself, it may be safe to use a static instance of HiveConf on most calls to FileSystem, like mkdirs(), get() etc. This way we use the cache efficiently too. However, I am not sure if there will be session specific properties that get used across all calls to the FileSystem APIs. Thoughts? Thanks in advance. > Memory leak in Hive Metastore > ----------------------------- > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore > Affects Versions: 1.1.0 > Reporter: Naveen Gangam > Assignee: Naveen Gangam > Attachments: Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)