[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302965#comment-15302965 ]
Naveen Gangam commented on HIVE-13749: -------------------------------------- Oops just posted the patch to RB (https://reviews.apache.org/r/47918/) at the same time as this comment. 1) Isnt the shutdown() called when a HMS request is fulfilled and the executor thread is being released back to the pool? So any new calls would potentially have a new UGI and a new instance of HiveConf. Also, calling closeAll() just removes the cached element. At worst, the FileSystem object is re-cached on a miss. 2) The other fixes are to address a similar issue on the HS2 side where using the FileSystem APIs causes the Cache to grow. This issue is on the HMS side. Regarding reproducing this locally, yes and no. I ran 100's of iterations of beeline executing a script that create a table and then drops it while randomly toggling the value of a hive conf property. For 300 iterations, I have gotten it to retain 60 instances which is not quite the same success as the customer is having. I think because of my test being run as a single user. Re-running the test with this fix, I have 8 instances retained but none in this particular cache. I have run with debug around this code and during the drop table command, I can see an element being added to the cache. I am also waiting for logs from this customer who is running with some instrumentation + fix. I can confirm that from those logs too. Alternatively, in checkTrashPurgeCombination() we could add a close() to this FileSystem. In my testcase, this has been the primary reason for the retained instances. {code} HadoopShims.HdfsEncryptionShim shim = ShimLoader.getHadoopShims().createHdfsEncryptionShim(FileSystem.get(hiveConf), hiveConf); {code} Thoughts? Thanks > Memory leak in Hive Metastore > ----------------------------- > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore > Affects Versions: 1.1.0 > Reporter: Naveen Gangam > Assignee: Naveen Gangam > Attachments: HIVE-13749.patch, Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)