[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317001#comment-15317001 ]
Daniel Dai commented on HIVE-13749: ----------------------------------- [~ngangam], what I used to do before to diagnose is to use a patched hadoop client libraries to catch the stack of every invocation of FileSystem.get, and understand exactly where the leak coming from. I don't want to blindly remove it in shutdown, plus, UGI object might already get lost at that time and you might not able to remove it. Here is how I patch Hadoop: {code} --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java @@ -20,6 +20,8 @@ import java.io.Closeable; import java.io.FileNotFoundException; import java.io.IOException; +import java.io.StringWriter; +import java.io.PrintWriter; import java.lang.ref.WeakReference; import java.net.URI; import java.net.URISyntaxException; @@ -2699,6 +2701,10 @@ private FileSystem getInternal(URI uri, Configuration conf, Key key) throws IOEx } fs.key = key; map.put(key, fs); + StringWriter sw = new StringWriter(); + new Throwable("").printStackTrace(new PrintWriter(sw)); + LOG.info("calling context for getInternal:" + sw.toString()); + LOG.info("# of maps:" + map.size()); if (conf.getBoolean("fs.automatic.close", true)) { toAutoClose.add(key); } @@ -2752,6 +2758,7 @@ synchronized void closeAll(boolean onlyAutomatic) throws IOException { if (!exceptions.isEmpty()) { throw MultipleIOException.createIOException(exceptions); } + LOG.info("map size after closeAll:" + map.size()); } private class ClientFinalizer implements Runnable { @@ -2789,6 +2796,7 @@ synchronized void closeAll(UserGroupInformation ugi) throws IOException { if (!exceptions.isEmpty()) { throw MultipleIOException.createIOException(exceptions); } + LOG.info("map size after closeAll:" + map.size()); } /** FileSystem.Cache.Key */ {code} Here is how to instruct Hive to use this jar: 1. Put attached hadoop-common.jar into $HIVE_HOME/lib 2. export HADOOP_USER_CLASSPATH_FIRST=true 3. Make sure fs cache is enabled 4. Restart hivemetastore 5. Collect hivemetastore.log > Memory leak in Hive Metastore > ----------------------------- > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore > Affects Versions: 1.1.0 > Reporter: Naveen Gangam > Assignee: Naveen Gangam > Attachments: HIVE-13749.patch, Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)