[ https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020794#comment-17020794 ]
Rajesh Balamohan commented on HIVE-22753: ----------------------------------------- Did some more debugging. 1. https://issues.apache.org/jira/browse/HIVE-22733 does not fix this issue. Observed mem leak with this fix as well. 2. HushableRandomAccessFileAppender stop() is getting invoked correctly as part of "Operation.cleanupOperationLog --> LogUtils.stopQueryAppender". However, due to some residual message in BatchEventProcessor, "HushableRandomAccessFileAppender" with same filename gets immediately recreated. This happens immediately after stop() is invoked. E.g {noformat} at sun.reflect.GeneratedMethodAccessor83.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:136) at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:958) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:898) at org.apache.logging.log4j.core.appender.routing.RoutingAppender.createAppender(RoutingAppender.java:271) at org.apache.logging.log4j.core.appender.routing.RoutingAppender.getControl(RoutingAppender.java:255) at org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:225) at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156) at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129) at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120) at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:448) at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:433) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:417) at org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79) at org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:380) at org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:152) at org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:45) at org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:29) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129) {noformat} So this leaves the object forever in the map, causing the memory leak. Yet to check how to prevent this from reinstantiated immediately. > Fix gradual mem leak: Operationlog related appenders should be cleared up on > errors > ------------------------------------------------------------------------------------ > > Key: HIVE-22753 > URL: https://issues.apache.org/jira/browse/HIVE-22753 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Priority: Minor > Attachments: HIVE-22753.1.patch, image-2020-01-21-11-14-37-911.png, > image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png > > > In case of exception in SQLOperation, operational log does not get cleared > up. This causes gradual build up of HushableRandomAccessFileAppender causing > HS2 to OOM after some time. > !image-2020-01-21-11-14-37-911.png|width=431,height=267! > > Allocation tree > !image-2020-01-21-11-18-37-294.png|width=425,height=178! > > Prod instance mem > !image-2020-01-21-11-17-59-279.png|width=698,height=209! > > Each HushableRandomAccessFileAppender holds internal ref to > RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem > leak. > Related ticket: HIVE-18820 -- This message was sent by Atlassian Jira (v8.3.4#803005)