[ https://issues.apache.org/jira/browse/FLINK-31092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693779#comment-17693779 ]
luoyuxia commented on FLINK-31092: ---------------------------------- Try to analyze the heap dump, I found most of object will be `ServiceLoaderUtil#LoadResult(IllegalStateException)`, and the exception message is ` Trying to access closed classloader.xxxx`. I think that's the cause of OOM. From the code: {code:java} static <T> List<LoadResult<T>> load(Class<T> clazz, ClassLoader classLoader) { List<LoadResult<T>> loadResults = new ArrayList<>(); Iterator<T> serviceLoaderIterator = ServiceLoader.load(clazz, classLoader).iterator(); while (true) { try { T next = serviceLoaderIterator.next(); loadResults.add(new LoadResult<>(next)); } catch (NoSuchElementException e) { break; } catch (Throwable t) { loadResults.add(new LoadResult<>(t)); } } return loadResults; } {code} Seems it'll then loop indefinitely when `serviceLoaderIterator.next()` throw exception other than NoSuchElementException. And it'll add more and more `LoadResult` util OOM. And the stack where OOM happens is as follows: {code:java} at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:48) at org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:179) at org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResources(FlinkUserCodeClassLoaders.java:213) at java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:348) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:364) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at java.util.ServiceLoader$1.next(ServiceLoader.java:480) at org.apache.flink.table.factories.ServiceLoaderUtil.load(ServiceLoaderUtil.java:42) at org.apache.flink.table.factories.FactoryUtil.discoverFactories(FactoryUtil.java:805) at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:524) at org.apache.flink.table.factories.PlannerFactoryUtil.createPlanner(PlannerFactoryUtil.java:45) at org.apache.flink.table.gateway.service.operation.OperationExecutor.createStreamTableEnvironment(OperationExecutor.java:375) at org.apache.flink.table.gateway.service.operation.OperationExecutor.getTableEnvironment(OperationExecutor.java:332) at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:190) at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212) at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl$$Lambda$1007.apply(<unknown string>) at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:110) at org.apache.flink.table.gateway.service.operation.OperationManager$$Lambda$1008.call(<unknown string>) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:242) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation$$Lambda$1010.run(<unknown string>) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748){code} [~fsk119] Could you please have a look? Is it possible it'll access a closed classloader? > Hive ITCases fail with OutOfMemoryError > --------------------------------------- > > Key: FLINK-31092 > URL: https://issues.apache.org/jira/browse/FLINK-31092 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Affects Versions: 1.17.0 > Reporter: Matthias Pohl > Assignee: luoyuxia > Priority: Critical > Labels: test-stability > Attachments: VisualVM-FLINK-31092.png > > > We're experiencing a OutOfMemoryError where the heap space reaches the upper > limit: > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46161&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=23142 > {code} > Feb 15 05:05:14 [INFO] Running > org.apache.flink.table.catalog.hive.HiveCatalogITCase > Feb 15 05:05:17 [INFO] java.lang.OutOfMemoryError: Java heap space > Feb 15 05:05:17 [INFO] Dumping heap to java_pid9669.hprof ... > Feb 15 05:05:28 [INFO] Heap dump file created [1957090051 bytes in 11.718 > secs] > java.lang.OutOfMemoryError: Java heap space > at > org.apache.maven.surefire.booter.ForkedBooter.cancelPingScheduler(ForkedBooter.java:209) > at > org.apache.maven.surefire.booter.ForkedBooter.acknowledgedExit(ForkedBooter.java:419) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:186) > at > org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:562) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:548) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)