[ 
https://issues.apache.org/jira/browse/FLINK-31092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695034#comment-17695034
 ] 

Shengkai Fang edited comment on FLINK-31092 at 3/1/23 12:14 PM:
----------------------------------------------------------------

Hi, all. I think [~luoyuxia] is right. 
`ServiceLoaderUtil#LoadResult(IllegalStateException)` keeps loading the 
exception when the classloader is closed by the SessionManager. The case is 
possible to happen when the session is closed but the operation is running. But 
it's difficult for Gateway to cancel the task that is submitted to the 
`ExecutorService` by force if the task doesn't respect the interrupted flag.

[~slinkydeveloper], [~twalthr] could you share some thoughts about this? Can we 
limit the type of exception here to prevent endless loading?
{code:java}
static <T> List<LoadResult<T>> load(Class<T> clazz, ClassLoader classLoader) {
    List<LoadResult<T>> loadResults = new ArrayList<>();

    Iterator<T> serviceLoaderIterator = ServiceLoader.load(clazz, 
classLoader).iterator();

    while (true) {
        try {
            T next = serviceLoaderIterator.next();
            loadResults.add(new LoadResult<>(next));
        } catch (NoSuchElementException e) {
            break;
        } catch (Throwable t) {
            loadResults.add(new LoadResult<>(t));
        }
    }

    return loadResults;
} 

{code}


was (Author: fsk119):
Hi, all. I think [~luoyuxia] is right. 
`ServiceLoaderUtil#LoadResult(IllegalStateException)` keeps loading the 
exception when the classloader is closed by the SessionManager. The case is 
possible to happen when the session is closed but the operation is running. But 
it's difficult for Gateway to cancel the task that is submitted to the 
`ExecutorService` by force if the task doesn't respect the interrupted flag.

[~slinkydeveloper], [~twalthr] could you share some thoughts about this? Can we 
limit the type of exception type here to prevent endless loading?
{code:java}
static <T> List<LoadResult<T>> load(Class<T> clazz, ClassLoader classLoader) {
    List<LoadResult<T>> loadResults = new ArrayList<>();

    Iterator<T> serviceLoaderIterator = ServiceLoader.load(clazz, 
classLoader).iterator();

    while (true) {
        try {
            T next = serviceLoaderIterator.next();
            loadResults.add(new LoadResult<>(next));
        } catch (NoSuchElementException e) {
            break;
        } catch (Throwable t) {
            loadResults.add(new LoadResult<>(t));
        }
    }

    return loadResults;
} 

{code}

> Hive ITCases fail with OutOfMemoryError
> ---------------------------------------
>
>                 Key: FLINK-31092
>                 URL: https://issues.apache.org/jira/browse/FLINK-31092
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Hive
>    Affects Versions: 1.17.0
>            Reporter: Matthias Pohl
>            Assignee: luoyuxia
>            Priority: Blocker
>              Labels: test-stability
>         Attachments: 
> -__w-2-s-flink-connectors-flink-connector-hive-target-surefire-reports-2023-02-15T05-01-18_982-jvmRun4.dump,
>  VisualVM-FLINK-31092.png
>
>
> We're experiencing a OutOfMemoryError where the heap space reaches the upper 
> limit:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46161&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=23142
> {code}
> Feb 15 05:05:14 [INFO] Running 
> org.apache.flink.table.catalog.hive.HiveCatalogITCase
> Feb 15 05:05:17 [INFO] java.lang.OutOfMemoryError: Java heap space
> Feb 15 05:05:17 [INFO] Dumping heap to java_pid9669.hprof ...
> Feb 15 05:05:28 [INFO] Heap dump file created [1957090051 bytes in 11.718 
> secs]
> java.lang.OutOfMemoryError: Java heap space
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.cancelPingScheduler(ForkedBooter.java:209)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.acknowledgedExit(ForkedBooter.java:419)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:186)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:562)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:548)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to