[ https://issues.apache.org/jira/browse/FLINK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180577#comment-17180577 ]
Stephan Ewen commented on FLINK-16408: -------------------------------------- Just one caveat: In previous debugging sessions, we found that Metaspace Garbage Collection seems to trail behind a bit, at least in some JVM versions. So if you submit the job every 1 second, the JVM GC might not keep up. And unlike Heap GC (where new object allocation will block if GC needs to free up heap space first) it seems the JVM will not block class loading on Metaspace GC. A test to check that would be to submit a new job every 10-15 seconds and see if this is infinitely possible. If that works, but submitting it every 1 second does not work, then you may have the issue that there is no class leak, just class GC not keeping up. > Bind user code class loader to lifetime of a slot > ------------------------------------------------- > > Key: FLINK-16408 > URL: https://issues.apache.org/jira/browse/FLINK-16408 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Affects Versions: 1.9.2, 1.10.0 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0 > > Attachments: Metaspace-OOM.png > > > In order to avoid class leaks due to creating multiple user code class > loaders and loading class multiple times in a recovery case, I would suggest > to bind the lifetime of a user code class loader to the lifetime of a slot. > More precisely, the user code class loader should live at most as long as the > slot which is using it. -- This message was sent by Atlassian Jira (v8.3.4#803005)