Github user zentol commented on the issue: https://github.com/apache/flink/pull/3709 eh... in charge? Whenever *anything* related to a job is requested from the web-ui the EGHolder is accessed. Suppose you have the job info page (/jobs/:jobid) open in a browser or smth. The WebUI periodically sends requests to the backend, which will asks the EGHolder, which then asks the JM if it doesn't find the job in the cache. Now, if we remove the suspended EG we will in fact keep polling the JM until the job was recovered. This is actually the same behavior that you would have if the job is suspended and the GC/guava cache starts right away rr if the job was resumed on another JM but you aren't refreshing the webUI (which should redirect to the current leader). So for adding entries nothing changes; for removing entries the GC is still mostly in charge; we're just adding a small 2-line branch to invalidate suspended ExecutionGraphs that is activated if a handler accesses the EGHolder.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---