[ 
https://issues.apache.org/jira/browse/FLINK-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15972943#comment-15972943
 ] 

ASF GitHub Bot commented on FLINK-6295:
---------------------------------------

Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/3709
  
    eh... in charge?
    
    Whenever *anything* related to a job is requested from the web-ui the 
EGHolder is accessed.
    
    Suppose you have the job info page (/jobs/:jobid) open in a browser or 
smth. The WebUI periodically sends requests to the backend, which will asks the 
EGHolder, which then asks the JM if it doesn't find the job in the cache. Now, 
if we remove the suspended EG we will in fact keep polling the JM until the job 
was recovered.
    
    This is actually the same behavior that you would have if the job is 
suspended and the GC/guava cache starts right away rr if the job was resumed on 
another JM but you aren't refreshing the webUI (which should redirect to the 
current leader).
    
    So for adding entries nothing changes; for removing entries the GC is still 
mostly in charge; we're just adding a small 2-line branch to invalidate 
suspended ExecutionGraphs that is activated if a handler accesses the EGHolder.


> use LoadingCache instead of WeakHashMap to lower latency
> --------------------------------------------------------
>
>                 Key: FLINK-6295
>                 URL: https://issues.apache.org/jira/browse/FLINK-6295
>             Project: Flink
>          Issue Type: Bug
>          Components: Webfrontend
>            Reporter: Tao Wang
>            Assignee: Tao Wang
>
> Now in ExecutionGraphHolder, which is used in many handlers, we use a 
> WeakHashMap to cache ExecutionGraph(s), which is only sensitive to garbage 
> collection.
> The latency is too high when JVM do GC rarely, which will make status of jobs 
> or its tasks unmatched with the real ones.
> LoadingCache is a common used cache implementation from guava lib, we can use 
> its time based eviction to lower latency of status update.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to