[jira] [Commented] (FLINK-6295) Update suspended ExecutionGraph to lower latency

ASF GitHub Bot (JIRA) Mon, 24 Apr 2017 07:51:21 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981275#comment-15981275
 ]


ASF GitHub Bot commented on FLINK-6295:
---------------------------------------

Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/3709
  
    @WangTaoTheTonic I think the big source of confusion is the following: The 
cache does not cache any status. It really duplicates the pointer to the life 
`ExecutionGraph` object (the `AccessExecutionGraph` and the `ExecutionGraph` 
are the same here, the names are an artifact of an earlier approach to create a 
History Server).
    
    The only case that is problematic is the case where there are multiple 
execution graphs, which happens upon leader change.
    
    Another way to fix this would have been to remove the graph from the cache 
whenever leader status is lost.


> Update suspended ExecutionGraph to lower latency
> ------------------------------------------------
>
>                 Key: FLINK-6295
>                 URL: https://issues.apache.org/jira/browse/FLINK-6295
>             Project: Flink
>          Issue Type: Bug
>          Components: Webfrontend
>            Reporter: Tao Wang
>            Assignee: Tao Wang
>
> Now in ExecutionGraphHolder, which is used in many handlers, we use a 
> WeakHashMap to cache ExecutionGraph(s), which is only sensitive to garbage 
> collection.
> The latency is too high when JVM do GC rarely, which will make status of jobs 
> or its tasks unmatched with the real ones.
> LoadingCache is a common used cache implementation from guava lib, we can use 
> its time based eviction to lower latency of status update.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-6295) Update suspended ExecutionGraph to lower latency

Reply via email to