[ 
https://issues.apache.org/jira/browse/FLINK-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857340#comment-16857340
 ] 

vinoyang commented on FLINK-12662:
----------------------------------

Hi [~till.rohrmann] I have thought a bit about the solution. Basically, the 
design seems like the {{ArchivedExecutionGraph}} and {{AccessExecutionGraph}}. 

IMO, we can reuse the {{AccessExecutionGraph}} for the purpose of rest API. In 
addition, we need to introduce a new entity of the execution graph, named e.g. 
{{AttemptedExecutionGraph}} to distinguish with {{ArchivedExecutionGraph}} 
which need job state to be a globally terminal state. We can generate an 
instance of {{AttemptedExecutionGraph}} when calling 
{{ExecutionGraph#tryRestartOrFail}} method.

In addition, we may need a similar class named {{AttemptedExecutionGraphStore}} 
to store the instances of {{AttemptedExecutionGraph}} and give an 
implementation based on file and memory. And we may give a new entry in 
{{ExecutionGraphCache}}.

This a coarse-grained idea, WDYT?

 

> show jobs failover in history server as well
> --------------------------------------------
>
>                 Key: FLINK-12662
>                 URL: https://issues.apache.org/jira/browse/FLINK-12662
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / REST
>            Reporter: Su Ralph
>            Assignee: vinoyang
>            Priority: Major
>
> Currently 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.8/monitoring/historyserver.html]
>  only show the completed jobs (completd, cancel, failed). Not showing any 
> intermediate failover. 
> Which make the cluster administrator/developer hard to find first place if 
> there is two failover happens. Feature ask is to 
> - make a failover as a record in history server as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to