[ https://issues.apache.org/jira/browse/FLINK-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857340#comment-16857340 ]
vinoyang commented on FLINK-12662: ---------------------------------- Hi [~till.rohrmann] I have thought a bit about the solution. Basically, the design seems like the {{ArchivedExecutionGraph}} and {{AccessExecutionGraph}}. IMO, we can reuse the {{AccessExecutionGraph}} for the purpose of rest API. In addition, we need to introduce a new entity of the execution graph, named e.g. {{AttemptedExecutionGraph}} to distinguish with {{ArchivedExecutionGraph}} which need job state to be a globally terminal state. We can generate an instance of {{AttemptedExecutionGraph}} when calling {{ExecutionGraph#tryRestartOrFail}} method. In addition, we may need a similar class named {{AttemptedExecutionGraphStore}} to store the instances of {{AttemptedExecutionGraph}} and give an implementation based on file and memory. And we may give a new entry in {{ExecutionGraphCache}}. This a coarse-grained idea, WDYT? > show jobs failover in history server as well > -------------------------------------------- > > Key: FLINK-12662 > URL: https://issues.apache.org/jira/browse/FLINK-12662 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST > Reporter: Su Ralph > Assignee: vinoyang > Priority: Major > > Currently > [https://ci.apache.org/projects/flink/flink-docs-release-1.8/monitoring/historyserver.html] > only show the completed jobs (completd, cancel, failed). Not showing any > intermediate failover. > Which make the cluster administrator/developer hard to find first place if > there is two failover happens. Feature ask is to > - make a failover as a record in history server as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)