[ https://issues.apache.org/jira/browse/FLINK-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269409#comment-17269409 ]
Till Rohrmann commented on FLINK-6042: -------------------------------------- Taking your argument, why is it better to add the exception information method to the {{ArchivedExecutionGraph}} and making it thereby accessible to all {{AbstractExecutionGraphHandler}} handlers? Wouldn't it make sense to only provide access to those information a handler needs? In our case, one could give access to the {{AccessExecutionGraph}} for those handlers which extract information from the {{ExecutionGraph}} and maybe something like a {{FailureHistory}} for the {{JobExceptionsHandler}}? In the end the {{ArchivedExecutionGraph}} might also implement {{FailureHistory}} but I think the important bit is to segregate the interfaces. Thinking a step ahead, how would it work with the {{ArchivedExecutionGraph}} if we send multiple graphs because it changed over the job's lifetime. To which graph will the exception causing the lifetime end of a graph be assigned? > Display last n exceptions/causes for job restarts in Web UI > ----------------------------------------------------------- > > Key: FLINK-6042 > URL: https://issues.apache.org/jira/browse/FLINK-6042 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination, Runtime / Web Frontend > Affects Versions: 1.3.0 > Reporter: Till Rohrmann > Assignee: Matthias > Priority: Major > Labels: pull-request-available > > Users requested that it would be nice to see the last {{n}} exceptions > causing a job restart in the Web UI. This will help to more easily debug and > operate a job. > We could store the root causes for failures similar to how prior executions > are stored in the {{ExecutionVertex}} using the {{EvictingBoundedList}} and > then serve this information via the Web UI. > _-- Update: January 21, 2021 --_ > The UI can already handle multiple exceptions through the Exception History. > Right now, we list one or more exceptions which caused the job to fail. > Instead, we could adapt it in a way that the history contains not only the > exceptions of the most recent failure but one expandable entry per restart. > If there are more than one exception connected to a single restart, we would > list their stacktraces within one expandable entry. -- This message was sent by Atlassian Jira (v8.3.4#803005)