[ 
https://issues.apache.org/jira/browse/FLINK-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-6042:
----------------------------
    Description: 
Users requested that it would be nice to see the last {{n}} exceptions causing 
a job restart in the Web UI. This will help to more easily debug and operate a 
job.

We could store the root causes for failures similar to how prior executions are 
stored in the {{ExecutionVertex}} using the {{EvictingBoundedList}} and then 
serve this information via the Web UI.

Update: January 21, 2021

The UI can already handle multiple exceptions through the Exception History. 
Right now, we list one or more exceptions which caused the job to fail. 
Instead, we could adapt it in a way that the history contains not only the 
exceptions of the most recent failure but one expandable entry per restart. If 
there are more than one exception connected to a single restart, we would list 
their stacktraces within one expandable entry.

  was:
Users requested that it would be nice to see the last {{n}} exceptions causing 
a job restart in the Web UI. This will help to more easily debug and operate a 
job.

We could store the root causes for failures similar to how prior executions are 
stored in the {{ExecutionVertex}} using the {{EvictingBoundedList}} and then 
serve this information via the Web UI.


> Display last n exceptions/causes for job restarts in Web UI
> -----------------------------------------------------------
>
>                 Key: FLINK-6042
>                 URL: https://issues.apache.org/jira/browse/FLINK-6042
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination, Runtime / Web Frontend
>    Affects Versions: 1.3.0
>            Reporter: Till Rohrmann
>            Assignee: Matthias
>            Priority: Major
>              Labels: pull-request-available
>
> Users requested that it would be nice to see the last {{n}} exceptions 
> causing a job restart in the Web UI. This will help to more easily debug and 
> operate a job.
> We could store the root causes for failures similar to how prior executions 
> are stored in the {{ExecutionVertex}} using the {{EvictingBoundedList}} and 
> then serve this information via the Web UI.
> Update: January 21, 2021
> The UI can already handle multiple exceptions through the Exception History. 
> Right now, we list one or more exceptions which caused the job to fail. 
> Instead, we could adapt it in a way that the history contains not only the 
> exceptions of the most recent failure but one expandable entry per restart. 
> If there are more than one exception connected to a single restart, we would 
> list their stacktraces within one expandable entry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to