Hi Panagiotis

Thank you for starting this discussion. I think this FLIP is valuable and
can help user to analyze the causes of job failover better!

I have two comments as follows

1. How about adding more job information in FailureListenerContext? For
example, job vertext, subtask, taskmanager location. And then user can do
more statistics according to different dimensions.

2. Users may want to save results in listener, and then they can get the
historical results even jabmanager failover. Can we provide a unified
implementation for data storage requirements?


Best,
shammon FY


On Saturday, March 18, 2023, Panagiotis Garefalakis <pga...@apache.org>
wrote:

> Hi everyone,
>
> This FLIP [1] proposes a pluggable interface for failure handling allowing
> users to implement custom failure logic using the plugin framework.
> Motivated by existing proposals [2] and tickets [3], this enables use-cases
> like: assigning particular types to failures (e.g., User or System),
> emitting custom metrics per type (e.g., application or platform), even
> exposing errors to downstream consumers (e.g., notification systems).
>
> Thanks to Piotr and Anton for the initial reviews and discussions!
>
> For anyone interested, the starting point would be the FLIP [1] that I
> created,
> describing the motivation and the proposed changes (part of the core,
> runtime and web).
>
> The intuition behind this FLIP is being able to execute custom logic on
> failures by exposing a FailureListener interface. Implementation by users
> can be simply loaded to the system as Jar files. FailureListeners may also
> decide to assign failure tags to errors (expressed as strings),
> that will then be exposed as metadata by the UI/Rest interfaces.
>
> Feedback is always appreciated! Looking forward to your thoughts!
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-304%
> 3A+Pluggable+failure+handling+for+Apache+Flink
> [2]
> https://docs.google.com/document/d/1pcHg9F3GoDDeVD5GIIo2wO67
> Hmjgy0-hRDeuFnrMgT4
> [3] https://issues.apache.org/jira/browse/FLINK-20833
>
> Cheers,
> Panagiotis
>

Reply via email to