Hi Panagiotis Thank you for starting this discussion. I think this FLIP is valuable and can help user to analyze the causes of job failover better!
I have two comments as follows 1. How about adding more job information in FailureListenerContext? For example, job vertext, subtask, taskmanager location. And then user can do more statistics according to different dimensions. 2. Users may want to save results in listener, and then they can get the historical results even jabmanager failover. Can we provide a unified implementation for data storage requirements? Best, shammon FY On Saturday, March 18, 2023, Panagiotis Garefalakis <pga...@apache.org> wrote: > Hi everyone, > > This FLIP [1] proposes a pluggable interface for failure handling allowing > users to implement custom failure logic using the plugin framework. > Motivated by existing proposals [2] and tickets [3], this enables use-cases > like: assigning particular types to failures (e.g., User or System), > emitting custom metrics per type (e.g., application or platform), even > exposing errors to downstream consumers (e.g., notification systems). > > Thanks to Piotr and Anton for the initial reviews and discussions! > > For anyone interested, the starting point would be the FLIP [1] that I > created, > describing the motivation and the proposed changes (part of the core, > runtime and web). > > The intuition behind this FLIP is being able to execute custom logic on > failures by exposing a FailureListener interface. Implementation by users > can be simply loaded to the system as Jar files. FailureListeners may also > decide to assign failure tags to errors (expressed as strings), > that will then be exposed as metadata by the UI/Rest interfaces. > > Feedback is always appreciated! Looking forward to your thoughts! > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-304% > 3A+Pluggable+failure+handling+for+Apache+Flink > [2] > https://docs.google.com/document/d/1pcHg9F3GoDDeVD5GIIo2wO67 > Hmjgy0-hRDeuFnrMgT4 > [3] https://issues.apache.org/jira/browse/FLINK-20833 > > Cheers, > Panagiotis >