Hello Users!

    I would like to notify an external endpoint when a streaming job has a
certain number of restarts. While I can use a service to continuously *poll*
 Flink metrics and identify failing jobs, I am looking to inverse the
action and have the job notify. We have around ~50 streaming jobs and it
gets challenging querying on a continuous basis.

    Looking into [1], the intrusive way was to perform the action at [2]
(not tested though) Happy to hear suggestions and alternatives ?


[1]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/execution/task_failure_recovery/#restart-strategies


[2]
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/failover/flip1/FixedDelayRestartBackoffTimeStrategy.java#L68


Thanks
AK.

Reply via email to