Re: Error reporting for Flink jobs

Timo Walther Mon, 29 Jun 2020 00:04:53 -0700

Hi Satyam,

I'm not aware of an API to solve all your problems at once. A commonpattern for failures in user-code is to catch errors in user-code anddefine a side output for an operator to pipe the errors to dedicatedsinks. However, such a functionality does not exist in SQL yet. For thesink part, it might be useful to look into the StreamingFileSink [1]which provides better failure handling guarantees. Flink 1.11 will beshipped with a SQL streaming file sink.


Regards,
Timo

[1]https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html



On 28.06.20 12:27, Satyam Shekhar wrote:

Hello,
I am using Flink as the query engine for running SQL queries on bothbatch and streaming data. I use the Blink planner in batch and streamingmode respectively for the two cases.
In my current setup, I execute the batch queries synchronously viaStreamTableEnvironment::execute method. The job uses OutputFormat toconsume results in StreamTableSink and send it to the user. In casethere is an error/exception in the pipeline (possibly to user code), itis not reported to OutputFormat or the Sink. If an error occurs afterthe invocation of the write method on OutputFormat, the implementationmay falsely assume that the result successful and complete since closeis called in both success and failure cases. I can work around this, bychecking for exceptions thrown by the execute method but that adds extralatency due to job tear down cost.
A similar problem also exists for streaming jobs. In my setup, streamingjobs are executed asynchronously viaStreamExecuteEnvironment::executeAsync. Since the sink interface has nomethods to receive errors in the pipeline, the user code has toperiodically track and manage persistent failures.
Have I missed something in the API? Or Is there some other way to getaccess to error status in user code?
Regards,
Satyam

Re: Error reporting for Flink jobs

Reply via email to