Hi Mich,

at the moment there is not much support handle such data driven exceptions
(badly formatted data, late data, ...).
However, there is a proposal to improve this: FLIP-13 [1]. So it is work in
progress.

It would be very helpful if you could check if the proposal would address
your use cases.

Until then, I guess that either the split operator or directly writing to
an external storage system would be the way to go.

Best,
Fabian

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-13+Side+Outputs+in+Flink

2016-11-11 2:33 GMT+01:00 Michel Betancourt <michelbetanco...@gmail.com>:

>
> Hi, new to Apache Flink.  Trying to find some solid input on how best to
> handle exceptions in streams -- specifically those that should not
> interrupt the stream.
>
> For example, if an error occurs during deserialization from bytes/Strings
> to your data-type, in my use-case I would rather queue the data for visual
> inspection than discard it and filter it out.
>
> One way of doing this is to diverge the stream so that good items take one
> path, while bad items take another.
>
> The closest thing I can find in Flink that can achieve this effect is the
> split operator. The caveat is that split does not also allow for inlined
> transformations.  In other words, the best use of split appears first
> perform your logic that catches the exception.  Then pass the exception
> into the next stage which uses split to check for an exception and
> providing names to each piece of the decision, for example "OK" vs "error".
>
> Frameworks like RX (Reactive Extensions, eg RxJava) have built in
> functionality that allows the user to decide if they want to handle
> exceptions globally or specifically and resume if needed.  I was hoping to
> find similar operations in Flink but so far no luck.
>
> At any rate, it would be great to get some feedback to see if I am heading
> down the good path here, and whether there are any caveats / gotchas to be
> aware of?
>
> Thanks!
> Mich
>
>

Reply via email to