Re: Discarding bad data in Stream

2018-02-19 Thread Niclas Hedhman
Thanks Fabian, I have seen Side Outputs and OutputTags but not fully understood the mechanics yet. In my case, I don't need to keep bad records... And I think I will end up with flatMap() after all, it just becomes a internal documentation issue to provide relevant information... Thanks for your

Re: Discarding bad data in Stream

2018-02-19 Thread Fabian Hueske
Hi Niclas, I'd either add a Filter to directly discard bad records. That should make the behavior explicit. If you need to do complex transformations that you don't want to do twice, the FlatMap approach would be the most efficient. If you'd like to keep the bad records, you can implement a Proces

Discarding bad data in Stream

2018-02-19 Thread Niclas Hedhman
Hi again, something that I don't find (easily) in the documentation is what the recommended method is to discard data from the stream. On one hand, I could always use flatMap(), even if it is "per message" since that allows me to return zero or one objects. DataStream stream = env.addSource(