Re: Splitting stream

Arvid Heise Mon, 10 May 2021 07:00:06 -0700

Hi Nikola,

if you just want to apply a different user function to the records
depending on the property "exist" the simplest way is to use


source -> map(if exist do this else that) -> sink

If it turns out that you want to apply a different subgraph, you can do

source -> filter(if exist) -> do this -> union -> sink
source -> filter(if not exist) -> do that -^

On Mon, May 10, 2021 at 3:07 PM Nikola Hrusov <n.hru...@gmail.com> wrote:

> Hi,
>
> I am trying to find some information on what is the best way to split a
> stream of the same data.
>
> For the given scenario: I have an object which has a property "exist"
>
> I want to split the stream based on this property, do something, and
> afterwards join it again into a single stream.
>
> Initial (A) -> Split stream based on exist (B) or not (C) -> union both
> streams (D)
>
> I could find some similar topics on StackOverflow:
> -
> https://stackoverflow.com/questions/53588554/apache-flink-using-filter-or-split-to-split-a-stream
> -
> https://stackoverflow.com/questions/61752728/how-to-get-output-of-the-values-that-are-not-matched-in-filter-function-in-apach
>
> but none of them really gives a definitive answer.
>
> What I am thinking about is using 1) filter or 2) side output.
>
> I know that one of the use cases of side output is that it can have
> different data types. That is not my case as it will be the same object
> going through the whole pipeline.
>
> So both options look more or less the same to me, however I do not know
> the flink internals as good as I would like to as of this point.
>
> Can some of you guys shed some light and perhaps tell me if I am mistaken
> in my thoughts?
>
> Thanks.
>
> Regards
> ,
> Nikola
>

Re: Splitting stream

Reply via email to