Hi Sébastien,

there is always the possibility to reuse a stream. Given a
DataStream<Element> input, you can do the following:

KeyedStream<Element> a = input.keyBy(x -> f(x));
KeyedStream<Element> b = input.keyBy(x -> g(x));

This gives you two differently partitioned streams a and b.

If you want to evaluate every event against the full set of rules, then you
could take a look at Flink Broadcast State Pattern [1]. It allows you to
broadcast a stream of rules to all operators of a keyed input stream.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html

Cheers,
Till

On Mon, Feb 17, 2020 at 11:10 PM theo.diefent...@scoop-software.de <
theo.diefent...@scoop-software.de> wrote:

> Hi Sebastian,
> I'd also highly recommend a recent Flink blog post to you where exactly
> this question was answered in quote some detail :
> https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html
> Best regardsTheo
> -------- Ursprüngliche Nachricht --------
> Von: Eduardo Winpenny Tejedor <eduardo.winpe...@gmail.com>
> Datum: Mo., 17. Feb. 2020, 21:07
> An: Lehuede sebastien <lehued...@gmail.com>
> Cc: user <user@flink.apache.org>
> Betreff: Re: Process stream multiple time with different KeyBy
>
>
> Hi Sebastien,
>
> Without being entirely sure of what's your use case/end goal I'll tell
> you (some of) the options Flink provides you for defining a flow.
>
> If your use case is to apply the same rule to each of your "swimlanes"
> of data (one with category=foo AND subcategory=bar, another with
> category=foo and another with category=bar) you can do this by
> implementing your own org.apache.flink.api.java.functions.KeySelector
> function for the keyBy function. You'll just need to return a
> different key for each of your rules and the data will separate to the
> appropriate "swimlane".
>
> If your use case is to apply different rules to each swimlane then you
> can write a ProcessFunction that redirects elements to different *side
> outputs*. You can then apply different operations to each side output.
>
> Your application could get tricky to evolve IF the number of swimlanes
> or the operators are meant to change over time, you'd have to be
> careful how the existing state fits into your new flows.
>
> Regards,
> Eduardo
>
> On Mon, Feb 17, 2020 at 7:06 PM Lehuede sebastien <lehued...@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > I'm currently working on a Flink Application where I match events
> against a set of rules. At the beginning I wanted to dynamically create
> streams following the category of events (Event are JSON formatted and I've
> a field like "category":"foo" in each event) but I'm stuck by the
> impossibility to create streams at runtime.
> >
> > So, one of the solution for me is to create a single Kafka topic and
> then use the "KeyBy" operator to match events with "category":"foo" against
> rules also containing "category":"foo" in rule specification.
> >
> > Now I have some cases where events and rules have one category and one
> subcategory. At this point I'm not sure about the "KeyBy" operator behavior.
> >
> > Example :
> >
> > Events have : "category":"foo" AND "subcategory":"bar"
> > Rule1 specification has : "category":"foo" AND "subcategory":"bar"
> > Rule2 specification has : "category':"foo"
> > Rule3 specification has : "category":"bar"
> >
> > In this case, my events need to be match against Rule1, Rule2 and Rule3.
> >
> > If I'm right, if I apply a multiple key "KeyBy()" with "category" and
> "subcategory" fields and then apply two single key "KeyBy()" with
> "category" field, my events will be consumed by the first "KeyBy()"
> operator and no events will be streamed in the operators after ?
> >
> > Is there any way to process the same stream one time for multi key
> KeyBy() and another time for single key KeyBy() ?
> >
> > Thanks !
> > Sébastien.
>

Reply via email to