Hi Sébastien, there is always the possibility to reuse a stream. Given a DataStream<Element> input, you can do the following:
KeyedStream<Element> a = input.keyBy(x -> f(x)); KeyedStream<Element> b = input.keyBy(x -> g(x)); This gives you two differently partitioned streams a and b. If you want to evaluate every event against the full set of rules, then you could take a look at Flink Broadcast State Pattern [1]. It allows you to broadcast a stream of rules to all operators of a keyed input stream. [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html Cheers, Till On Mon, Feb 17, 2020 at 11:10 PM theo.diefent...@scoop-software.de < theo.diefent...@scoop-software.de> wrote: > Hi Sebastian, > I'd also highly recommend a recent Flink blog post to you where exactly > this question was answered in quote some detail : > https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html > Best regardsTheo > -------- Ursprüngliche Nachricht -------- > Von: Eduardo Winpenny Tejedor <eduardo.winpe...@gmail.com> > Datum: Mo., 17. Feb. 2020, 21:07 > An: Lehuede sebastien <lehued...@gmail.com> > Cc: user <user@flink.apache.org> > Betreff: Re: Process stream multiple time with different KeyBy > > > Hi Sebastien, > > Without being entirely sure of what's your use case/end goal I'll tell > you (some of) the options Flink provides you for defining a flow. > > If your use case is to apply the same rule to each of your "swimlanes" > of data (one with category=foo AND subcategory=bar, another with > category=foo and another with category=bar) you can do this by > implementing your own org.apache.flink.api.java.functions.KeySelector > function for the keyBy function. You'll just need to return a > different key for each of your rules and the data will separate to the > appropriate "swimlane". > > If your use case is to apply different rules to each swimlane then you > can write a ProcessFunction that redirects elements to different *side > outputs*. You can then apply different operations to each side output. > > Your application could get tricky to evolve IF the number of swimlanes > or the operators are meant to change over time, you'd have to be > careful how the existing state fits into your new flows. > > Regards, > Eduardo > > On Mon, Feb 17, 2020 at 7:06 PM Lehuede sebastien <lehued...@gmail.com> > wrote: > > > > Hi all, > > > > I'm currently working on a Flink Application where I match events > against a set of rules. At the beginning I wanted to dynamically create > streams following the category of events (Event are JSON formatted and I've > a field like "category":"foo" in each event) but I'm stuck by the > impossibility to create streams at runtime. > > > > So, one of the solution for me is to create a single Kafka topic and > then use the "KeyBy" operator to match events with "category":"foo" against > rules also containing "category":"foo" in rule specification. > > > > Now I have some cases where events and rules have one category and one > subcategory. At this point I'm not sure about the "KeyBy" operator behavior. > > > > Example : > > > > Events have : "category":"foo" AND "subcategory":"bar" > > Rule1 specification has : "category":"foo" AND "subcategory":"bar" > > Rule2 specification has : "category':"foo" > > Rule3 specification has : "category":"bar" > > > > In this case, my events need to be match against Rule1, Rule2 and Rule3. > > > > If I'm right, if I apply a multiple key "KeyBy()" with "category" and > "subcategory" fields and then apply two single key "KeyBy()" with > "category" field, my events will be consumed by the first "KeyBy()" > operator and no events will be streamed in the operators after ? > > > > Is there any way to process the same stream one time for multi key > KeyBy() and another time for single key KeyBy() ? > > > > Thanks ! > > Sébastien. >