Hi guys, Thanks for your answers and sorry for the late reply.
My use case is : I receive some events on one stream, each events can contain: - 1 field category - 1 field subcategory - 1 field category AND 1 field subcategory Events are matched against rules which can contain : - - 1 field category - 1 field subcategory - 1 field category AND 1 field subcategory Now, let's say I receive an Event containing the following fields, category=foo and subcategory=bar. I want to be able to match this event against rule also containing category=foo and subcategory=bar in the specification but I also want to be able to match this events against rules containing category=foo OR rules containing subcategory=bar in specification. But I think I already have many information in your answers, I will definitely take a look at the Fraud Detection System example for the DynamicKeyFunction. And try to work with 2 different streams (One stream for events with single Key an one stream for events with multiple Key) as suggested by Till. Thanks again ! Sébastien On Tue, Feb 18, 2020 at 6:16 AM Till Rohrmann <trohrm...@apache.org> wrote: > Hi Sébastien, > > there is always the possibility to reuse a stream. Given a > DataStream<Element> input, you can do the following: > > KeyedStream<Element> a = input.keyBy(x -> f(x)); > KeyedStream<Element> b = input.keyBy(x -> g(x)); > > This gives you two differently partitioned streams a and b. > > If you want to evaluate every event against the full set of rules, then > you could take a look at Flink Broadcast State Pattern [1]. It allows you > to broadcast a stream of rules to all operators of a keyed input stream. > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html > > Cheers, > Till > > On Mon, Feb 17, 2020 at 11:10 PM theo.diefent...@scoop-software.de < > theo.diefent...@scoop-software.de> wrote: > >> Hi Sebastian, >> I'd also highly recommend a recent Flink blog post to you where exactly >> this question was answered in quote some detail : >> https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html >> Best regardsTheo >> -------- Ursprüngliche Nachricht -------- >> Von: Eduardo Winpenny Tejedor <eduardo.winpe...@gmail.com> >> Datum: Mo., 17. Feb. 2020, 21:07 >> An: Lehuede sebastien <lehued...@gmail.com> >> Cc: user <user@flink.apache.org> >> Betreff: Re: Process stream multiple time with different KeyBy >> >> >> Hi Sebastien, >> >> Without being entirely sure of what's your use case/end goal I'll tell >> you (some of) the options Flink provides you for defining a flow. >> >> If your use case is to apply the same rule to each of your "swimlanes" >> of data (one with category=foo AND subcategory=bar, another with >> category=foo and another with category=bar) you can do this by >> implementing your own org.apache.flink.api.java.functions.KeySelector >> function for the keyBy function. You'll just need to return a >> different key for each of your rules and the data will separate to the >> appropriate "swimlane". >> >> If your use case is to apply different rules to each swimlane then you >> can write a ProcessFunction that redirects elements to different *side >> outputs*. You can then apply different operations to each side output. >> >> Your application could get tricky to evolve IF the number of swimlanes >> or the operators are meant to change over time, you'd have to be >> careful how the existing state fits into your new flows. >> >> Regards, >> Eduardo >> >> On Mon, Feb 17, 2020 at 7:06 PM Lehuede sebastien <lehued...@gmail.com> >> wrote: >> > >> > Hi all, >> > >> > I'm currently working on a Flink Application where I match events >> against a set of rules. At the beginning I wanted to dynamically create >> streams following the category of events (Event are JSON formatted and I've >> a field like "category":"foo" in each event) but I'm stuck by the >> impossibility to create streams at runtime. >> > >> > So, one of the solution for me is to create a single Kafka topic and >> then use the "KeyBy" operator to match events with "category":"foo" against >> rules also containing "category":"foo" in rule specification. >> > >> > Now I have some cases where events and rules have one category and one >> subcategory. At this point I'm not sure about the "KeyBy" operator behavior. >> > >> > Example : >> > >> > Events have : "category":"foo" AND "subcategory":"bar" >> > Rule1 specification has : "category":"foo" AND "subcategory":"bar" >> > Rule2 specification has : "category':"foo" >> > Rule3 specification has : "category":"bar" >> > >> > In this case, my events need to be match against Rule1, Rule2 and Rule3. >> > >> > If I'm right, if I apply a multiple key "KeyBy()" with "category" and >> "subcategory" fields and then apply two single key "KeyBy()" with >> "category" field, my events will be consumed by the first "KeyBy()" >> operator and no events will be streamed in the operators after ? >> > >> > Is there any way to process the same stream one time for multi key >> KeyBy() and another time for single key KeyBy() ? >> > >> > Thanks ! >> > Sébastien. >> >