Re: Advice - Drools in Flink

Maciek Próchniak Thu, 23 Jun 2016 13:16:38 -0700

Hi Anton,

I think I start to understand what you want to achieve. I would be avery interesting thing to do ;)If the drools are the piece that decides what is interesting and what isnot, then probably also drools should be responisble for keepingstate/aggregates - otherwise

Flink would keep too much stuff in memory.

I think that serializing session wouldn't be impossible. I still wonderhow to deal with clocks in Drools CEP - maybe the clock should be movedby incoming events - but then there are problems with out of order ones,or only by watermarks - but then you have worse latency.One thing that is still not so clear to me is - how do you want topartition the stream of tweets? I guess that it cannot be done by sometweet property - e.g. author, because otherwise users would have manysessions and aggregations would not be correct?I may be wrong, but I think that to leverage Flink scaling/statecapabilities the processing logic should be somehow Flink-aware - thatis, Flink should understand (at least partially) what are specific rules...



maciek


On 23/06/2016 16:54, Anton wrote:

Hi Maciek

Firstly, thanks for your replies and interest. It is really appreciated. I
am familiar with Drools but am new to Flink. Whereas you seem familiar with
both - so your feedback is really appreciated.

On Thu, Jun 23, 2016 at 8:30 AM, Maciek Próchniak <m...@touk.pl> wrote:

you mean - keeping working memory facts in Flink state and with each event
throw them into stateless session?

Yes kind of.

Can you elaborate a bit on your use case? What would be the state?

*Use case*
Flink ingests a large amount of tweets.
Application users can write rules to reason of tweet contents and
frequency.
For example, user John writes a rule that says:

*When I receive 3 tweets about Apache Flink within 1 day*

*Then Flink is popular*

*Fire new Popular("Flink") event*



*When I have not received a tweet about Apache Attic project ___ in one
month*

*then this project is not popular*
*Fire new UnPopular("Attic project") event*


Another important aspect of the use-case is the frequency of rule
changes/updates. There will be many users writing rules to reason over
incoming events - and these rules are expected to change often.

The main goal is to make Drools CEP stateful. Part of the challenge is
ensuring that new messages are routed to the correct drools session. But,
as Flink already seems to have scaling and state well done - it would be
nice to combine the two.
Please have a look at
http://blog.athico.com/2016/03/high-availability-drools-stateless.html

Also, I cannot comment on the strengths and weaknesses of Flink CEP, but
Drools CEP is certainly very mature - and when combined with the rule
language - very powerful. To be able to combine the statefulness of Flink
with Drools would be amazing!

The concern with using Stateful sessions is the possibility of memory
leaks. However, a stateless session is just a wrapper around stateless -
which calls dispose. So this makes me think, perhaps, when populating the
working memory, just extract all relevant events from the Flink state into
the Drools WM - and it could even be stateful - and then call displose
after calling fireAllRules. And do this each time a new message/event
arrives.

Would it be for example some event aggregations, which would be filtered by

the rules?

Yes.

Re: Advice - Drools in Flink

Reply via email to