Hi All,

I have the following questions.

1) can we do Flink CEP on event stream or batch?
2) If we can do streaming I wonder how long can we keep the stream
stateful? I also wonder if anyone successfully had done any stateful
streaming for days or months(with or without CEP)? or is stateful streaming
is mainly to keep state only for a few hours?

I have a use case where events are ingested from multiple sources and in
theory, the sources are supposed to have the same events however in
practice the sources will not have the same events so when the events are
ingested from multiple sources the goal is to detect where the "breaks"
are(meaning the missing events like exists in one source but not in other)?
so I realize this is the typical case for CEP.

Also, in this particular use case events that supposed to come 2 years ago
can come today and if so, need to update those events also in real time or
near real time. Sure there wouldn't be a lot of events that were missed 2
years ago but there will be a few. What would be the best approach?

One solution I can think of is to do Stateful CEP with a window of one day
or whatever short time period where most events will occur and collect the
events that fall beyond that time period(The late ones) into some Kafka
topic and have a separate stream analyze the time period of the late ones,
construct the corresponding NFA and run through it again.  Please let me
know how this sounds or if there is a better way to do it.

Thanks!

Reply via email to