Hi there.
I’m a newby with Flink and I need to develop a CEP application using this
technology. At least, we are choosing a tool now. So I have a question how
Flink really fit to my goals.
Buisness requirements:
There is a mobile application. Data is collected in Kafka topics (which are
have multi partitions) from it. Number of topics is three. And it is not a
ready events by user. At first I need to join all data from topics and only
after this we have a good clean event of user.
The order of events by each user is matter.
So the rules can be like these: User does action A, then B, and then for some
period does not action event C. If such a sequence is recived, so the system
communicate with this user by different chanels.
We don't want to use an extarnal DB, but only Flink states.
My questions are:
1. Using Kafka as input stream of data and collect an event by user.
I think that the order of clean user events will be wrong with this way,
because topics are not partitioned by user key and only one topic has this
field. So can I reorder these events by time field of event?
2. State of events.
Can I query the state using SQL syntax? I don't want iterate all records of
store to make a communication.
In case described above (A -> B -> x period waiting -> no C -> communication),
the B event stored in state. If C recived the system cleans B in store. We need
query the store and get all records B with B.event_time + period_waiting <
now_time.
Or can the CEP library make this job by pattern?
3. May be the solve of these requrements are not correct anougth. But again
does Flink can help realise this task?
Thanks,
Yuri L.