Hey everyone, I'm quite new to Apache Flink. I'm trying to build a system with Flink and wanted to hear your opinion and whether the proposed architecture is even possible with Flink. The environment for the system will be a microservice architecture handling messaging via async events.
I want to give you a brief description of the system: - there are a lot of sensors, which each produces a stream of data - on each stream of sensor data I want to match one or more patterns via Flink's CEP library - each of these sensors belongs to one or more administrative entities - each pattern belongs to one administrative entity and needs to be evaluated on one or more sensors of this entity - the user can change the connection of a sensor to an administrative entity as well as the sensors on which a pattern needs to be evaluated I hope this description is enough to give you an overview of the system. This is what I am thinking of doing: - I will have an Apache Kafka cluster and a Flink cluster running inside docker containers - I create a topic in Kafka for each administrative entity - for each entity I create a Flink job which consumes the corresponding topic - the Flink job creates a stream of the sensor data - it splits the stream to a stream for each sensor - for each pattern that hast to be evaluated on one stream I create a pattern stream This results in the following: - there will be a lot of Kafka topics - for each topic there will be one Flink job (-> a lot of jobs, too) - in each job there will be quite a lot of streams and patterns and therefore even more pattern streams The main questions that arose while thinking of this implementation: 1.) From other questions here, I know that there is currently no way to dynamically add taskmanagers to the Flink cluster. The proposed way to handle that, is to start up much more taskmanagers than first needed. Is it even possible to have a great number of jobs on one cluster? 2.) Would a viable alternative be to just dynamically start up a new cluster for each administrative entity? 3.) I also came to know, that Flink isn't able to handle dynamically created streams and patterns. I guess that is due to the fixed calculation of the execution graph at the jobs beginning. Is there a way to make Flink recalculate the graph of a running job? I also just found out about this [1] example, where they use scripts to hot deploy queries. I will look into that, too. Maybe that provides an acceptable solution for me, too. 4.) Is it even possible to have a great number of streams and patterns in one Flink job? Any comments and feedback are greatly appreciated. Thanks a lot in advance :) Best, Claudia [1]: https://techblog.king.com/rbea-scalable-real-time-analytics-king/