Hi, I am evaluating flink for use in stateful streaming application. Some information about the intended use:
- Will run in a mesos cluster and deployed via marathon in a docker container - Initial throughput ~ 100 messages per second (from kafka) - Will need to scale to 10x that soon after launch - State will be much larger than memory available In order to quickly get this out the door I am considering postponing the YARN / HA setup of a cluster with the idea that the current application can easily fit within a single jvm and handle the throughput. Hopefully by the time I need more scale flink support for mesos will be available and I can use that to distribute the job to the cluster with minimal code rewrite. Questions: 1. Is this a viable approach? Any pitfalls to be aware of? 2. What is the correct term for this deployment mode? Single node standalone? Local? 3. Will the RocksDB state backend work in a single jvm mode? 4. When the single jvm process becomes unhealthy and is restarted by marathon will flink recover appropriately or is failure recovery a function of HA? 5. How would I migrate the RocksDB state once I move to HA mode? Is there a straight forward path? Thanks for your time, Ryan