Hi all!
We discussed a while back about introducing a dedicated streaming mode for
Flink. I would like to take a go at this and implement the changes, but
discuss them before.
Here is a brief summary why we wanted to introduce the dedicated streaming
mode:
Even though both batch and streaming are executed by the same execution
engine,
a streaming setup of Flink varies a bit from a batch setup:
1) The streaming cluster starts an additional service to store the
distributed state snapshots.
2) Streaming mode uses memory a bit different, so we should configure the
memory manager differently. This difference may eventually go away.
Concretely, to implement this, I was thinking about introducing the
following externally visible changes
- Additional scripts "start-streaming-cluster.sh" and
"start-streaming-local.sh"
- An execution mode parameter for the TaskManager ("batch / streaming")
- An execution mode parameter for the JobManager TaskManager ("batch /
streaming")
- All local executors and mini clusters need a flag that specifies whether
they will start
a streaming cluster, or a pure batch cluster.
Anything else that comes to your minds?
Greetings,
Stephan