Hi everyone!

What do you think about making the streaming execution mode of the system
explicit? That means that people start a Flink cluster explicitly in Batch
mode or in Streaming mode.

The rational behind this idea is that I am not sure how batch and streaming
clusters are really shared in a meaningful way, since streaming programs
basically run forever. There are also further differences:

  - Memory Management: Streaming jobs do not use the managed memory
currently (see [1] and [2])

  - Are streaming jobs inherently single user? Initially, I would say yes,
because you need to know that you provisioned enough compute power to keep
up with your ingestion rate and that not some other job starts eating
shared resources from you (network / disk)

  - High Availability will probably look a bit different for a streaming
master and a batch master

Once we figured the co-existence between streaming and batch in the same
cluster out better, we can remove this separation. This does not affect any
user programs, only the "ops" of the cluster.

Greetings,
Stephan


[1] https://issues.apache.org/jira/browse/FLINK-1368
[2] https://issues.apache.org/jira/browse/FLINK-1323

Reply via email to