Thank you for answering all my questions. My suggestion would be to start off with exposing an API to allow dynamically changing operator parallelism as the users of flink will be better able to decide the right scaling policy. Once this functionality is there, its just a matter of providing policies (ratio based, throughput based, back-pressure based). The web UI could be used for setting parallelism as well.
An analogy would be autoscaling provided by cloud providers. The features provided are: 1. Web UI for manually overriding parallelism (min, max, desired) 2. Metric based scaling policies It will be difficult for developers to think of a reasonable value for maxParallelism for each operator and like I explained above, sometimes even a small increase in parallelism is enough to bring things down. A UI / external policy based approach will allow for quick experimentation and fine tuning. I don't think it will be possible for flink developers to build one size fits all solution. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/