Hello, I'm developing a Flink toy-application on my local machine before to deploy the real one on a real cluster. Now I have to determine how many nodes I need to set the cluster.
I already read these documents: jobs and scheduling <https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/job_scheduling.html> programming model <https://ci.apache.org/projects/flink/flink-docs-release-1.2/concepts/programming-model.html> parallelism <https://flink.apache.org/faq.html#what-is-the-parallelism-how-do-i-set-it> But I'm still a bit confused about how many nodes I have to consider to execute my application. For example if I have the following code (from the doc): <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/n13927/Screen_Shot_2017-06-22_at_16.png> - This means that operations "on same line" are executed on same node? (It sounds a bit strange to me) Some confirms: - If the answer to previous question is yes and if I set parallelism to '1' I can establish how many nodes I need counting how many operations I have to perform ? - If I set parallelism to 'N' but I have less than 'N' nodes available Flink automatically scales the elaboration on available nodes? My throughput and data load is not relevant I think, it is not heavy. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/About-nodes-number-on-Flink-tp13927.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.