> On 30 Nov 2015, at 17:47, Kashmar, Ali <ali.kash...@emc.com> wrote: > Do the parallel instances of each task get distributed across the cluster or > is it possible that they all run on the same node?
Yes, slots are requested from all nodes of the cluster. But keep in mind that multiple tasks (forming a local pipeline) can be scheduled to the same slot (1 slot can hold many tasks). Have you seen this? https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/job_scheduling.html > If they can all run on the same node, what happens when that node crashes? > Does the job manager recreate them using the remaining open slots? What happens: The job manager tries to restart the program with the same parallelism. Thus if you have enough free slots available in your cluster, this works smoothly (so yes, the remaining/available slots are used) With a YARN cluster the task manager containers are restarted automatically. In standalone mode, you have to take care of this yourself. Does this help? – Ufuk