Is there a way to make a task cluster-parallelizable? I.e. Make sure the
parallel instances of the task are distributed across the cluster. When I
run my flink job with a parallelism of 16, all the parallel tasks are
assigned to the first task manager.

- Ali

On 2015-11-30, 2:18 PM, "Ufuk Celebi" <u...@apache.org> wrote:

>
>> On 30 Nov 2015, at 17:47, Kashmar, Ali <ali.kash...@emc.com> wrote:
>> Do the parallel instances of each task get distributed across the
>>cluster or is it possible that they all run on the same node?
>
>Yes, slots are requested from all nodes of the cluster. But keep in mind
>that multiple tasks (forming a local pipeline) can be scheduled to the
>same slot (1 slot can hold many tasks).
>
>Have you seen this?
>https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/job
>_scheduling.html
>
>> If they can all run on the same node, what happens when that node
>>crashes? Does the job manager recreate them using the remaining open
>>slots?
>
>What happens: The job manager tries to restart the program with the same
>parallelism. Thus if you have enough free slots available in your
>cluster, this works smoothly (so yes, the remaining/available slots are
>used)
>
>With a YARN cluster the task manager containers are restarted
>automatically. In standalone mode, you have to take care of this yourself.
>
>
>Does this help?
>
>­ Ufuk
>

Reply via email to